How to selected duplicated items and them delete them - sql

I need to find the duplicated rows and then delete.
I have three tables - the data which I need to look into is in tables a and c.
For the moment I have managed to write this query
select
a.name, a.group, a.order, c.p_id
from
a,
join
b using (p_id)
join
c using (p_id)
where
c.p_id = 1
The results is like this:
name group order P_id
John T 0 1
Michael T 1 1
William T 2 1
David QG 3 1
Richard QG 4 1
Joseph QG 5 1
Thomas QA 6 1
James QS 7 1
Robert QM 8 1
Charles QM 10 1
Christopher T 1 1
Daniel T 2 1
Matthew QG 3 1
Anthony QG 4 1
Mark T 0 1
Steven QG 5 1
Paul QA 6 1
James QS 8 1
Robert QM 9 1
Joshua QA 7 1
Kevin CA 13 1
The highlight yellow and red rows is what I need to get but I don't how to get the result.
So what I am trying to do is that when we have duplication name and group and the P_id are the same, and the order is different.
In my query I have used a hard-coded p_id but I want to check all the results not just this one.
After I generate this report I need to delete the duplication keep the one that has the lowest order.
So from the screenshot, if we get the case of red rows, the row that has (Robert, QM, 9) should be deleted
If anybody can help me, I would really appreciated it.
Thank you

You can use the distinct on option on the select statement.
select distinct on (a.name, a.group, c.p_id)
a.name, a.group, a.order, c.p_id
from
a,
join
b using (p_id)
join
c using (p_id)
where
c.p_id = 1
order by a.name, a.group desc, c.p_id;
NOTE: Not Tested.

Related

SQL select with three tables and foreign keys

I have three tables :
field:
f_id
f_start
f_end
1
10
20
2
15
25
3
5
10
person :
p_id
p_name
1
Roger
2
John
3
Alicia
affect :
id
fk_field
fk_person
1
2
1
2
1
2
3
3
3
And I would like to select the dates and the names associated to. Like this
p_name
f_start
f_end
Roger
15
25
John
10
20
Alicia
5
10
I'm new to SQL and I don't know if i have to use JOIN or not... Thanks
You must join all 3 tables on their related columns:
SELECT p.p_name, f.f_start, f.f_end
FROM person p
INNER JOIN affect a ON a.fk_person = p.p_id
INNER JOIN field f ON f.f_id = a.fk_field;
Depending on your requirement you may need LEFT instead of INNER joins, but for this sample data the INNER joins will do.

Select from table 1 unless there is a relationship in 2 other tables

I need to query a name(s) from the Officials table, but exclude that name if the person has the day blocked.
For example, if Sam has blocked 8/21/2021 and 9/11/2021, he should not be selected if the corresponding dates are selected from the Games table. Sam should show up if 9/18/2021 is selected, however. I have 3 tables:
Officials tbl
RefId Name
---------------------
1 Jack
2 Sam
3 Jane
Games tbl Blocks tbl
GameId GameDate BlockId RefId BlockDate
------------------------- ----------------------
1 8/21/2021 1 2 8/21/2021
2 9/11/2021 2 2 9/11/2021
3 9/18/2021 3 3 8/21/2021
Desired Output
----------------------------------
If Game 1 is selected: Jack
If Game 2 is selected: Jack and Jane
If Game 3 is selected: Jack, Sam and Jane
The only 2 tables that are related are the Officials table and Blocks table, with the RefId. I need to compare the BlockDate of Blocks table to GameDate of Games table. I have tried some sql language and this below is obviously not correct, but I'm looking for a way to accomplish what I want to do:
#GameDate datetime,
Select c.Id, c.Name
From Officials c
Where In c.Id And Blocks.BlockDate <> Games.GameDate)
You can do it with NOT EXISTS:
SELECT o.*
FROM Officials o
WHERE NOT EXISTS (
SELECT 1
FROM Blocks b INNER JOIN Games g
ON g.GameDate = b.BlockDate
WHERE b.RefId = o.RefId AND g.GameId = ?
);
See the demo.

SQL count 2 equal columns and select other columns

I have a two separate tables, one with vacancies, and one with applications to those vacancies. I want to select a new table which selects from the vacancies table with a number of other columns from that table, and another column that calculates how many applications there are for those vacancies. So my vacancy table looks like this:
ID Active StartDate JobID JobTypeID HoursPerWeek
1 1 2017-02-28 2 CE 0
2 1 2017-02-15 4 CE 40
3 1 2017-02-14 1 CE 40
4 1 2017-02-28 1 CE 48
My applications table looks like this:
ID VacancyID Forename Surname EmailAddress TelephoneNumber
1 1 John Smith jsmith#gmail.com 447777777777
2 2 John Smith jsmith#gmail.com 447748772641
3 2 John Smith jsmith#gmail.com 447777777777
4 2 John Smith jsmith#gmail.com 447700123456
5 4 John Smith jsmith#gmail.com 447400123569
6 4 John Smith jsmith#gmail.com 447400126547
7 4 John Smith jsmith#gmail.com 447555123654
I want a table that looks like this:
ID Active StartDate JobID HoursPerWeek NumberOfApplicants
1 1 2017-02-28 2 0 1
2 1 2017-02-15 4 40 3
3 1 2017-02-14 1 40 0
4 1 2017-02-28 1 48 3
How can I select that table using joins and count the number of applicants where the VacancyID is equal to the ID of the first vacancy table? I have tried:
select Vacancy.ID, VacancyID, count(*) as NumberOfApplications from VacancyApplication
join Vacancy on Vacancy.ID=VacancyID
group by VacancyID, Vacancy.ID
This obviously doesn't select all the other columns and it also does not select ID 3 because there are 0 applications for that - I want ID 3 to be there with a value of 0 as well as all the other columns. How do I do this? I've tried various forms of grouping and selecting but I'm quite new to SQL so I'm not really sure how this can be done.
Use RIGHT JOIN instead of INNER JOIN and count the vacancyid column from vacancyapplication table. For the non matching records you will get count as 0
SELECT v.id, v.Active, v.StartDate, v.JobID, v.HoursPerWeek
Count(va.vacancyid) AS NumberOfApplications
FROM vacancyapplication va
RIGHT JOIN vacancy v
ON v.id = va.vacancyid
GROUP BY v.id, v.Active, v.StartDate, v.JobID, v.HoursPerWeek
Start using Alias names, it makes the query more readable
Hoping, i understood your problem correctly. Please try below query
select Vacancy.ID, VacancyID, count(*) as NumberOfApplications from VacancyApplication
left join Vacancy on Vacancy.ID=VacancyID
group by VacancyID, Vacancy.ID
You can use count as a window function using the OVER clause, thus eliminating he need for group by:
SELECT v.ID,
v.Active,
v.StartDate,
v.JobID,
v.JobTypeID,
COUNT(va.ID) OVER(PARTITION BY v.ID) HoursPerWeek
FROM Vacancy v
LEFT JOIN vacancyapplication va ON(v.ID = va.VacancyID)
Use left join and table aliases:
select v.ID, count(va.VacancyID) as NumberOfApplications
from Vacancy v join
VacancyApplication va
on v.ID = va.VacancyID
group by v.ID;
You seem to want all the columns. You could include them in the group by. However, a correlated subquery or outer apply is simpler:
select v.*, va.cnt
from vacancy v outer apply
(select count(*) as cnt
from VacancyApplication va
where v.ID = va.VacancyID
) va;
This is probably more efficient anyway, especially if you have an index on VacancyApplication(VacancyID).

How to count linked entries in another table with a specific value

Let's say I have two tables. A students table and an observations table. If the students table looks like:
Id Student Grade
1 Alex 3
2 Barney 3
3 Cara 4
4 Diana 4
And the observations table looks like:
Id Student_Id Observation_Type
1 1 A
2 1 B
3 3 A
4 2 A
5 4 B
6 3 A
7 2 B
8 4 B
9 1 A
Basically, the result I'd like from the query would be the following:
Student Grade Observation_A_Count
Alex 3 2
Barney 3 1
Cara 4 2
Diana 4 0
In other words, I'd like to gather data for each student from the students table and for each student count the number of A observations from the observations table and tack that onto the other information. How do I go about doing this?
This is a simple join and aggregate:
select
a.Student,
a.Grade,
count(b.Id) as Observation_A_Count
from
Student a left join
Observations b on a.Id = b.Student_Id
group by
a.Student,
a.Grade
order by
1
Or, you can use a correlated subquery:
select
a.Student,
a.Grade,
(select count(*) from observations x where x.Student_Id = a.Id) as Observation_A_Count
from
Student a
order by
a.Student
You can join the table with a specific condition, by doing this you can have a field for Observation_B_Count and Observation_C_Count, etc.
SELECT Student.Student, Student.Grade, COUNT(Observation_Type.*) AS Observation_A_Count
FROM Student
LEFT JOIN Observations ON Observations.Student_ID = Student.Student_ID AND Observations.Observation_Type = 'A'
GROUP BY Student.Student, Student.Grade

Tricky SQL - Select non-adjacent numbers

Given this data on SQL Server 2005:
SectionID Name
1 Dan
2 Dan
4 Dan
5 Dan
2 Tom
7 Tom
9 Tom
10 Tom
How would I select records where the sectionID must be +-2 or more from another section for the same name.
The result would be:
1 Dan
4 Dan
2 Tom
7 Tom
9 Tom
Thanks for reading!
SELECT *
FROM mytable a
WHERE NOT EXISTS
(SELECT *
FROM mytable b
WHERE a.Name = b.Name
AND a.SectionID = b.SectionID + 1)
Here's LEFT JOIN variant of Anthony's answer (removes consecutive id's from the results)
SELECT a.*
FROM mytable a
LEFT JOIN mytable b ON a.Name = b.Name AND a.SectionID = b.SectionID + 1
WHERE b.SectionID IS NULL
EDIT: Since there is another interpretation of the question (simply getting results where id's are more than 1 number apart) here is another attempt at an answer:
WITH alternate AS (
SELECT sectionid,
name,
EXISTS(SELECT a.sectionid
FROM mytable b
WHERE a.name = b.name AND
(a.sectionid = b.sectionid-1 or a.sectionid = b.sectionid+1)) as has_neighbour,
row_number() OVER (PARTITION by a.name ORDER BY a.name, a.sectionid) as row_no
FROM mytable a
)
SELECT sectionid, name
FROM alternate
WHERE row_no % 2 = 1 OR NOT(has_neighbour)
ORDER BY name, sectionid;
gives:
sectionid | name
-----------+------
1 | Dan
4 | Dan
2 | Tom
7 | Tom
9 | Tom
Logic: if a record has neighbors with same name and id+/-1 then every odd row is taken, if it has no such neighbors then it gets the row regardless if it is even or odd.
As stated in the comment the condition is ambiguous - on start of each new sequence you might start with odd or even rows and the criteria will still be satisfied with different results (even with different number of results).