sql query with patients - sql

Need help with a database query;
patients(cpr(key), firstname, sirname, address, postalnumber, country, journal)
allergies(allergens(key)), allergytype, allergic_reaction)
patientallergies(allergens(key), cpr(key))
How do we write CPR-numbers pairwise on patients who are allergic against the exact same allergens? A CPR number can only be printed once and a CPR number cannot be paired with itself.
Our current suggestion goes something like this:
SELECT p1.cpr, p2.cpr
FROM patients p1, patients p2
Not sure where to go from here

One method is a self-join and aggregation. This uses window functions to count the number of allergens per patient to be sure the match is exact:
with pa as (
select pa.*, count(*) over (partition by cpr) as cnt
from patientallergies pa
)
select pa.cpr, pa2.cpr
from pa join
pa pa2
on pa.allergan = pa2.allergan and pa.cnt = pa2.cnt
group by pa.cpr, pa2.cpr, pa.cnt
having count(*) = pa.cnt;

Related

WHERE clause does not find column after a CTE?

New to CTE's and subqueries in SQL.
I have 3 tables:
categories (category_code, category)
countries (country_code, country, continent)
businesses (business, year_founded, category_code, country_code)
Goal is to look at oldest businesses in the world. I used a CTE:
WITH bus_cat_cont AS (
SELECT business, year_founded, category, country,
continent
FROM businesses AS b
INNER JOIN categories AS c1
ON b.category_code = c1.category_code
INNER JOIN countries AS c2
ON b.country_code = c2.country_code
)
SELECT continent,
category,
COUNT(business) AS n
FROM bus_cat_cont
WHERE n > 5
GROUP BY continent, category
ORDER BY n DESC;
The code works without WHERE n > 5. But after adding that, I get the error:
column "n" does not exist
I realized there is a much easier way to get the output I want without a CTE.
But I'm wondering: Why do I get this error?
This would work:
WITH bus_cat_cont AS (
SELECT business, year_founded, category, country, continent
FROM businesses AS b
JOIN categories AS c1 ON b.category_code = c1.category_code
JOIN countries AS c2 ON b.country_code = c2.country_code
)
SELECT continent, category, count(business) AS n
FROM bus_cat_cont
-- WHERE n > 5 -- wrong
GROUP BY continent, category
HAVING count(business) > 5 -- right
ORDER BY n DESC;
The output column name "n" is not visible (yet) in the WHERE or HAVING clause. Consider the sequence of events in an SQL query:
Best way to get result count before LIMIT was applied
For the record, the result has no obvious connection to your declared goal to "look at oldest businesses in the world". year_founded is unused in the query.
You get the most common continent/category combinations among businesses.
Aside, probably better:
SELECT co.continent, ca.category, n
FROM (
SELECT category_code, country_code, count(*) AS n
FROM businesses
GROUP BY 1, 2
HAVING count(*) > 5
) b
JOIN categories ca USING (category_code)
JOIN countries co USING (country_code)
ORDER BY n DESC;
There is really no need for a CTE.
Aggregate first, join later. See:
Query with LEFT JOIN not returning rows for count of 0
Beside being faster, this is also safer. While category_code, country_code should be defined UNIQUE, the same may not be true for continent and category. (You may want to output codes additionally to disambiguate.)
count(*) is implemented separately and slightly faster - and equivalent while business is defined NOT NULL.

Returning the Min() of a Count()

I am studying for an SQL test and the previous year has the final question:
Name the student who has studied the least number of papers. How many
papers have they studied?
So far, this is the select query that I have created:
select min(Full_Name), min(Amount)
from (select st.ST_F_Name & ' ' & st.ST_L_Name as Full_Name, count(*) as Amount
from (student_course as sc
inner join students as st
on st.ST_ID=sc.SC_ST_ID)
group by st.ST_F_Name & ' ' & st.ST_L_Name)
This works perfectly for returning the result I want but I'm not sure if this is the way I should be doing this query? I feel like calling min() on the Full_Name could potentially backfire on me under certain circumstances. Is there a better way to be doing this? (this is in MS Access for unknown reasons)
If you want only 1 of such students if there are multiple, this is probably the simplest:
select st.ST_F_Name, st.ST_L_Name, count(*) as Amount
from student_course as sc
inner join students as st
on st.ST_ID=sc.SC_ST_ID
group by st.ST_ID
order by Amount ASC LIMIT 1
However, if you want to find all stuch students, you follow a different approach. We use a WITH clause to simplify things, that defines a CTE (Common Table Expression) computing the number of courses per-student. And then we select students where their number equals to the minimum in that CTE:
with per_student as (
select st.ST_F_Name, st.ST_L_Name, count(*) as Amount
from student_course as sc
inner join students as st
on st.ST_ID=sc.SC_ST_ID
group by st.ST_ID
)
select * from per_student
where amount = (select min(amount) from per_student)
But the real trick in that question is that there might be students that didn't take ANY courses. But with approaches presented so far you'll never see them. You want something like this:
with per_student as (
select st.ST_F_Name, st.ST_L_Name, count(sc.SC_ST_ID) as Amount
from student_course as sc
right outer join students as st
on st.ST_ID=sc.SC_ST_ID
group by st.ST_ID
)
select * from per_student
where amount = (select min(amount) from per_student)
You can order by count(*) to get the student with the least # of papers:
i.e.
select * from students where st_id in (
select top 1 sc_st_id
from student_course
group by sc_st_id
order by count(*)
)
if you also need the # of papers studied, then join a derived table containing the min count:
select * from students s
left join (
select top 1 sc_st_id, count(*)
from student_course
group by sc_st_id
order by count(*)
) t on t.sc_st_id = s.st_id

Problems selecting columns with aggregates (SQL Server)

I'm seriously stuck. Please bear with me though because I'm new to databases.:)
Anyway, I need to display the StudentID, the subject where the student has the highest grade in, and the grade of that subject.
Here's the code I have:
SELECT
Grades.Student_ID,
Subject.Subject_Code,
MAX(Grades.Grade)
FROM
Grades
LEFT JOIN
Subject ON Grades.Subject_ID = Subject.Subject_ID
GROUP BY
Grades.Student_ID
But it has this error:
'Subject.Subject_Code' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.'
But I can't include Subject_Code in GROUP BY because the results will be different.
What can I do to only show the
Student_ID || (subject with highest grade) || (grade of that subject)
How can I work around this error?
It seems you are looking for a group wise maximum. Here's one approach joining back to a derived table containing the top grade for each student (This approach should work on most RDBMS, including MySql):
SELECT X.Student_ID,
s.Subject_Code,
x.TopGrade
FROM
(
SELECT
Grades.Student_ID,
MAX(Grades.Grade) AS TopGrade
FROM Grades
GROUP BY Grades.Student_ID
) x
INNER JOIN Grades g
ON g.Student_ID = x.Student_ID AND g.Grade = x.TopGrade
LEFT JOIN Subject s
ON g.Subject_ID = s.Subject_ID
If the same student has two or more marks with exactly the same grade, it will return all Subjects.
Here's my original answer, which will work on SQL Server
SELECT x.Student_ID, x.Subject_Code, x.Grade
FROM
(
SELECT
Grades.Student_ID,
Subject.Subject_Code,
RANK() OVER (PARTITION BY Grades.Student_ID ORDER BY Grades.Grade DESC) AS [Rank],
Grades.Grade
FROM Grades
LEFT JOIN Subject
ON Grades.Subject_ID = Subject.Subject_ID
) x
WHERE x.[Rank] = 1;
SqlFiddle of both the above queries here. In addition, there is an example with ROW_NUMBER with an additional arbitrary ORDER BY to pick one top subject when the student has equal marks in two or more subjects.

SQL Comparing COUNT values within same table

I'm trying to solve a seemingly simple problem, but I think i'm tripping over on my understanding of how the EXISTS keyword works. The problem is simple (this is a dumbed down version of the actual problem) - I have a table of students and a table of hobbies. The students table has their student ID and Name. Return only the students that share the same number of hobbies (i.e. those students who have a unique number of hobbies would not be shown)
So the difficulty I run into is working out how to compare the count of hobbies. What I have tried is this.
SELECT sa.studentnum, COUNT(ha.hobbynum)
FROM student sa, hobby ha
WHERE sa.studentnum = ha.studentnum
AND EXISTS (SELECT *
FROM student sb, hobby hb
WHERE sb.studentnum = hb.studentnum
AND sa.studentnum != sb.studentnum
HAVING COUNT(ha.hobbynum) = COUNT(hb.hobbynum)
)
GROUP BY sa.studentnum
ORDER BY sa.studentnum;
So what appears to be happening is that the count of hobbynums is identical each test, resulting in all of the original table being returned, instead of just those that match the same number of hobbies.
Not tested, but maybe something like this (if I understand the problem correctly):
WITH h AS (
SELECT studentnum, COUNT(hobbynum) OVER (PARTITION BY studentnum) student_hobby_ct
FROM hobby)
SELECT studentnum, student_hobby_ct
FROM h h1 JOIN h h2 ON h1.student_hobby_ct = h2.student_hobby_ct AND
h1.studentnum <> h2.studentnum;
I think that what your query would do is only return students who had at least one other student that had the same number of hobbies. But you're not returning anything about the students with whom they match. Is that intentional? I'd treat both queries as sub-queries and aggregate before a join on the counts. You could do several things... here it's returning the number of students that have matching hobby counts, but you could limit HAVING(COUNT(distinct sb.studentnum) = 0 to get the result your query seemed to return...
with xx as
(SELECT sa.studentnum, count(ha.hobbynum) hobbycount
FROM student sa inner join hobby ha
on sa.studentnum = ha.studentnum
group by sa.studentnum
)
select sa.studentnum, sa.hobbycount, count(distinct sb.studentnum) as matchcount
from
xx sa inner join xx sb on
sa.hobbycount = sb.hobbycount
where
sa.studentnum != sb.studentnum
GROUP by sa.studentnum, sa.hobbycount
ORDER BY sa.studentnum;

Count, inner join

I have two tables:
DRIVER(Driver_Id,First name,Last name,...);
PARTICIPANT IN CAR ACCIDENT(Participant_Id,Driver_Id-foreign key,responsibility-yes or no,...).
Now, I need to find out which driver participated in accident where responsibility is 'YES', and how many times. I did this:
Select Driver_ID, COUNT (Participant.Driver_ID)as 'Number of accidents'
from Participant in car accident
where responsibility='YES'
group by Driver_ID
order by COUNT (Participant.Driver_ID) desc
But, I need to add drivers first and last name from the first table(using inner join, I suppose). I don't know how, because it is not contained in either an aggregate function or the GROUP BY clause.
Please help :)
As you suspected, you need to use an inner join. And because the first name and last name are now part of the SELECT, you also need to include those columns in the GROUP BY.
Select Driver_ID, First_name, Last_name COUNT (Participant.Driver_ID) as "Number of accidents"
from "Participant in car accident" join Driver on "Participant in car accident".Driver_ID = Driver.Driver_ID
where responsibility='YES'
group by Driver_ID, First_name, Last_name
order by COUNT (Participant.Driver_ID) desc
Is this homework?
You could use an inline table:
SELECT d.driver_first_name,
d.driver_last_name,
r.incident_count
FROM DRIVER d
INNER JOIN (SELECT driver_id,
count(*) incident_count
FROM PARTICIPANT_IN_CAR_ACCIDENT
WHERE responsibility = 'YES'
GROUP BY driver_id) r
ON d.driver_id = r.driver_id
ORDER BY r.incident_count DESC
Should work.