Count() on many to many relationship - sql

I have a Students table and a Language table. They form a many to many relationship using a pivot table Languages_Student.
Is there a way of getting the student which has the biggest amount of languages in common with another student?
I'm not quite sure how to combine COUNT() with some kind of select. This is what I'm working with now:
select * from students student1
inner join languages_student ls1
on student1.id = ls1.student_id
inner join languages l1
on l1.id = ls1.language_id
inner join languages_student ls2
on l1.id = ls2.language_id
inner join students student2
on ls2.student_id = student2.id
where student1.id = 65
group by 16
I'm trying to get the student with biggest amount of languages in common with the student with id 65.
Any ideas?

You can do this using a self-join. Join the junction table to itself on the language column; then aggregate by the student column to get the number in common:
select ls.student_id, count(*) as NumInCommon
from languages_student ls join
languages_student ls65
on ls.language = ls65.language and ls65.student_id = 65 and
ls65.student_id <> ls.student_id
group by ls.student_id
order by count(*) desc;

select ls.student_id, count(ls.language_id) as common
from languages_student ls
where ls.id in (
select l.id
from students s
inner join languages_student ls
on s.id = ls.student_id
inner join languages l
on l.id = ls.language_id
where s.id = 65 )
order by count(ls.language_id)

Related

Sql select clause with three table

I am trying to write a sql query to show student list for each course.
The diagram below show the database relationship.
The SQL query I have written is:
select * from Courses
inner join Enrollments on Enrollments.CourseId = Courses.CourseId
inner join Student on Enrollments.StudentId = StudentId
where Courses.CourseId = 1
The issue is that i am getting returned alot more data than I expected as only one student is registered for the course but i get ten entries. I am not sure if i have done somethings fundamental wrong or is my query the issue.
This is the data
This is the result
I expected only two rows to be returned.
Thanks
Every column in your query must be qualified with the table's name.
You did not qualify the column StudentId in this join:
inner join Student on Enrollments.StudentId = StudentId
If you did you would find the error which is that there is no column StudentId in the table Student and you should use the column Id:
select * from Courses
inner join Enrollments on Enrollments.CourseId = Courses.CourseId
inner join Student on Enrollments.StudentId = Student.Id
where Courses.CourseId = 1
or better with aliases for the tables:
select *
from Courses as c
inner join Enrollments as e on e.CourseId = c.CourseId
inner join Student as s on e.StudentId = s.Id
where c.CourseId = 1
The primary key of table Student is Id, not StudentId.
So the correct query is:
select * from Courses
inner join Enrollments on Enrollments.CourseId = Courses.CourseId
inner join Student on Enrollments.StudentId = Student.Id
where Courses.CourseId = 1

How to make a query from 4 tables

I am having a problem creating a View using SQL. I need to make a View of 4 tables:
tbl_school, tbl_teacher, tb_student, and tbl_class.
This is my table structure:
And this is my View Statement
SELECT
tbl_school.school_id,
tbl_school.school_nm,
(SELECT COUNT(*) FROM tbl_class) AS class,
(SELECT COUNT(*) FROM tbl_teacher) AS teacher,
(SELECT COUNT(*) FROM tbl_student) AS student
FROM
tbl_school
INNER JOIN tbl_teacher ON tbl_school.school_id = tbl_teacher.school_id
INNER JOIN tbl_class ON tbl_teacher.teacher_id = tbl_class.teacher_id AND tbl_school.school_id = tbl_class.school_id
INNER JOIN tbl_student ON tbl_class.class_id = tbl_student.class_id
GROUP BY
tbl_school.school_id
And this is the query result:
The problem is that I have one teacher in SD1 School and another teacher in SD2 School. Each teacher has one class and SD1 School has two students and SD2 School has one student.
Is there a way I can get the results that I desire?
You can use an aggregation containing DISTINCT keywords, and had better using aliasing and one more column (tbl_school.school_nm) within the GROUP BY list to make it a more proper SQL( Btw some DBMS don't allow excluding that column from GROUP BY while MySQL allows ) :
SELECT s.school_id, s.school_nm,
COUNT(DISTINCT c.class_id) AS class,
COUNT(DISTINCT t.teacher_id) AS teacher,
COUNT(DISTINCT d.student_id) AS student -- this is a presumedly existing column within the student table
FROM tbl_school s
JOIN tbl_teacher t ON s.school_id = t.school_id
JOIN tbl_class c ON t.teacher_id = c.teacher_id AND s.school_id = c.school_id
JOIN tbl_student d ON c.class_id = d.class_id
GROUP BY s.school_id, s.school_nm
Welcome to SO.
It has been a while since I have done this, but have you tried adding a WHERE modifier to your internal SQL select statements? Like this...
Side note: It makes more sense, to me, to also have a FK on tbl_student that links them to which school that they're in.
SELECT
tbl_school.school_id,
tbl_school.school_nm,
(SELECT COUNT(*) FROM tbl_class WHERE school_id=tbl_school.school_id) AS class,
(SELECT COUNT(*) FROM tbl_teacher WHERE school_id=tbl_school.school_id) AS teacher,
(SELECT COUNT(*) FROM tbl_student) AS student
FROM
tbl_school
INNER JOIN tbl_teacher ON tbl_school.school_id = tbl_teacher.school_id
INNER JOIN tbl_class ON tbl_teacher.teacher_id = tbl_class.teacher_id AND tbl_school.school_id = tbl_class.school_id
INNER JOIN tbl_student ON tbl_class.class_id = tbl_student.class_id
GROUP BY
tbl_school.school_id

sqlite: count and group by clause gives not the result expected

on sqlite, I have the tables
papers: rero_id, doi, year
writtenby: rero_id, authorid, instid
authors: author_id, name, firstname
inst: inst_id, name, see_id
inst is a table of Institutions: Universities and so on.
Each line in writtenby gives a paper, an author, an institution this author was attached at that time. There can be more then one institution and the couple paper, authorid is repeated for each institution.
For a given author, I want a list and a count of the institutions he has cohautored paper with.
For a list I tried
SELECT inst.name as loc
FROM (
(authors INNER JOIN writtenby ON authors.authorid =
writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid =
auth_1.authorid
inner join inst on writtenby_1.instid = inst.inst_id
WHERE (authors.name) ="Doe" AND (authors.firstname)= "Joe"
ORDER BY loc
I got a list that seems ok.
Now, I would like to regroup these institution names and have a count.
I tried
SELECT inst.name, count(inst.name)
FROM (
(authors INNER JOIN writtenby ON authors.authorid =
writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid =
auth_1.authorid
inner join inst on writtenby_1.instid = inst.inst_id
GROUP BY inst.name
HAVING (authors.name) ="Doe" AND (authors.firstname)= "John"
I have only three line and not a count of the institutions listed from the first query.
Thanks for correcting me !
François
Try using where instead of having
SELECT inst.name, count(inst.name)
FROM (
(authors INNER JOIN writtenby ON authors.authorid =
writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid =
auth_1.authorid
inner join inst on writtenby_1.instid = inst.inst_id
where authors.name ='Doe' AND authors.firstname= 'John'
GROUP BY inst.name
I got this that works,
SELECT inst.name as loc, count(*) as c
FROM (
(authors INNER JOIN writtenby ON authors.authorid = writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
inner join inst on writtenby_1.instid = inst.inst_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid = auth_1.authorid
WHERE (authors.name) ="Doe" AND (authors.firstname)= "John"
GROUP BY inst.name
ORDER BY c DESC
I still can use a where clause, and that's not the same as having...
And thanks to fa6 who gave the answer below
F.

SQL LEFT JOIN for joining three tables but one with to exclude content

I have 3 tables
STUDENTS
FEES_PAID
SUSPENDED
I want to get the details of the students who have paid the fees but not from SUSPENDED.
SELECT
ID
FROM
STUDENTS s
LEFT JOIN
SUSPENDED p ON s.ID = p.ID
INNER JOIN
FEES_PAID f ON f.ID = s.ID
WHERE
s.ID IS NULL
Unfortunately this does not work. Can any one suggest an efficient query?
Thanks in advance
You need to check if the second table is missing from the LEFT JOIN. So, you need to look at a column in that table. Change the WHERE to:
WHERE p.ID IS NULL
Alternatively, use NOT EXISTS:
SELECT s.ID
FROM STUDENTS s INNER JOIN
FEES_PAID f
ON f.ID = s.ID
WHERE NOT EXISTS (SELECT 1 FROM SUSPENDED p WHERE s.ID = p.ID);
Note that for both these queries, you will need to qualify the ID in the SELECT to specify the table where it comes from.
This should work:
SELECT
s.ID
FROM
STUDENTS s
LEFT JOIN
SUSPENDED p
ON s.ID=p.ID
INNER JOIN
FEES_PAID f
ON f.ID= s.ID
WHERE
p.ID IS NULL

Query extensibility with WHERE EXISTS with a large table

The following query is designed to find the number of people who went to a hospital, the total number of people who went to a hospital and the divide those two to find a percentage. The table Claims is two million plus rows and does have the correct non-clustered index of patientid, admissiondate, and dischargdate. The query runs quickly enough but I'm interested in how I could make it more usable. I would like to be able to add another code in the line where (hcpcs.hcpcs ='97001') and have the change in percentRehabNotHomeHealth be relfected in another column. Is there possible without writing a big, fat join statement where I join the results of the two queries together? I know that by adding the extra column the math won't look right, but I'm not worried about that at the moment. desired sample output: http://imgur.com/BCLrd
database schema
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join
--this join adds the hospitalCounts column
(
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
) as t on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select distinct p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10
You might look into CTE (Common Table Expressions) to get what you need. It would allow you to get summarized data and join that back to the detail on a common key. As an example I modified your join on the subquery to be a CTE.
;with hospitalCounts as (
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
)
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join hospitalCounts on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10