why results of two queries are different? - sql

select distinct ID, title, takes.course_id
from course join takes
on course.course_id = takes.course_id
where takes.course_id in
(select takes.course_id
from takes
where ID = '10204');
select ID, title, takes.course_id
from course join takes
on course.course_id = takes.course_id
where ID = '10204';
I want to query the course IDs and the titles of the courses that a student whose ID is 10204 takes. The first gives a result with 5000 rows which is incorrect. The second give a correct result. So what is wrong with the first?

The first query gives you data for all students that happen to take a course that 10204 also takes.

Essentially the first query can be read as "Find all courses and the students that take them, for any course that is also taken by student 10204". You can look at the first query as a 3 way join. The results of the subquery select takes.course_id from takes where ID = '10204' would be the "third" table.

Adding to the pile on since everyone seems to be offering bits and pieces, some of which are oddly irate...
The first query says "Give me information on the students and courses where the courses were also taken by student 10204"
The second query says "Give me information on the students and courses taken by student 10204"
You say you wanted to get the course IDs and Titles for the courses taken by the student 10204, so obviously the second query is the correct one. You don't care about other student's that have taken the same courses.
Perhaps, to put it into perspective, rewriting the first, and incorrect query will help:
select distinct ID, title, takes.course_id
from course
join takes
on course.course_id = takes.course_id
join takes as takes2
on takes.course_id = takes2.course_id
WHERE
takes2.ID = '10204');

Well that is could be because in the first query you are quering where the course_id in takes table is equal to a specific course_id in that table (WHICH CAN BE NOT UNIQUE)
and in the second query you are straightly querying where the course_id is equal to a unique ID in that table!

Thanks you guys. I think my problem is that I did not realize that other students can take the same courses with the student having ID 10204. That this why though the condition is to query only courses take by the students 10204, the results is all about the courses taken by both 10204 and other students.

Because takes.ID != course.ID. The first you use takes.ID in the where clause but the second you use course.ID

Related

SQL question: how to find rows that share all of the same rows in a composite table?

I'm working on my SQL project using the Oracle database for class, and I'm asked a question that I see far too often.
You have three tables:
STUDENT: SNO, SNAME
CLASS: CNO, CNAME
ATTENDANCE: SNO, CNO, Grade
The question I keep finding is of a similar type: Find the names of the students that attend in all of the classes that "John" (or anyone else) attends.
John attends three classes, so I have to find the students that also attend those three classes (could be more, but those three must be there). However, I won't always know how many classes John (or whoever) attends, so it can't be hardcoded like that.
SELECT jclass.CNO
FROM attendance jclass
INNER JOIN student on jclass.SNO = student.SNO
WHERE student.SNAME = 'John';
This gets me the classes that John attends. I tried to add the identifier for the other students:
SELECT student.SNAME
FROM student
INNER JOIN attendance on student.SNO = attendance.SNO
INNER JOIN class on attendance.CNO = class.CNO
WHERE student.SNAME <> 'John'
AND class.CNO IN (SELECT jclass.CNO
FROM attendance jclass
INNER JOIN student on jclass.SNO = student.SNO
WHERE student.SNAME = 'John');
However, this only gets me the students that appear in at least one of John's classes, rather than all of them. I can see why it's doing this, but I'm not sure how to fix it. It's the one big struggle I'm having with SQL.
Here is one way - assuming SNO is primary key in the first table, CNO is primary key in the second table, and (SNO, CNO) is (composite) primary key in the third table, and that the input student is given by a unique identifier (first name is distinctly NOT a unique identifier, so the problem stated in terms of giving "John" as the input makes no sense). Here I assume the "special" student is identified by SNO = 1001; you can make 1001 into a variable, or change it to a subquery that selects a (unique!!) SNO based on some other inputs.
I didn't try to make the query as efficient as possible, or use features you most likely haven't seen in your class. Rather, I tried to make it as elementary and as readable as possible.
select sno
from attendance
where cno in (select cno from attendance where sno = 1001)
group by sno
having count(*) = (select count(*) from attendance where sno = 1001)
;
The strategy is simple: the subquery in the in condition finds the classes attended by the "special" student, then from the attendance table we select only rows for those classes. Group by student, and count. Keep only the students for whom the count is equal to the total count for the "special" student. Note the last condition is about groups, not about input rows, so it belongs in the having clause.

SQL View only returns one row

I'm trying to create a view that returns the amount of courses each Californian student is enrolled in. There are 4 CA students listed in my 'Students' table, so it should return that many rows.
create or replace view s2018_courses as
select Students.*, COUNT(Current_Schedule.ID) EnrolledCourses
from Students, Current_Schedule
where Students.ID = Current_Schedule.ID AND state='CA';
This query, however, only returns a single row with one student's information, and the total number of courses all CA students are in (in this case 14, since each student is enrolled in 3-5 classes).
I created a view similar to this recently (in a different DB) and it worked well and executed multiple rows, so I'm not sure what is going wrong? Sorry if this is a confusing question, I'm new to SQL and StackOverflow!! Thank you in advance for any advice!
You are missing an aggregation step in your query/view:
CREATE OR REPLACE VIEW s2018_courses AS
SELECT
s.ID, COUNT(cs.ID) EnrolledCourses
FROM Students s
INNER JOIN Current_Schedule cs
ON s.ID = cs.ID
WHERE
state = 'CA'
GROUP BY
s.ID;
The logical problem with your current query is that you are using COUNT(*) without GROUP BY, and MySQL is interpreting this to mean that you want to take the count of the entire table. Note also that I only select ID in my query above, because this is what is being used to aggregate.

SQL help required

I have a question which asks:
Make a list of all data stored in da_students, da_enrolments and da_courses. Write a query that makes all columns appear only once in the output, the rows are sorted by students' first name ascending AND course_id descending so that students are grouped by course (without using a group by clause) and generated first/ last names( fname.., Lname...) do not appear in the output.
The current code I have is:
SELECT *
FROM DA_STUDENTS
FULL JOIN DA_ENROLMENTS ON student_id = student_student_id
FULL JOIN DA_COURSES ON course_course_id = course_id
ORDER BY first_name ASC, course_id DESC;
Any help please?
To start with: The sorting thing is not well explained. It even looks like it is indented to obfuscate. To sort by student name and course but have the students grouped by course in the end, simply means order by course and then by student.
Then: "generated first/last names ... do not appear in the output". What? How are they generated? Aren't they simple columns in the students table? As to showing something in the output or not: state what you want to show in your SELECT clause. Other columns will not be shown of course.
As to "Make a list of all data stored in da_students, da_enrolments and da_courses": This means all students, all courses plus the associtated enrolements:
from da_students s
cross join da_courses c
left join da_enrolments e on e.student_id = s.student_id and e.course_id = c.course_id
As to "Write a query that makes all columns appear only once in the output": This means, don't
select *
nor
select s.student_id, e.student_id, ...
but only
select s.student_id, ...
so as to show the student ID only once.

Cannot find correct number of values in a table that are not in another table, though I can do otherwise

I want to retrieve the course_id in table course that is not in the table takes. Table takes only contains course_id of courses taken by students. The problem is that if I have:
select count (distinct course.course_id)
from course, takes
where course.course_id = (takes.course_id);
the result is 85 which is smaller than the total number of course_id in table course, which is 200. The result is correct.
But I want to find the number of course_id that are not in the table takes, and I have:
select count (distinct course.course_id)
from course, takes
where course.course_id != (takes.course_id);
The result is 200, which is equal the number of course_id in table course. What is wrong with my code?
This SQL will give you the count of course_id in table course that aren't in the table takes:
select count (*)
from course c
where not exists (select *
from takes t
where c.course_id = t.course_id);
You didn't specify your DBMS, however, this SQL is pretty standard so it should work in the popular DBMSs.
There are a few different ways to accomplish what you're looking for. My personal favorite is the LEFT JOIN condition. Let me walk you through it:
Fact One: You want to return a list of courses
Fact Two: You want to
filter that list to not include anything in the Takes table.
I'd go about this by first mentally selecting a list of courses:
SELECT c.Course_ID
FROM Course c
and then filtering out the ones I don't want. One way to do this is to use a LEFT JOIN to get all the rows from the first table, along with any that happen to match in the second table, and then filter out the rows that actually do match, like so:
SELECT c.Course_ID
FROM
Course c
LEFT JOIN -- note the syntax: 'comma joins' are a bad idea.
Takes t ON
c.Course_ID = t.Course_ID -- at this point, you have all courses
WHERE t.Course_ID IS NULL -- this phrase means that none of the matching records will be returned.
Another note: as mentioned above, comma joins should be avoided. Instead, use the syntax I demonstrated above (INNER JOIN or LEFT JOIN, followed by the table name and an ON condition).

SQL Server query to know if an id in a column equals to other in another column

I’ve been hours trying to build this query and I need your help so I can make it.
This is table Students (made out of inner joins):
SpecialtyChosenID StudentID Subject SubjectSpecialtyID
5ABFB416-8137 15 Math A1EBF3CB-E899
5ABFB416-8137 15 English A1EBF3CB-E899
The info in it means that a student with id no. 15 has chosen an specialty with id 5ABFB416-8137
The two subjects he has passed (Math and English) belong to a specialty with id A1EBF3CB-E899
What would be the query to know if the passed subjects belong to the specialty chosen by the student??
Counting the number of subjects with the same SubjectSpecialtyID as SpecialtyChosenID and vice versa could do.
Thanks a lot
You can do a self join. This finds the number of subjects taken by the student that match the student's chosen specialities.
SELECT l.SpecialtyChosenID, l.StudentID, Count(Distinct r.Subject) FROM Students l
LEFT JOIN Students r ON (l.StudentID=r.StudentID AND l.SpecialityChosenID=r.SubjectSpecialityID)
GROUP BY l.SpecialtyChosenID, l.StudentID
However, this is quite inefficient using the table structure given. If you have a table listing students with their specialities, and another with subjects and specialities, and a third relating students with subjects, it would be better to build this query from the base data, rather than from the derived data.
SELECT * FROM Students WHERE SpecialtyChosenID = SubjectSpecialtyID
If you only need the list of matching subjects and you have the SpecialtyChosenID you can do something like
SELECT * FROM Students
WHERE SubjectSpecialtyID = SpecialityChosenID
CASE WHEN SpecialtyChosenID = SubjectSpecialtyID THEN 1 ELSE 0 END AS specialty