'for all' Queries in sql - sql

In class, professor said that SQL language does not provide 'for all' operator.
In order to use 'for all' you have to use 'not exist( X except Y)'
At this point, I can't figure out why 'for all' is same meaning as 'not exist( X except Y)'
I give you example relation:
course (cID,title,deptName,credit),
teaches (pID,cID,semester,year,classroom),
student (sID,name,gender,deptName)
Q: Find all student names who have taken all courses offered in 'CS' department
The answer is:
Select distinct
S.sid, S.name
from
student as S
where
not exists (
(select cID from course where deptName = 'CS')
except
(select T.cID from takes as T where S.sID = T.sID)
);
Can you give me specific explain about that?
ps. Sorry for my English skill

You professor is right. SQL has no direct way to query all records that have all possible relations of a certain type.
It's easy to query which relations of a certain type a record has. Just INNER JOIN the two tables and you are done.
But in an M:N relationship like "students" to "taken courses" it's not that simple.
To answer the question "which student has taken all possible courses" you must find out which relations could possibly exist and then make sure that all of them do actually exist.
select distinct
S.sid, S.name
from
student as S
where
not exists (
(select cID from course where deptName = 'CS')
except
(select T.cID from takes as T where S.sID = T.sID)
);
can be translated as
give me all students SELECT
for whom it is true: WHERE
that the following set is empty NOT EXISTS
(any course in 'CS') "all relations that can possibly exist"
minus EXCEPT
(all courses the student has taken) "the ones that do actually exist"
In other words: Of all possible relations there is no relation that does not exist.
There are other ways of expressing the same thought that can be used in database systems without support for EXCEPT.
For example
select
S.sid,
S.name
from
student as S
inner join takes as T on T.sID = S.sID
inner join course as C on C. cID = T. cID
where
c. deptName = 'CS'
group by
S.sid,
S.name
having
count(*) = (select count(*) from course where deptName = 'CS');

From your table definition and requirement its not clear what is the use of teaches table. You want the list of students names those have taken all courses offered by 'CS' department. For this students and course table is enough.
SELECT name
FROM
(
SELECT B.name, A.cid
FROM course A
INNER JOIN student B ON A.deptName = B.deptName
WHERE A.deptName = 'CS'
GROUP BY A.cid, B.name
) A
GROUP BY name
HAVING COUNT(name) >= (SELECT COUNT(cid) FROM course WHERE deptName = 'CS')
Internal query just selects all students those have taken any course offered by 'CS' dept and with group by I just make sure that in case a student take same course twice they will be counted as one row. Next I just select those students take all course offered by 'CS' dept.
I think you have some gap to understand your requirement properly. In your requirement no relation with teaches table is specified.

Q: Find all student names who have taken all courses offered in 'CS'
department
NOT EXISTS returns true if the query passed to it contains 0 records.
In this case, your sub-query from NOT EXISTS selects all the courses offered in 'CS', and subtract from this result set all the courses taken by specific student.
If the student have taken all the courses then except will remove all and the sub-query will return 0 records, which in pair with NOT EXISTS will give you true for specific student, and it will be displayed in final result set.

Brief history: Codd invented the Relational Model (RM), some people created a DBMS loosely based on RM to prove a RM product could be performant, and the SQL language emerged based on that DBMS (i.e. not directly based on the RM).
Codd came up with a set of primitive operators to define a database as being relationally complete. His algebra included product, where two relations are 'multiplied' together to give a combined relation; this made it into SQL as CROSS JOIN. [Side note: people refer to this operator as 'Cartesian product', which results in a set of ordered pairs. However, product in RM results in a relation (as do all relational operators), and CROSS JOIN results in a table expression (loosely speaking).]
Codd's algebra also included a division operator. I guess the thinking is, we should be able to take the result of product and one of the relations and use an operator to result in the other relation. But it does has some practical use too, of course. It is commonly expressed as 'the supplier who supplies all products', after Chris Date's parts and suppliers database found in his books. SQL lacks an explicit division operator, so we need to use other operators to get the desired result.
Note there are two flavours of division, being exact division ("suppliers who supply all the parts we are interested in and no more") and division with remainder ("suppliers who supply at least all the parts we are interested in and possibly more"). I tend to be wary of the answers here that do not mention either the name 'division' or that you need to decide whether you need to deal with remainders.
The thinking behind your professor's answer is that a double negative (in mathematics and English) i.e. if the statement "there is no part I don't supply" is true for a given supplier then that supplier will be in the result.
Note there are operators that Codd omitted (e.g. rename and summerize) that can now be found in SQL, so it's a shame we are still waiting for division!

Related

SQL question: how to find rows that share all of the same rows in a composite table?

I'm working on my SQL project using the Oracle database for class, and I'm asked a question that I see far too often.
You have three tables:
STUDENT: SNO, SNAME
CLASS: CNO, CNAME
ATTENDANCE: SNO, CNO, Grade
The question I keep finding is of a similar type: Find the names of the students that attend in all of the classes that "John" (or anyone else) attends.
John attends three classes, so I have to find the students that also attend those three classes (could be more, but those three must be there). However, I won't always know how many classes John (or whoever) attends, so it can't be hardcoded like that.
SELECT jclass.CNO
FROM attendance jclass
INNER JOIN student on jclass.SNO = student.SNO
WHERE student.SNAME = 'John';
This gets me the classes that John attends. I tried to add the identifier for the other students:
SELECT student.SNAME
FROM student
INNER JOIN attendance on student.SNO = attendance.SNO
INNER JOIN class on attendance.CNO = class.CNO
WHERE student.SNAME <> 'John'
AND class.CNO IN (SELECT jclass.CNO
FROM attendance jclass
INNER JOIN student on jclass.SNO = student.SNO
WHERE student.SNAME = 'John');
However, this only gets me the students that appear in at least one of John's classes, rather than all of them. I can see why it's doing this, but I'm not sure how to fix it. It's the one big struggle I'm having with SQL.
Here is one way - assuming SNO is primary key in the first table, CNO is primary key in the second table, and (SNO, CNO) is (composite) primary key in the third table, and that the input student is given by a unique identifier (first name is distinctly NOT a unique identifier, so the problem stated in terms of giving "John" as the input makes no sense). Here I assume the "special" student is identified by SNO = 1001; you can make 1001 into a variable, or change it to a subquery that selects a (unique!!) SNO based on some other inputs.
I didn't try to make the query as efficient as possible, or use features you most likely haven't seen in your class. Rather, I tried to make it as elementary and as readable as possible.
select sno
from attendance
where cno in (select cno from attendance where sno = 1001)
group by sno
having count(*) = (select count(*) from attendance where sno = 1001)
;
The strategy is simple: the subquery in the in condition finds the classes attended by the "special" student, then from the attendance table we select only rows for those classes. Group by student, and count. Keep only the students for whom the count is equal to the total count for the "special" student. Note the last condition is about groups, not about input rows, so it belongs in the having clause.

SQL Query Involving Finding Most Frequent Tuple Value in Column

I have the following relations:
teaches(ID,course_id,sec_id,semester,year)
instructor(ID,name,dept_name,salary)
I am trying to express the following as an SQL query:
Find the ID and name of the instructor who has taught the most courses(i.e has the most tuples in teaches).
My Query
select ID, name
from teaches
natural join instructor
group by ID
order by count(*) desc
I know this isn't correct, but I feel like I'm on the right track. In order to answer the question, you need to work with both relations, hence the natural join operation is required. Since the question asks for the instructor that has taught the most courses, that tells me that we are trying to count the number of times each instructor ID appears in the teaches relation. From what I understand, we are looking to count distinct instructor IDs, hence the group by command is needed.
Don't use natural joins: all they do is rely on column names to decide which columns relate across tables (they don't check for foreign keys constraints or the-like, as you would thought). This is unreliable by nature.
You can use a regular inner join:
select i.id, i.name
from teaches t
inner join instructor i on i.id = t.sec_id
group by i.id, i.name
order by count(*) desc
limit 1
Notes:
this assumes that column teaches.sec_id relates to instructor.id (I cannot see which other column could be used)
I added a limit clause to the query since you stated that you want the top instructor - the syntax may vary across databases
always prefix the column names with the table they belong to, to make the query unambiguous and easier to understand
it is a good practice (and a requirement in many databases) that in an aggregate query all non-aggregared columns listed in the select clause should appear in the group by clause; I added the instructur name to your group by clause

oracle 10g contains clause

I am new to SQL. I am trying to write a query which requires contains operator. I am looking at many examples for the contains clause, but they seem different than what I need to use. I want a divide operator(relational algebra) equivalent in sql.
a sample of what I am doing is :
Give the students names who are enrolled for computer architecture course and have satisfied all its prerequisites. I have a prereq table and a course table which lists all courses taken by student
so what I ultimately want is
(get all courses taken by a student) contains (get all prerequisits)
for what exactly shall I look for for sql equivalent of divide operation in relational algebra?
You can do this by comparing counts.
select s.StudentName
from StudentCourses s, PrereqCourses p
where s.CourseId = p.CourseId
group by s.StudentName
having count(*) = (select count(*) from PrereqCourses)
See: http://www.dba-oracle.com/t_sql_patterns_relational_division.htm

SQL WHERE <from another table>

Say you have these tables:
PHARMACY(**___id_pharmacy___**, name, addr, tel)
PHARMACIST(**___Insurance_number___**, name, surname, qualification, **id_pharmacy**)
SELLS(**___id_pharmacy___**, **___name___**, price)
DRUG(**___Name___**, chem_formula, **id_druggistshop**)
DRUGGISTSHOP(**___id_druggistshop___**, name, address)
I think this will be more specific.
So, I'm trying to construct an SQL statement, in which I will fetch the data from id_pharmacy and name FROM PHARMACY, the insurance_number, name, and surname columns from PHARMACIST, for all the pharmacies that sell the drug called Kronol.
And that's basically it. I know I'm missing the relationships in the code I wrote previously.
Note: Those column names which have underscores left and right to them are underlined(Primary keys).
The query you've written won't work in any DBMS that I know of.
You'll most likely want to use some combination of JOINs.
Since the exact schema isn't provided, consider this pseudo code, but hopefully it will get you on the right track.
SELECT PH.Ph_Number, PH.Name, PHCL.Ins_Number, PHCL.Name, PHCL.Surname
FROM PH
INNER JOIN PHCL ON PHCL.PH_Number = PH.Ph_Number
INNER JOIN MLIST ON MLIST.PH_Number = PH.PH_Number
WHERE MLIST.Name = "Andy"
I've obviously assumed some relationships between tables that may or may not exist, but hopefully this will be pretty close. The UNION operator won't work because you're selecting different columns and a different number of columns from the various tables. This is the wrong approach all together for what you're trying to do. It's also worth mentioning that a LEFT JOIN may or may not be a better option for you, depending on the exact requirements you're trying to meet.
Ok, try this query:
SELECT A.id_pharmacy, A.name AS PharmacyName, B.Insurance_number,
B.name AS PharmacistName, B.surname AS PharmacistSurname
FROM PHARMACY A
LEFT JOIN PHARMACIST B
ON A.id_pharmacy = B.id_pharmacy
WHERE A.id_pharmacy IN (SELECT id_pharmacy FROM SELLS WHERE name = 'Kronol')

MySQL 5.5 Database Query Help

I am having a few issues with a DB query.
I have two tables, students (Fields: FirstName, LastName, StdSSN), and teachers (TFirstName, TLastName, TSSN) which I've stripped down for this example. I need to perform a query that will return all the students except for the students that are teachers themselves.
I have the query
SELECT student.FirstName, student.LastName
FROM `student`,`teachers`
WHERE student.StdSSN=teachers.TSSN
Which gives me a list of all the teachers who are also students it does not provide me with a list of students who are not teachers, so I tried changing to:
SELECT student.FirstName, student.LastName
FROM `student`,`teachers`
WHERE student.StdSSN!=teachers.TSSN
Which gives me a list of all the students with many duplicate values so I am a little stuck here. How can I change things to return a list of all students who are not teachers? I was thinking INNER/OUTER/SELF-JOIN and was playing with that for a few hours but things became complicated and I did not accomplish anything so I've pretty much given up.
Can anyone give me any advice? I did see the query before and it was pretty simple, but I've failed somewhere.
Using NOT IN
SELECT s.*
FROM STUDENTS s
WHERE s.stdssn NOT IN (SELECT t.tssn
FROM TEACHERS t)
Using NOT EXISTS
SELECT s.*
FROM STUDENTS s
WHERE NOT EXISTS (SELECT NULL
FROM TEACHERS t
WHERE t.tssn = s.stdssn)
Using LEFT JOIN / IS NULL
SELECT s.*
FROM STUDENTS s
LEFT JOIN TEACHERS t ON t.tssn = s.stdssn
WHERE t.column IS NULL
I used "column" for any column in TEACHERS other than what is joined on.
Comparison:
If the column(s) compared are nullable (value can be NULL), NOT EXISTS is the best choice. Otherwise, LEFT JOIN/IS NULL is the best choice (for MySQL).