I'm refreshing my SQL with the online Stanford database class exercises, found here. Here is the problem:
"Find names and grades of students who only have friends in the same
grade. Return the result sorted by grade, then by name within each
grade."
We have a highschooler table, with the attributes name, grade, id. Also, the likes table has attributes id1 and id2. id1 and id2 in likes correspond to id in highschooler.
Based on the problem section this comes from, I can tell that I'll need to use subqueries, but I'm not sure where. How should I approach this problem? None of the currently suggested solutions work.
Here is my current SQL statement, that is not working correctly (ignoring sorting):
select distinct
student1.id,
student1.name,
student1.grade
from
highschooler student1,
highschooler student2,
friend
where not exists (select *
from friend
where student1.id = id1
and student2.id = id2
and student1.grade = student2.grade
and student1.id <> student2.id);
I assumed that, if A was B's Friend, it's equal to B was A's friend.
CREATE VIEW Temp
AS
SELECT id,name,grade,id2,[grd2] FROM highschooler
INNER JOIN Likes ON highschooler.id = Likes.id1
INNER JOIN (SELECT id as [id2t], grade as [grd2] from highschooler) a ON a.id2t = Likes.id2
UNION ALL
SELECT id,name,grade,[id1] as [id2],[grd2] FROM highschooler
INNER JOIN Likes ON highschooler.id = Likes.id2
INNER JOIN (SELECT id as [id2t], grade as [grd2] from highschooler) a ON a.id2t = Likes.id1
The temp view let me have all the info i need.
CREATE VIEW PlayWithClassMate
AS
SELECT distinct id FROM Temp WHERE grade = grd2
This PlayWithClassMate view let me have all student who play with her/his classmate (some how, i think a person can play, with all his/her friend not their classmate).
CREATE VIEW IDResult
AS
SELECT id FROM (
SELECT id, COUNT(GRD2) as c FROM TEMP
WHERE id in (SELECT id FROM PlayWithClassMate)
GROUP BY ID) A
WHERE C>1
this IDResult view have all the id the question ask you.
Now, select whatever you need, inwhich its ID in IDResult
i think it's not the best, or it may be the worst, but it work.
(srr abt terribe grammar)
This is harder than it looks, because it requires preparing sets sequentially. But, there are a few ways to solve this one. Here's what quickly comes to mind:
First, find the friend-of-friend for everybody by grade producing something like:
[ID], [FoF ID], [Grade of FoF]
You really don't need [FoF ID], but it might help when debugging.
Then, as a second-order operation, you'll need to produce a list of [ID]s where [Grade of FoF] is equal to both the MAX() and MIN():
SELECT [ID], MAX(Grade of FoF) as A, MIN(Grade of FoF) as B FROM [the above] WHERE A=B
UPDATE:
I realized that I should also add that in the final qry: A=B and A=Grade. Then this solution works. Keep in mind: it only answers the question "Find names and grades of students who only have friends in the same grade." and it assumes friendship is one-directional. (Sorry, I had to leave something undone.)
For those that need to see some SQL, here you are. It's written for MS Access, but easily ported (start by removing the "()" in the inner-most query) to MySQL, PGSQL, or Oracle. Better still, no procedural extensions and no temp tables.
SELECT
name
FROM
(
SELECT
ID
,name
,grade
,min( friend_grade) as min_friend_grade
,max( friend_grade) as max_friend_grade
FROM
(
SELECT
hs1.ID
,hs1.name
,hs1.grade
,l.ID2 as friend_id
,hs2.name as friend_name
,hs2.grade as friend_grade
FROM
( highschooler hs1
INNER JOIN likes l ON (hs1.ID = l.ID1) )
INNER JOIN highschooler hs2 ON (l.ID2 = hs2.ID)
)FoF
GROUP BY
ID
,name
,grade
)FoF_max_min
WHERE
grade=min_friend_grade
AND min_friend_grade=max_friend_grade
Related
The caveat here is I must complete this with only the following tools:
The basic SQL construct: SELECT FROM .. AS WHERE... Distinct is ok.
Set operators: UNION, INTERSECT, EXCEPT
Create temporary relations: CREATE VIEW... AS ...
Arithmetic operators like <, >, <=, == etc.
Subquery can be used only in the context of NOT IN or a subtraction operation. I.e. (select ... from... where not in (select...)
I can NOT use any join, limit, max, min, count, sum, having, group by, not exists, any exists, count, aggregate functions or anything else not listed in 1-5 above.
Schema:
People (id, name, age, address)
Courses (cid, name, department)
Grades (pid, cid, grade)
I satisfied the query but I used not exists (which I can't use). The sql below shows only people who took every class in the Courses table:
select People.name from People
where not exists
(select Courses.cid from Courses
where not exists
(select grades.cid from grades
where grades.cid = courses.cid and grades.pid = people.id))
Is there way to solve this by using not in or some other method that I am allowed to use? I've struggled with this for hours. If anyone can help with this goofy obstacle, I'll gladly upvote your answer and select your answer.
As Nick.McDermaid said you can use except to identify students that are missing classes and not in to exclude them.
1 Get the complete list with a cartesian product of people x courses. This is what grades would look like if every student has taken every course.
create view complete_view as
select people.id as pid, courses.id as cid
from people, courses
2 Use except to identify students that are missing at least one class
create view missing_view as select distinct pid from (
select pid, cid from complete_view
except
select pid, cid from grades
) t
3 Use not in to select students that aren't missing any classes
select * from people where id not in (select pid from missing_view)
As Nick suggests, you can use EXCEPT in this case. Here is the sample:
select People.name from People
EXCEPT
select People.name from People AS p
join Grades AS g on g.pid = p.id
join Courses as c on c.cid = g.cid
you can turn the first not exists into not in using a constant value.
select *
from People a
where 1 not in (
select 1
from courses b
...
I have two tables:
students(id, name, school_id)
schools(id, name)
I'm trying to use EXISTS in order to learn if there are any students that go to specific school, say "Harvard" for example. I know that EXISTS is used after WHERE but I'm wondering if I can do this:
SELECT EXISTS
(SELECT *
FROM students st, schools sch
WHERE st.school_id=sch.id AND sch.name="Harvard");
Is this query correct? I am working on MySQL Workbench and I don't get an error. But I don't know if it does what it's supposed to do.
If it's not, then what should I change? I just want to know if it's correct and if I can use this syntax in the future.
Note that the desired result is either yes or no (1 or 0).
How do I get this result?
Sorry if my question was unclear, I can edit it again if you still don't understand.
You could just get the count, then you would know if any exist - and if so, how many...
SELECT COUNT(*)
FROM students st, schools sch
WHERE st.school_id=sch.id AND sch.name="Harvard"
The EXISTS keyword is normally used as a pre-condition... i.e. "if not exists, then add this..."
This is a simple Select with a Join.
SELECT *
FROM Students S
JOIN Schools SC
ON S.School_id = SC.id
Where SC.Name = 'Harvard'
This will give you all the rows for the students that go to Harvard, if any. If you want to do a count you can do SELECT COUNT(*) instead or limit which columns are returned by indicating the specific columns in the SELECT statement
This will give you student names that go to Harvard. If you wish to see how many students learn at Harvard replace st.name with count(*).
Note that it doesn't matter what you put in SELECT list inside EXISTS statement, so choosing a constant value provides better performance than selecting columns.
SELECT
st.name
FROM
students st
WHERE
EXISTS(
SELECT 1 FROM schools s WHERE s.name= 'Harvard' AND s.id = st.school_id)
OR
SELECT
st.name
FROM
students st
INNER JOIN schools s ON
st.school_id = s.id
WHERE
s.name = 'Harvard'
Additional note: Code below would only yield result either true or false.
SELECT EXISTS ( <query> )
This means that below query would return true if there are any students that learn at Harvard.
SELECT EXISTS (
SELECT 1
FROM students st
INNER JOIN schools s ON st.school_id = s.id
WHERE s.name = 'Harvard' )
I'm doing Stanfords introduction to DB course and this is one of the homework assignments. My code does the job well, but I don't really like it how I reused the same SELECT-FROM-JOIN part twice:
SELECT name, grade
FROM Highschooler
WHERE
ID IN (
SELECT H1.ID
FROM Friend
JOIN Highschooler AS H1
ON Friend.ID1 = H1.ID
JOIN Highschooler AS H2
ON Friend.ID2 = H2.ID
WHERE H1.grade = H2.grade
) AND
ID NOT IN (
SELECT H1.ID
FROM Friend
JOIN Highschooler AS H1
ON Friend.ID1 = H1.ID
JOIN Highschooler AS H2
ON Friend.ID2 = H2.ID
WHERE H1.grade <> H2.grade
)
ORDER BY grade, name
This is the SQL schema for the two tables used in the code:
Highschooler(ID int, name text, grade int);
Friend(ID1 int, ID2 int);
I had to query all the Highschoolers that only have friends in the same grade, and not in any other grades. Is there a way to somehow write the code bellow only once, and reuse it two times for the two different WHERE clauses = and <>?
SELECT H1.ID
FROM Friend
JOIN Highschooler AS H1
ON Friend.ID1 = H1.ID
JOIN Highschooler AS H2
ON Friend.ID2 = H2.ID
EDIT: We are required to provide SQLite code.
This is a "poster child" example for the WHERE EXISTS query:
SELECT name, grade
FROM Highschooler ME
WHERE EXISTS (
SELECT 1
FROM Friend F
JOIN Highschooler OTHER on F.ID2=OTHER.ID
WHERE F.ID1=ME.ID AND OTHER.Grade = ME.GRADE
)
AND NOT EXISTS (
SELECT 1
FROM Friend F
JOIN Highschooler OTHER on F.ID2=OTHER.ID
WHERE F.ID1=ME.ID AND OTHER.Grade <> ME.GRADE
)
An EXISTS condition is true if its SELECT returns one or more row; otherwise, it is false. All you need to do is to correlate the inner subquery with the outer one (the F.ID1=ME.ID part), and add the remaining constraints that you need (the OTHER.Grade = ME.GRADE or the OTHER.Grade <> ME.GRADE) to your query.
This is a typical type of question about groups related to an individual. When you are faced with such a question, one approach is to use joins (looking at things in pairs). Often a better approach is to use aggregation to look at the entire group at once.
The insight here is that if you have a group of friends and all are in the same grade, then the minimum and maximum grades will be the same.
That hint might be enough for you to write the query. If so, stop here.
The query that returns what you want is much simpler than what you were doing. You just need to look at the friends' grades:
SELECT f.id1
FROM Friend f jJOIN
Highschooler fh
ON Friend.ID1 = fh.ID join
group by f.id1
having max(fh.grade) = min(fh.grade)
The having clause ensures that all are the same (ignore NULL values).
EDIT:
This version answers the question: Which highschoolers have friends all of whom are in the same grade. Your question is ambiguous. Perhaps you mean that the friends and the original person are all in the same grade. If so, then you can do so with a small modification. One way is to change the having clause to:
having max(fh.grade) = min(fh.grade) and
max(fh.grade) = (select grade from Highschooler h where f.id1 = h.id1)
This checks that the friends and original person are all in the same grade.
Sometimes you can get more natural query shape when you turn some filtering joins into set operations like UNION or MINUS/EXCEPT. The query of yours could be for example written as (pseudo-code):
SELECT H.id
FROM Highschooler H
JOIN .... | has a friend
WHERE ... | in SAME grade
EXCEPT
SELECT H.id
FROM Highschooler H
JOIN .... | has a friend
WHERE ... | in OTHER grade
some SQL engines use keyword "MINUS", some use "EXCEPT".
But note that very like UNION, this will execute both queries, then filter their results. This can have different performance then a single do-it-all query, but mind that not necessarily worse. Many times I find it even having better performance, as 'excepting' over single column, especially sorted, is very quick
Also, if your DB engine permits, you might try to use a View or CTE to shorten your original query, but I do not see much sense in doing so, except for aesthetics
Some databases support the minus keyword.
select whatever
from wherever
where id in
(select id
from somewhere
where something
minus
select id
from somewhere
where something else
)
Other databases support the same concept, but with the keyword except, instead of minus.
I'm trying to solve a seemingly simple problem, but I think i'm tripping over on my understanding of how the EXISTS keyword works. The problem is simple (this is a dumbed down version of the actual problem) - I have a table of students and a table of hobbies. The students table has their student ID and Name. Return only the students that share the same number of hobbies (i.e. those students who have a unique number of hobbies would not be shown)
So the difficulty I run into is working out how to compare the count of hobbies. What I have tried is this.
SELECT sa.studentnum, COUNT(ha.hobbynum)
FROM student sa, hobby ha
WHERE sa.studentnum = ha.studentnum
AND EXISTS (SELECT *
FROM student sb, hobby hb
WHERE sb.studentnum = hb.studentnum
AND sa.studentnum != sb.studentnum
HAVING COUNT(ha.hobbynum) = COUNT(hb.hobbynum)
)
GROUP BY sa.studentnum
ORDER BY sa.studentnum;
So what appears to be happening is that the count of hobbynums is identical each test, resulting in all of the original table being returned, instead of just those that match the same number of hobbies.
Not tested, but maybe something like this (if I understand the problem correctly):
WITH h AS (
SELECT studentnum, COUNT(hobbynum) OVER (PARTITION BY studentnum) student_hobby_ct
FROM hobby)
SELECT studentnum, student_hobby_ct
FROM h h1 JOIN h h2 ON h1.student_hobby_ct = h2.student_hobby_ct AND
h1.studentnum <> h2.studentnum;
I think that what your query would do is only return students who had at least one other student that had the same number of hobbies. But you're not returning anything about the students with whom they match. Is that intentional? I'd treat both queries as sub-queries and aggregate before a join on the counts. You could do several things... here it's returning the number of students that have matching hobby counts, but you could limit HAVING(COUNT(distinct sb.studentnum) = 0 to get the result your query seemed to return...
with xx as
(SELECT sa.studentnum, count(ha.hobbynum) hobbycount
FROM student sa inner join hobby ha
on sa.studentnum = ha.studentnum
group by sa.studentnum
)
select sa.studentnum, sa.hobbycount, count(distinct sb.studentnum) as matchcount
from
xx sa inner join xx sb on
sa.hobbycount = sb.hobbycount
where
sa.studentnum != sb.studentnum
GROUP by sa.studentnum, sa.hobbycount
ORDER BY sa.studentnum;
don't know if this is possible.. I'm using sqlite3
schema:
CREATE TABLE docs (id integer primary key, name string);
CREATE TABLE revs (id integer primary key, doc_id integer, number integer);
I want to select every job joined with only one of its revisions, the one with the highest number. How can I achieve this?
Right now I'm doing a left join and getting everything and then I'm filtering it in the application, but this sucks..
(by the way, can you suggest me a good and easy introductory book on databases and how they work and maybe something about sql too..)
thanks!
try this
Select * From docs d
Join revs r
On r.doc_id = d.id
Where r.number =
(Select Max(number ) from revs
Where Doc_Id = d.Id)
or, if you want the docs with no Revs (Is this possible?)
Select * From docs d
Left Join revs r
On r.doc_id = d.id
And r.number =
(Select Max(number ) from revs
Where Doc_Id = d.Id)
Not sure if your engine supports this, but typically, you would do something like this in ANSI SQL:
SELECT docs.*
,revs.*
FROM docs
INNER /* LEFT works here also if you don't have revs */ JOIN revs
ON docs.id = revs.doc_id
AND revs.number IN (
SELECT MAX(number)
FROM revs
WHERE doc_id = docs.id
)
There are a number of ways to write equivalent queries, using common table expressions, correlated aggregate subqueries, etc.
select d.*, r.max_number
from docs d
left outer join (
select doc_id, max(number) as max_number
from revs
group by doc_id
) r on d.id = r.doc_id
Database Design : Database Design for Mere Mortals by Hernandez
SQL : The Practical SQL Handbook
If you want to hurt your head, any of the SQL books by Joe Celko.
Here is a very good list of books for Database Design
https://stackoverflow.com/search?q=database+book
If every job has revisions (e.g., starting with rev 0), I would use the same approach as OrbMan, but with an inner join. (If you are certain you are looking for a 1-to-1 match. why not let SQL know, too?)
select d.*, r.max_number
from docs d
inner join
(
select doc_id, max(number) as max_number
from revs
group by doc_id
) r on d.id = r.doc_id
I'd recommend "A Sane Approach to Database Design" as an excellent introduction to good design practices. (I am slightly biased. I wrote it. But hey, so far it has a 5-star average review on Amazon, none of which reviews were contributed by me or any friends or relatives.)