Related
I am trying to use WHERE with not-equal condition after joining two tables but it does not work.
Example: I have a table with data on famous people and a separate table with their works. Some works can have several authors. So I want a table listing authors with their co-authors:
CREATE TABLE famous_people (id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
profession TEXT,
birth_year INTEGER);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Landau", "physicist", 1908);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Lifshitz", "physicist", 1908);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Fisher", "statistician", 1908);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Ginzburg", "physicist", 1916);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("A. Strugatsky", "writer", 1925);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("B. Strugatsky", "writer", 1933);
CREATE TABLE works (id INTEGER PRIMARY KEY AUTOINCREMENT,
person_id INTEGER,
work TEXT);
INSERT INTO works (person_id, work)
VALUES (1, "Theoretical Physics");
INSERT INTO works (person_id, work)
VALUES (2, "Theoretical Physics");
INSERT INTO works (person_id, work)
VALUES (1, "Theory of Superconductivity");
INSERT INTO works (person_id, work)
VALUES (4, "Theory of Superconductivity");
INSERT INTO works (person_id, work)
VALUES (3, "Fisher test");
INSERT INTO works (person_id, work)
VALUES (5, "Roadside Picnic");
INSERT INTO works (person_id, work)
VALUES (6, "Roadside Picnic");
INSERT INTO works (person_id, work)
VALUES (5, "Hard to Be a God");
INSERT INTO works (person_id, work)
VALUES (6, "Hard to Be a God");
/* Co-authors */
SELECT a.name AS author, b.name AS coauthor FROM works
JOIN famous_people a
ON works.person_id = a.id
JOIN famous_people b
ON works.person_id = b.id;
It is Ok, except each author also has themselves as their own co-author, so I am trying to filter it out by adding WHERE author <> coauthor as the last line. But what I get is a table with two columns: work and name. Same weird result with WHERE a.name <> b.name
Funny enough, WHERE author = coauthor works fine but this is not what I want.
Expected result: a table with 2 columns:
author co-author
Landau Lipshitz
A. Strugatsky B. Strugatsky
Fisher NULL
Find all works that have two authors (using inner join on same work but different authors) and find all works that have one author (using not exists). Then combine the results:
SELECT w1.work, p1.name AS author, p2.name AS coauthor
FROM works AS w1
JOIN works AS w2 ON w1.work = w2.work AND w1.person_id < w2.person_id
JOIN famous_people AS p1 ON w1.person_id = p1.id
JOIN famous_people AS p2 ON w2.person_id = p2.id
UNION ALL
SELECT w1.work, p1.name, null
FROM works AS w1
JOIN famous_people AS p1 ON w1.person_id = p1.id
WHERE NOT EXISTS (
SELECT 1
FROM works AS w2
WHERE w2.work = w1.work AND w2.person_id <> w1.person_id
)
Demo on DB<>Fiddle
Your query cannot work. Keep in mind that a join works on rows. So there is one works row with one person ID that you look at at a time in your where clause. Then you join the person to the works row and then you join the person to the works row. That is the same person twice of course, because one works row only refers to one person.
This shows another, minor, problem. You call this table works. I would consider "Theoretical Physics" a work. You do so too; you named the column work. But then, why is the same work twice in the works table? This must not be. A works table shall store works, i.e. one work per row. What you have is a work_author table actually, and a work is uniquely identified by its title. This kind of makes sense; a title may uniquely identify a work - as long as no other author happens to name their work "Theoretical Physics", too :-( And as long as there are no typos in the table either.
This would be a better model:
person (person_id, name, birth_year, ...)
work (work_id, title, year, ...)
work_author (work_id, person_id)
If you have a typo in a title in this model, there is one row where you correct it and the data stays intact.
Now you want to get the authors of a work. This is easily done with aggregation:
select w.*, group_concat(p.name) as authors
from work_author wa
join person p on p.person_id = wa.person_id
join work w on w.work_id = wa.work_id
group by w.work_id
order by w.work_id;
You forgot to tell us your DBMS. As you are using double quotes where it must be single quotes according to the SQL standard, and your DBMS doesn't complain, this may be MySQL. (You should still always use single quotes for string literals.) For MySQL the string aggregation function is GROUP_CONCAT, so guessing MySQL, I used that in my query. Other DBMS use STRING_AGG, LISTAGG or something else.
If you just want to show up to two authors per work, you can take the minimum and maximum name (and compare the two in order not to show the same author twice):
select
w.*,
min(p.name) as author1,
case when min(p.name) <> max(p.name) then max(p.name) end as author2
from ...
UPDATE
In the comments you say that for every author you want to know all authors who worked with them. For this you need to join authors to authors based on their works. Still assuming MySQL:
select p1.name, group_concat(distinct p2.name) as others
from work_author wa1
join work_author wa2 on wa2.work_id = wa1.work_id
and wa2.person_id <> wa1.person_id
join person p1 on p1.person_id = wa1.person_id
join person p2 on p2.person_id = wa2.person_id
group by p1.name
order by p1.name;
Or not aggregated:
select distinct p1.name as person1, p2.name as person2
from work_author wa1
join work_author wa2 on wa2.work_id = wa1.work_id
and wa2.person_id <> wa1.person_id
join person p1 on p1.person_id = wa1.person_id
join person p2 on p2.person_id = wa2.person_id
order by p1.name, p2.name;
I changed the model as proposed by Thorsten Kettner and solved the task of matching authors with their co-authors as follows:
CREATE TABLE famous_people (id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
profession TEXT,
birth_year INTEGER);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Landau", "physicist", 1908);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Lifshitz", "physicist", 1908);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Fisher", "statistician", 1908);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("Ginzburg", "physicist", 1916);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("A. Strugatsky", "writer", 1925);
INSERT INTO famous_people (name, profession, birth_year)
VALUES ("B. Strugatsky", "writer", 1933);
CREATE TABLE works (id INTEGER PRIMARY KEY AUTOINCREMENT,
work TEXT,
subject TEXT);
INSERT INTO works (work, subject)
VALUES ("Theoretical Physics", "physics");
INSERT INTO works (work, subject)
VALUES ("Theory of Superconductivity", "physics");
INSERT INTO works (work, subject)
VALUES ("Fisher test", "statistics");
INSERT INTO works (work, subject)
VALUES ("Roadside Picnic", "scifi");
INSERT INTO works (work, subject)
VALUES ("Hard to Be a God", "scifi");
CREATE TABLE author_works (id INTEGER PRIMARY KEY AUTOINCREMENT,
work_id INTEGER,
author_id INTEGER);
INSERT INTO author_works (work_id, author_id) VALUES (1, 1);
INSERT INTO author_works (work_id, author_id) VALUES (1, 2);
INSERT INTO author_works (work_id, author_id) VALUES (2, 1);
INSERT INTO author_works (work_id, author_id) VALUES (2, 4);
INSERT INTO author_works (work_id, author_id) VALUES (3, 3);
INSERT INTO author_works (work_id, author_id) VALUES (4, 5);
INSERT INTO author_works (work_id, author_id) VALUES (4, 6);
INSERT INTO author_works (work_id, author_id) VALUES (5, 5);
INSERT INTO author_works (work_id, author_id) VALUES (5, 6);
/* List of authors and their works */
SELECT famous_people.name, works.work FROM author_works
JOIN famous_people
ON author_works.author_id = famous_people.id
JOIN works
ON works.id = author_works.work_id;
/* Authors and co-authors ids*/
SELECT DISTINCT a.name, b.name
FROM author_works aw1
JOIN author_works aw2
ON aw1.work_id = aw2.work_id
JOIN famous_people a
ON aw1.author_id = a.id
JOIN famous_people b
ON aw2.author_id = b.id
WHERE aw1.author_id <> aw2.author_id;
I'm wondering if it is possible to inner join an inner join with another inner join.
I have a database of 3 tables:
people
athletes
coaches
Every athlete or coach must exist in the people table, but there are some people who are neither coaches nor athletes.
What I am trying to do is find a list of people who are active (meaning play or coach) in at least 3 different sports. The definition of active is they are either coaches, athletes or both a coach and an athlete for that sport.
The person table would consist of (id, name, height)
the athlete table would be (id, sport)
the coaching table would be (id, sport)
I have created 3 inner joins which tell me who is both a coach and and an athlete, who is just a coach and who is just an athlete.
This is done via inner joins.
For example,
1) who is both a coach and an athlete
select
person.id,
person.name,
coach.sport as 'Coaches and plays this sport'
from coach
inner join athlete
on coach.id = athlete.id
and coach.sport = athlete.sport
inner join person
on athlete.id = person.id
That brings up a list of everyone who both coaches and plays the same sport.
2) To find out who only coaches sports, I have used inner joins as below:
select
person.id,
person.name,
coach.sport as 'Coaches this sport'
from coach
inner join person
on coach.id = person.id
3) Then to find out who only plays sports, I've got the same as 2) but just tweaked the words
select
person.id,
person.name,
athlete.sport as 'Plays this sport'
from athlete
inner join person
on athlete.id = person.id
The end result is now I've got:
1) persons who both play and coach the same sport
2) persons who coach a sport
3) persons who play a sport
What I would like to know is how to find a list of people who play or coach at least 3 different sports? I can't figure it out because if someone plays and coaches a sport like hockey in table 1, then I don't want to count them in table 2 and 3.
I tried using these 3 inner joins to make a massive join table so that I could pick the distinct values but it is not working.
Is there an easier way to go about this without making sub-sub-queries?
What I would like to know is how to find a list of people who play /
coach at least 3 different sports? I can't figure it out because if
someone plays and coaches a sport like hockey in table 1, then I don't
want to count them in table 2 and 3.
you can do something like this
select p.id,min(p.name) name
from
person p inner join
(
select id,sport from athlete
union
select id,sport from coach
)
ca
on ca.id=p.id
group by p.id
having count(ca.sport)>2
CREATE TABLE #person (Id INT, Name VARCHAR(50));
CREATE TABLE #athlete (Id INT, Sport VARCHAR(50));
CREATE TABLE #coach (Id INT, Sport VARCHAR(50));
INSERT INTO #person (Id, Name) VALUES(1, 'Bob');
INSERT INTO #person (Id, Name) VALUES(2, 'Carol');
INSERT INTO #person (Id, Name) VALUES(2, 'Sam');
INSERT INTO #athlete (Id, Sport) VALUES(1, 'Golf');
INSERT INTO #athlete (Id, Sport) VALUES(1, 'Football');
INSERT INTO #coach (Id, Sport) VALUES(1, 'Tennis');
INSERT INTO #athlete (Id, Sport) VALUES(2, 'Tennis');
INSERT INTO #coach (Id, Sport) VALUES(2, 'Tennis');
INSERT INTO #athlete (Id, Sport) VALUES(2, 'Swimming');
-- so Bob has 3 sports, Carol has only 2 (she both coaches and plays Tennis)
SELECT p.Id, p.Name
FROM
(
SELECT Id, Sport
FROM #athlete
UNION -- this has an implicit "distinct"
SELECT Id, Sport
FROM #coach
) a
INNER JOIN #person p ON a.Id = p.Id
GROUP BY p.Id, p.Name
HAVING COUNT(*) >= 3
-- returns 1, Bob
I have created a SQL with some test data - should work in your case:
Connecting the two results in the subselect with UNION:
UNION will return just non-duplicate values. So every sport will be just counted once.
Finally just grouping the resultset by person.Person_id and person.name.
Due to the HAVING clause, just persons with 3 or more sports will be returned-
CREATE TABLE person
(
Person_id int
,name varchar(50)
,height int
)
CREATE TABLE coach
(
id int
,sport varchar(50)
)
CREATE TABLE athlete
(
id int
,sport varchar(50)
)
INSERT INTO person VALUES
(1,'John', 130),
(2,'Jack', 150),
(3,'William', 170),
(4,'Averel', 190),
(5,'Lucky Luke', 180),
(6,'Jolly Jumper', 250),
(7,'Rantanplan ', 90)
INSERT INTO coach VALUES
(1,'Football'),
(1,'Hockey'),
(1,'Skiing'),
(2,'Tennis'),
(2,'Curling'),
(4,'Tennis'),
(5,'Volleyball')
INSERT INTO athlete VALUES
(1,'Football'),
(1,'Hockey'),
(2,'Tennis'),
(2,'Volleyball'),
(2,'Hockey'),
(4,'Tennis'),
(5,'Volleyball'),
(3,'Tennis'),
(6,'Volleyball'),
(6,'Tennis'),
(6,'Hockey'),
(6,'Football'),
(6,'Cricket')
SELECT person.Person_id
,person.name
FROM person
INNER JOIN (
SELECT id
,sport
FROM athlete
UNION
SELECT id
,sport
FROM coach
) sports
ON sports.id = person.Person_id
GROUP BY person.Person_id
,person.name
HAVING COUNT(*) >= 3
ORDER BY Person_id
The coaches & athletes, ie people who are coaches or athletes, are relevant to your answer. That is union (rows in one or another), not (inner) join rows in one and another). (Although outer join involves a union, so there is a complicated way to use it here.) But there's no point in getting that by unioning only-coaches, only-athletes & coach-athletes.
Idiomatic is to group & count the union of Athletes & Coaches.
select id
from (select * from Athletes union select * from Coaches) as u
group by id
having COUNT(*) >= 3
Alternatively, you want ids of people who coach or play a 1st sport and coach or play a 2nd sport and coach or play a 3rd sport where the sports are all different.
with u as (select * from Athletes union select * from Coaches)
select u1.id
from u u1
join u u2 on u1.id = u2.id
join u u3 on u2.id = u3.id
where u1.sport <> u2.sport and u2.sport <> u3.sport and u1.sport <> u3.sport
If you wanted names you would join that with People.
Is there any rule of thumb to construct SQL query from a human-readable description?](https://stackoverflow.com/a/33952141/3404097)
I usually don't ask for "scripts" but for mechanisms but I think that in this case if i'll see an example I would understand the principal.
I have three tables as shown below:
and I want to get the columns from all three, plus a count of the number of episodes in each series and to get a result like this:
Currently, I am opening multiple DB threads and I am afraid that as I get more visitors on my site it will eventually respond really slowly.
Any ideas?
Thanks a lot!
First join all the tables together to get the columns. Then, to get a count, use a window function:
SELECT count(*) over (partition by seriesID) as NumEpisodesInSeries,
st.SeriesId, st.SeriesName, et.episodeID, et.episodeName,
ct.createdID, ct.CreatorName
FROM series_table st join
episode_table et
ON et.ofSeries = st.seriesID join
creator_table ct
ON ct.creatorID = st.byCreator;
Do your appropriate joins between the tables and their IDs as you would expect, and also join onto the result of a subquery that determines the total episode count using the Episodes table.
SELECT SeriesCount.NumEpisodes AS #OfEpisodesInSeries,
S.id AS SeriesId,
S.name AS SeriesName,
E.id AS EpisodeId,
E.name AS EpisodeName,
C.id AS CreatorId,
C.name AS CreatorName
FROM
Series S
INNER JOIN
Episodes E
ON E.seriesId = S.id
INNER JOIN
Creators C
ON S.creatorId = C.id
INNER JOIN
(
SELECT seriesId, COUNT(id) AS NumEpisodes
FROM Episodes
GROUP BY seriesId
) SeriesCount
ON SeriesCount.seriesId = S.id
SQL Fiddle Schema:
CREATE TABLE Series (id int, name varchar(20), creatorId int)
INSERT INTO Series VALUES(1, 'Friends', 1)
INSERT INTO Series VALUES(2, 'Family Guy', 2)
INSERT INTO Series VALUES(3, 'The Tonight Show', 1)
CREATE TABLE Episodes (id int, name varchar(20), seriesId int)
INSERT INTO Episodes VALUES(1, 'Joey', 1)
INSERT INTO Episodes VALUES(2, 'Ross', 1)
INSERT INTO Episodes VALUES(3, 'Phoebe', 1)
INSERT INTO Episodes VALUES(4, 'Stewie', 2)
INSERT INTO Episodes VALUES(5, 'Kevin Kostner', 3)
INSERT INTO Episodes VALUES(6, 'Brad Pitt', 3)
INSERT INTO Episodes VALUES(7, 'Tom Hanks', 3)
INSERT INTO Episodes VALUES(8, 'Morgan Freeman', 3)
CREATE TABLE Creators (id int, name varchar(20))
INSERT INTO Creators VALUES(1, 'Some Guy')
INSERT INTO Creators VALUES(2, 'Seth McFarlane')
Try this:
http://www.sqlfiddle.com/#!3/5f938/17
select min(ec.num) as NumEpisodes,s.Id,S.Name,
Ep.ID as EpisodeID,Ep.name as EpisodeName,
C.ID as CreatorID,C.Name as CreatorName
from Episodes ep
join Series s on s.Id=ep.SeriesID
join Creators c on c.Id=s.CreatorID
join (select seriesId,count(*) as Num from Episodes
group by seriesId) ec on s.id=ec.seriesID
group by s.Id,S.Name,Ep.ID,Ep.name,C.ID,C.Name
Thanks Gordon
I would do the following:
SELECT (SELECT Count(*)
FROM episodetbl e1
WHERE e1.ofseries = s.seriesid) AS "#ofEpisodesInSeries",
s.seriesid,
s.seriesname,
e.episodeid,
e.episodename,
c.creatorid,
c.creatorname
FROM seriestbl s
INNER JOIN creatortbl c
ON s.bycreator = c.creatorid
INNER JOIN episodetbl e
ON e.ofseries = s.seriesid
I'm studying SQL and can't seem to find an answer to this exercise.
Exercise: For all cases where the same reviewer rated the same movie twice and gave it a higher rating the second time, return the reviewer's name and the title of the movie.
I don't know how to compare 2 rows and then get the higher rating.
The tables' schemas are:
Movie ( mID, title, year, director )
English: There is a movie with ID number mID, a title, a release
year, and a director.
Reviewer ( rID, name )
English: The reviewer with ID number rID has a certain name.
Rating ( rID, mID, stars, ratingDate )
English: The reviewer rID gave the movie mIDa number of stars rating (1-5) on a certain ratingDate.*
Researching here in the forum I've got as far as to this point:
select *
from rating a
join Reviewer rv on rv.rid = a.rid
where 1 < (select COUNT(*) from rating b
where b.rid = a.rid and b.mid = a.mid)
I'd be glad to be given also an explanation of the code. Since even the code above is making me really confused.
/* Create the schema for our tables */
create table Movie(mID int, title text, year int, director text);
create table Reviewer(rID int, name text);
create table Rating(rID int, mID int, stars int, ratingDate date);
/* Populate the tables with our data */
insert into Movie values(101, 'Gone with the Wind', 1939, 'Victor Fleming');
insert into Movie values(102, 'Star Wars', 1977, 'George Lucas');
insert into Movie values(103, 'The Sound of Music', 1965, 'Robert Wise');
insert into Movie values(104, 'E.T.', 1982, 'Steven Spielberg');
insert into Movie values(105, 'Titanic', 1997, 'James Cameron');
insert into Movie values(106, 'Snow White', 1937, null);
insert into Movie values(107, 'Avatar', 2009, 'James Cameron');
insert into Movie values(108, 'Raiders of the Lost Ark', 1981, 'Steven Spielberg');
insert into Reviewer values(201, 'Sarah Martinez');
insert into Reviewer values(202, 'Daniel Lewis');
insert into Reviewer values(203, 'Brittany Harris');
insert into Reviewer values(204, 'Mike Anderson');
insert into Reviewer values(205, 'Chris Jackson');
insert into Reviewer values(206, 'Elizabeth Thomas');
insert into Reviewer values(207, 'James Cameron');
insert into Reviewer values(208, 'Ashley White');
insert into Rating values(201, 101, 2, '2011-01-22');
insert into Rating values(201, 101, 4, '2011-01-27');
insert into Rating values(202, 106, 4, null);
insert into Rating values(203, 103, 2, '2011-01-20');
insert into Rating values(203, 108, 4, '2011-01-12');
insert into Rating values(203, 108, 2, '2011-01-30');
insert into Rating values(204, 101, 3, '2011-01-09');
insert into Rating values(205, 103, 3, '2011-01-27');
insert into Rating values(205, 104, 2, '2011-01-22');
insert into Rating values(205, 108, 4, null);
insert into Rating values(206, 107, 3, '2011-01-15');
insert into Rating values(206, 106, 5, '2011-01-19');
insert into Rating values(207, 107, 5, '2011-01-20');
insert into Rating values(208, 104, 3, '2011-01-02');
something like that should work (they are other ways, too)
SELECT rev.name, m.title
FROM Reviewer rev
INNER JOIN Rating r1 on r1.rID = rev.rID
INNER JOIN Rating r2 on r2.rID = rev.rID and r2.mID = r1.mID
INNER JOIN Movie m on m.mID = r1.mID
WHERE r2.ratingDate > r1.ratingDate and r2.stars > r1.stars
or you can do all in join (instead of WHERE clause) in this case
SELECT rev.name, m.title
FROM Reviewer rev
INNER JOIN Rating r1 on r1.rID = rev.rID
INNER JOIN Rating r2
on r2.rID = rev.rID
and r2.mID = r1.mID
and r2.ratingDate > r1.ratingDate
and r2.stars > r1.stars
INNER JOIN Movie m on m.mID = r1.mID
SqlFiddle (with your sample datas)
Explanation : I suppose you know the JOIN syntax, so
The trick is to join Rating two times.
Then the WHERE part checks if there's exist a line where one of the rating (from same reviewer on same movie) has a bigger ratingDate and more stars. Which checks : "gave it a higher rating the second time".
Then we just group by reviewerName and movie title (this part is to avoid duplicates if we have 3 reviews, the second having more stars than the first, and the third more than the second) : with your sample datas, the GROUP BY is not needed, but...
Start with getting all the reviewers who reviewed exactly twice:
select rid
from rating r
group by rid
having count(*) = 2
Now the question is: are they the same or is the second larger? To do this, join back in the ratings, but also include the two dates:
from (select rid, min(ratingdate) as minratingdate, max(ratingdate) as maxratingdate
from rating r
group by rid
having count(*) = 2
) twotimes join
rating r1
on r1.rid = twotimes.rid and r1.ratingdate = twotimes.minratingdate join
rating r2
on r2.rid = twotimes.rid and r2.ratingdate = twotimes.maxratingdate
This brings in the information about the two reviews. You can finish the query from here.
You can use GROUP BY and HAVING:
SELECT m.mId, m.Title, r.Name, ra.stars
FROM Movie m
JOIN (SELECT mId, rId, MAX(stars) stars
FROM Rating
GROUP BY mId, rId
HAVING COUNT(*) > 1) ra ON m.mId = ra.mId
JOIN Reviewer r ON ra.rId = r.rId
GROUP BY m.mId, m.Title, r.Name, ra.stars
This will return you any movie that has multiple reviews from the same reviewer with the highest number of stars.
Here is the SQL Fiddle for testing.
Good luck.
NOTE: I accidentally put another question's sentence in here (massive apologies on my part), I have updated this post as of Wednesday 14th March at 23:21pm with the correct question.
I have spent a few hours trying to figure out this question without anyone's help but have realised I have wasted too much productive time and should've asked someone sooner. I had a decent crack at this and have come so close but cannot get the final solution I need. What I am supposed to get is:
For all cases where the same reviewer rated the same movie twice and
gave it a higher rating the second time, return the reviewer's name
and the title of the movie.
This is the query I managed to get here:
SELECT reviewer.name, movie.title, rating.stars
FROM (reviewer JOIN rating ON reviewer.rid = rating.rid)
JOIN movie ON movie.mid = rating.mid
GROUP BY reviewer.name
HAVING COUNT(*) >= 2
ORDER BY reviewer.name DESC
(I have a feeling there is a missing WHERE clause from the above query, but am not sure where to place it)
(From what I have learned, RIGHT and FULL OUTER JOINs are not currently supported in SQLite)
And here are the tables and data (in pictures)...
... And the DB code...
/* Delete the tables if they already exist */
drop table if exists Movie;
drop table if exists Reviewer;
drop table if exists Rating;
/* Create the schema for our tables */
create table Movie(mID int, title text, year int, director text);
create table Reviewer(rID int, name text);
create table Rating(rID int, mID int, stars int, ratingDate date);
/* Populate the tables with our data */
insert into Movie values(101, 'Gone with the Wind', 1939, 'Victor Fleming');
insert into Movie values(102, 'Star Wars', 1977, 'George Lucas');
insert into Movie values(103, 'The Sound of Music', 1965, 'Robert Wise');
insert into Movie values(104, 'E.T.', 1982, 'Steven Spielberg');
insert into Movie values(105, 'Titanic', 1997, 'James Cameron');
insert into Movie values(106, 'Snow White', 1937, null);
insert into Movie values(107, 'Avatar', 2009, 'James Cameron');
insert into Movie values(108, 'Raiders of the Lost Ark', 1981, 'Steven Spielberg');
insert into Reviewer values(201, 'Sarah Martinez');
insert into Reviewer values(202, 'Daniel Lewis');
insert into Reviewer values(203, 'Brittany Harris');
insert into Reviewer values(204, 'Mike Anderson');
insert into Reviewer values(205, 'Chris Jackson');
insert into Reviewer values(206, 'Elizabeth Thomas');
insert into Reviewer values(207, 'James Cameron');
insert into Reviewer values(208, 'Ashley White');
insert into Rating values(201, 101, 2, '2011-01-22');
insert into Rating values(201, 101, 4, '2011-01-27');
insert into Rating values(202, 106, 4, null);
insert into Rating values(203, 103, 2, '2011-01-20');
insert into Rating values(203, 108, 4, '2011-01-12');
insert into Rating values(203, 108, 2, '2011-01-30');
insert into Rating values(204, 101, 3, '2011-01-09');
insert into Rating values(205, 103, 3, '2011-01-27');
insert into Rating values(205, 104, 2, '2011-01-22');
insert into Rating values(205, 108, 4, null);
insert into Rating values(206, 107, 3, '2011-01-15');
insert into Rating values(206, 106, 5, '2011-01-19');
insert into Rating values(207, 107, 5, '2011-01-20');
insert into Rating values(208, 104, 3, '2011-01-02');
I have another relatively similar question like this, but if I get some help on this one I should be able to apply the patterns and techniques from this one to the next one.
Thanks in advance! :)
I have added an inner join with derived table that returns maximum stars per movie. Because of inner join between movies and ratings only movies with ratings will be retrieved. Join it back to main query to get maximum stars per movie.
Note: you stated that you wish to order by movie title but your query orders by reviewer.
SELECT reviewer.name, movie.title, rating.stars, maxStarsPerMovie.MaxStars
FROM (reviewer JOIN rating ON reviewer.rid = rating.rid)
JOIN movie ON movie.mid = rating.mid
join
(
select movie.mid, max(rating.stars) MaxStars
from movie
inner join rating
on movie.mid = rating.mid
group by movie.mid
) maxStarsPerMovie
on movie.mid = maxStarsPerMovie.mid
ORDER BY reviewer.name DESC
EDIT: requiremets changed. This query will return list of reviewers who changed their opinion at later date in favor of the movie. It does so by joining ratings for the second time adding two filters on stars and date to join.
SELECT reviewer.name, movie.title, rating.ratingDate, rating.stars,
newRating.ratingDate newRatingDate, newRating.Stars newRatingStars
FROM (reviewer JOIN rating ON reviewer.rid = rating.rid)
JOIN movie ON movie.mid = rating.mid
inner join rating newRating
on newRating.mid = movie.mid
and newRating.rid = reviewer.rid
and newRating.ratingdate > rating.ratingdate
and newRating.stars > rating.stars
ORDER BY reviewer.name, movie.title
From the description of the requirement:
Return the movie title and number of stars (sorted by movie title) For each movie that has at least one rating, and find the highest number of stars that movie received.
The reviewer details do not appear to be required - only the Movie and the maximum stars.
Therefore, I suggest:
SELECT movie.mid, MAX(movie.title) as title, MAX(rating.stars) as max_stars
FROM rating
JOIN movie ON movie.mid = rating.mid
GROUP BY movie.mid
ORDER BY 2, 1
You were probably doing this for Stanford's SQL mini-course, as I just did. Here's what I got for my answer (I had no experience with SQL prior to watching the lectures, so hopefully this isn't too terrible):
Start with a query that finds each rID for a reviewer who rated a movie twice and gave it a higher score the second time:
select R1.rID from Rating R1, Rating R2 where R1.mID = R2.mID and R1.rID = R2.rID
and R1.ratingDate < R2.ratingDate and R2.stars > R1.stars;
think of R1 as the first rating of a particular movie by a particular reviewer, and R2 as the second.
We need to be talking about 2 reviews of the same movie by the same person, hence R1.mID = R2.mID and R1.rID = R2.rID. Next, to make sure that R1 was indeed first, chronologically, we set R1.ratingDate < R2.ratingDate, and to make sure that R2 was indeed given a greater score, we set R2.stars > R1.stars.
You can check to see that this gives us rID 201, which is the correct answer (can verify by checking the data). Now we need to display the movie reviewer name and title instead of the rID.
I did this by doing the cross product of all 3 relations (I suppose using joins would be cleaner?), removing duplicates and using the query listed above as a where clause subquery:
select distinct name, title
from Movie, Reviewer, Rating
where Movie.mID = Rating.mID and Reviewer.rID = Rating.rID and Rating.rID in (select R1.rID from
Rating R1, Rating R2 where R1.mID =R2.mID and R1.rID = R2.rID and R1.ratingDate < R2.ratingDate
and R2.stars > R1.stars);
In the where clause I just made the cross products into natural joins by setting the respective mIDs and rIDs equal, and made sure that the rIDs (called Rating.rID for disambiguation) were determined by the initial query that I wrote.
I would suggest using a Self Join in subquery to identify the reviewer and then use the result to get name of reviewer and movie title.
SELECT name, title FROM Reviewer R join (SELECT Ra.mID as mID, Ra.rID as rID FROM Rating Ra JOIN Rating Rb ON Ra.mID=Rb.mID WHERE Ra.rID=Rb.rID AND Ra.ratingDate<Rb.ratingDate AND Ra.stars<Rb.stars) AS Tmp ON R.rID=Tmp.rID JOIN Movie M ON M.mID=Tmp.mID