The task: Obtain a list, in alphabetical order, of actors who've had at least 30 starring roles.
My code:
select name, count(ord=1)
from casting
join actor on actorid=actor.id
where ord=1 and count(ord=1) and exists ( select 1 from casting
where count(movieid)>=30)
group by actorid,name
order by name
It gives me error, - invalid use of group by function.
Join the tables, group by actor and put the condition in the having clause.
select
a.name,
sum(case c.ord when 1 then 1 else 0 end) starringroles
from actor a inner join casting c
on c.actorid = a.id
group by a.id, a.name
having sum(case c.ord when 1 then 1 else 0 end) >= 30
order by a.name
The expression sum(case c.ord when 1 then 1 else 0 end) will count the number of starring roles (with ord = 1).
you can not use aggregation on where need having
select name, count(*)
from casting
join actor on actorid=actor.id
where ord=1
and exists ( select 1 from casting
having count(movieid)>=30)
group by actorid,name
having count(movieid)>=30
order by name
select MAX(name) AS name, count(*) AS roles
from casting
join actor on actorid=actor.id
group by actorid
HAVING COUNT(*)>=30
order by name;
This is a question in SQLZoo, Question # 13. The picture below explains the tables:
A little description about the database:
This database features two entities (movies and actors) in a many-to-many relation. Each entity has its own table. A third table, casting , is used to link them. The relationship is many-to-many because each film features many actors and each actor has appeared in many films.
Here is the answer I found working:
select A.name
from actor A
inner join casting C
on C.actorid = A.id
where C.ord =1 /*only the lead roles*/
group by A.id /*grouped by Actor ID*/
having count(C.movieid) >=15 /*at least 15 starring roles*/
order by A.name /* in alphabetical order*/
SELECT actor.name FROM actor
JOIN casting ON actor.id=casting.actorid
WHERE casting.actorid IN (SELECT actorid FROM casting WHERE ord =1
GROUP BY actorid
HAVING COUNT(ord) >=15)
GROUP BY name
Related
I have to write a query that would calculate number of tickets purchased consisting only of movie genre of that type. At the end, I have to return movie genre and number of tickets bought for that genre. I have written a query but I was wondering if it can be made shorter and more compact?
Following is the database scheme:
movies(movieId, movieGenre, moviePrice)
tickets(ticketId, ticketDate, customerId)
details(ticketId, movieId, numOfTickets)
Here is my query:
select m.genre, count(*)
from(select t.ticketId, m.genre
from(select d.ticketId
from(select m.genre, t.ticketId
from tickets t join details d on t.ticketId =
d.ticketId join movies m on d.movieId = m.movieId
group by m.genre, t.ticketId) d
group by d.ticketId
having count(*) = 1) as t join details d on t.ticketId =
d.ticketId join movies m on d.movieId = m.movieId
group by t.ticketId, m.genre) m
group by m.genre;
This runs on a database so I am only able to post sample output:
comedy 29821
action 27857
rom-com 19663
I see no reason to use the table tickets, because the results do not filter or aggregate by ticketDate or customerID. Thus, a shorter sql is
SELECT m.moviegenre,
Sum(d.numoftickets) as SumNum
FROM details d
LEFT JOIN movies m
ON d.movieid = m.movieid
GROUP BY m.moviegenre
HAVING SumNum > 0
ORDER BY m.moviegenre
added 3/28 am
I am not sure what is meant by Duplicates?? In table = details(ticketId, movieId, numOfTickets) ??
I would expect that ticketId is unique, so what would explain duplicates?
Is the same ticketId being printed twice, repeatedly??
Determine what number of ticketId are duplicates--
SELECT ticketId, count(*) as cnt
FROM details d
GROUP By ticketId
HAVING count(*) > 1
Determine what number of "details" rows are duplicates--
SELECT ticketId, movieId, numOfTickets, count(*) as cnt
FROM details d
GROUP By ticketId, movieId, numOfTickets
HAVING count(*) > 1
Then again, it may be that table = movies(movieId, movieGenre, moviePrice) is the one with duplicates??
Determine what number of movieId are duplicates--
SELECT movieId, count(*) as cnt
FROM movies m
GROUP BY movieId
HAVING count(*) > 1
Remove duplicates from details--
SELECT m.moviegenre,
Sum(d.numoftickets) as SumNum
FROM
(Select Distinct * From details) d
LEFT JOIN movies m
ON d.movieid = m.movieid
GROUP BY m.moviegenre
ORDER BY m.moviegenre
movie:
id ,title,yr,director,budget,gross
actor:
id,name
casting:
movieid,actorid,ord
I have a question on this I has been asking to (List the films together with the leading star for all 1962 films.) [Note: the ord field of casting gives the position of the actor. If ord=1 then this actor is in the starring role]
My answer was this:
select
title, name
from
movie
join
casting on movie.id = casting.movieid
join
actor on actor.id = casting.actorid
where
yr = 1962 and movie.id = casting.movieid and actor.id = casting.actorid and casting.ord = 1
group by
title
But what my problem was I can get close to the answer, I have the problem at the ord part because some of the casting do not have 1 for the actor that just showing two, so it will not display on the output.
How could I make it select ord =1 or ord =2 (but not both)(and 1 have higher priority)
Hope anyone can help me this.
select title,coalesce (c.name,c21.name) as actorname from movie m
left join (select * from casting where ord=1) c on m.id= c.movieid
left join (select * from casting where ord=2) c1 on c.movieid = c1.movieid
left join actor a on a.id= c.actorid
where yr=1962
Hope this will output your expected result
Select title,name
From Movie M
Cross Apply (Select top 1 movieid,actorid,ord from Casting Order By ord) as Cast
Inner join Actor A On A.id = Cast. actorid
Use CROSS APPLY (http://technet.microsoft.com/en-us/library/ms175156(v=sql.105).aspx) with a subquery which is ordered by the 'ord' column and has a TOP(1) clause.
SELECT
M.title
, CA.name
FROM
movie AS M
CROSS APPLY (
SELECT TOP(1)
actor.name
FROM
casting C
INNER JOIN actor A
ON C.actorid = A.id
WHERE
M.id = C.movieid
ORDER BY
ord
ASC
) AS CA
I have 2 answers:
SELECT movie.title, (select actor.name from actor where actor.id = casting.actorid)
FROM movie
JOIN casting
ON movie.id =casting.movieid
WHERE movie.yr = 1962 and casting.ord = 1
But then I realized that you can chain joins:
SELECT movie.title, actor.name
FROM movie
JOIN casting ON movie.id=casting.movieid
JOIN actor ON casting.actorid = actor.id
WHERE movie.yr = 1962 and casting.ord = 1
The second one is clearly much simpler. (There's no need to nest a SELECT statement).
To have it select either or, do ... and (casting.ord = 1 xor casting.ord = 2).
To order it by 1, try something like Order by casting.ord in (1,2). (I haven't tested that.
This is question is sqlzoo, and I wrote following code, but I feel it is too redundant
SELECT year, freq
FROM (SELECT yr AS year,count(title) AS freq
FROM movie, actor, casting
WHERE name= 'John Travolta'
AND movie.id=movieid
AND actor.id=actorid
GROUP BY yr) AS a
WHERE freq=(
SELECT MAX(freq)
FROM (SELECT yr AS year,count(title) AS freq
FROM movie, actor, casting
WHERE name= 'John Travolta'
AND movie.id=movieid
AND actor.id=actorid
GROUP BY yr) AS b
)
why cannot it be like this?
SELECT year, freq
FROM (SELECT yr AS year,count(title) AS freq
FROM movie, actor, casting
WHERE name= 'John Travolta'
AND movie.id=movieid
AND actor.id=actorid
GROUP BY yr) AS a
WHERE freq=(
SELECT MAX(freq)
FROM a
)
In cases like this it may be helpful to use CTE's (Common Table Expression). That's the only way you can re-use a subquery. Look how you can use ROW_NUMBER to find the largest frequency. I have also updated the old school FROM A, B, C WHERE ... to the new school FROM A INNER JOIN B ... (I'm not 100% sure the JOIN criteria are correct though.)
WITH a AS
(
SELECT
yr AS year,
COUNT(title) AS freq
FROM
movie
INNER JOIN
casting ON movie.id = casting.movieid
INNER JOIN
actor ON actor.id = casting.actorid
WHERE
name = 'John Travolta'
GROUP BY
yr),
b AS
(
SELECT
year, freq,
ROW_NUMBER() OVER (ORDER BY freq DESC) as RowNum
FROM a
)
SELECT year, freq
FROM b
WHERE RowNum = 1
Whenever you are writing sub-queries the inner ones are evaluated first and then the outer queries.In your second query you are using alias "a" which doesn't exist actually.That is the reason you will get an error in the second query and you cannot use it.The first query is the correct one syntactically.
I am doing some SQL exercise and having this problem, this query below gives me a 'half correct' result because I only want the row(s) with the most title to be displayed, this query is displaying all records. Can someone help? Thanks.
Question:
Which were the busiest years for 'John Travolta'. Show the number of movies he made for each year.
Tables:
movie (id, title, yr, score, votes, director)
actor (id, name)
casting (movieid, actorid, ord)
Query:
select yr, max(title)
from
(
select yr, count(title) title from movie
join casting
on (movie.id=casting.movieid)
join actor
on (casting.actorid=actor.id)
where actor.name="John Travolta"
group by yr Asc
) a
The question asks
Which were the busiest years?
... plural. So, what were his top 5 years?
select top 5
m.yr
,count(*)
from actor as a
join casting as c
join movie as m
on m.movieid = c.movieid
on c.actorid = a.actorid
where a.name = 'John Travolta'
group by
m.yr
order by
count(*) desc
However, the second part of the question specifies that you should
Show the number of movies he made for each year.
So far our query doesn't account for years in which John made no movies... so, this might be where your half correct comes into play. That said, you may want to create a table variable filled with year values from 1954 through the current year... and left join off of that.
declare #year table
(
[yr] int
)
declare #currentYear int = datepart(year,getdate())
while #currentYear >= 1954 begin -- Travolta was born in 1954!
insert #year values (#currentYear)
set #currentYear -= 1
end
select
y.yr
,count(m.movieid)
from #year y
left join movies as m
join casting as c
join actor as a
on a.actorid = c.actorid
and a.name = 'John Travolta'
on c.movieid = m.movieid
on m.yr = y.yr
group by
y.yr
order by
,count(m.movieid) desc
[Edit: based on comments] And a final query to return all years whose count matches the highest of any year.
;with TravoltaMovies as
(
select
m.yr
,count(*) as [Count]
from actor as a
join casting as c
join movie as m
on m.movieid = c.movieid
on c.actorid = a.actorid
where a.name = 'John Travolta'
group by m.yr
)
select
*
from TravoltaMovies as tm
where tm.[Count] = (select max([Count]) from TravoltaMovies)
the answer is:
SELECT y.yr,MAX(y.count)
FROM(
SELECT movie.yr,COUNT(movie.yr) AS count
FROM (movie JOIN casting ON (movie.id=movieid)) JOIN actor ON (actor.id=actorid)
WHERE name='John Travolta'
GROUP BY yr
ORDER BY COUNT(movie.yr) DESC) y
This is much easy solution.
select yr, count(yr)
from movie
join casting on movie.id = movieid
join actor on actorid = actor.id
where name = 'John Travolta'
group by yr
having count(yr) > 2**
Happy to help
select TOP 1 yr, title
from
(
select yr, count(title) title from movie
join casting
on (movie.id=casting.movieid)
join actor
on (casting.actorid=actor.id)
where actor.name="John Travolta"
group by yr Asc
) a
ORDER BY title DESC
Just add a TOP selection and an ORDER BY.
The aggregation is unnecessary.
Thanks all this is the query:
SELECT yr,COUNT(title) FROM
movie JOIN casting ON movie.id=movieid
JOIN actor ON actorid=actor.id
where name='John Travolta'
GROUP BY yr
HAVING COUNT(title)=(SELECT MAX(c) FROM
(SELECT yr,COUNT(title) AS c FROM
movie JOIN casting ON movie.id=movieid
JOIN actor ON actorid=actor.id
where name='John Travolta'
GROUP BY yr) AS t
)
This a simple answer using both join and sub-query concept
select yr,count(title) from movie
inner join casting on
movie.id=casting.movieid
where actorid= (select id from actor where name ='John Travolta')
group by yr
having count(title)>2
Here is an easier solution with explanation-
First we make a join of all the tables
Then we put a category filter on name with name= 'John Travolta'
Now we put a group function on yr so that we have yr and corresponding count(yr)
Solution wants to have years with count>2 only so we apply that filter on grouped data by using 'having' clause
select yr, count(yr) from movie
join casting on movie.id=movieid
join actor on actor.id=actorid
where name= 'John Travolta'
group by yr
having count(yr)>2
Hope this helps, I dont see a need to write a Procedure for this.
Having 3 tables:
movie(id, title, yr, score, votes,
director) actor(id, name)
casting(movieid, actorid, ord)
Q:Which were the busiest years for
'John Travolta'. Show the number of
movies he made for each year.
A: My try is syntactically worng. why ?
select yr, count(*)
from
(actor join casting
on (actor.id = casting.actorid)
join
on (movie.id = casting.movieid)
group by yr
having actor.name='John Travolta'
You are missing the second table name after join
use where not having
Try this:
select yr, count(*)
from actor
join casting on actor.id = casting.actorid
join movie on movie.id = casting.movieid -- you were missing table name "movie"
where actor.name='John Travolta' -- "where", not "having"
group by yr
Also note the consistent formatting I used. If you use a good format, it's easier to find syntax errors
FYI, having is used for aggregate functions, eg having count(*) > 3
Remove the ( ) from around the table name and add movie to your second join.
select yr, count(*)
from actor join
casting on actor.id = casting.actorid join
movie on movie.id = casting.movieid
group by yr
having actor.name='John Travolta'
EDIT:
You need to switch your having to a where because havings are use for aggregate functions in conjunctions with your group by.
select yr, count(*)
from actor join
casting on actor.id = casting.actorid join
movie on movie.id = casting.movieid
where actor.name = 'John Travolta'
group by yr
To join u have to specify table ure joining, should be
join movie
on movie.id = casting.movieid