MAX(COUNT(*)) for each ID in a SQL query - sql

I'm needing some help with a SQL query using PostgreSQL 9.4.
I need the most rented movies on each local, this is the data I'm asked to select
movie title
year
local id
number of rents
Tables:
rental(idMovie, idLocal, idClient)
movies(idMovie, title, description, year)
This is what I have done, but is not what i am asked to do.
SELECT m.tile, m.year, r.idLocal, COUNT(*) AS cont
FROM rental r, movies m
WHERE m.idMovie=r.idMovie
GROUP BY r.idLocal, m.title, m.year
ORDER BY COUNT(*) DESC;

If I understand your question correctly what you want is to show the most rented out movie(s) for every location.
If that is the case then you could use a window function like rank() which will assign a ranking number to all movies based on the number of rentals for each location. The where clause then filters out the highest ranking movies (there can of course be more than one that shares the top spot).
select m.title, m.year, r.idLocal, r.rents
from movies m
join (
select
idMovie,
idlocal,
count(idmovie) rents,
rank() over (
partition by idlocal
order by count(idmovie) desc
) rn
from rental
group by idMovie, idLocal
) r on m.idMovie = r.idMovie
where rn = 1
order by idlocal;
Sample SQL Fiddle

Try this...
SELECT r.idLocal, m.title, m.year, count(r.idMovie) as count_movie
FROM rental r, movies m
WHERE m.idMovie=r.idMovie
GROUP BY r.idLocal, m.title, m.year
ORDER BY count_movie DESC;
Or
SELECT r.idLocal, m.title, m.year, count(r.idMovie) as count_movie
FROM rental r, movies m
WHERE m.idMovie=r.idMovie
GROUP BY r.idLocal, m.title, m.year
ORDER BY count(r.title) DESC;

Related

Shorten a query

I have to write a query that would calculate number of tickets purchased consisting only of movie genre of that type. At the end, I have to return movie genre and number of tickets bought for that genre. I have written a query but I was wondering if it can be made shorter and more compact?
Following is the database scheme:
movies(movieId, movieGenre, moviePrice)
tickets(ticketId, ticketDate, customerId)
details(ticketId, movieId, numOfTickets)
Here is my query:
select m.genre, count(*)
from(select t.ticketId, m.genre
from(select d.ticketId
from(select m.genre, t.ticketId
from tickets t join details d on t.ticketId =
d.ticketId join movies m on d.movieId = m.movieId
group by m.genre, t.ticketId) d
group by d.ticketId
having count(*) = 1) as t join details d on t.ticketId =
d.ticketId join movies m on d.movieId = m.movieId
group by t.ticketId, m.genre) m
group by m.genre;
This runs on a database so I am only able to post sample output:
comedy 29821
action 27857
rom-com 19663
I see no reason to use the table tickets, because the results do not filter or aggregate by ticketDate or customerID. Thus, a shorter sql is
SELECT m.moviegenre,
Sum(d.numoftickets) as SumNum
FROM details d
LEFT JOIN movies m
ON d.movieid = m.movieid
GROUP BY m.moviegenre
HAVING SumNum > 0
ORDER BY m.moviegenre
added 3/28 am
I am not sure what is meant by Duplicates?? In table = details(ticketId, movieId, numOfTickets) ??
I would expect that ticketId is unique, so what would explain duplicates?
Is the same ticketId being printed twice, repeatedly??
Determine what number of ticketId are duplicates--
SELECT ticketId, count(*) as cnt
FROM details d
GROUP By ticketId
HAVING count(*) > 1
Determine what number of "details" rows are duplicates--
SELECT ticketId, movieId, numOfTickets, count(*) as cnt
FROM details d
GROUP By ticketId, movieId, numOfTickets
HAVING count(*) > 1
Then again, it may be that table = movies(movieId, movieGenre, moviePrice) is the one with duplicates??
Determine what number of movieId are duplicates--
SELECT movieId, count(*) as cnt
FROM movies m
GROUP BY movieId
HAVING count(*) > 1
Remove duplicates from details--
SELECT m.moviegenre,
Sum(d.numoftickets) as SumNum
FROM
(Select Distinct * From details) d
LEFT JOIN movies m
ON d.movieid = m.movieid
GROUP BY m.moviegenre
ORDER BY m.moviegenre

hive select max count by grouping on two fields

I am trying to write a sql query to find Most Popular Artist in each Country. Popular artist is one which has maximum number of rating>=8
Below is table structure,
describe album;
albumid string
album_title string
album_artist string`
describe album_ratings;
userid int
albumid string
rating int
describe cusers;
userid int
state string
country string
Below is one query that I wrote but it is not working.
select album_artist, country, count(rating)
from album, album_ratings, cusers
where album.albumid=album_ratings.albumid
and album_ratings.userid=cusers.userid
and rating>=6
group by country, album_artist
having count(rating) = (
select max(t.cnt)
from (
select count(rating) as cnt
from album, album_ratings, cusers
where album.albumid=album_ratings.albumid
and album_ratings.userid=cusers.userid
and rating>=6
group by country, album_artist
) as t
group by t.country
);
Learn to use proper, explicit JOIN syntax. Never use commas in the FROM clause.
You can do this with window functions:
select *
from (select album_artist, country, count(*) as cnt,
row_number() over (partition by country order by count(*) desc) as seqnum
from album a join
album_ratings ar join
on a.albumid = ar.albumid
cusers u
on ar.userid = u.userid
where rating >= 6
group by country, album_artist
) aru
where seqnum = 1;
If you want ties, use rank() instead of row_number().
You can use window function row_number to find most popular artist in each country (higher rating - more popular):
select *
from (
select c.country,
a.album_artist,
sum(rating) as total_rating,
row_number() over (partition by c.country order by sum(rating) desc) as rn
from cusers c
join album_ratings r on c.userid = r.userid
join album a on r.albumid = a.albumid
where r.rating >= 8
group by c.country,
a.album_artist
) t
where rn = 1;
I assumed sum(rating) instead, because I think rating should be additive.
Also, always use explicit join syntax instead of old comma based join.

Movie with highest rating?

Find the movie(s) with the highest average rating. Return the movie title(s) and average rating.
I tried this and stuck because I'm not able to retrieve mid if i add mid, max(avg_stars) then it will give max of every mid, I want only one max value.
http://sqlfiddle.com/#!3/e3ee1/13
select max(avg_stars) from
(
select top 1 mid, avg(stars) as avg_stars
from rating
group by mid
order by avg_stars desc
) z
excepted output Snow White 4.5 and how can i handle if two movies having same max(avg_stars).
This would serve your purpose perfectly & with performance -
SELECT
Title
,AVG_RATING
FROM
(
SELECT
M.Title
,M.mID
,CAST(ROUND(AVG(R.stars),2) AS DECIMAL(10,2)) AS AVG_RATING
,RANK() OVER (ORDER BY AVG(R.stars) DESC) RATING_RANK
FROM Movie M
INNER JOIN Rating R
ON M.mID = R.mID
GROUP BY M.Title,M.mID
)RANKED_RATING
WHERE RATING_RANK = 1
You may have to play around the casting a little to suit your table definitions.
Note - If 2 or more movies have the highest avg rating - all would be ranked 1 and all would get selected. If you still want only one - you'll need to define a rule as to which one you want to be selected.
Try this http://sqlfiddle.com/#!3/e3ee1/143:
;WITH CTE as
(
select r.mid, avg(r.stars) as avg_stars, m.title
from rating r
INNER JOIN Movie m ON m.mid=r.mid
group by r.mid, m.title
--order by avg_stars desc
)
select TOP 1 mid, title,avg_stars from CTE
Group by avg_stars,mid,title
--having avg_stars=Max(avg_stars)
Order By avg_stars desc
Output:
MID TITLE AVG_STARS
106 Snow White 4.5
SELECT TOP 1 MAX(m.title) AS title, AVG(stars) AS averageStars
FROM rating r
JOIN movie m
ON r.mId = m.mId
GROUP BY r.mId
ORDER BY AVG(stars) DESC,
--Order by a seond column of your
--choice to break ties for AVG(stars)
MAX(m.title)
You can probably optimize or come up with something cleaner but this works:
SELECT m.title, AVG(r.stars) AS AverageStars
FROM Rating AS r (NOLOCK)
INNER JOIN Movie AS m (NOLOCK) ON m.mID = r.mID
GROUP BY r.mID, m.Title
HAVING AVG(r.stars) =
(
SELECT TOP 1 AVG(stars) AS AverageStars
FROM Rating (NOLOCK)
GROUP BY mID
ORDER BY AverageStars DESC
)

I don't understand this SQL Server MIN() resultset

I have this SQL Server query which I wrote to find the Movie title that has the least amount of records in the RENTAL table.
When run, it returns a resultset that is identical to the resultset I get from executing the sub-query by itself.
In other words, rather returning the single movie with the minimum RentalCount, it returns all movie titles and their corresponding RentalCount.
SELECT B.Title, MIN(B.RentalCount) AS RentalCount
FROM (
SELECT Movie.Title, Count(*) AS RentalCount
FROM Rental
JOIN Dvd ON Rental.RentalID=Dvd.DvdID
JOIN Movie ON Dvd.Movieid=movie.MovieID
GROUP BY Movie.Title
) B
GROUP BY B.Title
The result is correct. Your subquery returns the total count for each title on the rental table. And the result will be the same on the outer query because you have grouped them by their title also.
follow-up question: what result do you want to achieved?
find the Movie title that has the least amount of records in the RENTAL table
SELECT Movie.Title, Count(*) AS RentalCount
FROM Rental
JOIN Dvd ON Rental.RentalID=Dvd.DvdID
JOIN Movie ON Dvd.Movieid=movie.MovieID
GROUP BY Movie.Title
HAVING Count(*) =
(
SELECT MIN(t_count)
FROM
(
SELECT Count(*) t_count
FROM Rental
GROUP BY Title
) a
)
UPDATE 1
Thanks to Martin Smith for introducing me TOP....WITH TIES
SELECT TOP 1 WITH TIES Movie.Title, Count(*) AS RentalCount
FROM Rental
JOIN Dvd ON Rental.RentalID=Dvd.DvdID
JOIN Movie ON Dvd.Movieid=movie.MovieID
GROUP BY Movie.Title
ORDER BY RentalCount DESC
SQLFiddle Demo
You could have done this without a subquery
SELECT TOP 1 Movie.Title, Count(*) AS RentalCount
FROM Rental
JOIN Dvd ON Rental.RentalID=Dvd.DvdID
JOIN Movie ON Dvd.Movieid=movie.MovieID
GROUP BY Movie.Title
ORDER BY Count(*)
if you are looking for a specfic movie title then do like this:
SELECT Movie.Title, Count(*) AS RentalCount
FROM Rental
JOIN Dvd ON Rental.RentalID=Dvd.DvdID
JOIN Movie ON Dvd.Movieid=movie.MovieID
where Movie.Title='xyz'
GROUP BY Movie.Title

How should I join these 3 SQL queries in Oracle?

I have these 3 queries:
SELECT
title, year, MovieGenres(m.mid) genres,
MovieDirectors(m.mid) directors, MovieWriters(m.mid) writers,
synopsis, poster_url
FROM movies m
WHERE m.mid = 1;
SELECT AVG(rating) FROM movie_ratings WHERE mid = 1;
SELECT COUNT(rating) FROM movie_ratings WHERE mid = 1;
And I need to join them into a single query. I was able to do it like this:
SELECT
title, year, MovieGenres(m.mid) genres,
MovieDirectors(m.mid) directors, MovieWriters(m.mid) writers,
synopsis, poster_url, AVG(rating) average, COUNT(rating) count
FROM movies m INNER JOIN movie_ratings mr
ON m.mid = mr.mid
WHERE m.mid = 1
GROUP BY
title, year, MovieGenres(m.mid), MovieDirectors(m.mid),
MovieWriters(m.mid), synopsis, poster_url;
But I don't really like that "huge" GROUP BY, is there a simpler way to do it?
You could do something like this:
SELECT title
,year
,MovieGenres(m.mid) genres
,MovieDirectors(m.mid) directors
,MovieWriters(m.mid) writers
,synopsis
,poster_url
,(select avg(mr.rating)
from movie_ratings mr
where mr.mid = m.mid) as avg_rating
,(select count(rating)
from movie_ratings mr
where mr.mid = m.mid) as num_ratings
FROM movies m
WHERE m.mid = 1;
or even
with grouped as(
select avg(rating) as avg_rating
,count(rating) as num_ratings
from movie_ratings
where mid = 1
)
select title
,year
,MovieGenres(m.mid) genres
,MovieDirectors(m.mid) directors
,MovieWriters(m.mid) writers
,synopsis
,poster_url
,avg_rating
,num_ratings
from movies m cross join grouped
where m.mid = 1;
I guess I don't see the problem with having several GroupBy columns. That's a very common pattern in SQL. Of course, code clarity is often in the eye of the beholder.
Check the explain plans for the two approaches; my guess is you'll get better performance with your original version since it only needs to process the movie_ratings table once. But I haven't checked, and that will be somewhat data and installation dependent.
how about
SELECT
title, year, MovieGenres(m.mid) genres,
MovieDirectors(m.mid) directors, MovieWriters(m.mid) writers,
synopsis, poster_url,
(SELECT AVG(rating) FROM movie_ratings WHERE mid = 1) av,
(SELECT COUNT(rating) FROM movie_ratings WHERE mid = 1) cnt
FROM movies m
WHERE m.mid = 1;
or
SELECT
title, year, MovieGenres(m.mid) genres,
MovieDirectors(m.mid) directors, MovieWriters(m.mid) writers,
synopsis, poster_url,
av.av,
cnt.cnt
FROM movies m,
(SELECT AVG(rating) av FROM movie_ratings WHERE mid = 1) av,
(SELECT COUNT(rating) cnt FROM movie_ratings WHERE mid = 1) cnt
WHERE m.mid = 1;