sql get a unique ID then count the number of tuples relating to that ID - sql

Database Structure
MovieInfo (mvID, title, rating, year, length, studio)
DirectorInfo(directorID, firstname, lastname)
MemberInfo(username, email, password)
ActorInfo(actorID, firstname, lastname, gender, birthplace)
CastInfo(mvID*, actorID*)
DirectInfo(mvID*, directorID*)
GenreInfo(mvID*, genre)
RankingInfo(username*, mvID*, score, voteDate)
Query
I need to get the director with the largest number of comedy movies. (I'm also required to use the ALL operator). My understanding is getting the list of mvid where genre = 'Comedy" and directorid:
select mvid
from genreinfo
where genre = 'Comedy'
union all
select directorid
from directorinfo
;
But then how do I count the number of movies a specific director has? And how do I get that single one with the highest count of "comedy" movies?

You're on the right track. I'd recommend looking at JOINs.
I've provided a step-by-step answer on how to obtain the desired results. If you just want the final query, go down to step 5 and pick the one appropriate for your DBMS.
1: Selecting all comedy movie IDs:
SELECT mvid
FROM GenreInfo
WHERE genre = 'Comedy';
2: Selecting the directorIDs of those movies
SELECT directorID
FROM DirectInfo
JOIN GenreInfo
ON DirectInfo.mvID = GenreInfo.mvID
WHERE genre = 'Comedy';
3: Selecting the director name of those directors.
SELECT firstname
FROM DirectorInfo
JOIN DirectInfo
ON DirectorInfo.directorID = DirectInfo.directorID
JOIN GenreInfo
ON DirectInfo.mvID = GenreInfo.mvID
WHERE genre = 'Comedy';
4: Grouping that query by director to get number of movies:
SELECT firstname, COUNT(*) AS NumberOfMovies
FROM DirectorInfo
JOIN DirectInfo
ON DirectorInfo.directorID = DirectInfo.directorID
JOIN GenreInfo
ON DirectInfo.mvID = GenreInfo.mvID
WHERE genre = 'Comedy'
GROUP BY DirectorInfo.directorID;
5: Sort the results and get only the first one:
SELECT firstname, COUNT(*) AS NumberOfMovies
FROM DirectorInfo
JOIN DirectInfo
ON DirectorInfo.directorID = DirectInfo.directorID
JOIN GenreInfo
ON DirectInfo.mvID = GenreInfo.mvID
WHERE genre = 'Comedy'
GROUP BY DirectorInfo.directorID
ORDER BY NumberOfMovies
LIMIT 1;
If you're using SQL server, use TOP instead:
SELECT TOP 1 firstname, COUNT(*) AS NumberOfMovies
FROM DirectorInfo
JOIN DirectInfo
ON DirectorInfo.directorID = DirectInfo.directorID
JOIN GenreInfo
ON DirectInfo.mvID = GenreInfo.mvID
WHERE genre = 'Comedy'
GROUP BY DirectorInfo.directorID
ORDER BY NumberOfMovies;

You can use a join and group by to get the result.
select DirectorID,COUNT(mvid)
from DirectInfo d
inner join genreinfo g
ON d.mvid=g.mvid
where genre ='Comedy'
GROUP BY DirectorID
ORDER BY COUNT(mvid)

This is homework? Well, right now you are selecting a list of IDs, some of them representing directors, others representing movies. You notice that this is not at all what you are supposed to do, right?
What you want is a list of directors. So you select from the DirectorInfo table. You also want information about his movies (excatly: the number of movies of a certain kind). So you must join that information from MovieInfo. Now think about what else you need to glue together to get from director to their movies. Then think about how to glue in that genre criterium.
Once you have joined it all together, then you group your results. You want one record per director (instead of ane record per director and movie), so you make a group and count within that group.
I hope this helps you solve your task. Good luck!

select di.directorid, count(1) as 'no_of_comedy_movies'
from DirectorInfo di inner join join DirectInfo dri
on di.directorid = dri.directorid
inner join genreinfo gi
on gi.mvid = dri.mvid
where gi.genre = 'Comedy'
group by dri.directorID
order by no_of_comedy_movies

Related

is there a more efficient alternative for this SQL query?

Im working on a movie data set that has tables for movies, genre and a bridge table in_genre.
The following query tries to find common genres between two movies. Im doing two joins to get the genre list and a intersect to find common genres.
Is there a more efficient way?
Table schema:
movie : movie_id(PK)(int)
in_genre(bridge_table): movie_id(FK)(int), genre_id(int)
SELECT count(*) as common_genre
FROM(
// getting genres of first movie
SELECT in_genre.genre_id
FROM movie INNER JOIN in_genre ON movie.id = in_genre.movie_id
WHERE movie.id = 0109830
INTERSECT
// getting genres of second movie
SELECT in_genre.genre_id
FROM movie INNER JOIN in_genre ON movie.id = in_genre.movie_id
WHERE movie.id = 1375666
) as genres
If it only needs the data from in_genre then there's no need to join the movie table.
And you can use an EXISTS to find the common genres.
SELECT COUNT(DISTINCT genre_id) as common_genre
FROM in_genre ig
WHERE movie_id = 0109830
AND EXISTS
(
SELECT 1
FROM in_genre ig2
WHERE ig2.movie_id = 1375666
AND ig2.genre_id = ig.genre_id
)
If you want the genres, I would simply do:
SELECT genre_id as common_genre
FROM in_genre ig
WHERE movie_id IN (0109830, 1375666)
GROUP BY genre_id
HAVING COUNT(*) = 2;
If you want the count, a subquery is simple enough:
SELECT COUNT(*)
FROM (SELECT genre_id as common_genre
FROM in_genre ig
WHERE movie_id IN (0109830, 1375666)
GROUP BY genre_id
HAVING COUNT(*) = 2
) g;
If you want full information about the genres, then I would suggest exists:
select g.*
from genres g
where exists (select 1
from in_genre ig
where ig.genre_id = g.genre_id and ig.movie_id = 0109830
) and
exists (select 1
from in_genre ig
where ig.genre_id = g.genre_id and ig.movie_id = 1375666
);

SQL ORDER BY number of rows?

How do you ORDER BY number of rows found in another table? I have a table for animals (these are livestock animals) and another table for awards. When an animal wins an award, the award gets added to the awards table.
People want to be able to find the animals who have won the most awards (WHERE award type is 1), ordered from most awards to least. How do I ORDER BY how many awards they have if the awards are in a separate table each with their own row?
SELECT animals.id
FROM animals
LEFT JOIN awards ON animals.id = awards.animalid
WHERE awards.type = 1
ORDER BY...
You would seem to want GROUP BY:
SELECT a.id
FROM animals a LEFT JOIN
awards aw
ON a.id = aw.animalid AND aw.type = 1
GROUP BY a.id
ORDER BY COUNT(aw.animalid) DESC;

SQL query return name and count of specific attributes across multiple tables

Given multiple tables I'm trying to write a query that returns the names that satisfies a specific count clause.
I have the tables:
genre(genre, movieid)
moviedirectors(movieid, directorid)
directors(directorid, firstname, lastname)
I want to write a query that returns the first and last name of directors that directed at least 50 movies of the genre comedy, and return that number as well.
This is what I have
select d.fname, d.lname, count(*)
from genre g, directors d, moviedirectors md
where g.genre='Comedy' and g.movieid=md.movieid and
md.directorid=d.directorid
group by d.id
having count(*) >= 50
I believe this should be correct but when I run this query on the command line it never finishes. I waited 30 minutes and got no results.
you need inner joins:
SELECT d.fname
d.lname
FROM genre g
INNER JOIN moviedirectors md
ON g.movieid = md.movieid
INNER JOIN directors d
ON md.directorid = d.directorid
WHERE g.genre = 'Comedy'
GROUP BY d.fname, -- group by columns in select
d.lname
HAVING COUNT(*) >= 50
select c.firstname,c.lastname,count(e.movieid) from (select a.* from directors a,movie b where b.genre = 'Comedy' and b.movieid=a.movieid)d,directors c where c.directorid=d.directorid group by e.movieid having count(e.movieid)>50;

Oracle sql - referencing tables

My school task was to get names from my movie database actors which play in movies with highest ratings
I made it this way and it works :
select name,surname
from actor
where ACTORID in(
select actorid
from actor_movie
where MOVIEID in (
select movieid
from movie
where RATINGID in (
select ratingid
from rating
where PERCENT_CSFD = (
select max(percent_csfd)
from rating
)
)
)
);
the output is :
Gary Oldman
Sigourney Weaver
...but I'd like to also add to this select mentioned movie and its rating. It accessible in inner selects but I don't know how to join it with outer select in which i can work just with rows found in Actor Table.
Thank you for your answers.
You just need to join the tables properly. Afterwards you can simply add the columns you´d like to select. The final select could be looking like this.
select ac.name, ac.surname, -- go on selecting from the different tables
from actor ac
inner join actor_movie amo
on amo.actorid = ac.actorid
inner join movie mo
on amo.movieid = mo.movieid
inner join rating ra
on ra.ratingid = mo.ratingid
where ra.PERCENT_CSFD =
(select max(percent_csfd)
from rating)
A way to get your result with a slightly different method could be something like:
select *
from
(
select name, surname, percent_csfd, row_number() over ( order by percent_csfd desc) as rank
from actor
inner join actor_movie
using (actorId)
inner join movie
using (movieId)
inner join rating
using(ratingId)
(
where rank = 1
This uses row_number to evaluate the "rank" of the movie(s) and then filter for the movie(s) with the highest rating.

SQL count by group

I have to following schema
Movie(mvID, title, rating, year)
Director(directorID, firstname, lastname)
Genre(mvID*, genre)
Direct(mvID*, directorID*)
I need to know the director that directed to most movies of say the comedy genre and output their details with the count of how many movies they made in the that genre.
So I have
SELECT Director.DirectorID, Director.FirstName, Director.LastName, COUNT(*)
FROM Direct, Genre, Director
WHERE Direct.mvID = Genre.mvID
AND Genre.genre = 'Comedy'
AND Direct.DirectorID = Director.DirectorID
AND COUNT(*) > ALL
GROUP BY Director.DirectorID, Director.FirstName, Director.LastName
ORDER by COUNT(*) DESC
but I get a group function not allowed error.
The SQL below will solve your problem:
SELECT Director.DirectorID, Director.FirstName, Director.LastName, COUNT(*)
FROM Direct, Genre, Director
WHERE Direct.mvID = Genre.mvID
AND Genre.genre = 'Comedy'
AND Direct.DirectorID = Director.DirectorID
AND rownum = 1
GROUP BY Director.DirectorID, Director.FirstName, Director.LastName
order by COUNT(*) DESC
Specifically, this joins the tables and counts the movies directed by each director for movies of genre 'Comedy'. Then it sorts the row in descending order of count and grabs the first row.
Note you have two many<->many relationships so if you weren't only looking at comedy (i.e. if you ran this across multiple genres), you would probably have to use a different technique (i.e. multiple temp tables or similar virtual aggregates) in order to not have 'double counting in your SQL...
Also note that if two directors have the same count of movies for this genre, only one row would be brought back. You would have to modify the sql slightly if you wanted all directors with the same count of comedies to be returned.
Cheers,
Noah Wollowick
select d.firstname,d.lastname,count(g.mvId)
from Director d
inner join Direct dr on d.directorId=dr.directorId
inner join Genre g on dr.mvId=g.mvId
where g.genre='comedy'
group by d.firstname
having count(g.mvId)=max(g.mvId)