A query by using count with multiple columns in SQL - sql

I am pretty new to SQL.
I have a movie database. With the following tables with the following with their columns listed:
Category Table
columns - category_id, name, last_update
Film_Category Table
columns - film id, category id, last_update
Inventory Table
columns - inventory_id, film_id, store_id, last_update
Rental Table
columns - rental_id, rental_date, inventory_id, customer_id, return_date, staff_id, last_update
Film Table
columns - film_id, title
Question/ Issue
I wish to create a query that lists each movie, the film category it is classified in, and how often it is rented. I wish to use the data from the five tables as much as possible.
I want the table to output the film title column, the category name column and the count of how many times it is rented out. The output should be something like this:
title name rental_count
Alter Victory Animation 10
Goofy Movie Animation 20
Help would really be appreciated for this task!

use join and aggregate function count
select F.title,C.name,count(rental_id) as rental_count from Rental R
left join Inventory I on R.inventory_id=I.inventory_id
inner join Film_Category Fc on I.film_id=Fc.film_id
inner join Flim F on F.film_id=Fc.film_id
inner join Category C on Fc.category_id=C.category_id
group by F.title,C.name

WITH film_rents AS
(
SELECT I.film_id, COUNT(1) AS rental_count
FROM Inventory AS I
INNER JOIN Rental AS R ON R.inventory_id = I.inventory_id
GROUP BY I.film_id)
SELECT F.title, ISNULL(rental_count, 0 ) AS rental_count, C.name
FROM Film AS F
LEFT JOIN film_rents AS FR ON F.film_id = FR.film_id
INNER JOIN Film_Category AS FC ON FC.film_id = F.film_id
INNER JOIN Category AS C ON C.category_id = FC.category_id
this does what you asked, however I think what you really wants is more than this. I am saying this because you have a junction table Film_Category which means for one film there is one or more categories. in that case the query you asked for ( and above query) does not do the job for you. Asuming you are using SQL‌ Server 2017 you can use this:
WITH film_rents AS
(
SELECT I.film_id, COUNT(1) AS rental_count
FROM Inventory AS I
INNER JOIN Rental AS R ON R.inventory_id = I.inventory_id
GROUP BY I.film_id),
film_categories AS
(
SELECT FC.film_id, STRING_AGG(C.name, ',') AS categories
FROM Film_Category AS FC
INNER JOIN Category AS C ON C.category_id = FC.category_id
GROUP BY FC.film_id
)
SELECT F.title, ISNULL(rental_count, 0 ) AS rental_count, FC.categories AS [name]
FROM Film AS F
LEFT JOIN film_rents AS FR ON F.film_id = FR.film_id
INNER JOIN film_categories AS FC ON FC.film_id = F.film_id

Related

Postgres inner join or complicated selects?

Let's say I have the example below in the picture, with the intended result below the tables
So far, I am able to get the rental count with
SELECT inventory.film_id,
SUM((SELECT COUNT(*) FROM rental WHERE rental.inventory_id = inventory.inventory_id)) AS rentals,
(SELECT title FROM film WHERE film.film_id = inventory.film_id) AS title
FROM inventory
GROUP BY inventory.film_id
ORDER BY inventory.film_id
Result
film_id rentals title
1 23 Academy Dinosaur
2 7 Ace Goldfinger
. .. ................
I just cannot seem to figure out how to get lets say, category.name from linking film_id to film_category then to category. I have tried adding this code
INNER JOIN film_category ON inventory.film_id = film_category.film_id
but it just returns the same result, it doesn't join it. This wouldn't grab category.name anyways like I am needing.
Any help in understanding what the logic is behind this example would be awesome. thanks
Be careful here. A film can have many rentals and belong to many categories. So you want to join the number of rentals and the list of categories. This means, you should first aggregate your data and then join:
select f.film_id, f.title, c.categories, r.number_of_rentals
from film f
left join
(
select fc.film_id, string_agg(c.name, ', ' order by c.name) as categories
from film_category fc
join category c on c.category_id = fc.category_id
group by fc.film_id
) c on c.film_id = f.film_id
left join
(
select i.film_id, count(*) as number_of_rentals
from rental r
join inventory in on i.inventory_id = r.inventory_id
group by i.film_id
) r on r.film_id = f.film_id
order by f.film_id;
You can write simple query using JOIN
SELECT inventory.film_id, film.title, COUNT(*) AS rentals
FROM inventory
JOIN rental ON rental.inventory_id = inventory.inventory_id
JOIN film ON film.film_id = inventory.film_id
GROUP BY inventory.film_id, film.title
ORDER BY inventory.film_id;
PostgreSQL live example
If you need to get film categories use next version:
SELECT
inventory.film_id,
film.title,
COUNT(DISTINCT rental_id) AS rentals,
ARRAY_AGG(DISTINCT category.title)
FROM inventory
JOIN rental ON rental.inventory_id = inventory.inventory_id
JOIN film ON film.film_id = inventory.film_id
JOIN film_category ON film_category.film_id = inventory.film_id
JOIN category ON film_category.category_id = category.category_id
GROUP BY inventory.film_id, film.title
ORDER BY inventory.film_id;
fiddle here

What is the most efficient way of selecting data from relational database?

I just started working with databases and
I have this data sample from PostgreSQL tutorial
https://www.postgresqltutorial.com/postgresql-sample-database/
Which diagram looks like this:
I want to find all film categories rented in for example Canada. Is there a way of doing it without using SELECT within SELECT.. statement like this:
SELECT * FROM category WHERE category_id IN (
SELECT category_id FROM film_category WHERE film_id IN (
SELECT film_id FROM film WHERE film_id IN (
SELECT film_id FROM inventory WHERE inventory_id IN (
SELECT inventory_id FROM rental WHERE staff_id IN (
SELECT staff_id FROM staff WHERE store_id IN (
SELECT store_id FROM store WHERE address_id IN (
SELECT address_id FROM address WHERE city_id IN (
SELECT city_id FROM city WHERE country_id IN (
SELECT country_id FROM country WHERE country IN ('Canada')
)
)
)
)
)
)
)
)
)
I'm sure there must be something that i'm missing.
The proper way is to use joins instead of all these nested subqueries:
select distinct c.category_id, c.name
from category c
inner join film_category fc on fc.category_id = c.category_id
inner join inventory i on i.film_id = fc.film_id
inner join rental r on r.inventory_id = i.inventory_id
inner join staff s on s.staff_id = r.staff_id
inner join store sr on sr.store_id = s.store_id
inner join address a on a.address_id = sr.address_id
inner join city ct on ct.city_id = a.city_id
inner join country cr on cr.country_id = ct.country_id
where cr.country = 'Canada'
For your requirement you must join 9 tables (1 less than your code because the table film is not really needed as the column film_id can link the tables film_category and inventory directly).
Notice the aliases for each table which shortens the code and makes it more readable and the ON clauses which are used to link each pair of tables.
Also the keyword DISTINCT is used so you don't get duplicates in the results because all these joins will return many rows for each category.

Avoid a SQL subquery double join on the right element in PostgreSQL?

I am trying to optimize a sql query and want to see how i can avoid subqueries when doing a second join on the resulting table. I have the following query from the dvd rental database provided by postgresql and have joined three tables with the purpose of getting the category of the film. I know that I can use a CTE or temp table but I was wondering if there was a shorter route to accomplish what is below:
--------get the category of a film
--------link film table to category id table with film id
--------then link resulting table to the category name table with category_id
SELECT
t1.title,
t1.film_id,
t1.category_id,
c.name
FROM
(
SELECT
f.title,
f.film_id,
fc.category_id
FROM
film as f
left join film_category as fc on f.film_id = fc.film_id
) as T1 left join category as c on t1.category_id = c.category_id
ORDER by title
I don't see why you have any subqueries at all:
SELECT f.title, f.film_id, fc.category_id, c.name
FROM film f LEFT JOIN
film_category fc
ON f.film_id = fc.film_id LEFT JOIN
category c
ON fc.category_id = c.category_id
ORDER by f.title

PostgreSQL - Which film is the most popular in category “Sports”?

I'm trying to answer a specific question "Which film is the most popular in category “Sports”?"
I've tried this
WITH CustomerRentalsPerStore AS
(
SELECT R.customer_id, I.category_id, COUNT (R.inventory_id) as rental_count
from rental AS R
INNER JOIN inventory AS I
on R.inventory_id = I.inventory_id
GROUP BY customer_id, I.category_id
--ORDER BY COUNT (R.inventory_id) desc
)
SELECT c.customer_id, c.first_name, c.last_name, cr.rental_count, cr.store_id
FROM Customer C
INNER JOIN CustomerRentalsPerStore CR
on C.customer_id = CR.customer_id
where cr.rental_count = (SELECT MAX(rental_count) FROM CustomerRentalsPerStore)
AND CR.category_id='Sports'
Here are the ER Diagrams:
Any help would be appreciated! Thank you
Based on what you clarified for me the film with the highest number of rentals in the sports category, I have the following untested SQL that should give you the result:
SELECT f.title, COUNT(*) AS RentalCount
FROM film f
INNER JOIN film_category fc ON fc.film_id = f.film_id
INNER JOIN category c ON c.category_id = fc.category_id
INNER JOIN inventory i ON i.film_id = f.film_id
INNER JOIN rental r ON r.inventory_id = i.inventory_id
WHERE (c.name = 'Sports')
GROUP BY f.title
ORDER BY 2 DESC;
This effectively gets the number of all rentals (COUNT) for all films in the Sports category. You obviously only want the first result so just limit the output to one row.
The code is untested but should point you in the correct direction.
A other approach would be to make/limit the selection as small as possible at first before going into GROUP BY and COUNT(*)processing.
The optimizer might choose a better execution method but it will be depending on indexing.
SELECT
film.title
, COUNT(*)
FROM (
SELECT
category.category_id
FROM
category
INNER JOIN
film_category
ON
category.category_id = film_category.category_id
INNER JOIN
film
ON
film_category.category_id = film.film_id
INNER JOIN
inventory
ON
film.film_id = inventory.film_id
INNER JOIN
rental
ON
inventory.film_id= rental.inventory_id
WHERE
category.name = 'Sports'
) AS alias
INNER JOIN
film
ON
alias.film_id = film.film_id
GROUP BY
film.title
ORDER BY
COUNT(*) DESC
LIMIT 1

WITH CTE query using postgresql

I am learning how to use SQL CTE and I would like to compare two query to have the same answer (using postgresql) but I fail can someone help plese?
I create this query and I have the total of each film title (Sakila database):
SELECT COUNT(r.rental_id) rental_count,
f.title as "Film"
FROM film f
JOIN inventory i
ON f.film_id = i.film_id
JOIN rental r USING (inventory_id)
GROUP BY f.title
ORDER BY rental_count DESC;
I would like to do the same using the WITH (CTE) and for that I create this code :
WITH table1 AS (
SELECT f.film_id,
f.title as "Film"
FROM film f),
table2 AS (
SELECT r.inventory_id,
COUNT(r.rental_id) rental_count,
i.film_id,
i.inventory_id
FROM inventory i
JOIN rental r USING (inventory_id)
GROUP BY r.inventory_id, i.film_id, i.inventory_id)
SELECT *
FROM table1
JOIN table2
ON table1.film_id = table2.film_id;
The problem is that the result did not show the total of each film title, but instead every film title separately.
The second CTE would need to be grouped by film to produce an equivalent end result.
WITH table1 AS (
SELECT
f.film_id
, f.title AS "Film"
FROM film f
)
, table2 AS (
SELECT
COUNT(r.rental_id) rental_count
, i.film_id
FROM inventory i
JOIN rental r ON i.inventory_id = r.inventory
GROUP BY i.film_id
)
SELECT
table2.rental_count
, table1.Film
FROM table1
JOIN table2 ON table1.film_id = table2.film_id
ORDER BY rental_count DESC;
Just a note; I would not recommend using both natural and non-natural join types in a single query, it can get quite confusing.
SELECT
COUNT(r.rental_id) rental_count
, f.title AS "Film"
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON i.inventory_id = r.inventory_id -- change here
GROUP BY f.title
ORDER BY rental_count DESC;
To get the same result, you'd have to aggregate and group in the second query just like in the first:
WITH table1 AS (...),
table2 AS (...)
SELECT count(table2.rental_count) AS rental_count,
table1."Film"
FROM table1
JOIN table2 USING (film_id)
GROUP BY table1."Film"
ORDER BY rental_count DESC;
Basically you use the CTEs instead of the original tables.