SQL - Get consecutively minimum numbers - sql

Title may not make sense so I will provide some context.
I have a table, call it Movies.
A movie tuple has the values: Name, Director, Genre, Year
I'm trying to create a query that allows me to return all Directors who have never released two consecutive Horror films with more than 4 years apart.
I'm not sure where I'd begin but I'm trying to start off by creating a query that given some specific year, returns the next minimum year, so that I can check if the difference between these two is less than 4, and keep doing that for all movies.
My attempt was:
SELECT D1.Director
FROM Movies D1
WHERE D1.Director NOT IN
(SELECT D2.Director FROM Director D2
WHERE D2.Director = D1.Director
AND D2.Genre = 'Horror'
AND D1.Genre = 'Horror' AND D2.Year - D1.Year > 4
OR D1.Year - D2.Year > 4)
which does not work for obvious reasons.
I've also had a few attempts using joins, and it works on films that follow a pattern such as 2000, 2003, 2006, but fail if more than 3 films.

You could try this:
Select all data, and use lag or lead to return the last or next year. After that look at the difference between the two.
WITH TempTable AS (
SELECT
Name,
Director,
Genre,
Year,
LAG(Year) OVER (PARTITION BY Name, Director, Genre ORDER BY Year ASC) AS 'PriorYear'
FROM
Movies
WHERE
Genre = 'Horror'
)
SELECT
Name,
Director
FROM
TempTable
GROUP BY
Name,
Director
HAVING
MAX(Year-PriorYear) < 2

Try this:
SELECT * FROM (
SELECT director, min(diff) as diff FROM (
SELECT m1.director, m1.year as year1, m2.year as year2, m2.year-m1.year as diff
FROM `movies` m1, movies m2
WHERE m1.director = m2.director and m1.name <> m2.name and m1.year<=m2.year
and m1.genre='horror' and m2.genre='horror'
) d1 group by director
) d2 WHERE diff>4
First, in the inner Select it will list all movie pairs of directors' horror movies with year difference calculated, then minimum of these are selected (for consecutiveness), then longer than 4 years differences are selected...

Related

Selecting and comparing the averages of multiple SQL tables

I'm doing a NBA database with random data refering a team (for example the Lakers). The thing here is that I have a table for each season and in each season-table I have the game number, the player name, points, assists, rebounds, steals and blocks. What i want to do and don't know how to express it in an SQL sentence or multiple is to ask: select the avg of points, avg assists, avg rebounds... of ONE season of ONE player that for example averaged more than 25 points, 5 assists and 5 rebounds, in a SINGLE season. So that a player did it multiple times in multiple seasons, it appears the averages of that player multiple times.
Written in postgres, the CTE is a UNION of all your different tables, with calculated averages using each respective table. Then you simply select from the CTE.
with seasons as (
select '2018' as season, player,
avg(points) as avg_points,
avg(assists) as avt_assists,
avg(rebounds) as avg_rebounds,
avg(steals) as avg_steals,
avg(blocks) as avg_blocks
from table1
group by 1, 2
union
select '2019' as season, player,
avg(points) as avg_points,
avg(assists) as avt_assists,
avg(rebounds) as avg_rebounds,
avg(steals) as avg_steals,
avg(blocks) as avg_blocks
from table2
group by 1, 2
union
select '2020' as season, player,
avg(points) as avg_points,
avg(assists) as avt_assists,
avg(rebounds) as avg_rebounds,
avg(steals) as avg_steals,
avg(blocks) as avg_blocks
from table3
group by 1, 2
)
select *
from seasons
where avg_points > 25
and avg_assists > 5
and avg_rebounds > 5

Getting top 10 most popular within an array column using SQL UNNEST

I am working with a sample data set which gives the following result:
Continuing to work, I am now trying to get the top 10 Production Companies (based on "production_companies" field) that made the most number of movies in the most popular genre for a year.
The output
Rank | Production Company | Popular Genre | Movie Count
I thought breaking this down to getting the most popular genre for the year would be the 1st step with the following query:
select
genres.name AS _genre,
FROM
commons.movies m,
UNNEST(m.genres) as genres
WHERE
SUBSTR(m.release_date, 1, 4) = '2008'
GROUP BY
genres.name
ORDER BY
COUNT(genres.name) DESC
LIMIT
1
I have now go the output as 'Drama' being the most popular genre for the year 2008.
Answering the question to get the most popular prod company and their count has been a bit challenging and failing several times.
I have after several tries got to:
select
o_prd_cmp.name,
o_mov.title
from
commons.movies o_mov,
unnest(o_mov.genres) as o_gnr,
UNNEST(o_mov.production_companies) AS o_prd_cmp
where
SUBSTR(o_mov.release_date, 1, 4) = '2008'
AND o_gnr.name = (
select
genres.name AS _genre,
FROM
commons.movies m,
UNNEST(m.genres) as genres
WHERE
SUBSTR(m.release_date, 1, 4) = '2008'
GROUP BY
genres.name
ORDER BY
COUNT(genres.name) DESC
LIMIT
1
)
Any help with this is greatly appreciated.

List all directors who were very prolific in their first 10 years of activity

List all directors who were very prolific in their first 10 years of activity, i.e., they directed at least one movie each year, in the 10 years after their first movie (i.e., the year of their first movie counts as the first of 10.)
Movie(title, year, genre, budget, gross)
Director(name, country, YofB)
Actor(name, country, YofB)
Producer(name, country, YofB)
DirectorMovie(d_name, m_title, m_year)
ActorMovie(a_name, m_title, m_year)
ProducerMovie(p_name, m_title, m_year)
Attribute genre in table Movie has as
value one of {“comedy”, drama”, tragedy”, “musical”, “horror”}.).
You can filter the rows of the table DirectorMovie, so there are only the rows of the first 10 years of each director, then group by the director's name and set a condition in the having clause where you count only the distinct years:
select dm.d_name
from DirectorMovie dm
where dm.m_year < (select min(m_year) from DirectorMovie where d_name = dm.d_name) + 10
group by dm.name
having count(distinct dm.m_year) = 10

SQL MAx function with multiple columns showing in apex?

image1 image2I am trying to write an sql function to show the year, playername, and ppg of the player with the highest ppg from each year in our database.
We have a Players table with all the stats, and a team table with stats linked to each season as a team total.
What I want to do is get the highest scorer from each season so:
2010: Jake 10ppg
2011: Jake 12 ppg
2012 Carl 13 ppq
Etc.
here is my current query
SELECT Year, PlayerName, MAX(PPG) AS PPG
FROM PLAYERS_T, TEAM_T
GROUP BY Year
ORDER BY PPG;
However this is not working, what do I need to do to make this work?
This should work, but will show duplicated record if same PPG. Dont know what is the use of Team table there
SQL DEMO
WITH PLAYERS_T as (
SELECT 2010 "Year", 'Jake' "PlayerName", 10 ppg
UNION
SELECT 2011 "Year", 'Jake' "PlayerName", 12 ppg
UNION
SELECT 2012 "Year", 'Carl' "PlayerName", 13 ppg
)
SELECT T1."Year", T1."PlayerName", T1.PPG
FROM PLAYERS_T T1
LEFT JOIN PLAYERS_T T2
ON T1."Year" = T2."Year"
AND T1.PPG < T2.PPG
WHERE T2."Year" IS NULL
OUTPUT
Try this one:
SELECT players_T.playername, players_T.ppg, players_T.year
FROM
(SELECT year, MAX(PPG) AS mx
FROM players_T
GROUP BY year) sub
INNER JOIN players_T ON sub.mx = players_T.ppg
WHERE sub.year = players_T.year
ORDER BY players_T.year
In the subquery, this finds the max ppg per year. Then we join with the players table on the ppg to find the player name. The result should be the player name, ppg and year together. Let me know what you find!
Edit: Need to include a WHERE clause for year

PostgreSQL show latest date and one value of a column from many

I have a problem with a query, that i can't figure out. Have tried for some time, but I just can't figure it out. Would be a great deal of help if you could help me. So... I have 4 tables:
cars - ID, make, model, plate_number, price, type, year, owner_ID
persons - ID, name, surname, pers_code
insurance_data - company_ID, car_ID, first_date, last_date
companies - ID, title
My query so far is..
SELECT cars.plate_number, persons.name, persons.surname, insurance_data.last_date
FROM cars,persons,insurance_data
WHERE cars.owner_ID = persons.ID AND cars.ID = insurance_data.car_ID
This outputs cars plate number, owner of the car, and the last date of the car's insurance. But the problem is that there's two cars that have two end dates of insurance, so in the output there's two entries for same car and with both insurance end dates. What i need is that there would be only one entry for each car and corresponding insurance end date should be the latest.
I know this is pretty basic, but i'm a first year student of databases, and this is one of my first assignments. Thanks in advance
(1) Never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
(2) Use table aliases!
The answer to your question is DISTINCT ON:
SELECT DISTINCT ON (c.plate_number) c.plate_number, p.name, p.surname, id.last_date
FROM cars c JOIN
persons p
ON c.owner_ID = p.ID JOIN
insurance_data id
ON c.ID = id.car_ID
ORDER BY c.plate_number, id.last_date DESC;