Is there an easier way to find the row with a max value? - sql

I have a schema where these two tables exist (among others)
participation
+------+--------+------------------+
| movie| person | role |
+------+--------+------------------+
| 1 | 1 | "Regisseur" |
| 1 | 1 | "Schauspieler" |
| 1 | 2 | "Schauspielerin" |
| 2 | 3 | "Regisseur" |
| 3 | 4 | "Regisseur" |
| 3 | 5 | "Schauspieler" |
| 3 | 6 | "Schauspieler" |
| 4 | 7 | "Schauspielerin" |
| 4 | 8 | "Schauspieler" |
| 5 | 1 | "Schauspieler" |
| 5 | 8 | "Schauspieler" |
| 5 | 14 | "Schauspieler" |
+------+--------+------------------+
movie
+----+------------------------------+------+-----+
| id | title | year | fsk |
+----+------------------------------+------+-----+
| 1 | "Die Bruecke am Fluss" | 1995 | 12 |
| 2 | "101 Dalmatiner" | 1961 | 0 |
| 3 | "Vernetzt - Johnny Mnemonic" | 1995 | 16 |
| 4 | "Waehrend Du schliefst..." | 1995 | 6 |
| 5 | "Casper" | 1995 | 6 |
| 6 | "French Kiss" | 1995 | 6 |
| 7 | "Stadtgespraech" | 1995 | 12 |
| 8 | "Apollo 13" | 1995 | 6 |
| 9 | "Schlafes Bruder" | 1995 | 12 |
| 10 | "Assassins - Die Killer" | 1995 | 16 |
| 11 | "Braveheart" | 1995 | 16 |
| 12 | "Das Netz" | 1995 | 12 |
| 13 | "Free Willy 2" | 1995 | 6 |
+----+------------------------------+------+-----+
I want to get the movie with the highest number of people that participated. I figured out an SQL statement that actually does this, but looks super complicated. It looks like this:
SELECT titel
FROM movie.movie
JOIN (SELECT *
FROM (SELECT Max(count_person) AS max_count_person
FROM (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons) AS
maxCountPersons
JOIN (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons
ON maxCountPersons.max_count_person =
countPersons.count_person)
AS maxPersonsmovie
ON maxPersonsmovie.movie = movie.id
The main problem is, that I can't find an easier way to select the row with the highest value. If I simply could make a selection on the inner table and pick the row with the highest value on count_person without losing the information about the movie itself, this would look so much simpler. Is there a way to simplify this, or is this really the easiest way to do this?

Here is a way without subqueries:
SELECT m.title
FROM movie.movie m JOIN
movie.participation p
ON m.id = p.movie
GROUP BY m.title
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
You can use LIMIT 1 instead of FETCH, if you prefer.
Note: In the event of ties, this only returns one value. That seems consistent with your question.

You can use rank window function to do this.
SELECT title
FROM (SELECT m.title,rank() over(order by count(p.person) desc) as rnk
FROM movie.movie m
LEFT JOIN movie.participation p ON m.id=p.movie
GROUP BY m.title
) t
WHERE rnk=1

SELECT title
FROM movie.movie
WHERE id = (SELECT movie
FROM movie.participation
GROUP BY movie
ORDER BY count(*) DESC
LIMIT 1);

Related

How to get data from previos year?

Here my base sample
I need get data from previous period with lag in Hello table
Could you help me?
+------+--------+------+-------+
| Year | Animal | Plus | Hello |
+-------+------+--------+------+
| 2 | Cat | 3 | |
| 2 | Dog | 4 | |
| 2 | Mouse | 5 | |
| 3 | Cat | 5 | 3 |
| 3 | Dog | 6 | 4 |
| 3 | Mouse | 6 | 5 |
| 3 | Horse | 6 | |
| 3 | Pig | 6 | |
| 3 | Goose | 6 | |
| 4 | Cat | | 5 |
| 4 | Dog | | 6 |
| 4 | Mouse | | 6 |
| 4 | Horse | | 6 |
| 4 | Pig | | 6 |
+-------+------+--------+------+
You are looking for LAG. This function looks into previous rows.
select
place, year, animal, plus,
lag(plus) over (partition by animal order by year) as hello
from mytable
order by year, animal;
The "previous" row is the closest previous one, i.e. if for ' Goose' there are rows for year 3 and 5 and none for year 4, then year 3 would be considered the previous row for year 5 and LAG would show that value.
If you really want the adjacent previous year, i.e. year - 1, then you can select this year as follows:
select
place, year, animal, plus,
(
select plus
from mytable prev_year
where prev_year.animal = mytable.animal
and prev_year.year = mytable.year - 1)
) as hello
from mytable
order by year, animal;
Same thing with an outer join:
select
t.place, t.year, t.animal, t.plus, prev_year.plus as hello
from mytable t
left join mytable prev_year on prev_year.animal = t.animal
and prev_year.year = t.year - 1
order by t.year, t.animal;

Get Field Hierachy

I have the following tables and I want to get the quantity of users by country:
+--------+------+:
| user | zone |
+--------+------+
| Paul | 7 |
+--------+------+
| John | 5 |
+--------+------+
| Peter | 6 |
+--------+------+
| Frank | 5 |
+--------+------+
| Silvia | 2 |
+--------+------+
| Carl | 4 |
+--------+------+
| Mark | 3 |
+--------+------+
Regions
+---------+-----------------+----------+--+
| zone_id | zone_name | idUpzone | |
+---------+-----------------+----------+--+
| 1 | Global | null | |
+---------+-----------------+----------+--+
| 2 | US | 1 | |
+---------+-----------------+----------+--+
| 3 | Florida | 2 | |
+---------+-----------------+----------+--+
| 4 | Orlando | 3 | |
+---------+-----------------+----------+--+
| 5 | China | 1 | |
+---------+-----------------+----------+--+
| 6 | Orlando Sector | 4 | |
+---------+-----------------+----------+--+
| 7 | Beijing | 5 | |
+---------+-----------------+----------+--+
so I get something like this
+---------+-----+
| Country | QTY |
+---------+-----+
| US | 4 |
+---------+-----+
| China | 3 |
+---------+-----+
Use a recursive CTE to get the highest level and then join:
with cte as (
select zone_id, zone_id as top_zone_id, zone_name as top_zone_name, 1 as lev
from regions
where parent_zone_id = 1
union all
select r.zone_id, cte.top_zone_id, top_zone_name, lev + 1
from cte join
regions r
on r.idUpzone = cte.zone_id
)
select cte.top_zone_name, count(*)
from users u join
cte
on u.zone = cte.zone_id
group by cte.top_zone_name;
Try this out:
SELECT
r.zone_name AS Contry, COUNT(*) QTY
FROM (
SELECT * FROM users u
INNER JOIN regions r ON u.zone = r.zone_id
) a
GROUP BY r.zone_name

Get the Id of the matched data from other table. No duplicates of ID from both tables

Here is my table A.
| Id | GroupId | StoreId | Amount |
| 1 | 20 | 7 | 15000 |
| 2 | 20 | 7 | 1230 |
| 3 | 20 | 7 | 14230 |
| 4 | 20 | 7 | 9540 |
| 5 | 20 | 7 | 24230 |
| 6 | 20 | 7 | 1230 |
| 7 | 20 | 7 | 1230 |
Here is my table B.
| Id | GroupId | StoreId | Credit |
| 12 | 20 | 7 | 1230 |
| 14 | 20 | 7 | 15000 |
| 15 | 20 | 7 | 14230 |
| 16 | 20 | 7 | 1230 |
| 17 | 20 | 7 | 7004 |
| 18 | 20 | 7 | 65523 |
I want to get this result without getting duplicate Id of both table.
I need to get the Id of table B and A where the Amount = Credit.
| A.ID | B.ID | Amount |
| 1 | 14 | 15000 |
| 2 | 12 | 1230 |
| 3 | 15 | 14230 |
| 4 | null | 9540 |
| 5 | null | 24230 |
| 6 | 16 | 1230 |
| 7 | null | 1230 |
My problem is when I have 2 or more same Amount in table A, I get duplicate ID of table B. which should be null. Please help me. Thank you.
I think you want a left join. But this is tricky because you have duplicate amounts, but you only want one to match. The solution is to use row_number():
select . . .
from (select a.*, row_number() over (partition by amount order by id) as seqnum
from a
) a left join
(select b.*, row_number() over (partition by credit order by id) as seqnum
from b
)b
on a.amount = b.credit and a.seqnum = b.seqnum;
Another approach, I think simplier and shorter :)
select ID [A.ID],
(select top 1 ID from TABLE_B where Credit = A.Amount) [B.ID],
Amount
from TABLE_A [A]

How to structure a proper SQL subquery?

I'm trying to wrap my head around how to do a proper subquery, it's not making sense to me, lets say I have two tables books and chapters:
Books
+----+------------------+----------+---------------------+
| id | name | author | last_great_chapters |
+----+------------------+----------+---------------------+
| 1 | some book title | john doe | 2 |
| 2 | foo novel title | some guy | 4 |
| 3 | other book title | lol man | 3 |
+----+------------------+----------+---------------------+
Chapters
+----+---------+----------------+
| id | book_id | chapter_number |
+----+---------+----------------+
| 1 | 1 | 1 |
| 2 | 1 | 3 |
| 3 | 1 | 4 |
| 4 | 1 | 5 |
| 5 | 2 | 1 |
| 6 | 2 | 2 |
| 7 | 2 | 3 |
| 8 | 2 | 4 |
| 9 | 2 | 5 |
| 10 | 3 | 1 |
| 11 | 3 | 2 |
| 12 | 3 | 3 |
| 13 | 3 | 4 |
| 14 | 3 | 5 |
+----+---------+----------------+
How can I join the two tables, and just print out the number of rows (sorted limit(last_great_chapters)) of the "last_great_chapters" from the books table list for each book?
if I understood correctly, you want to print out table books and last_great_chapters count in Chapters table?
if yes, try it
select b.id, b.name, b.author , b.last_great_chapter, COUNT(c.chapter_number) as rownumbers FROM Books as b
LEFT JOIN Chapters AS C ON c.chapter_number = b.last_great_chapters
group by b.id, b.name, b.author , b.last_great_chapter

Percentage to total in BigQuery Legacy SQL (Subqueries?)

I can't understand how to calulate percentage to total in BigQuery Legacy SQL.
So, I have a table:
ID | Name | Group | Mark
1 | John | A | 10
2 | Lucy | A | 5
3 | Jane | A | 7
4 | Lily | B | 9
5 | Steve | B | 14
6 | Rita | B | 11
I want to calculate percentage like this:
ID | Name | Group | Mark | Percent
1 | John | A | 10 | 10/(10+5+7)=45%
2 | Lucy | A | 5 | 5/(10+5+7)=22%
3 | Jane | A | 7 | 7/(10+5+7)=33%
4 | Lily | B | 9 | 9/(9+14+11)=26%
5 | Steve | B | 14 | 14/(9+14+11)=42%
6 | Rita | B | 11 | 11/(9+14+11)=32%
My table is quite long for me (3 million rows).
I thought that I could do it with subqueries, but in SELECT I can't use subqueries.
Does anyone know a way to do it?
SELECT
ID, Name, [Group], Mark,
RATIO_TO_REPORT(Mark) OVER(PARTITION BY [Group]) AS percent
FROM YourTable
Check more about RATIO_TO_REPORT