How to apply group by here? - sql

I have a table Movie with columns Movie and Viewer where each movie is viewed by any user any number of times, so the table can have multiple same entries. I want to find the Top N most viewed movies and then the Top K viewers for each of the Top N movies. How can I apply group by or partition by effectively in such scenario ? Or if there is any better approach to this, please share. Thanks!
Movie
User
Avengers
John
Batman
Chris
Batman
Ron
X-Men
Chris
X-Men
Ron
Matrix
John
Batman
Martin
Matrix
Chris
Batman
Chris
X-Men
Ron
So, in this table Batman is the most watched movie is Batman followed by X-Men so I want the result table to look like :
Movie
User
View count
Batman
Chris
2
Batman
Ron
1
Batman
Martin
1
X-Men
Ron
2
X-Men
Chris
1
Matrix
John
1
Matrix
Chris
1
Avengers
John
1
I understand that I can group by movie and then do order by count(*) desc but this doesn't give me the second column which is grouped by viewer and the count for each viewer also.

Consider below approach (assuming Top 3 movies with Top 2 users)
select movie, user, view_count
from (
select distinct *,
count(*) over(partition by movie) movie_views,
count(*) over(partition by movie, user) view_count
from your_table
)
qualify dense_rank() over(order by movie_views desc) <=3
and row_number() over(partition by movie order by view_count desc) <=2
-- order by movie_views desc, view_count desc
if applied to sample data in your question - output is

Related

More concise way to count no. of lessons/events held if multiple per date exist

This is an example for the question's sake, but I basically have MySQL data similar to the following:
ID
LessonDate
PersonID
Subject
1234
2021-01-11
1
Spanish
1235
2021-01-11
1
Spanish
1236
2021-01-11
2
Spanish
1237
2021-01-12
1
Music
1238
2021-01-12
1
Music
1239
2021-01-12
1
Music
1240
2021-01-12
3
Music
1241
2021-01-12
3
Music
1242
2021-01-13
2
Chemistry
1243
2021-01-13
3
Chemistry
1244
2021-01-13
2
Spanish
1245
2021-01-14
3
Mathematics
This is interpreted to mean that:
Person 1 had two Spanish lessons on 11th Jan (Person 2 attended one of those)
Person 1 had three Music lessons on 12th Jan (Person 3 attended two of those)
Person 2 and Person 3 shared a Chemistry lesson on 13th Jan, while only Person 2 went to Spanish that day, etc.
To get my desired output I'm currently using 4 levels of grouping as follows. Starting with a count per subject-person-date, then the max per subject per date, then the total per subject, and finally listing out the subjects for each total:
SELECT LessonsHeld, GROUP_CONCAT(Subject ORDER BY Subject SEPARATOR ', ') AS Subjects FROM
(SELECT Subject, SUM(DayCount) AS LessonsHeld FROM
(SELECT Subject, LessonDate, MAX(PersonDayCount) AS DayCount FROM
(SELECT Subject, LessonDate, PersonID, COUNT(*) AS PersonDayCount FROM `lessons`
GROUP BY Subject, LessonDate, PersonID) x
GROUP BY Subject, LessonDate) y
GROUP BY Subject) z
GROUP BY LessonsHeld
ORDER BY LessonsHeld DESC
The output:
[LessonsHeld] [Subjects]
3 Music, Spanish
1 Chemistry, Mathematics
Is there a more concise way to count people's multiple events/classes etc. held on given dates? 4 levels of GROUP BY seems a tad extreme here.
Note the reason I have multiple IDs for single events is that each attendee can enter and delete their own data. I've focused on minimising data entry rather than having them do anything extra such as remembering start times or matching up their attendance with each other.
MySql 8.0 supports window functions, so it can be a bit shorter
SELECT LessonsHeld, GROUP_CONCAT(Subject ORDER BY Subject SEPARATOR ', ') AS Subjects
FROM
(SELECT Subject, SUM(DayCount) AS LessonsHeld
FROM
(SELECT Subject, LessonDate,
max(COUNT(*)) over(partition by Subject, LessonDate) AS DayCount,
row_number() over(partition by Subject, LessonDate order by PersonID) rn
FROM `lessons`
GROUP BY Subject, LessonDate, PersonID
) x
where rn = 1
GROUP BY Subject) z
GROUP BY LessonsHeld
ORDER BY LessonsHeld DESC;
Not sure if it will perform better.
MariaDB 10.3 fiddle

SQL query count rows with the same entry

Given a dataset Roster_table as such:
Group ID
Group Name
Name
Phone
42
Red Dragon
Jon
123455678
32
Green Lizard
Liz
932143211
19
Blue Falcon
Ben
134554678
42
Red Dragon
Reed
432143211
42
Red Dragon
Brad
231314155
19
Blue Falcon
Chad
214124412
How do I get the following query output combining rows with the same Group ID from the dataset, and the new column Count in descending order:
Group ID
Group Name
Count
42
Red Dragon
3
19
Blue Falcon
2
32
Green Lizard
1
SELECT * FROM Roster_table
Please try this where alias tot_count is used in ORDER BY clause.
-- PostgreSQL(v11)
SELECT Group_ID
, MAX(Group_Name) Group_Name
, COUNT(1) tot_count
FROM Roster_table
GROUP BY Group_ID
ORDER BY tot_count DESC;
Please check from url https://dbfiddle.uk/?rdbms=postgres_11&fiddle=b66f9f0d40e804e89be12e3530fe00a0
Based on Rahul Biswas's answer:
Solution without using Max function
SELECT Group_ID, Group_Name, COUNT(*)
FROM Roster_table
GROUP BY Group_ID, Group_Name
ORDER BY COUNT(*) DESC
Credit goes to Eric S.

Writing SQL query without count

Can anyone help me how to write this query without using count?
"Some directors directed more than one movie. For all such directors, return the titles of all movies directed by them, along with the director name. Sort by director name, then movie title.(without COUNT.)
mID | title | director
--------------------------------------
101 |Gone with the Wind |Victor Fleming
102 |Star Wars |George Lucas
103 |The Sound of Music |Robert Wise
104 |E.T. |Steven Spielberg
105 |Titanic |James Cameron
106 |Snow White |<null>
107 |Avatar |James Cameron
108 |Raiders of the Lost Ark |Steven Spielberg
You can compare for each director the min and max of the mid.
If they are different - there is more than 1.
Select mid, title, director
From tbl
where director in (Select director
From tbl
Group by director
Having max(mid) > min (mid))
order by director, title
You can use ROW_NUMBER() and a CTE or sub-query. Not using COUNT() is pretty silly though. It's an ideal case for such an aggregate.
with cte as(
select
director
,title
,row_number() over (partition by director order by title) as rn
from
yourTable)
select
director,
title
from cte
where director in (select director from cte where rn > 1)
order by
director,
title
How about this?
select t.*
from t join
(select director, sum(1) as cnt
from t
group by director
) d
on t.director = d.director
where cnt > 1
order by director, title;

Retrieve highest value from sql table

How can retrieve that data:
Name Title Profit
Peter CEO 2
Robert A.D 3
Michael Vice 5
Peter CEO 4
Robert Admin 5
Robert CEO 13
Adrin Promotion 8
Michael Vice 21
Peter CEO 3
Robert Admin 15
to get this:
Peter........4
Robert.......15
Michael......21
Adrin........8
I want to get the highest profit value from each name.
If there are multiple equal names always take the highest value.
select name,max(profit) from table group by name
Since this type of request almost always follows with "now can I include the title?" - here is a query that gets the highest profit for each name but can include all the other columns without grouping or applying arbitrary aggregates to those other columns:
;WITH x AS
(
SELECT Name, Title, Profit, rn = ROW_NUMBER()
OVER (PARTITION BY Name ORDER BY Profit DESC)
FROM dbo.table
)
SELECT Name, Title, Profit
FROM x
WHERE rn = 1;

How to produce detail, not summary, report sorted by count(*)?

Oracle 11g:
I want results to list by highest count, then ch_id. When I use group by to get the count then I loose the granularity of the detail. Is there an analytic function I could use?
SALES
ch_id desc customer
=========================
ANAR Anari BOB
SWIS Swiss JOE
SWIS Swiss AMY
BRUN Brunost SAM
BRUN Brunost ANN
BRUN Brunost ROB
Desired Results
count ch_id customer
===========================================
3 BRUN ANN
3 BRUN ROB
3 BRUN SAM
2 SWIS AMY
2 SWIS JOE
1 ANAR BOB
Use the analytic count(*):
select * from
(
select count(*) over (partition by ch_id) cnt,
ch_id, customer
from sales
)
order by cnt desc
select total, ch_id, customer
from sales s
inner join (select count(*) total, ch_id from sales group by ch_id) b
on b.ch_id = s.chi_id
order by total, ch_id
ok - the other post that happened at the same time, using partition, is the better solution for Oracle. But this one works regardless of DB.