Get max count with lead name - sql

My dataset name bollywood.csv:
This is my data. I need the actors who have the most lead roles in movies.
I Need the name of the lead actors and the number of films in which they have lead.
My code is:
select lead, count(*) as nos from bollywood group by lead order by nos desc;
And the result is:
Amitabh 3
Akshay 3
John 3
Riteish 2
Shahrukh 2
Sunny 2
Emraan 2
Katrina 2
Nawazuddin 2
Tiger 2
Sharman 2
Manoj 2
Vidya 1
Tusshar 1
Tannishtha 1
Sushant 1
SunnyDeol 1
Sonam 1
Sonakshi 1
Siddarth 1
Shahid 1
Sandeep 1
Salman 1

If you want all actors with most lead roles (possibly multiple records):
select lead, count(*) as nos
from bollywood
group by lead
having count(*) =
(select max(cnt) from
(select lead, count(*) cnt
from bollywood
group by lead ) tblBolly )

Use rownum pseudocolumn with order by (oracle).
select from (
select lead, count(*) as nos
from bollywood
group by lead
order by lead desc
)
where rownum = 1

Use window functions:
select lead, cnt
from (select lead, count(*) as cnt,
rank() over (order by count(*) desc) as rnk
from bollywood
group by lead
) b
where rnk = 1;

Related

Select highest aggregated group

I'm having trouble with selecting the highest aggregated group.
I have data in a table like this: Sales table:
ID
GroupDescription
Sales
1
Group1
2
1
Group1
15
1
Group2
3
1
Group3
2
1
Group3
2
1
Group3
2
2
Group1
2
2
Group2
5
2
Group3
3
2
Group4
12
2
Group4
2
2
Group4
2
I want to return 1 record for each ID. I also want to include the Group that had the most sales and the total sales for that group and ID.
Expected output:
ID
GroupDescription
SumSales
1
Group1
17
2
Group4
16
I have code working but I feel like it can be written much better:
SELECT * FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID, SumSales DESC) as RowNum, * FROM
(
SELECT
ID
,GroupDescription
,SUM(Sales) OVER(PARTITION BY ID,GroupDescription) as SumSales
FROM Sales
) t1
) t2
WHERE RowNum = 1
Aggregate by ID and GroupDescription and use window functions FIRST_VALUE() and MAX() to get the top group and its total:
SELECT DISTINCT ID,
FIRST_VALUE(GroupDescription) OVER (PARTITION BY ID ORDER BY SUM(Sales) DESC) GroupDescription,
MAX(SUM(Sales)) OVER (PARTITION BY ID) SumSales
FROM Sales
GROUP BY ID, GroupDescription;
See the demo.
Seems like you could use normal aggregation in the inner table. You can also put the row-number on the same level as that.
SELECT
s.ID
,s.GroupDescription
,s.SumSales
FROM
(
SELECT
s.ID
,s.GroupDescription
,SUM(s.Sales) as SumSales
,ROW_NUMBER() OVER (PARTITION BY s.ID ORDER BY SUM(s.Sales) DESC) as RowNum
FROM Sales s
GROUP BY
s.ID
,s.GroupDescription
) s
WHERE s.RowNum = 1;
db<>fiddle
Note that ordering a window function by the same column as the partitioning makes no sense, and will be ignored.

How to rank groups of data?

Given the following, and tasked with ranking the raw data by the SUM(volume) within each group:
group_id volume
1 2
1 3
2 5
3 1
3 3
How can I obtain the following?
group_id volume group_volume rank
1 2 5 1
1 3 5 1
2 5 5 2
3 1 4 3
3 3 4 3
I can get group_volume easily, but am struggling on how to break the ties in rank without grouping by + ranking in a separate subquery and joining in.
SELECT *
, SUM(volume) OVER (PARTITION BY group_id) AS grouped_volume
, ... AS rank
FROM groups
Use CTE and Dense_rank
WITH CTE1 AS (SELECT group_id, volume,
sum(volume) over(partition by group_id) group_volume
from table1)
SELECT A.*, dense_rank() over( order by group_id, group_volume) rank FROM CTE1 A;
Use two levels of window functions for this:
select g.*,
dense_rank() over (order by group_volume desc, group_id) as rank
from (select g.*,
sum(volume) over (partition by group_id) as group_volume
from groups g
) g;
There is no need for a JOIN.

SQL : Return joint most frequent values from a column

I have the following table named customerOrders.
ID user order
1 1 2
2 1 3
3 1 1
4 2 1
5 1 5
6 2 4
7 3 1
8 6 2
9 2 2
10 2 3
I want to return to users with most orders. Currently, I have the following QUERY:
SELECT user, COUNT(user) AS UsersWithMostOrders
FROM customerOrders
GROUP BY user
ORDER BY UsersWithMostOrders DESC;
This returns me all the values grouped by total orders like.
user UsersWithMostOrders
1 4
2 4
3 1
6 1
I only want to return the users with most orders. In my case that would be user 1 and 2 since both of them have 4 orders. If I use TOP 1 or LIMIT, it will only return the first user. If I use TOP 2, it will only work in this scenario, it will return invalid data when top two users have different count of orders.
Required Result
user UsersWithMostOrders
1 4
2 4
You can use TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES
[user], COUNT(*) AS UsersWithMostOrders
FROM customerOrders
GROUP BY [user]
ORDER BY UsersWithMostOrders DESC;
See the demo.
Results:
> user | UsersWithMostOrders
> ---: | ------------------:
> 1 | 4
> 2 | 4
Option 1
Should work with most versions of SQL.
select *
from (
select *,
rank() over(order by numOrders desc) as rrank
from (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
) summed
) ranked
where rrank = 1
Play around with the code here
Option 2
If your version of SQL allows window functions (with), here is a much more readable solution which does the same thing
with summed as (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
),
ranked as (
select *,
rank() over(order by numOrders desc) as rrank
from summed
)
select *
from ranked
where rrank = 1
Play around with the code here
You can use a CTE to attain this Req:
;WITH CTE AS(
SELECT [user], COUNT(user) AS UsersWithMostOrders
FROM #T
GROUP BY [user])
SELECT M.* from CTE M
INNER JOIN ( SELECT
MAX(UsersWithMostOrders) AS MaximumOrders FROM CTE) S ON
M.UsersWithMostOrders=S.MaximumOrders
Below Oracle Query can help:
WITH test_table AS
(
SELECT user, COUNT(order) AS total_order , DENSE_RANK() OVER (ORDER BY
total_order desc) AS rank_orders FROM customerOrders
GROUP BY user
)
select * from test_table where rank_orders = 1

Oracle Nested Grouping

The question is: For each day, list the User ID who has read the most number of messages.
user_id msgID read_date
1 1 10
1 2 10
2 2 10
2 2 23
3 2 23
I believe the date is an outer group and user_id is an inner group, but how to do group nesting in sql? Or somehow avoid this?
This is a task for a Window Function:
select *
from
(
select user_id, read_date, count(*) as cnt,
rank()
over (partition by read_date -- each day
order by count(*) desc) as rnk -- maximum number
from tab
group by user_id, read_date
) dt
where rnk = 1
This might return multiple users for one with the same maximum count, if you want just one (randomly) switch to ROW_NUMBER
select user_id
from
(
select user_id,count(msgID)
from table
group by read_date
)
where rownum <= 1;

SQL Server dense_rank with sum

I have a query,which is not returning proper result,
i want my query to return total of score for same user_id, so that each user_id will have only one record with sum of all of its score.
My query is this:
SELECT
DENSE_RANK() OVER (ORDER BY score DESC) AS rank,
user_id,
SUM(score) AS total_score
FROM
account_game
GROUP BY
user_id, score
ORDER BY
rank ASC
Query output is :
rank user_id total_score
1 2 4837
2 1 600
2 6 600
3 1 30
4 1 20
There should be three records with user_id 1,2,6
Expected result should be
rank user_id total_score
-------------------------
1 2 4837
2 6 700
3 1 650
Please suggest
As StuartLC commented, you can just remove the score from your GROUP BY and all should be fine:
SELECT DENSE_RANK() OVER (Order by SUM(score) DESC) AS rank,
user_id,
SUM(score) as total_score
FROM
account_game
GROUP BY user_id
ORDER BY rank ASC