Select highest aggregated group - sql

I'm having trouble with selecting the highest aggregated group.
I have data in a table like this: Sales table:
ID
GroupDescription
Sales
1
Group1
2
1
Group1
15
1
Group2
3
1
Group3
2
1
Group3
2
1
Group3
2
2
Group1
2
2
Group2
5
2
Group3
3
2
Group4
12
2
Group4
2
2
Group4
2
I want to return 1 record for each ID. I also want to include the Group that had the most sales and the total sales for that group and ID.
Expected output:
ID
GroupDescription
SumSales
1
Group1
17
2
Group4
16
I have code working but I feel like it can be written much better:
SELECT * FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID, SumSales DESC) as RowNum, * FROM
(
SELECT
ID
,GroupDescription
,SUM(Sales) OVER(PARTITION BY ID,GroupDescription) as SumSales
FROM Sales
) t1
) t2
WHERE RowNum = 1

Aggregate by ID and GroupDescription and use window functions FIRST_VALUE() and MAX() to get the top group and its total:
SELECT DISTINCT ID,
FIRST_VALUE(GroupDescription) OVER (PARTITION BY ID ORDER BY SUM(Sales) DESC) GroupDescription,
MAX(SUM(Sales)) OVER (PARTITION BY ID) SumSales
FROM Sales
GROUP BY ID, GroupDescription;
See the demo.

Seems like you could use normal aggregation in the inner table. You can also put the row-number on the same level as that.
SELECT
s.ID
,s.GroupDescription
,s.SumSales
FROM
(
SELECT
s.ID
,s.GroupDescription
,SUM(s.Sales) as SumSales
,ROW_NUMBER() OVER (PARTITION BY s.ID ORDER BY SUM(s.Sales) DESC) as RowNum
FROM Sales s
GROUP BY
s.ID
,s.GroupDescription
) s
WHERE s.RowNum = 1;
db<>fiddle
Note that ordering a window function by the same column as the partitioning makes no sense, and will be ignored.

Related

DB2 Toad SQL - Group by Certain Columns using Max Command

I am having some trouble with the below query. I do understand I need to group by ID and Category, but I only want to group by ID while keeping the rest of the columns based on Rank being max. Is there a way to only group by certain columns?
select ID, Category, max(rank)
from schema.table1
group by ID
Input:
ID Category Rank
111 3 4
111 1 5
123 5 3
124 7 2
Current Output
ID Category Rank
111 3 4
111 9 1
123 5 3
124 7 2
Desired Output
ID Category Rank
111 1 5
123 5 3
124 7 2
You can use:
select *
from table1
where (id, rank) in (select id, max(rank) from table1 group by id)
Result:
ID CATEGORY RANK
---- --------- ----
111 1 5
123 5 3
124 7 2
Or you can use the ROW_NUMBER() window function. For example:
select *
from (
select *,
row_number() over(partition by id order by rank desc) as rn
from table1
) x
where rn = 1
See running example at db<>fiddle.
You can try using - row_number()
select * from
(
select ID, Category,rank, row_number() over(partition by id order by rank desc) as rn
from schema.table1
)A where rn=1

SQL : Return joint most frequent values from a column

I have the following table named customerOrders.
ID user order
1 1 2
2 1 3
3 1 1
4 2 1
5 1 5
6 2 4
7 3 1
8 6 2
9 2 2
10 2 3
I want to return to users with most orders. Currently, I have the following QUERY:
SELECT user, COUNT(user) AS UsersWithMostOrders
FROM customerOrders
GROUP BY user
ORDER BY UsersWithMostOrders DESC;
This returns me all the values grouped by total orders like.
user UsersWithMostOrders
1 4
2 4
3 1
6 1
I only want to return the users with most orders. In my case that would be user 1 and 2 since both of them have 4 orders. If I use TOP 1 or LIMIT, it will only return the first user. If I use TOP 2, it will only work in this scenario, it will return invalid data when top two users have different count of orders.
Required Result
user UsersWithMostOrders
1 4
2 4
You can use TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES
[user], COUNT(*) AS UsersWithMostOrders
FROM customerOrders
GROUP BY [user]
ORDER BY UsersWithMostOrders DESC;
See the demo.
Results:
> user | UsersWithMostOrders
> ---: | ------------------:
> 1 | 4
> 2 | 4
Option 1
Should work with most versions of SQL.
select *
from (
select *,
rank() over(order by numOrders desc) as rrank
from (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
) summed
) ranked
where rrank = 1
Play around with the code here
Option 2
If your version of SQL allows window functions (with), here is a much more readable solution which does the same thing
with summed as (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
),
ranked as (
select *,
rank() over(order by numOrders desc) as rrank
from summed
)
select *
from ranked
where rrank = 1
Play around with the code here
You can use a CTE to attain this Req:
;WITH CTE AS(
SELECT [user], COUNT(user) AS UsersWithMostOrders
FROM #T
GROUP BY [user])
SELECT M.* from CTE M
INNER JOIN ( SELECT
MAX(UsersWithMostOrders) AS MaximumOrders FROM CTE) S ON
M.UsersWithMostOrders=S.MaximumOrders
Below Oracle Query can help:
WITH test_table AS
(
SELECT user, COUNT(order) AS total_order , DENSE_RANK() OVER (ORDER BY
total_order desc) AS rank_orders FROM customerOrders
GROUP BY user
)
select * from test_table where rank_orders = 1

Get max count with lead name

My dataset name bollywood.csv:
This is my data. I need the actors who have the most lead roles in movies.
I Need the name of the lead actors and the number of films in which they have lead.
My code is:
select lead, count(*) as nos from bollywood group by lead order by nos desc;
And the result is:
Amitabh 3
Akshay 3
John 3
Riteish 2
Shahrukh 2
Sunny 2
Emraan 2
Katrina 2
Nawazuddin 2
Tiger 2
Sharman 2
Manoj 2
Vidya 1
Tusshar 1
Tannishtha 1
Sushant 1
SunnyDeol 1
Sonam 1
Sonakshi 1
Siddarth 1
Shahid 1
Sandeep 1
Salman 1
If you want all actors with most lead roles (possibly multiple records):
select lead, count(*) as nos
from bollywood
group by lead
having count(*) =
(select max(cnt) from
(select lead, count(*) cnt
from bollywood
group by lead ) tblBolly )
Use rownum pseudocolumn with order by (oracle).
select from (
select lead, count(*) as nos
from bollywood
group by lead
order by lead desc
)
where rownum = 1
Use window functions:
select lead, cnt
from (select lead, count(*) as cnt,
rank() over (order by count(*) desc) as rnk
from bollywood
group by lead
) b
where rnk = 1;

Oracle Nested Grouping

The question is: For each day, list the User ID who has read the most number of messages.
user_id msgID read_date
1 1 10
1 2 10
2 2 10
2 2 23
3 2 23
I believe the date is an outer group and user_id is an inner group, but how to do group nesting in sql? Or somehow avoid this?
This is a task for a Window Function:
select *
from
(
select user_id, read_date, count(*) as cnt,
rank()
over (partition by read_date -- each day
order by count(*) desc) as rnk -- maximum number
from tab
group by user_id, read_date
) dt
where rnk = 1
This might return multiple users for one with the same maximum count, if you want just one (randomly) switch to ROW_NUMBER
select user_id
from
(
select user_id,count(msgID)
from table
group by read_date
)
where rownum <= 1;

Make Two Queries into 1 result set with 2 columns

Say I have a table that looks like this:
Person Table
ID AccountID Name
1 6 Billy
2 6 Joe
3 6 Tom
4 8 Jamie
5 8 Jake
6 8 Sam
I have two queries that I know work by themselves:
Select Name Group1 from person where accountid = 6
Select Name Group2 from person where accountid = 8
But I want a single Result Set to look like this:
Group1 Group2
Billy Jamie
Joe Jake
Tom Same
You can use row_number() to assign a distinct value for each row, ans then use a FULL OUTER JOIN to join the two subqueries:
select t1.group1,
t2.group2
from
(
select name group1,
row_number() over(order by id) rn
from yourtable
where accountid = 6
) t1
full outer join
(
select name group2,
row_number() over(order by id) rn
from yourtable
where accountid = 8
) t2
on t1.rn = t2.rn;
See SQL Fiddle with Demo
I agree you should do this client side. But it can be done in T/SQL:
select G1.Name as Group1
, G2.Name as Group2
from (
select row_number() over (order by ID) as rn
, *
from Group
where AccountID = 6
) as G1
full outer join
(
select row_number() over (order by ID) as rn
, *
from Group
where AccountID = 8
) as G2
on G1.rn = G2.rn
order by
coalesce(G1.rn, G2.rn)