Max 3 values of every group - sql

I sorted my data to something like this and now I want to get top 3 max person_occurence by every company but I couldn't figure out how to do it.
person_occurence company person_id
67 company_1 110
66 company_2 176
64 company_3 100
64 company_3 196
63 company_4 127
62 company_1 150
61 company_5 120
60 company_3 140
59 company_5 154
59 company_5 162
59 company_4 194
58 company_4 109
58 company_3 128
58 company_1 156
I used this query to get max of every company but can't get top 3 max person_occurence
SELECT max(agent_occurence), company FROM table GROUP BY company;

Use window functions:
select t.*
from (select t.*,
row_number() over (partition by company order by person_occurrence desc) as seqnum
from t
) t
where seqnum <= 3;
Now, this assumes that you want the top 3 regardless of ties -- that is, if 4 are tied with the same highest value, this returns three of them. Ties make things more difficult. You may want dense_rank() or rank() instead.

You can use correlated subquery :
SELECT t1.*
FROM table t1
WHERE t1.person_occurencee = (SELECT max(t2.person_occurence) FROM table t2 WHERE t1.company = t2.company);
If your DBMS supports analytical function then you can also do :
select t.*
from (select t.*,
rank() over (partition by company order by person_occurrence desc) as seq
from t
) t
where seq <= 3;

Related

Count similar names block per independent partitions

I have a dataframe that looks like this:
id name datetime
44 once 2022-11-22T15:41:00
44 once 2022-11-22T15:42:00
44 once 2022-11-22T15:43:00
44 twice 2022-11-22T15:44:00
44 once 2022-11-22T16:41:00
55 thrice 2022-11-22T17:44:00
55 thrice 2022-11-22T17:46:00
55 once 2022-11-22T17:47:00
55 once 2022-11-22T17:51:00
55 twice 2022-11-22T18:41:00
55 thrice 2022-11-22T18:51:00
My desired output is
id name datetime cnt
44 once 2022-11-22T15:41:00 3
44 once 2022-11-22T15:42:00 3
44 once 2022-11-22T15:43:00 3
44 twice 2022-11-22T15:44:00 1
44 once 2022-11-22T16:41:00 1
55 thrice 2022-11-22T17:44:00 2
55 thrice 2022-11-22T17:46:00 2
55 once 2022-11-22T17:47:00 2
55 once 2022-11-22T17:51:00 2
55 twice 2022-11-22T18:41:00 1
55 thrice 2022-11-22T18:51:00 1
where the new column, cnt, is the maximum count of the name column per block that they follow themselves consecutively.
I attempted the problem by doing:
select
id,
name,
datetime,
row_number() over (partition by id order by datetime) rn1,
row_number() over (partition by id, name order by name, datetime) rn2
from table
but it is obviously not giving the desired output.
I tried also looking at the solutions in SQL count consecutive days but could not figure out from answers given there.
As noted in the question you linked to, this is a typical gaps & islands problem.
The solution is provided in the answers to that question, but I've applied to your sample data specifically for you here:
with gp as (
select *,
Row_Number() over(partition by id order by [datetime])
- Row_Number() over(partition by id, name order by [datetime]) g
from t
)
select id, name, [datetime],
Count(*) over(partition by id, name, g) cnt
from gp;
See Demo DBFiddle

Get sum qty over specific range

I have the below table
substring(area,6,3)
qty
101
10
103
15
102
11
104
30
105
25
107
17
108
23
106
48
And I am looking to get a result as below without repeating the IIF ( as it's a cumulative of 4 sequences) in the area:
new_area(substring(area,6,3)
sum_qty
101-104
66
105-108
117
I don't know how to create the new area column to be able to get the sum qty
Looking forward to your help.
Please also add an explanation so I will understand how the query is running.
I think this is what you are looking for.
We just use the window function row_number() to create the Grp
NOTE: If you have repeating values in AREA use dense_rank() instead of row_number()
Example
Select new_area = concat(min(area),'-',max(area))
,qty = sum(qty)
From (
Select area=substring(area,6,3)
,qty
,Grp = (row_number() over (order by substring(area,6,3))-1) / 4
From YourTable
) A
Group By Grp
Results
new_area qty
101-104 66
105-108 113 -- get different results
If you were to run the subquery, you would see the following.
Then it becomes a small matter to aggregate the data grouped by the created column GRP

SQL Subquerys and RANK()

I'm using below for a tournament system. The table contains registrered lengths for all teams. The result will be a scoreboard, summing up all teams length in a totalScore.
I'm trying to get the RANK() function into my SQL but I'm stuck right now. I want to get the current teams rank score out from my DB. Anyone got any ideas? I'm using MariaDB.
select team, sum(length) as totalScore
from
(SELECT t.*,
#num_in_group:=case when #team!=team then #num_in_group:=0 else #num_in_group:=#num_in_group+1 end as num_in_group,
#team:=team as t
FROM reg_catches t, (select #team:=-1, #num_in_group:=0) init
ORDER BY team asc, length desc) sub
WHERE sub.num_in_group<=4
GROUP BY team
ORDER BY totalScore DESC;
Table
team length
-----------
26 70
25 70
25 95
25 98
25 100
25 100
25 100
25 122
Current output
team totalScore
-- --
25 520
26 70
Wanted output
rank team totalScore
-- -- --
1 25 520
2 26 70
SET #row = 0;
SELECT #row:=#row + 1 rank, a.team, a.total_score
FROM(SELECT team, sum(r.length) as total_score FROM reg_catches r GROUP BY
r.team) a;
Try the above
Got this far now with above help from Dickson, problem now is that it seems that the rank is based on team ID instead of totalScore :O
SET #row = 0;
SELECT #row:=#row + 1 rank, team, sum(length) as totalScore
from
(SELECT t.*,
#num_in_group:=case when #team!=team then #num_in_group:=0 else #num_in_group:=#num_in_group+1 end as num_in_group,
#team:=team as t
FROM reg_catches t, (select #team:=-1, #num_in_group:=0) init
ORDER BY team asc, length desc) sub
WHERE sub.num_in_group<=4 and competition = "#COMPID" and disqualified = 0
GROUP BY team
ORDER BY totalScore DESC
Current output
rank team totalScore
1 28381 479
58 28468 439
20 28412 436
25 28419 432
14 28404 427
5 28388 421
Wanted would be
rank team totalScore
1 28381 479
2 28468 439
3 28412 436
4 28419 432
5 28404 427
6 28388 421
SQL Fiddle: http://sqlfiddle.com/#!9/107d98/2/1

Something like rank() in SQL Server

How can I write a query in SQL Server such as rank() but a bit different calculate.
For example rank is:
rankNumber uniqeId
1 160
2 159
3 158
4 157
5 156
5 156
7 152
8 151
8 151
10 150
I need the result like these:
rankNumber uniqeId
1 160
2 159
3 158
4 157
5 156
5 156
6 152
7 151
7 151
8 150
How can I do this? Is there such a function in SQL Server?
SELECT DENSE_RANK() OVER (ORDER BY TotCnt DESC) AS TopCustomers, CustomerID, TotCnt
FROM (SELECT CustomerID, COUNT(*) AS TotCnt
FROM Orders Group BY CustomerID) AS Cust
OUTPUT
To expand on the DENSE_RANK comment, the full query is short and sweet:
SELECT
DENSE_RANK() OVER (ORDER BY uniqueId DESC) AS rankNumber,
uniqueId
FROM myTable
ORDER BY rankNumber
There's a SQL Fiddle here

Sql get latest records of the month for each name

This question is probably answered before but i cant find how to get the latest records of the months.
The problem is that I have a table with sometimes 2 row for the same month. I cant use the aggregate function(I guess) cause in the 2 rows, i have different data where i need to get the latest.
Example:
name Date nuA nuB nuC nuD
test1 05/06/2013 356 654 3957 7033
test1 05/26/2013 113 237 399 853
test3 06/06/2013 145 247 68 218
test4 06/22/2013 37 37 6 25
test4 06/27/2013 50 76 20 84
test4 05/15/2013 34 43 34 54
I need to get a result like:
test1 05/26/2013 113 237 399 853
test3 06/06/2013 145 247 68 218
test4 05/15/2013 34 43 34 54
test4 06/27/2013 50 76 20 84
** in my example the data is in order but in my real table the data is not in order.
For now i have something like:
SELECT Name, max(DATE) , nuA,nuB,nuC,nuD
FROM tableA INNER JOIN
Group By Name, nuA,nuB,nuC,nuD
But it didn't work as i want.
Thanks in advance
Edit1:
It seems that i wasn't clear with my question...
So i add some data in my example to show you how i need to do it.
Thanks guys
Use SQL Server ranking functions.
select name, Date, nuA, nuB, nuC, nuD from
(Select *, row_number() over (partition by name, datepart(year, Date),
datepart(month, Date) order by Date desc) as ranker from Table
) Z
where ranker = 1
Try this
SELECT t1.* FROM Table1 t1
INNER JOIN
(
SELECT [name],MAX([date]) as [date] FROM Table1
GROUP BY [name],YEAR([date]),MONTH([date])
) t2
ON t1.[date]=t2.[date] and t1.[name]=t2.[name]
ORDER BY t1.[name]
Can you not just do an order
select * from tablename where Date = (select max(Date) from tablename)
followed by only pulling the first 3?