Max 3 values of every group

Max 3 values of every group - sql

I sorted my data to something like this and now I want to get top 3 max person_occurence by every company but I couldn't figure out how to do it.
person_occurence company person_id
67 company_1 110
66 company_2 176
64 company_3 100
64 company_3 196
63 company_4 127
62 company_1 150
61 company_5 120
60 company_3 140
59 company_5 154
59 company_5 162
59 company_4 194
58 company_4 109
58 company_3 128
58 company_1 156
I used this query to get max of every company but can't get top 3 max person_occurence
SELECT max(agent_occurence), company FROM table GROUP BY company;

Use window functions:
select t.*
from (select t.*,
row_number() over (partition by company order by person_occurrence desc) as seqnum
from t
) t
where seqnum <= 3;
Now, this assumes that you want the top 3 regardless of ties -- that is, if 4 are tied with the same highest value, this returns three of them. Ties make things more difficult. You may want dense_rank() or rank() instead.

You can use correlated subquery :
SELECT t1.*
FROM table t1
WHERE t1.person_occurencee = (SELECT max(t2.person_occurence) FROM table t2 WHERE t1.company = t2.company);
If your DBMS supports analytical function then you can also do :
select t.*
from (select t.*,
rank() over (partition by company order by person_occurrence desc) as seq
from t
) t
where seq <= 3;

Related

Count similar names block per independent partitions

I have a dataframe that looks like this:
id name datetime
44 once 2022-11-22T15:41:00
44 once 2022-11-22T15:42:00
44 once 2022-11-22T15:43:00
44 twice 2022-11-22T15:44:00
44 once 2022-11-22T16:41:00
55 thrice 2022-11-22T17:44:00
55 thrice 2022-11-22T17:46:00
55 once 2022-11-22T17:47:00
55 once 2022-11-22T17:51:00
55 twice 2022-11-22T18:41:00
55 thrice 2022-11-22T18:51:00
My desired output is
id name datetime cnt
44 once 2022-11-22T15:41:00 3
44 once 2022-11-22T15:42:00 3
44 once 2022-11-22T15:43:00 3
44 twice 2022-11-22T15:44:00 1
44 once 2022-11-22T16:41:00 1
55 thrice 2022-11-22T17:44:00 2
55 thrice 2022-11-22T17:46:00 2
55 once 2022-11-22T17:47:00 2
55 once 2022-11-22T17:51:00 2
55 twice 2022-11-22T18:41:00 1
55 thrice 2022-11-22T18:51:00 1
where the new column, cnt, is the maximum count of the name column per block that they follow themselves consecutively.
I attempted the problem by doing:
select
id,
name,
datetime,
row_number() over (partition by id order by datetime) rn1,
row_number() over (partition by id, name order by name, datetime) rn2
from table
but it is obviously not giving the desired output.
I tried also looking at the solutions in SQL count consecutive days but could not figure out from answers given there.

As noted in the question you linked to, this is a typical gaps & islands problem.
The solution is provided in the answers to that question, but I've applied to your sample data specifically for you here:
with gp as (
select *,
Row_Number() over(partition by id order by [datetime])
- Row_Number() over(partition by id, name order by [datetime]) g
from t
)
select id, name, [datetime],
Count(*) over(partition by id, name, g) cnt
from gp;
See Demo DBFiddle

Get sum qty over specific range

I have the below table
substring(area,6,3)
qty
101
10
103
15
102
11
104
30
105
25
107
17
108
23
106
48
And I am looking to get a result as below without repeating the IIF ( as it's a cumulative of 4 sequences) in the area:
new_area(substring(area,6,3)
sum_qty
101-104
66
105-108
117
I don't know how to create the new area column to be able to get the sum qty
Looking forward to your help.
Please also add an explanation so I will understand how the query is running.

I think this is what you are looking for.
We just use the window function row_number() to create the Grp
NOTE: If you have repeating values in AREA use dense_rank() instead of row_number()
Example
Select new_area = concat(min(area),'-',max(area))
,qty = sum(qty)
From (
Select area=substring(area,6,3)
,qty
,Grp = (row_number() over (order by substring(area,6,3))-1) / 4
From YourTable
) A
Group By Grp
Results
new_area qty
101-104 66
105-108 113 -- get different results
If you were to run the subquery, you would see the following.
Then it becomes a small matter to aggregate the data grouped by the created column GRP

SQL Subquerys and RANK()

I'm using below for a tournament system. The table contains registrered lengths for all teams. The result will be a scoreboard, summing up all teams length in a totalScore.
I'm trying to get the RANK() function into my SQL but I'm stuck right now. I want to get the current teams rank score out from my DB. Anyone got any ideas? I'm using MariaDB.
select team, sum(length) as totalScore
from
(SELECT t.*,
#num_in_group:=case when #team!=team then #num_in_group:=0 else #num_in_group:=#num_in_group+1 end as num_in_group,
#team:=team as t
FROM reg_catches t, (select #team:=-1, #num_in_group:=0) init
ORDER BY team asc, length desc) sub
WHERE sub.num_in_group<=4
GROUP BY team
ORDER BY totalScore DESC;
Table
team length
-----------
26 70
25 70
25 95
25 98
25 100
25 100
25 100
25 122
Current output
team totalScore
-- --
25 520
26 70
Wanted output
rank team totalScore
-- -- --
1 25 520
2 26 70

SET #row = 0;
SELECT #row:=#row + 1 rank, a.team, a.total_score
FROM(SELECT team, sum(r.length) as total_score FROM reg_catches r GROUP BY
r.team) a;
Try the above

Got this far now with above help from Dickson, problem now is that it seems that the rank is based on team ID instead of totalScore :O
SET #row = 0;
SELECT #row:=#row + 1 rank, team, sum(length) as totalScore
from
(SELECT t.*,
#num_in_group:=case when #team!=team then #num_in_group:=0 else #num_in_group:=#num_in_group+1 end as num_in_group,
#team:=team as t
FROM reg_catches t, (select #team:=-1, #num_in_group:=0) init
ORDER BY team asc, length desc) sub
WHERE sub.num_in_group<=4 and competition = "#COMPID" and disqualified = 0
GROUP BY team
ORDER BY totalScore DESC
Current output
rank team totalScore
1 28381 479
58 28468 439
20 28412 436
25 28419 432
14 28404 427
5 28388 421
Wanted would be
rank team totalScore
1 28381 479
2 28468 439
3 28412 436
4 28419 432
5 28404 427
6 28388 421
SQL Fiddle: http://sqlfiddle.com/#!9/107d98/2/1

Something like rank() in SQL Server

How can I write a query in SQL Server such as rank() but a bit different calculate.
For example rank is:
rankNumber uniqeId
1 160
2 159
3 158
4 157
5 156
5 156
7 152
8 151
8 151
10 150
I need the result like these:
rankNumber uniqeId
1 160
2 159
3 158
4 157
5 156
5 156
6 152
7 151
7 151
8 150
How can I do this? Is there such a function in SQL Server?

SELECT DENSE_RANK() OVER (ORDER BY TotCnt DESC) AS TopCustomers, CustomerID, TotCnt
FROM (SELECT CustomerID, COUNT(*) AS TotCnt
FROM Orders Group BY CustomerID) AS Cust
OUTPUT

To expand on the DENSE_RANK comment, the full query is short and sweet:
SELECT
DENSE_RANK() OVER (ORDER BY uniqueId DESC) AS rankNumber,
uniqueId
FROM myTable
ORDER BY rankNumber
There's a SQL Fiddle here

Sql get latest records of the month for each name

This question is probably answered before but i cant find how to get the latest records of the months.
The problem is that I have a table with sometimes 2 row for the same month. I cant use the aggregate function(I guess) cause in the 2 rows, i have different data where i need to get the latest.
Example:
name Date nuA nuB nuC nuD
test1 05/06/2013 356 654 3957 7033
test1 05/26/2013 113 237 399 853
test3 06/06/2013 145 247 68 218
test4 06/22/2013 37 37 6 25
test4 06/27/2013 50 76 20 84
test4 05/15/2013 34 43 34 54
I need to get a result like:
test1 05/26/2013 113 237 399 853
test3 06/06/2013 145 247 68 218
test4 05/15/2013 34 43 34 54
test4 06/27/2013 50 76 20 84
** in my example the data is in order but in my real table the data is not in order.
For now i have something like:
SELECT Name, max(DATE) , nuA,nuB,nuC,nuD
FROM tableA INNER JOIN
Group By Name, nuA,nuB,nuC,nuD
But it didn't work as i want.
Thanks in advance
Edit1:
It seems that i wasn't clear with my question...
So i add some data in my example to show you how i need to do it.
Thanks guys

Use SQL Server ranking functions.
select name, Date, nuA, nuB, nuC, nuD from
(Select *, row_number() over (partition by name, datepart(year, Date),
datepart(month, Date) order by Date desc) as ranker from Table
) Z
where ranker = 1

Try this
SELECT t1.* FROM Table1 t1
INNER JOIN
(
SELECT [name],MAX([date]) as [date] FROM Table1
GROUP BY [name],YEAR([date]),MONTH([date])
) t2
ON t1.[date]=t2.[date] and t1.[name]=t2.[name]
ORDER BY t1.[name]

Can you not just do an order
select * from tablename where Date = (select max(Date) from tablename)
followed by only pulling the first 3?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Max 3 values of every group - sql

Related

Count similar names block per independent partitions

Get sum qty over specific range

SQL Subquerys and RANK()

Something like rank() in SQL Server

Sql get latest records of the month for each name

Categories

Resources