How to rank on aggregated sum in Postgresql?

How to rank on aggregated sum in Postgresql? - sql

I want to rank by aggregated points. Example: A guessing game. Day 1: Person A guesses and gets 10 points, person B guesses and gets 9 points. Day 2: Person A gets 5 points, Person B gets 9.
What I want to get is:
On Day 2, Person A has an aggregated amount of 15 points and ranks 2.
Here's the basic table guesses:
id, person, points, day
1, thomas, 10, 1
2,thomas,5,2
3,marie,9,1
4,marie,9,2
I'm having no problems getting the aggregated points grouped by day:
SELECT
*,
sum(points) OVER (PARTITION BY person ORDER BY id) AS total_running_points,
FROM
guesses
ORDER BY
day asc;
But now I need to rank on every day.
I tried with the following but failed as of course total_running_points is a new alias:
SELECT
*,
sum(points) OVER (PARTITION BY person ORDER BY id) AS total_running_points,
rank() OVER (ORDER BY total_running_points desc)
FROM
bets_by_day
ORDER BY
day asc;
I sense that I should use a subquery but then I wonder how to partition on it.
How can I solve this?

You can use a subquery:
SELECT b.*, rank() over (order by total_running_points desc) rnk
FROM (
SELECT b.*, sum(points) over (partition by person order by id) AS total_running_points
FROM bets_by_day b
) b
ORDER BY day asc;

Related

Get last record by month/year and id

I need to get the last record of each month/year for each id.
My table captures daily, for each id, an order value which is cumulative. So, I need that at the end I only have the last record of the month for each id.
I believe without something simple, but with the examples found I could not replicate for my case.
Here is an example of my input data and the expected result: db_fiddle.
My attempt doesn't include grouping by month and year:
select ar.id, ar.value, ar.aquisition_date
from table_views ar
inner join (
select id, max(aquisition_date) as last_aquisition_date_month
from table_views
group by id
)ld
on ar.id = ld.id and ar.aquisition_date = ld.last_aquisition_date_month

You could do this:
with tn as (
select
*,
row_number() over (partition by id, date_trunc('month', aquisition_date) order by aquisition_date desc) as rn
from table_views
)
select * from tn where rn = 1
The tn cte adds a row number that counts incrementally in descending order of date, for each month/id.. Then you take only those with rn=1, which is the last aquisition_date of any given month, for each id

Complex Ranking in SQL (Teradata)

I have a peculiar problem at hand. I need to rank in the following manner:
Each ID gets a new rank.
rank #1 is assigned to the ID with the lowest date. However, the subsequent dates for that particular ID can be higher but they will get the incremental rank w.r.t other IDs.
(E.g. ADF32 series will be considered to be ranked first as it had the lowest date, although it ends with dates 09-Nov, and RT659 starts with 13-Aug it will be ranked subsequently)
For a particular ID, if the days are consecutive then ranks are same, else they add by 1.
For a particular ID, ranks are given in date ASC.
How to formulate a query?

You need two steps:
select
id_col
,dt_col
,dense_rank()
over (order by min_dt, id_col, dt_col - rnk) as part_col
from
(
select
id_col
,dt_col
,min(dt_col)
over (partition by id_col) as min_dt
,rank()
over (partition by id_col
order by dt_col) as rnk
from tab
) as dt
dt_col - rnk caluclates the same result for consecutives dates -> same rank

Try datediff on lead/lag and then perform partitioned ranking
select t.ID_COL,t.dt_col,
rank() over(partition by t.ID_COL, t.date_diff order by t.dt_col desc) as rankk
from ( SELECT ID_COL,dt_col,
DATEDIFF(day, Lag(dt_col, 1) OVER(ORDER BY dt_col),dt_col) as date_diff FROM table1 ) t

One way to think about this problem is "when to add 1 to the rank". Well, that occurs when the previous value on a row with the same id_col differs by more than one day. Or when the row is the earliest day for an id.
This turns the problem into a cumulative sum:
select t.*,
sum(case when prev_dt_col = dt_col - 1 then 0 else 1
end) over
(order by min_dt_col, id_col, dt_col) as ranking
from (select t.*,
lag(dt_col) over (partition by id_col order by dt_col) as prev_dt_col,
min(dt_col) over (partition by id_col) as min_dt_col
from t
) t;

Sum having a condition

I've a table that has this information:
And need to get the following information:
If the country of the same person name (in this case Artur) is different, then I need to sum the two values of quantity from the max date (in this case 04/10) and return both person (Artur) and the qty (15k)
If the country of the same person name (in this case Joseph) is the same, then I need only the first row of the max date available.
I'm really struguling as I'm not sure how to implement the logic into my code:
Select
table.person,
table.quantity
From
(
Select
table.date,
table.person,
table.country,
table.quantity,
ROW_NUMBER () over (
PARTITION by table.code, table.person
ORDER by table.date DESC
) AS rn
FROM
table
WHERE table.date >= DATE '{2020-04-10}' -5
) a
WHERE a.RN IN (1,2)
Is it possible to create a rule to sum rows 1 and 2 when country is different (Artur case) and only return row number 1 when the country is the same for a name (Joseph case)?

Use dense_rank() or max() as a window function:
select person, sum(quantity)
from (select t.*,
max(date) over (partition by person) as max_date
from t
) t
where date = max_date
group by person;
EDIT:
Hmmm . . . I think you might want one row per country per person on the max date. If so:
select person, sum(quantity)
from (select t.*,
row_number() over (partition by person, country order by date desc) as seqnum_pc,
rank() over (partition by person order by date desc) as seqnum_p
from t
) t
where seqnum_p = 1 and seqnum_pc = 1
group by person;

Selecting type(s) of account with 2nd maximum number of accounts

Suppose we have an accounts table along with the already given values
I want to find the type of account with second highest number of accounts. In this case, result should be 'FD'. In case their is a contention for second highest count I need all those types in the result.
I'm not getting any idea of how to do it. I've found numerous posts for finding second highest values, say salary, in a table. But not for second highest COUNT.

This can be done using cte's. Get the counts for each type as the first step. Then use dense_rank (to get multiple rows with same counts in case of ties) to get the rank of rows by type based on counts. Finally, select the second ranked row.
with counts as (
select type, count(*) cnt
from yourtable
group by type)
, ranks as (
select type, dense_rank() over(order by cnt desc) rnk
from counts)
select type
from ranks
where rnk = 2;

One option is to use row_number() (or dense_rank(), depending on what "second" means when there are ties):
select a.*
from (select a.type, count(*) as cnt,
row_number() over (order by count(*) desc) as seqnum
from accounta a
group by a.type
) a
where seqnum = 2;
In Oracle 12c+, you can use offset/fetch:
select a.type, count(*) as cnt
from accounta a
group by a.type
order by count(*) desc
offset 1
fetch first 1 row only

Select only 3 best ranked after rank() over

I'd like to select the 3 best results of a rank() function for each partition
For instance, in this query :
SELECT id, rank() over (PARTITION BY year order by ...) as rank
FROM table1
GROUP BY year
I'd like to have 3 best ranked for every year.
I can manage that by making a new :
Select *
from ...
where rank <= 3
but then if I have some equalities, i'll get more than 3 row per year.
Do someone have an idea how to solve that ?

We have not much information about your table and query structures, but as a generic solution I'd suggest to add row_number() over (ORDER BY ... desc) as rn and filter by it too with where rn = 1 like here.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to rank on aggregated sum in Postgresql? - sql

You can use a subquery: SELECT b., rank() over (order by total_running_points desc) rnk FROM ( SELECT b., sum(points) over (partition by person order by id) AS total_running_points FROM bets_by_day b ) b ORDER BY day asc;

Related

Get last record by month/year and id

Complex Ranking in SQL (Teradata)

Sum having a condition

Selecting type(s) of account with 2nd maximum number of accounts

Select only 3 best ranked after rank() over

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to rank on aggregated sum in Postgresql? - sql

You can use a subquery: SELECT b.*, rank() over (order by total_running_points desc) rnk FROM ( SELECT b.*, sum(points) over (partition by person order by id) AS total_running_points FROM bets_by_day b ) b ORDER BY day asc;

Related

Get last record by month/year and id

Complex Ranking in SQL (Teradata)

Sum having a condition

Selecting type(s) of account with 2nd maximum number of accounts

Select only 3 best ranked after rank() over

Categories

Resources

You can use a subquery: SELECT b., rank() over (order by total_running_points desc) rnk FROM ( SELECT b., sum(points) over (partition by person order by id) AS total_running_points FROM bets_by_day b ) b ORDER BY day asc;