Find the top 2 records for each key in a table [duplicate] - sql

This question already has an answer here:
PostgreSQL: top n entries per item in same table
(1 answer)
Closed 7 months ago.
I have a list of results of player's scores in games, and I need to get the first two finishers for each game. LIMIT 2 works for the result set as a whole, but I need to limit it to 2 (or 1 if there is only one) per game.
Table being queried:
game_id
player_id
score
1
10
100
1
20
300
1
30
200
2
40
100
2
50
200
Desired results:
game_id
player_id
score
1
20
300
1
30
200
2
50
200
2
40
100

Using RANK() we can try:
WITH cte AS (
SELECT *, RANK() OVER (PARTITION BY game_id ORDER BY score DESC) rnk
FROM yourTable
)
SELECT game_id, player_id, score
FROM cte
WHERE rnk <= 2
ORDER BY game_id, score DESC;
Note that if there be the possibility of ties, then you might want to use DENSE_RANK instead of RANK. If ties are not a concern, then you could also use ROW_NUMBER instead of RANK.

Related

Limit SQL query results by distinct column value

I have an table with columns id, score, parent_id ordered by the score.
id
score
parent_id
5859
10
5859
2157043
9
5859
21064154
8
21064154
51992
7
51992
34384599
6
51992
1675761
5
1675761
3465729
4
3465729
401202
3
401203
1817458
2
1817458
I want to query all columns from this table with the same order but limit results at least 5 rows to meet the unique parent_id number equal to 5. As result, the parent_id only contains 5 ids: 5859, 21064154, 51992, 1675761, 3465729
Expected Results like:
id
score
parent_id
5859
10
5859
2157043
9
5859
21064154
8
21064154
51992
7
51992
34384599
6
51992
1675761
5
1675761
3465729
4
3465729
One other way you could accomplish this is to use lag to indicate when the id changes and then use a cumulative sum over a window then filtering accordingly:
select id, score, parent_id
from (
select *, Sum(diff) over(order by score desc)seq
from (
select *,
case when Lag(parent_id) over(order by score desc) = parent_id then 0 else 1 end diff
from t
)t
)d
where seq <= 5
order by score desc;
I understand that you want to retain rows that belong to the top 5 scoring parent_ids.
Although a sophisticated approach based on window functions might be possible here, here is one way to do it simply using a subquery:
select *
from mytable t
where parent_id in (
select top 5 parent_id
from mytable
group by parent_id
order by max(score) desc
)
order by score desc
If your data may have some score tied, consider adding option with ties to the subquery in order to guarantee a predictable result.

Select column's occurence order without group by

I currently have two tables, users and coupons
id
first_name
1
Roberta
2
Oliver
3
Shayna
4
Fechin
id
discount
user_id
1
20%
1
2
40%
2
3
15%
3
4
30%
1
5
10%
1
6
70%
4
What I want to do is select from the coupons table until I've selected X users.
so If I chose X = 2 the resulting table would be
id
discount
user_id
1
20%
1
2
40%
2
4
30%
1
5
10%
1
I've tried using both dense_rank and row_number but they return the count of occurrences of each user_id not it's order.
SELECT id,
discount,
user_id,
dense_rank() OVER (PARTITION BY user_id)
FROM coupons
I'm guessing I need to do it in multiple subqueries (which is fine) where the first subquery would return something like
id
discount
user_id
order_of_occurence
1
20%
1
1
2
40%
2
2
3
15%
3
3
4
30%
1
1
5
10%
1
1
6
70%
4
4
which I can then use to filter by what I need.
PS: I'm using postgresql.
You've stated that you want to parameterize the query so that you can retrieve X users. I'm reading that as all coupons for the first X distinct user_ids in coupon id column order.
It appears your attempt was close. dense_rank() is the right idea. Since you want to look over the entire table you can't use partition by. And a sorting column is also required to determine the ranking.
with data as (
select *,
dense_rank() over (order by id) as dr
from coupons
)
select * from data where dr <= <X>;

How to get rank of a user from all users

I have table called summary_coins , By ranking of coins I am trying to get an user ranking
I have tried like below
SELECT
user_id,
sum(get_count),
rank() over (order by sum(get_count) asc) as rank
FROM summary_coins
WHERE user_id = 2
GROUP BY user_id
sample data , without user_id = 2 in where I am getting below list
user_id sum rank
44 2 1
13 4 2
57 4 2
47 4 2
11 5 5
2 5 5
My desire out put :
2 5 5
Here I am always getting ranking 1 for user ID 2 , But from list of user it should be rank 5.
You want to apply WHERE user_id = 2 late. RANK OVER is the last thing to happen in your query, but you want to apply the WHERE clause afterwards. In order to do this make your query a subquery you select from:
SELECT user_id, sum_count, rank
FROM
(
SELECT
user_id,
sum(get_count) AS sum_count,
rank() over (order by sum(get_count) asc) as rank
FROM summary_coins
GROUP BY user_id
) all_users
WHERE user_id = 2;

Get specified row ranking number

Here is the rows looks like:
Id Gold
1 200
2 100
3 300
4 900
5 800
6 1000
What I want to achieve is getting the rank number whose Id equals to 5, which is order by Gold descending.
So after ordering, the intermediate rows should be(NOT RETURN):
Id Gold
6 1000
4 900
5 800
And the SQL should just return 3, which is the ranking of Id = 5 row.
What is the most efficient way to achieve this?
You simply want top, I think:
select top 3 t.*
from t
order by gold desc;
If you want the ranking of id = 5:
select count(*)
from t
where t.gold >= (select t2.gold from t t2 where t2.id = 5);
Try This Code By using Dense_rank():
WITH cte
AS (SELECT *,
Dense_rank()
OVER(
ORDER BY [Gold] DESC) AS rank
FROM your_table)
SELECT rank
FROM cte
WHERE id = 5

SQL - Overall average Points

I have a table like this:
[challenge_log]
User_id | challenge | Try | Points
==============================================
1 1 1 5
1 1 2 8
1 1 3 10
1 2 1 5
1 2 2 8
2 1 1 5
2 2 1 8
2 2 2 10
I want the overall average points. To do so, i believe i need 3 steps:
Step 1 - Get the MAX value (of points) of each user in each challenge:
User_id | challenge | Points
===================================
1 1 10
1 2 8
2 1 5
2 2 10
Step 2 - SUM all the MAX values of one user
User_id | Points
===================
1 18
2 15
Step 3 - The average
AVG = SUM (Points from step 2) / number of users = 16.5
Can you help me find a query for this?
You can get the overall average by dividing the total number of points by the number of distinct users. However, you need the maximum per challenge, so the sum is a bit more complicated. One way is with a subquery:
select sum(Points) / count(distinct userid)
from (select userid, challenge, max(Points) as Points
from challenge_log
group by userid, challenge
) cl;
You can also do this with one level of aggregation, by finding the maximum in the where clause:
select sum(Points) / count(distinct userid)
from challenge_log cl
where not exists (select 1
from challenge_log cl2
where cl2.userid = cl.userid and
cl2.challenge = cl.challenge and
cl2.points > cl.points
);
Try these on for size.
Overall Mean
select avg( Points ) as mean_score
from challenge_log
Per-Challenge Mean
select challenge ,
avg( Points ) as mean_score
from challenge_log
group by challenge
If you want to compute the mean of each users highest score per challenge, you're not exactly raising the level of complexity very much:
Overall Mean
select avg( high_score )
from ( select user_id ,
challenge ,
max( Points ) as high_score
from challenge_log
) t
Per-Challenge Mean
select challenge ,
avg( high_score )
from ( select user_id ,
challenge ,
max( Points ) as high_score
from challenge_log
) t
group by challenge
After step 1 do
SELECT USER_ID, AVG(POINTS)
FROM STEP1
GROUP BY USER_ID
You can combine step 1 and 2 into a single query/subquery as follows:
Select BestShot.[User_ID], AVG(cast (BestShot.MostPoints as money))
from (select tLog.Challenge, tLog.[User_ID], MostPoints = max(tLog.points)
from dbo.tmp_Challenge_Log tLog
Group by tLog.User_ID, tLog.Challenge
) BestShot
Group by BestShot.User_ID
The subquery determines the most points for each user/challenge combo, and the outer query takes these max values and uses the AVG function to return the average value of them. The last Group By tells SQL to average all the values across each User_ID.