Getting unique record based on max conditions including null values - sql

I need to get back one record of a player's (Rank A players only) most recent win date (some win dates are null but we need to include them) but picking only the last place of their most recent game session. So basically in that order: get their max win_date (if null, still include them) > from there grab their max place > and from there, pick only their max game_session_id.
Table players:
Badge_No Name Game_Session_ID Place Win_Date Rank
565 Barry 012550 4 6/17/2021 A
565 Barry 003521 2 3/04/2021 A
565 Barry 003521 3 3/04/2021 A
565 Barry 003521 4 3/04/2021 A
565 Barry 003521 5 3/04/2021 A
565 Barry 095945 1 6/17/2021 A
101 Lee 065411 1 A
018 Jess 001561 1 5/23/2020 A
018 Jess 002075 1 5/23/2020 A
209 Linda 026541 2 5/06/2021 A
728 Perry 000940 1 1/23/2021 B
Expected Output:
Badge_No Name Game_Session_ID Place Win_Date Rank
565 Barry 012550 4 6/17/2021 A
101 Lee 065411 1 A
018 Jess 002075 1 5/23/2020 A
209 Linda 026541 2 5/06/2021 A
My (wrong) code:
select distinct badge_no,
name, max(game_session_id) game_session_id,
max(place) place, max(win_date) win_date, rank
from players p
where not exists
(select 'x' from players p2
where p2.badge_no = p.badge_no and p.rank = 'B')
group by badge_no, name, rank

Use ROW_NUMBER with an approriate partition:
WITH cte AS (
SELECT p.*, ROW_NUMBER() OVER (PARTITION BY Badge_No
ORDER BY Win_Date DESC, Place DESC, Game_Session_ID DESC) rn
FROM players p
WHERE "Rank" = 'A'
)
SELECT Badge_No, Name, Game_Session_ID, Place, Win_Date, "Rank"
FROM cte
WHERE rn = 1;

select badge_no, name,
max(game_session_id) keep (dense_rank last
order by win_date nulls first, place) as game_session_id,
max(place) keep (dense_rank last order by win_date nulls first) as place,
max(win_date) as win_date, rank
from players
where rank = 'A'
group by badge_no, name, rank
;
If you are not familiar with the first / last aggregate function (don't worry, you would not be alone!), you may want to take a quick look at the documentation to see what it does.

Related

Display DISTINCT value on SQL statement by column condition

i'm introducing you the problem with DISTINCT values by column condition i have dealt with and can't provide
any idea how i can solve it.
So. The problem is i have two Stephen here declared , but i don't want duplicates:
**
The problem:
**
id vehicle_id worker_id user_type user_fullname
9 1 NULL external_users John Dalton
10 1 16 employees Mike
11 1 1 employees Stephen
12 2 173 employee Nicholas
13 2 1 employee Stephen
14 1 NULL external_users Peter
**
The desired output:**
id vehicle_id worker_id user_type user_fullname
9 1 NULL external_users John Dalton
10 1 16 employees Mike
12 2 173 employee Nicholas
13 2 1 employee Stephen
14 1 NULL external_users Peter
I have tried CASE statements but without success. When i group by it by worker_id,
it removes another duplicates, so i figured out it needs to be grouped by some special condition?
If anyone can provide me some hint how i can solve this problem , i will be very grateful.
Thank's!
There are no duplicate rows in this table. Just because Stephen appears twice doesn't make them duplicates because the ID, VEHICLE_ID, and USER_TYPE are different.
What you need to do is decide how you want to identify the Stephen record you wish to see in the output. Is it the one with the highest VEHICLE_ID? The "latest" record, i.e. the one with the highest ID?
You will use that rule in a window function to order the rows within your criteria, and then use that row number to filter down to the results you want. Something like this:
select id, vehicle_id, worker_id, user_type, user_fullname
from (
select id, vehicle_id, worker_id, user_type, user_fullname,
row_number() over (partition by worker_id, user_fullname order by id desc) n
from user_vehicle
) t
where t.n = 1

SQL find and group consecutive number in rows without duplicate

So I have a table like this:
Taxi Client Time
Tom A 1
Tom A 2
Tom B 3
Tom A 4
Tom A 5
Tom A 6
Tom B 7
Tom B 8
Bob A 1
Bob A 2
Bob A 3
and the expected result will be like this:
Tom 3
Bob 1
I have used the partition function to count the consecutive value but the result become this:
Tom A 2
Tom A 3
Tom B 2
Bob A 2
Please help, I am not good in English, thanks!
This is a variation of a gaps-and-islands problem. You can solve it using window functions:
select taxi, count(*)
from (select t.taxi, t.client, count(*) as num_times
from (select t.*,
row_number() over (partition by taxi order by time) as seqnum,
row_number() over (partition by taxi, client order by time) as seqnum_c
from t
) t
group by t.taxi, t.client, (seqnum - seqnum_c)
having count(*) >= 2
)
group by taxi;
use distinct count
select taxi ,count( distinct cient)
from table_name
group by taxi
It seems your expected output is wrong
I don't see where you get the number 3 from. If you're trying to do what your question says and group by client in consecutive order only and then get the number of different groups, I can help you out with the following query. Bob has 1 group and Tom has 4.
Partition by taxi, ORDER BY taxi, time and check if this client matches the previous client for this taxi. If yes, do not count this row. If no, count this row, this is a new group.
SELECT FEE.taxi,
SUM(FEE.clientNotSameAsPreviousInSequence)
FROM
(
SELECT taxi,
CASE
WHEN PreviousClient IS NULL THEN
1
WHEN PreviousClient <> client THEN
1
ELSE
0
END AS clientNotSameAsPreviousInSequence
FROM
(
SELECT *,
LAG(client) OVER (PARTITION BY taxi ORDER BY taxi, time) AS PreviousClient
FROM table
) taxisWithPreviousClient
) FEE
GROUP BY FEE.taxi;

Limit column value repeats to top 2

So I have this query:
SELECT
Search.USER_ID,
Search.SEARCH_TERM,
COUNT(*) AS Search.count
FROM Search
GROUP BY 1,2
ORDER BY 3 DESC
Which returns a response that looks like this:
USER_ID SEARCH_TERM count
bob dog 50
bob cat 45
sally cat 38
john mouse 30
sally turtle 10
sally lion 5
john zebra 3
john leopard 1
And my question is: How would I change the query, so that it only returns the top 2 most-searched-for-terms for any given user? So in the example above, the last row for Sally would be dropped, and the last row for John would also be dropped, leaving a total of 6 rows; 2 for each user, like so:
USER_ID SEARCH_TERM count
bob dog 50
bob cat 45
sally cat 38
john mouse 30
sally turtle 10
john zebra 3
In SQL Server, you can put the original query into a CTE, add the ROW_NUMBER() function. Then in the new main query, just add a WHERE clause to limit by the row number. Your query would look something like this:
;WITH OriginalQuery AS
(
SELECT
s.[User_id]
,s.Search_Term
,COUNT(*) AS 'count'
,ROW_NUMBER() OVER (PARTITION BY s.[USER_ID] ORDER BY COUNT(*) DESC) AS rn
FROM Search s
GROUP BY s.[User_id], s.Search_Term
)
SELECT oq.User_id
,oq.Search_Term
,oq.count
FROM OriginalQuery oq
WHERE rn <= 2
ORDER BY oq.count DESC
EDIT: I specified SQL Server as the dbms I used here, but the above should be ANSI-compliant and work in Snowflake.

SQL Server select first instance of ranked data

I have a query that creates a result set like this:
Rank Name
1 Fred
1 John
2 Mary
2 Fred
2 Betty
3 John
4 Betty
4 Frank
I need to then select the lowest rank for each name, e.g.:
Rank Name
1 Fred
1 John
2 Mary
2 Betty
4 Frank
Can this be done within TSQL?
SELECT MIN(Rank) AS Rank, Name
FROM TableName
GROUP BY Name
yes
select name, min(rank)
from nameTable
group by name
As Paul + Kevin have pointed out, simple cases of returning a value from an aggregate can be extracted using MIN / MAX etc (just note that RANK is a reserved word)
In a more general / complicated case, e.g. where you need to find the second / Nth highest rank, you can use PARTITIONs with ROW_NUMBER() to do ranking and then filter by the rank.
SELECT [Rank], [Name]
FROM
(
SELECT [RANK], [Name],
ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY [Rank]) as [RowRank]
FROM [MyTable]
) AS [MyTableReRanked]
WHERE [RowRank] = #N
ORDER BY [Rank];

selecting top N rows for each group in a table

I am facing a very common issue regarding "Selecting top N rows for each group in a table".
Consider a table with id, name, hair_colour, score columns.
I want a resultset such that, for each hair colour, get me top 3 scorer names.
To solve this i got exactly what i need on Rick Osborne's blogpost "sql-getting-top-n-rows-for-a-grouped-query"
That solution doesn't work as expected when my scores are equal.
In above example the result as follow.
id name hair score ranknum
---------------------------------
12 Kit Blonde 10 1
9 Becca Blonde 9 2
8 Katie Blonde 8 3
3 Sarah Brunette 10 1
4 Deborah Brunette 9 2 - ------- - - > if
1 Kim Brunette 8 3
Consider the row 4 Deborah Brunette 9 2. If this also has same score (10) same as Sarah, then ranknum will be 2,2,3 for "Brunette" type of hair.
What's the solution to this?
If you're using SQL Server 2005 or newer, you can use the ranking functions and a CTE to achieve this:
;WITH HairColors AS
(SELECT id, name, hair, score,
ROW_NUMBER() OVER(PARTITION BY hair ORDER BY score DESC) as 'RowNum'
)
SELECT id, name, hair, score
FROM HairColors
WHERE RowNum <= 3
This CTE will "partition" your data by the value of the hair column, and each partition is then order by score (descending) and gets a row number; the highest score for each partition is 1, then 2 etc.
So if you want to the TOP 3 of each group, select only those rows from the CTE that have a RowNum of 3 or less (1, 2, 3) --> there you go!
The way the algorithm comes up with the rank, is to count the number of rows in the cross-product with a score equal to or greater than the girl in question, in order to generate rank. Hence in the problem case you're talking about, Sarah's grid would look like
a.name | a.score | b.name | b.score
-------+---------+---------+--------
Sarah | 9 | Sarah | 9
Sarah | 9 | Deborah | 9
and similarly for Deborah, which is why both girls get a rank of 2 here.
The problem is that when there's a tie, all girls take the lowest value in the tied range due to this count, when you'd want them to take the highest value instead. I think a simple change can fix this:
Instead of a greater-than-or-equal comparison, use a strict greater-than comparison to count the number of girls who are strictly better. Then, add one to that and you have your rank (which will deal with ties as appropriate). So the inner select would be:
SELECT a.id, COUNT(*) + 1 AS ranknum
FROM girl AS a
INNER JOIN girl AS b ON (a.hair = b.hair) AND (a.score < b.score)
GROUP BY a.id
HAVING COUNT(*) <= 3
Can anyone see any problems with this approach that have escaped my notice?
Use this compound select which handles OP problem properly
SELECT g.* FROM girls as g
WHERE g.score > IFNULL( (SELECT g2.score FROM girls as g2
WHERE g.hair=g2.hair ORDER BY g2.score DESC LIMIT 3,1), 0)
Note that you need to use IFNULL here to handle case when table girls has less rows for some type of hair then we want to see in sql answer (in OP case it is 3 items).