SQL query count rows with the same entry - sql

Given a dataset Roster_table as such:
Group ID
Group Name
Name
Phone
42
Red Dragon
Jon
123455678
32
Green Lizard
Liz
932143211
19
Blue Falcon
Ben
134554678
42
Red Dragon
Reed
432143211
42
Red Dragon
Brad
231314155
19
Blue Falcon
Chad
214124412
How do I get the following query output combining rows with the same Group ID from the dataset, and the new column Count in descending order:
Group ID
Group Name
Count
42
Red Dragon
3
19
Blue Falcon
2
32
Green Lizard
1
SELECT * FROM Roster_table

Please try this where alias tot_count is used in ORDER BY clause.
-- PostgreSQL(v11)
SELECT Group_ID
, MAX(Group_Name) Group_Name
, COUNT(1) tot_count
FROM Roster_table
GROUP BY Group_ID
ORDER BY tot_count DESC;
Please check from url https://dbfiddle.uk/?rdbms=postgres_11&fiddle=b66f9f0d40e804e89be12e3530fe00a0

Based on Rahul Biswas's answer:
Solution without using Max function
SELECT Group_ID, Group_Name, COUNT(*)
FROM Roster_table
GROUP BY Group_ID, Group_Name
ORDER BY COUNT(*) DESC
Credit goes to Eric S.

Related

Getting unique record based on max conditions including null values

I need to get back one record of a player's (Rank A players only) most recent win date (some win dates are null but we need to include them) but picking only the last place of their most recent game session. So basically in that order: get their max win_date (if null, still include them) > from there grab their max place > and from there, pick only their max game_session_id.
Table players:
Badge_No Name Game_Session_ID Place Win_Date Rank
565 Barry 012550 4 6/17/2021 A
565 Barry 003521 2 3/04/2021 A
565 Barry 003521 3 3/04/2021 A
565 Barry 003521 4 3/04/2021 A
565 Barry 003521 5 3/04/2021 A
565 Barry 095945 1 6/17/2021 A
101 Lee 065411 1 A
018 Jess 001561 1 5/23/2020 A
018 Jess 002075 1 5/23/2020 A
209 Linda 026541 2 5/06/2021 A
728 Perry 000940 1 1/23/2021 B
Expected Output:
Badge_No Name Game_Session_ID Place Win_Date Rank
565 Barry 012550 4 6/17/2021 A
101 Lee 065411 1 A
018 Jess 002075 1 5/23/2020 A
209 Linda 026541 2 5/06/2021 A
My (wrong) code:
select distinct badge_no,
name, max(game_session_id) game_session_id,
max(place) place, max(win_date) win_date, rank
from players p
where not exists
(select 'x' from players p2
where p2.badge_no = p.badge_no and p.rank = 'B')
group by badge_no, name, rank
Use ROW_NUMBER with an approriate partition:
WITH cte AS (
SELECT p.*, ROW_NUMBER() OVER (PARTITION BY Badge_No
ORDER BY Win_Date DESC, Place DESC, Game_Session_ID DESC) rn
FROM players p
WHERE "Rank" = 'A'
)
SELECT Badge_No, Name, Game_Session_ID, Place, Win_Date, "Rank"
FROM cte
WHERE rn = 1;
select badge_no, name,
max(game_session_id) keep (dense_rank last
order by win_date nulls first, place) as game_session_id,
max(place) keep (dense_rank last order by win_date nulls first) as place,
max(win_date) as win_date, rank
from players
where rank = 'A'
group by badge_no, name, rank
;
If you are not familiar with the first / last aggregate function (don't worry, you would not be alone!), you may want to take a quick look at the documentation to see what it does.

Limit column value repeats to top 2

So I have this query:
SELECT
Search.USER_ID,
Search.SEARCH_TERM,
COUNT(*) AS Search.count
FROM Search
GROUP BY 1,2
ORDER BY 3 DESC
Which returns a response that looks like this:
USER_ID SEARCH_TERM count
bob dog 50
bob cat 45
sally cat 38
john mouse 30
sally turtle 10
sally lion 5
john zebra 3
john leopard 1
And my question is: How would I change the query, so that it only returns the top 2 most-searched-for-terms for any given user? So in the example above, the last row for Sally would be dropped, and the last row for John would also be dropped, leaving a total of 6 rows; 2 for each user, like so:
USER_ID SEARCH_TERM count
bob dog 50
bob cat 45
sally cat 38
john mouse 30
sally turtle 10
john zebra 3
In SQL Server, you can put the original query into a CTE, add the ROW_NUMBER() function. Then in the new main query, just add a WHERE clause to limit by the row number. Your query would look something like this:
;WITH OriginalQuery AS
(
SELECT
s.[User_id]
,s.Search_Term
,COUNT(*) AS 'count'
,ROW_NUMBER() OVER (PARTITION BY s.[USER_ID] ORDER BY COUNT(*) DESC) AS rn
FROM Search s
GROUP BY s.[User_id], s.Search_Term
)
SELECT oq.User_id
,oq.Search_Term
,oq.count
FROM OriginalQuery oq
WHERE rn <= 2
ORDER BY oq.count DESC
EDIT: I specified SQL Server as the dbms I used here, but the above should be ANSI-compliant and work in Snowflake.

select distinct rows order by Id Asc

I want to select rows that have a distinct Title Column.
Id Title Type
1 Bronze Group
2 Bronze Group
3 Bronze Group
4 Silver Group
5 Silver Group
6 Silver Group
7 Gold Group
8 Gold Group
9 Gold Group
10 Platinum Group
11 Platinum Group
12 Platinum Group
I thought this would be a simple query but i'm struggling! If anyone can help that would be great
SELECT DISTINCT(Title), Id
FROM Package
WHERE Type='Group'
ORDER BY Id ASC
You need to group by the title. And you have to tell the DB which rule to apply when selecting the id for duplicate entries. For instance the smallest id for every unique title:
SELECT Title, min(Id) as minid
FROM Package
WHERE Type='Group'
GROUP BY Title
ORDER BY min(Id) ASC
You have to drop the ID off as it is unique and makes it DISTINCT. Something like this.
SELECT DISTINCT Title
FROM Package
WHERE Type='Group'

How to produce detail, not summary, report sorted by count(*)?

Oracle 11g:
I want results to list by highest count, then ch_id. When I use group by to get the count then I loose the granularity of the detail. Is there an analytic function I could use?
SALES
ch_id desc customer
=========================
ANAR Anari BOB
SWIS Swiss JOE
SWIS Swiss AMY
BRUN Brunost SAM
BRUN Brunost ANN
BRUN Brunost ROB
Desired Results
count ch_id customer
===========================================
3 BRUN ANN
3 BRUN ROB
3 BRUN SAM
2 SWIS AMY
2 SWIS JOE
1 ANAR BOB
Use the analytic count(*):
select * from
(
select count(*) over (partition by ch_id) cnt,
ch_id, customer
from sales
)
order by cnt desc
select total, ch_id, customer
from sales s
inner join (select count(*) total, ch_id from sales group by ch_id) b
on b.ch_id = s.chi_id
order by total, ch_id
ok - the other post that happened at the same time, using partition, is the better solution for Oracle. But this one works regardless of DB.

selecting top N rows for each group in a table

I am facing a very common issue regarding "Selecting top N rows for each group in a table".
Consider a table with id, name, hair_colour, score columns.
I want a resultset such that, for each hair colour, get me top 3 scorer names.
To solve this i got exactly what i need on Rick Osborne's blogpost "sql-getting-top-n-rows-for-a-grouped-query"
That solution doesn't work as expected when my scores are equal.
In above example the result as follow.
id name hair score ranknum
---------------------------------
12 Kit Blonde 10 1
9 Becca Blonde 9 2
8 Katie Blonde 8 3
3 Sarah Brunette 10 1
4 Deborah Brunette 9 2 - ------- - - > if
1 Kim Brunette 8 3
Consider the row 4 Deborah Brunette 9 2. If this also has same score (10) same as Sarah, then ranknum will be 2,2,3 for "Brunette" type of hair.
What's the solution to this?
If you're using SQL Server 2005 or newer, you can use the ranking functions and a CTE to achieve this:
;WITH HairColors AS
(SELECT id, name, hair, score,
ROW_NUMBER() OVER(PARTITION BY hair ORDER BY score DESC) as 'RowNum'
)
SELECT id, name, hair, score
FROM HairColors
WHERE RowNum <= 3
This CTE will "partition" your data by the value of the hair column, and each partition is then order by score (descending) and gets a row number; the highest score for each partition is 1, then 2 etc.
So if you want to the TOP 3 of each group, select only those rows from the CTE that have a RowNum of 3 or less (1, 2, 3) --> there you go!
The way the algorithm comes up with the rank, is to count the number of rows in the cross-product with a score equal to or greater than the girl in question, in order to generate rank. Hence in the problem case you're talking about, Sarah's grid would look like
a.name | a.score | b.name | b.score
-------+---------+---------+--------
Sarah | 9 | Sarah | 9
Sarah | 9 | Deborah | 9
and similarly for Deborah, which is why both girls get a rank of 2 here.
The problem is that when there's a tie, all girls take the lowest value in the tied range due to this count, when you'd want them to take the highest value instead. I think a simple change can fix this:
Instead of a greater-than-or-equal comparison, use a strict greater-than comparison to count the number of girls who are strictly better. Then, add one to that and you have your rank (which will deal with ties as appropriate). So the inner select would be:
SELECT a.id, COUNT(*) + 1 AS ranknum
FROM girl AS a
INNER JOIN girl AS b ON (a.hair = b.hair) AND (a.score < b.score)
GROUP BY a.id
HAVING COUNT(*) <= 3
Can anyone see any problems with this approach that have escaped my notice?
Use this compound select which handles OP problem properly
SELECT g.* FROM girls as g
WHERE g.score > IFNULL( (SELECT g2.score FROM girls as g2
WHERE g.hair=g2.hair ORDER BY g2.score DESC LIMIT 3,1), 0)
Note that you need to use IFNULL here to handle case when table girls has less rows for some type of hair then we want to see in sql answer (in OP case it is 3 items).