Suppose I get the following result for a GROUP BY query on a table:
Name Count(*)
Apple 6
Mango 3
Grape 8
Pomegranate 1
Strawberry 13
How can I get the top three elements listed and the rest of the elements summed up in some name like 'Others'. something like follows.
Name Count(*)
Strawberry 13
Grape 8
Apple 6
Others 4
This is to be done in Oracle. Seaching yields results using TOP which is not available in Oracle.
Here's a complete solution including SQL fiddle:
WITH fruit_summary AS (
SELECT fruit, cnt,
RANK() OVER (ORDER BY cnt DESC) AS cnt_rank
FROM (
SELECT fruit, count(*) AS cnt
FROM fruit_table
GROUP BY fruit
)
)
SELECT fruit, cnt
FROM (
SELECT fruit, cnt, cnt AS val
FROM fruit_summary
WHERE cnt_rank <= 3
UNION ALL
SELECT 'Others', SUM(cnt), -1
FROM fruit_summary
WHERE cnt_rank > 3
)
ORDER BY val DESC
Note that the query can return more than 4 (3 + 1) rows if you have several summary lines with the same count in the top 3.
The WITH clause groups the source table by fruit and assigns a rank to each row. The resulting intermediate table is then used to display the top 3 as well as to summarize the resulting one.
This solution should work for you. You can actually use aggregates in window functions:
WITH f1 AS (
SELECT fruit_name, COUNT(*) AS fruit_cnt, RANK() OVER ( ORDER BY COUNT(*) DESC ) AS fruit_rank
FROM fruits
GROUP BY fruit_name
)
SELECT fruit_name, fruit_cnt, fruit_rank
FROM f1
WHERE fruit_rank <= 3
UNION
SELECT 'Others' AS fruit_name, SUM(fruit_cnt) AS fruit_cnt, MAX(fruit_rank) AS fruit_rank
FROM f1
WHERE fruit_rank > 3
ORDER BY fruit_rank
Please see SQL Fiddle Demo here.
Related
I have a problem regarding SQL query , it can be done in "plain" SQL, but as I am sure that I need to use some group concatenation (can't use MySQL) so second option is ORACLE dialect as there will be Oracle database. Let's say we have following entities:
Table: Veterinarian visits
Visit_Id,
Animal_id,
Veterinarian_id,
Sickness_code
Let's say there is 100 visits (100 visit_id) and each animal_id visits around 20 times.
I need to create a SELECT , grouped by Animal_id with 3 columns
animal_id
second shows aggregated amount of flu visits for this particular animal (let's say flu, sickness_code = 5)
3rd column shows top three sicknesses codes for each animal (top 3 most often codes for this particular animal_id)
How to do it? First and second columns are easy, but third? I know that I need to use LISTAGG from Oracle, OVER PARTITION BY, COUNT and RANK, I tried to tie it together but didn't work out as I expected :( How should this query look like?
Here sample data
create table VET as
select
rownum+1 Visit_Id,
mod(rownum+1,5) Animal_id,
cast(NULL as number) Veterinarian_id,
trunc(10*dbms_random.value)+1 Sickness_code
from dual
connect by level <=100;
Query
basically the subqueries do the following:
aggregate count and calculate flu count (in all records of the animal)
calculate RANK (if you need realy only 3 records use ROW_NUMBER - see discussion below)
Filter top 3 RANKs
LISTAGGregate result
with agg as (
select Animal_id, Sickness_code, count(*) cnt,
sum(case when SICKNESS_CODE = 5 then 1 else 0 end) over (partition by animal_id) as cnt_flu
from vet
group by Animal_id, Sickness_code
), agg2 as (
select ANIMAL_ID, SICKNESS_CODE, CNT, cnt_flu,
rank() OVER (PARTITION BY ANIMAL_ID ORDER BY cnt DESC) rnk
from agg
), agg3 as (
select ANIMAL_ID, SICKNESS_CODE, CNT, CNT_FLU, RNK
from agg2
where rnk <= 3
)
select
ANIMAL_ID, max(CNT_FLU) CNT_FLU,
LISTAGG(SICKNESS_CODE||'('||CNT||')', ', ') WITHIN GROUP (ORDER BY rnk) as cnt_lts
from agg3
group by ANIMAL_ID
order by 1;
gives
ANIMAL_ID CNT_FLU CNT_LTS
---------- ---------- ---------------------------------------------
0 1 6(5), 1(4), 9(3)
1 1 1(5), 3(4), 2(3), 8(3)
2 0 1(5), 10(3), 4(3), 6(3), 7(3)
3 1 5(4), 2(3), 4(3), 7(3)
4 1 2(5), 10(4), 1(2), 3(2), 5(2), 7(2), 8(2)
I intentionally show Sickness_code(count visits) to demonstarte that top 3 can have ties that you should handle.
Check the RANK function. Using ROW_NUMBER is not deterministic in this case.
I think the most natural way uses two levels of aggregation, along with a dash of window functions here and there:
select vas.animal,
sum(case when sickness_code = 5 then cnt else 0 end) as numflu,
listagg(case when seqnum <= 3 then sickness_code end, ',') within group (order by seqnum) as top3sicknesses
from (select animal, sickness_code, count(*) as cnt,
row_number() over (partition by animal order by count(*) desc) as seqnum
from visits
group by animal, sickness_code
) vas
group by vas.animal;
This uses the fact that listagg() ignores NULL values.
I have a table with rows like:
id group_name_code
1 999
2 16
3 789
4 999
5 231
6 999
7 349
8 16
9 819
10 999
11 654
But I want output rows like this:
id group_name_code
1 999
2 16
3 789
4 231
5 349
6 819
7 654
Will this query help?
select id, distinct(group_name_code) from group_table;
You seem to want:
Distinct values for group_name_code and a sequential id ordered by minimum id per set of group_name_code.
Netezza has the DISTINCT key word, but not DISTINCT ON () (Postgres feature):
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_select.html
You could:
SELECT DISTINCT group_name_code FROM group_table;
No parentheses, the DISTINCT key word does not require parentheses.
But you would not get the sequential id you show with this.
There are "analytic functions" a.k.a. window functions:
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_overview_analytic_funcs.html
And there is also row_number():
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_functions.html
So this should work:
SELECT row_number() OVER (ORDER BY min(id)) AS new_id, group_name_code
FROM group_table
GROUP BY group_name_code
ORDER BY min(id);
Or use a subquery if Netezza should not allow to nest aggregate and window functions:
SELECT row_number() OVER (ORDER BY id) AS new_id, group_name_code
FROM (
SELECT min(id) AS id, group_name_code
FROM group_table
GROUP BY group_name_code
) sub
ORDER BY id;
If you do not mind losing data on id you can use an aggregate function on that column and group by group_name_code:
select min(id) as id, group_name_code
from group_table
group by group_name_code
order by id;
This way you pull unique values for group_name_code and the lowest id for each code.
If you don't need id in your output (it seems like this doesn't correspond to input table) and just want the unique codes, try this:
select group_name_code
from p
group by group_name_code
order by id;
This gets the codes you want. If you want id to be the rownumber that will depend on which RDBMS you are using
you can get that result using CTE, replace #t with you table name and value with group_name_code
; WITH tbl AS (
SELECT DISTINCT value FROM #t
)
SELECT ROW_NUMBER() OVER (ORDER BY value) AS id,* FROM tbl
I have a table with 2 fields:
ID Name
-- -------
1 Alpha
2 Beta
3 Beta
4 Beta
5 Charlie
6 Charlie
I want to group them by name, with 'count', and a row 'SUM'
Name Count
------- -----
Alpha 1
Beta 3
Charlie 2
SUM 6
How would I write a query to add SUM row below the table?
SELECT name, COUNT(name) AS count
FROM table
GROUP BY name
UNION ALL
SELECT 'SUM' name, COUNT(name)
FROM table
OUTPUT:
name count
-------------------------------------------------- -----------
alpha 1
beta 3
Charlie 2
SUM 6
SELECT name, COUNT(name) AS count, SUM(COUNT(name)) OVER() AS total_count
FROM Table GROUP BY name
Without specifying which rdbms you are using
Have a look at this demo
SQL Fiddle DEMO
SELECT Name, COUNT(1) as Cnt
FROM Table1
GROUP BY Name
UNION ALL
SELECT 'SUM' Name, COUNT(1)
FROM Table1
That said, I would recomend that the total be added by your presentation layer, and not by the database.
This is a bit more of a SQL SERVER Version using Summarizing Data Using ROLLUP
SQL Fiddle DEMO
SELECT CASE WHEN (GROUPING(NAME) = 1) THEN 'SUM'
ELSE ISNULL(NAME, 'UNKNOWN')
END Name,
COUNT(1) as Cnt
FROM Table1
GROUP BY NAME
WITH ROLLUP
Try this:
SELECT ISNULL(Name,'SUM'), count(*) as Count
FROM table_name
Group By Name
WITH ROLLUP
all of the solution here are great but not necessarily can be implemented for old mysql servers (at least at my case). so you can use sub-queries (i think it is less complicated).
select sum(t1.cnt) from
(SELECT column, COUNT(column) as cnt
FROM
table
GROUP BY
column
HAVING
COUNT(column) > 1) as t1 ;
Please run as below :
Select sum(count)
from (select Name,
count(Name) as Count
from YourTable
group by Name); -- 6
The way I interpreted this question is needing the subtotal value of each group of answers. Subtotaling turns out to be very easy, using PARTITION:
SUM(COUNT(0)) OVER (PARTITION BY [Grouping]) AS [MY_TOTAL]
This is what my full SQL call looks like:
SELECT MAX(GroupName) [name], MAX(AUX2)[type],
COUNT(0) [count], SUM(COUNT(0)) OVER(PARTITION BY GroupId) AS [total]
FROM [MyView]
WHERE Active=1 AND Type='APP' AND Completed=1
AND [Date] BETWEEN '01/01/2014' AND GETDATE()
AND Id = '5b9xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' AND GroupId IS NOT NULL
GROUP BY AUX2, GroupId
The data returned from this looks like:
name type count total
Training Group 2 Cancelation 1 52
Training Group 2 Completed 41 52
Training Group 2 No Show 6 52
Training Group 2 Rescheduled 4 52
Training Group 3 NULL 4 10535
Training Group 3 Cancelation 857 10535
Training Group 3 Completed 7923 10535
Training Group 3 No Show 292 10535
Training Group 3 Rescheduled 1459 10535
Training Group 4 Cancelation 2 27
Training Group 4 Completed 24 27
Training Group 4 Rescheduled 1 27
You can use union to joining rows.
select Name, count(*) as Count from yourTable group by Name
union all
select "SUM" as Name, count(*) as Count from yourTable
For Sql server you can try this one.
SELECT ISNULL([NAME],'SUM'),Count([NAME]) AS COUNT
FROM TABLENAME
GROUP BY [NAME] WITH CUBE
with cttmp
as
(
select Col_Name, count(*) as ctn from tab_name group by Col_Name having count(Col_Name)>1
)
select sum(ctn) from c
You can use ROLLUP
select nvl(name, 'SUM'), count(*)
from table
group by rollup(name)
Use it as
select Name, count(Name) as Count from YourTable
group by Name
union
Select 'SUM' , COUNT(Name) from YourTable
I am using SQL server and the following should work for you:
select cast(name as varchar(16)) as 'Name', count(name) as 'Count'
from Table1
group by Name
union all
select 'Sum:', count(name)
from Table1
I required having count(*) > 1 also. So, I wrote my own query after referring some the above queries
SYNTAX:
select sum(count) from (select count(`table_name`.`id`) as `count` from `table_name` where {some condition} group by {some_column} having count(`table_name`.`id`) > 1) as `tmp`;
Example:
select sum(count) from (select count(`table_name`.`id`) as `count` from `table_name` where `table_name`.`name` IS NOT NULL and `table_name`.`name` != '' group by `table_name`.`name` having count(`table_name`.`id`) > 1) as `tmp`;
You can try group by on name and count the ids in that group.
SELECT name, count(id) as COUNT FROM table group by name
After the query, run below to get the total row count
select ##ROWCOUNT
select sum(s) from
(select count(Col_name) as s from Tab_name group by Col_name having count(*)>1)c
I tried this with solutions avaialble online, but none worked for me.
Table :
Id rank
1 100
1 100
2 75
2 45
3 50
3 50
I want Ids 1 and 3 returned, beacuse they have duplicates.
I tried something like
select * from A where rank in (
select rank from A group by rank having count(rank) > 1
This also returned ids without any duplicates. Please help.
Try this:
select id from table
group by id, rank
having count(*) > 1
select id, rank
from
(
select id, rank, count(*) cnt
from rank_tab
group by id, rank
having count(*) > 1
) t
This general idea should work:
SELECT id
FROM your_table
GROUP BY id
HAVING COUNT(*) > 1 AND COUNT(DISTINCT rank) = 1
In plain English: get every id that exists in multiple rows, but all these rows have the same value in rank.
If you want ids that have some duplicated ranks (but not necessarily all), something like this should work:
SELECT id
FROM your_table
GROUP BY id
HAVING COUNT(*) > COUNT(DISTINCT rank)
I have a web site that collects high scores for a game - the sidebar shows the latest 10 scores (not necessarily the highest, just the latest 10). However, since a user can play multiple games quickly, they can dominate the latest 10 list. How can I write an SQL squery to show the last 10 scores but limit it to one per user?
SELECT username, max(score)
FROM Sometable
GROUP BY username
ORDER BY Max(score) DESC
and from that, select the top X depending on your db platform. select top(10) in ms-sql 2005+
edit
sorry, I see that you want things ordered by date.
Here's a working query with ms-sql 2005.
;
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY username ORDER BY dateadded DESC) AS 'RowNo',
username, score, dateadded FROM SomeTable
)
SELECT username, score, dateadded FROM CTE
WHERE RowNo = 1
Group by user... and either select the Max(Score), Max([Submission Date]) or whatever.
In SQL Server, you could use the RANK() OVER() with appropriate PARTITION and GROUP BY, but what platform are you using?
In the interest of providing another point of view you could just add a field "max score" to your user table and then use a simple query with an order by to get the top 10.
Your update will need to check if the new score if higher then the current max score.
It does have the advantage of querying a table that will most probably have less rows then your score table.
Anyway, just another option to consider.
SELECT s2.*
FROM
(SELECT user_id, MAX(action_time) AS max_time
FROM scores s1 GROUP_BY user_id
ORDER BY MAX(action_time) DESC LIMIT 10)s1
INNER JOIN scores s2 ON (s2.user_id = s1.user_id AND s2.action_time = s1.max_time)
This is Mysql syntax, for SQL server you need to use SELECT TOP 10 ... instead of LIMIT 10.
Here is a working example that I built on SQL Server 2008
WITH MyTable AS
(
SELECT 1 as UserId, 10 as Score UNION ALL
SELECT 1 as UserId, 11 as Score UNION ALL
SELECT 1 as UserId, 12 as Score UNION ALL
SELECT 2 as UserId, 13 as Score UNION ALL
SELECT 2 as UserId, 14 as Score UNION ALL
SELECT 3 as UserId, 15 as Score UNION ALL
SELECT 3 as UserId, 16 as Score UNION ALL
SELECT 3 as UserId, 17 as Score UNION ALL
SELECT 4 as UserId, 18 as Score UNION ALL
SELECT 4 as UserId, 19 as Score UNION ALL
SELECT 5 as UserId, 20 as Score UNION ALL
SELECT 6 as UserId, 21 as Score UNION ALL
SELECT 7 as UserId, 22 as Score UNION ALL
SELECT 7 as UserId, 23 as Score UNION ALL
SELECT 7 as UserId, 24 as Score UNION ALL
SELECT 8 as UserId, 25 as Score UNION ALL
SELECT 8 as UserId, 26 as Score UNION ALL
SELECT 9 as UserId, 26 as Score UNION ALL
SELECT 10 as UserId, 20 as Score
),
MyTableNew AS
(
SELECT Row_Number() OVER (Order By UserId) Sequence, *
FROM MyTable
),
RankedUsers AS
(
SELECT *, Row_Number() OVER (Partition By UserId ORDER BY Sequence DESC) Ranks
FROM MyTableNew
)
SELECT *
FROM MyTableNew
WHERE Sequence IN
(
SELECT TOP 5 Sequence
FROM RankedUsers
WHERE Ranks = 1
ORDER BY Sequence DESC
)