running sum on group by - sql

I have this query
SELECT NAME, OTHER_NAME, COUNT(NAME)
FROM ETHNICITY
GROUP BY NAME,OTHER_NAME
and I would like to add a running sum on other_name or name in that column.
For instance, if there is 3x african american and 2x name= "other" and other_name = "jewish"
I want to give it 3 and 2 as the counts and sum them up as it traverses
Any ideas how I can augment this to add that? Thanks.

In Oracle, a running sum is easily done with the sum() ... over() window function:
select name
, other_name
, name_count
, sum(name_count) over(
order by name, other_name) as running
from (
select name
, other_name
, count(name) as name_count
from ethnicity
group by
name
, other_name
order by
name
, other_name
) subqueryalias
Example at SQL Fiddle

I prefer to do this using a subquery:
select t.name, t.other_name, t.cnt,
sum(cnt) over (order by name) as cumecnt
from (SELECT NAME, OTHER_NAME, COUNT(NAME) as cnt
FROM ETHNICITY
GROUP BY NAME,OTHER_NAME
) t
This assumes that you want a cumulative sum of count in the order of name.
The order by in the analytic functions do cumulative sums. This is standard syntax, and also supported by Postgres and SQL Server 2012.
The following might also work
select name, other_name, count(name) as cnt,
sum(count(name)) over (order by name)
from ethnicity
group by name, other_name
I find this harder to read (the sum(count()) is a bit jarring) and perhaps more prone to error. I haven't tried this syntax on Oracle; it does work in SQL Server 2012.

Look at Grouping sets, lets you aggregate totals.
Not sure this is what you're after though...
SELECT NAME, OTHER_NAME, COUNT(NAME)
FROM ETHNICITY
GROUP BY GROUPING SETS ((NAME,OTHER_NAME), (Name), ())
Sorry ID10T error... the grouping sets didn't require a 2nd aggregate, the count will do it on it's own:
So this data:
Name Other_Name
A B
A C
A D
B E
B F
B G
C H
C I
C J
Results in
Name Other_Name CNT(Name)
A B 1
A C 1
A D 1
A 3
B E 1
B F 1
B G 1
B 3
C H 1
C I 1
C J 1
C 3
9

Related

Order by Counts in a Group

I have the data which looks like below:
group resource count
A X 5
A Y 8
A Z 2
B E 8
B F 10
B G 2
I want to order the data in a way that the group comes on the top having highest sum of count using SQL statement. Such as:
group resource count
B F 10
B E 8
B G 2
A Y 8
A X 5
A Z 2
I am avoiding using multiple select statements too. Any help for this. Thanks
Try using a window function to sum your count and order by that. Your DBMS was not listed, so might need some minor tweaks to syntax, but I think this should work for most DBMS
SELECT [Group]
,[Resource]
,[Count]
,TotalCountOfGroup = SUM([Count]) OVER (PARTITION BY [Group])
FROM YourTable
ORDER BY TotalCountOfGroup DESC,[Count] DESC
Or if you need to exclude the TotalCount column, can wrap it in a CTE
WITH cte_YourData AS (
SELECT [Group],[Resource],[Count]
,TotalCountOfGroup = SUM([Count]) OVER (PARTITION BY [Group])
FROM YourTable
)
SELECT [Group],[Resource],[Count]
FROM cte_YourData
ORDER BY TotalCountOfGroup DESC,[Count] DESC

how to achieve count(distinct) without group by in hive

I want to find the count(distinct column name) wihtout using group by in hive.
my input is :
name id
a 2
a 3
a 4
b 1
c 4
c 4
d 7
d 9
my expected output is
name count
a 3
b 1
c 1
d 2
can some tell me how to achieve this without using group by. please help
A canonical solution with no explicit group by is select distinct with window functions:
select distinct name, count(distinct id) over (partition by name)
from t;
In your case, I strongly recommend the group by version:
select name, count(distinct id)
from t
group by name;
You can use subquery :
select distinct t.name,
(select count(distinct id) from table t1 where t1.name = t.name) as count
from table t;
However, GROUP BY is really appropriate way to do this.
just use count aggregate function with distinct keyword
select name,count(distinct id) as cnt from table
group by name

Query 2 sum in 1 table

table a
column id : a a b b
column total : 1 2 1 3
how can i show? in one table without use compute
a 3 7
b 4 7
Do group by to sum each id's total. Do a sub-select to count total:
select id,
sum(total) as total,
(select sum(total) from a) as totalall
from a
group by id
Using window functions with a distinct, it can be simply expressed like this:
select distinct id,
sum(Total) over(partition by id) total,
Sum(Total) over () total_all
from mytable
SQL Fiddle
One way is to use OUTER APPLY. You could also set a variable to the sum of the table and call that variable.
select a.id, sum(a.total) as total, b.Grand as GrandTotal
from tablea a
outer apply
(select sum(total) as Grand from tablea) b
group by a.id

Selecting the maximum count from a GROUP BY operation

Forgive my SQL knowledge, but I have a Person table with following data -
Id Name
---- ------
1 a
2 b
3 b
4 c
and I want the following result -
Name Total
------ ------
b 2
If I use the GROUP BY query -
SELECT Name, Total=COUNT(*) FROM Person GROUP BY Name
It gives me -
Name Total
------ ------
a 1
b 2
c 1
But I want only the one with maximum count. How do I get that?
If you want ties
SELECT top (1) with ties Name, COUNT(*) AS [count]
FROM Person
GROUP BY Name
ORDER BY count(*) DESC
The easiest way to do this in SQL Server would be to use the top syntax:
SELECT TOP 1 Name, COUNT(*) AS Total
FROM Person
GROUP BY Name
ORDER BY 2 DESC
The answer is:
WITH MaxGroup AS (
SELECT Name, COUNT(*) AS Total
FROM Person
GROUP BY Name)
SELECT Name, Total
FROM MaxGroup
WHERE Total = (SELECT MAX(Total) FROM MaxGroup)
try this...
SELECT Name, COUNT(*)
FROM Person
GROUP BY Name
having COUNT(*)=( SELECT max(COUNT(*)) FROM Person GROUP BY Name) ;

SUM of grouped COUNT in SQL Query

I have a table with 2 fields:
ID Name
-- -------
1 Alpha
2 Beta
3 Beta
4 Beta
5 Charlie
6 Charlie
I want to group them by name, with 'count', and a row 'SUM'
Name Count
------- -----
Alpha 1
Beta 3
Charlie 2
SUM 6
How would I write a query to add SUM row below the table?
SELECT name, COUNT(name) AS count
FROM table
GROUP BY name
UNION ALL
SELECT 'SUM' name, COUNT(name)
FROM table
OUTPUT:
name count
-------------------------------------------------- -----------
alpha 1
beta 3
Charlie 2
SUM 6
SELECT name, COUNT(name) AS count, SUM(COUNT(name)) OVER() AS total_count
FROM Table GROUP BY name
Without specifying which rdbms you are using
Have a look at this demo
SQL Fiddle DEMO
SELECT Name, COUNT(1) as Cnt
FROM Table1
GROUP BY Name
UNION ALL
SELECT 'SUM' Name, COUNT(1)
FROM Table1
That said, I would recomend that the total be added by your presentation layer, and not by the database.
This is a bit more of a SQL SERVER Version using Summarizing Data Using ROLLUP
SQL Fiddle DEMO
SELECT CASE WHEN (GROUPING(NAME) = 1) THEN 'SUM'
ELSE ISNULL(NAME, 'UNKNOWN')
END Name,
COUNT(1) as Cnt
FROM Table1
GROUP BY NAME
WITH ROLLUP
Try this:
SELECT ISNULL(Name,'SUM'), count(*) as Count
FROM table_name
Group By Name
WITH ROLLUP
all of the solution here are great but not necessarily can be implemented for old mysql servers (at least at my case). so you can use sub-queries (i think it is less complicated).
select sum(t1.cnt) from
(SELECT column, COUNT(column) as cnt
FROM
table
GROUP BY
column
HAVING
COUNT(column) > 1) as t1 ;
Please run as below :
Select sum(count)
from (select Name,
count(Name) as Count
from YourTable
group by Name); -- 6
The way I interpreted this question is needing the subtotal value of each group of answers. Subtotaling turns out to be very easy, using PARTITION:
SUM(COUNT(0)) OVER (PARTITION BY [Grouping]) AS [MY_TOTAL]
This is what my full SQL call looks like:
SELECT MAX(GroupName) [name], MAX(AUX2)[type],
COUNT(0) [count], SUM(COUNT(0)) OVER(PARTITION BY GroupId) AS [total]
FROM [MyView]
WHERE Active=1 AND Type='APP' AND Completed=1
AND [Date] BETWEEN '01/01/2014' AND GETDATE()
AND Id = '5b9xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' AND GroupId IS NOT NULL
GROUP BY AUX2, GroupId
The data returned from this looks like:
name type count total
Training Group 2 Cancelation 1 52
Training Group 2 Completed 41 52
Training Group 2 No Show 6 52
Training Group 2 Rescheduled 4 52
Training Group 3 NULL 4 10535
Training Group 3 Cancelation 857 10535
Training Group 3 Completed 7923 10535
Training Group 3 No Show 292 10535
Training Group 3 Rescheduled 1459 10535
Training Group 4 Cancelation 2 27
Training Group 4 Completed 24 27
Training Group 4 Rescheduled 1 27
You can use union to joining rows.
select Name, count(*) as Count from yourTable group by Name
union all
select "SUM" as Name, count(*) as Count from yourTable
For Sql server you can try this one.
SELECT ISNULL([NAME],'SUM'),Count([NAME]) AS COUNT
FROM TABLENAME
GROUP BY [NAME] WITH CUBE
with cttmp
as
(
select Col_Name, count(*) as ctn from tab_name group by Col_Name having count(Col_Name)>1
)
select sum(ctn) from c
You can use ROLLUP
select nvl(name, 'SUM'), count(*)
from table
group by rollup(name)
Use it as
select Name, count(Name) as Count from YourTable
group by Name
union
Select 'SUM' , COUNT(Name) from YourTable
I am using SQL server and the following should work for you:
select cast(name as varchar(16)) as 'Name', count(name) as 'Count'
from Table1
group by Name
union all
select 'Sum:', count(name)
from Table1
I required having count(*) > 1 also. So, I wrote my own query after referring some the above queries
SYNTAX:
select sum(count) from (select count(`table_name`.`id`) as `count` from `table_name` where {some condition} group by {some_column} having count(`table_name`.`id`) > 1) as `tmp`;
Example:
select sum(count) from (select count(`table_name`.`id`) as `count` from `table_name` where `table_name`.`name` IS NOT NULL and `table_name`.`name` != '' group by `table_name`.`name` having count(`table_name`.`id`) > 1) as `tmp`;
You can try group by on name and count the ids in that group.
SELECT name, count(id) as COUNT FROM table group by name
After the query, run below to get the total row count
select ##ROWCOUNT
select sum(s) from
(select count(Col_name) as s from Tab_name group by Col_name having count(*)>1)c