Grouping by one column and counting from multiple columns

Grouping by one column and counting from multiple columns - sql

I am trying to write a query on the below scenario
Result
Example:
ColumnA ColumnB
A B
B C
Result:
ColumnA countB countC
A 1 0
B 0 1

From your sample data, no aggregation is needed:
SELECT columnA,
(CASE WHEN columnB = 'B' THEN 1 ELSE 0 END) as countB,
(CASE WHEN columnB = 'C' THEN 1 ELSE 0 END) as countC
FROM t;

It should work:
Select
columnA,
count(ColumnB) as ColumnB,
count(ColumnC) as ColumnC
From Table
Group By
columnA

There is not enough sample data to confirm that, but based also on the title of your question you seem to be looking for condition aggregation :
SELECT
columnA,
COUNT(CASE WHEN columnB = 'B' THEN 1 END) countB,
COUNT(CASE WHEN columnB = 'C' THEN 1 END) countC
FROM mytable
GROUP BY columnA

Assuming that your real scenario is larger that that, I think that is better to use a pivot
DECLARE #table TABLE(ColumnA CHAR(1), ColumnB CHAR(1))
INSERT #table VALUES ('A', 'B')
INSERT #table VALUES ('B', 'C')
SELECT ColumnA, A AS CountA, B AS CountB, C AS CountC, D AS CountD....
FROM
(
SELECT ColumnA, ColumnB
FROM #table
) AS Source
PIVOT
(
Count(ColumnB)
FOR ColumnB IN ([A], [B], [C], [D].....)
) AS Pvt

Related

How to create a pivot table in PostgreSQL

I am looking to essentially create a pivot view using PostgreSQL, such that the table below:
Column A
Column B
Happy
Sad
Sad
Happy
Happy
Sad
becomes
Count
Column A
Column B
Happy
2
1
Sad
1
2
I've been able to use case/when operators far enough such that I can see the counts under independent columns,
SELECT
COUNT(CASE WHEN column1 = 'Happy' THEN 1 END) AS column1_Happy_count,
COUNT(CASE WHEN column1 = 'Sad' THEN 1 END) AS column1_Sad_count,
COUNT(CASE WHEN column2 = 'Happy' THEN 1 END) AS column2_Happy_count,
COUNT(CASE WHEN column2 = 'Sad' THEN 1 END) AS column2_Sad_count,
COUNT(CASE WHEN column3 = 'Happy' THEN 1 END) AS column3_Happy_count,
COUNT(CASE WHEN column3 = 'Sad' THEN 1 END) AS column3_Sad_count
FROM your_table;
but am missing the step to essentially each the pair of columns vertically.
I'm unable to use extensions such as tablefunc and crosstab.

Try this:
CREATE TABLE my_table (
column_a varchar(10),
column_b varchar(10)
);
INSERT INTO my_table (column_a, column_b)
VALUES ('Happy', 'Sad'),
('Sad', 'Happy'),
('Happy', 'Sad'),
('Good', 'Bad');
WITH DataSource (col, val) AS
(
SELECT 'a', column_a
FROM my_table
UNION ALL
SELECT 'b', column_b
FROM my_table
)
SELECT uniq.val AS "Count"
,MAX(case when counts.col = 'a' then counts end) AS "Column A"
,MAX(case when counts.col = 'b' then counts end) AS "Column B"
FROM
(
SELECT DISTINCT val
FROM DataSource
) uniq
INNER JOIN
(
SELECT col
,val
,COUNT(*) counts
FROM DataSource
GROUP BY col
,val
) counts
ON uniq.val = counts.val
GROUP BY uniq.val
will give you this:

You may aggregate for ColumnA, aggregate for ColumnB then do a full join as the following:
select coalesce(A.ColumnA ,B.ColumnB) as "Count",
A.cnt as "Column A",
B.cnt as "Column B"
from
(
select ColumnA, count(*) cnt
from tbl_name
group by ColumnA
) A
full join
(
select ColumnB, count(*) cnt
from tbl_name
group by ColumnB
) B
on A.ColumnA = B.ColumnB
If the distinct values in ColumnA are the same as the distinct values of ColumnB then you can use join instead of the full join.
See demo.

In oracle SQL , how to count the no of records based on conditions

I have table with below structure :
Col2
A
A
B
B
E
E
I wanted the SQL query to output me the following :
Internal 4
External 2
Total 6
Logic : If the values in the Col2 are A,B then it should be summed up as Internal , If E then it should be summed up as External.

To map your column values use DECODE, simple providing the list of the original and new values for the column.
select decode(col2,'A','Internal','B','Internal','E','External') col from tab
To calculate the total you do not need to rescan the whole table (performance drops to the half) but use group by rollup that calculates the Total
with t as (
select decode(col2,'A','Internal','B','Internal','E','External') col from tab)
select nvl(col,'Total') col, count(*) cnt
from t
group by rollup (col)
Result
COL CNT
-------- ----------
External 2
Internal 4
Total 6

select sum(case when col2 in ('A', 'B') then 1 else 0 end) as internal,
sum(case when col2 = 'E' then 1 else 0 end) as external,
count(col2) as total
from your_table

select 'Internal' "summed up as"
,sum(case when Col2 in ('A', 'B') then 1
else 0
end) "sum"
from test
union
select 'External' "summed up as"
,sum(case when Col2 = 'E' then 1
else 0
end) "sum"
from test
union
select 'Total' "summed up as"
, count(Col2) "sum"
from test;
Here is a DEMO

try like below using union all and make customize group
select case when col2 in ('A','B') then 'Internal' else 'External' end,
count(*) as result
from table_name
group by case when col2 in ('A','B') then 'Internal' else 'External' end
union all
select 'total', count(*) from table_name

select sum(Col2Count) as Internal from (SELECT Col2 as Col2, count( Col2 ) as Col2Count
FROM tablename group by Col2) where Col2 in (A,B);
This will give you result as :
Internal
4

Is it possible to use a subselect in a CASE expression with db2?

I'm trying to run the below query on a db2 database.
SELECT
columnA,
(CASE
WHEN columnA NOT IN (SELECT DISTINCT columnC FROM table_2) THEN 1 ELSE 0
END) AS columnB
FROM
table_1;
This is producing the error SQL0115 - Comparison operator NOT not valid. There are no errors if I replace the subselect with values that I know it will produce.
SELECT
columnA,
(CASE
WHEN columnA NOT IN ('ABC', 'EFG') THEN 1 ELSE 0
END) AS columnB
FROM
table_1;
Is it possible to use a subselect for a CASE expression with db2?

Your code should be fine, but the distinct is not necessary:
SELECT columnA,
(CASE WHEN columnA NOT IN (SELECT columnC FROM table_2) THEN 1 ELSE 0
END) AS columnB
FROM table_1;
I would, however, write it using NOT EXISTS:
SELECT columnA,
(CASE WHEN NOT EXISTS (SELECT 1 FROM table_2 t2 WHERE t2.columnC = t1.columnA)
THEN 1 ELSE 0
END) AS columnB
FROM table_1 t1;
NOT IN will not work as you expect when any of the values returned by the subquery are NULL.
You can also move the logic to the FROM clause:
SELECT columnA,
(CASE WHEN t2.columnC IS NOT NULL THEN 1 ELSE 0
END) AS columnB
FROM table_1 t1 LEFT JOIN
(SELECT DISTINCT columnC
FROM table_2
) t2
ON t2.columnC = t1.columnA

Group BY on Condition basis

I have data in following way....
ColumnA ColumnB
7675 22838
7675 24907
7675 NULL
I want the results in following way.....
ColumnA ColumnB
7675 2 (need total count for Not Null value)
7675 0 (need count 0 for NULL value)

SELECT ColumnA, COUNT(ColumnB) ColumnB
FROM YourTable
GROUP BY ColumnA
UNION ALL
SELECT ColumnA, 0
FROM YourTable
WHERE ColumnB IS NULL
GROUP BY ColumnA

You could introduce a calculated column indicating whether ColumnB is null or not and use it as a grouping criterion together with ColumnA:
SELECT
t.ColumnA,
ColumnB = COUNT(t.ColumnB)
FROM
dbo.YourTable AS t
CROSS APPLY
(SELECT CASE WHEN t.ColumnB IS NULL THEN 1 ELSE 0 END) AS x (SubGroup)
GROUP BY
t.ColumnA,
x.SubGroup
ORDER BY
t.ColumnA,
x.SubGroup
;
The COUNT(t.ColumnB) expression would always be NULL for a null subgroup, and for the corresponding non-null subgroup it would return the number of the non-null entries.

select columnA,
count(columnB) as non_null_count,
sum(columnB is null) as null_count
from your_table
group by ColumnA

you could easily do with a count and sum which may be faster if there are a lot of rows rather than selecting all of the rows twice with a UNION
SELECT columna, columnb, SUM(mycount)
FROM
( SELECT *, COUNT(columnb) as mycount
FROM test
GROUP BY columnb
)t
GROUP BY mycount
ORDER BY CASE WHEN mycount = 0 THEN 1 ELSE 2 END DESC;
Fiddle Demo

Cumulating value of current row + sum of previous rows

How would you do to transform a Column in a table from this:
ColumnA ColumnB
2 a
3 b
4 c
5 d
1 a
to this:
ColumnA ColumnB
3 a
6(=3+3) b
10(=4+3+3) c
15(=5+4+3+3) d
I'm interested to see esp. what method you would pick.

Like this:
;WITH cte
AS
(
SELECT ColumnB, SUM(ColumnA) asum
FROM #t
gROUP BY ColumnB
), cteRanked AS
(
SELECT asum, ColumnB, ROW_NUMBER() OVER(ORDER BY ColumnB) rownum
FROM cte
)
SELECT (SELECT SUM(asum) FROM cteRanked c2 WHERE c2.rownum <= c1.rownum),
ColumnB
FROM cteRanked c1;
This should give you:
ColumnA ColumnB
3 a
6 b
10 c
15 d
Here is a live demo

I'd generally avoid trying to do so, but the following matches what you've asked for:
declare #T table (ColumnA int,ColumnB char(1))
insert into #T(ColumnA,ColumnB) values
(2 , 'a'),
(3 , 'b'),
(4 , 'c'),
(5 , 'd'),
(1, 'a')
;With Bs as (
select distinct ColumnB from #T
)
select
SUM(t.ColumnA),b.ColumnB
from
Bs b
inner join
#T t
on
b.ColumnB >= t.ColumnB
group by
b.ColumnB
Result:
ColumnB
----------- -------
3 a
6 b
10 c
15 d
For small data sets, this will be fine. But for larger data sets, note that the last row of the table relies on obtaining the SUM over the entire contents of the original table.

Try the below script,
DECLARE #T TABLE(ColumnA INT, ColumnB VARCHAR(50));
INSERT INTO #T VALUES
(2, 'a'),
(3, 'b'),
(4, 'c'),
(5, 'd'),
(1, 'a');
SELECT SUM(ColumnA) OVER(ORDER BY ColumnB) AS ColumnA,ColumnB
FROM ( SELECT SUM(ColumnA) AS ColumnA,ColumnB
FROM #T GROUP BY ColumnB )T

Not sure if this is optimal, but how about (SQL Fiddle):
SELECT x.A + COALESCE(SUM(y.A),0) ColumnA, x.ColumnB
FROM
(
SELECT SUM(ColumnA) A, ColumnB
FROM myTable
GROUP BY ColumnB
) x
LEFT OUTER JOIN
(
SELECT SUM(ColumnA) A, ColumnB
FROM myTable
GROUP BY ColumnB
) y ON y.ColumnB < x.ColumnB
GROUP BY x.ColumnB, x.A

create table #T
(
ID int primary key,
ColumnA int,
ColumnB char(1)
);
insert into #T
select row_number() over(order by ColumnB),
sum(ColumnA) as ColumnA,
ColumnB
from YourTable
group by ColumnB;
with C as
(
select ID,
ColumnA,
ColumnB
from #T
where ID = 1
union all
select T.ID,
T.ColumnA + C.ColumnA,
T.ColumnB
from #T as T
inner join C
on T.ID = C.ID + 1
)
select ColumnA,
ColumnB
from C
option (maxrecursion 0);
drop table #T;

Using SQL SERVER? SO
Let think you have a table with 3 column C_1, C_2, C_3 and ordered by C_1.
Simply use [Over (Order By C_1)] to add a column for sum of C_3:
Select C_1, C_2, C_3, Sum(C_3) Over (Order By C_1)
if you want row number too, do it in the same way:
Select Row_Number() Over (Order By C_1), C_1, C_2, C_3, Sum(C_3) Over (Order By C_1)

If you are using SQL Server 2012 or greater then this will produce the required result.
DECLARE #t TABLE(
ColumnA int,
ColumnB varchar(50)
);
INSERT INTO #t VALUES
(2,'a'),
(3,'b'),
(4,'c'),
(5,'d'),
(1,'a');
SELECT
SUM(ColumnA) OVER (ORDER BY ColumnB ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ColumnA,
ColumnB
FROM (
SELECT
ColumnB,
SUM(ColumnA) AS ColumnA
FROM #t
GROUP BY ColumnB
) DVTBL
ORDER BY ColumnB

DECLARE #t TABLE(ColumnA INT, ColumnB VARCHAR(50));
INSERT INTO #t VALUES
(2, 'a'),
(3 , 'b'),
(4 , 'c'),
(5 , 'd'),
(1 , 'a');
;WITH cte
AS
(
SELECT ColumnB, sum(ColumnA) value,ROW_NUMBER() OVER(ORDER BY ColumnB) sr_no FROM #t group by ColumnB
)
SELECT ColumnB
,SUM(value) OVER ( ORDER BY ColumnB ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING)
FROM cte c1;

The best solution (simplest and quickest) is to use a OVER(ORDER BY) clause.
I will give and explain my problem and the solution found.
I have a table containing some annual transaction that have following columns
Yearx INT
NoSeq INT
Amount DECIMAL(10,2)
Balance DECIMAL(10,2)
The first three columns have values; balance column is empty.
Problem
How to fill Balance values considering that first value at 1 January is 5000€ ?
Example
NoSeq Amount Balance
----- -------- ---------
1 120.00+ 5120.00+ <= 5000 + 120
2 16.00- 5104.00+ <= 5000 + 120 - 16
3 3000.00- 2104.00+ <= 5000 + 120 - 16 + 3000
4 640.00+ 2740.00+ <= 5000 + 120 - 16 + 3000 + 640
Solution (based on Abdul Rasheed answer)
WITH
t AS
(
SELECT NoSeq
,Amount
FROM payements
WHERE Yearx = 2021
)
SELECT NoSeq
,Amount
,1179.18 + SUM(Amount) OVER(ORDER BY NoSeq
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS Balance
FROM t
In considering that on PostGreSql ROW BETWEEN used before is default, previous SELECT can be reduced to
WITH
t AS
(
SELECT NoSeq
,Amount
FROM payements
WHERE Yearx = 2021
)
SELECT NoSeq
,Amount
,1179.18 + SUM(Amount) OVER(ORDER BY NoSeq) as balance
FROM t
The first part (WITH clause) is used to define table on which OVER(ORDER BY) is apply in final SELECT.
The second part compute running sum using temporaty T table.
In my case, WITH clause is not necessary and SELECT command can be ultimely reducted to following SQL command
SELECT NoSeq
,Amount
,1179.18 + SUM(Amount) OVER(ORDER BY NoSeq) as balance
FROM payements
WHERE Yearx = 2021
I use this last SQL command in my VB.Net - Postgresql application.
To compute more that one year knowing Balance value on 1 January 2010, I use following SQL command
SELECT Yearx
,NoSeq
,Amount
,-279.34 + SUM(Amount) OVER(ORDER BY Yearx,NoSeq) as balance
FROM payements
WHERE Yearx BETWEEN 2010 AND 2021

You can do in this way also:
WITH grpAllData
AS
(
SELECT ColumnB, SUM(ColumnA) grpValue
FROM table_Name
gROUP BY ColumnB
)
SELECT g.ColumnB, sum(grpValue) OVER(ORDER BY ColumnB) desireValue
FROM grpAllData g
order by ColumnB
In the above query, We first aggregate all values in the same group, then in the final select just applied a window function on the previous result.

SELECT g.columnB as "ColumnB",
SUM(g.group_sum) over (ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as "ColumnA"
FROM (
SELECT SUM(ColumnA) as group_sum,
ColumnB
FROM cand
GROUP BY ColumnB
ORDER BY ColumnB) g
Grouping ColumnB with SUM aggregation of ColumnA. And then applying window function to ColumnA to generate cumulative sum.

That was my question too and I used answers here. With more research I found another solution which is more optimized and easier, also more fun! This solutions is based on Window Functions. here it is:
--- creating table and inserting values of the question
CREATE TABLE #tmp ( ColumnA INT , ColumnB VARCHAR(1))
INSERT INTO #tmp
VALUES (2,'a'),(3,'b'),(4,'c'),(5,'d'),(1,'a')
---- my solution
SELECT
SUM(ColumnA) OVER (ORDER BY ColumnB ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) ColumnA
,ColumnB
FROM
(
SELECT SUM(ColumnA) ColumnA,ColumnB FROM #tmp GROUP BY ColumnB
) X
And the result is:
ColumnA ColumnB
----------- -------
3 a
6 b
10 c
15 d

This will work based on grouping of columns cumulative summation for a column.
See the below SQL
SELECT product,
product_group,
fiscal_year,
Sum(quantity) OVER ( partition BY fiscal_year,a.product_group ORDER BY a.posting_date, a.product_group rows 100000000 PRECEDING) AS quantity
FROM report
WHERE
order by b.fiscal_year DESC

You can use below simple select statement for the same
SELECT COLUMN_A, COLUMN_B,
(SELECT SUM(COLUMN_B) FROM #TBL T2 WHERE T2.ID <= T1.ID) as SumofPreviousRow FROM #TBL T1;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Grouping by one column and counting from multiple columns - sql

I am trying to write a query on the below scenario Result Example: ColumnA ColumnB A B B C Result: ColumnA countB countC A 1 0 B 0 1

From your sample data, no aggregation is needed: SELECT columnA, (CASE WHEN columnB = 'B' THEN 1 ELSE 0 END) as countB, (CASE WHEN columnB = 'C' THEN 1 ELSE 0 END) as countC FROM t;

It should work: Select columnA, count(ColumnB) as ColumnB, count(ColumnC) as ColumnC From Table Group By columnA

There is not enough sample data to confirm that, but based also on the title of your question you seem to be looking for condition aggregation : SELECT columnA, COUNT(CASE WHEN columnB = 'B' THEN 1 END) countB, COUNT(CASE WHEN columnB = 'C' THEN 1 END) countC FROM mytable GROUP BY columnA

Related

How to create a pivot table in PostgreSQL

In oracle SQL , how to count the no of records based on conditions

Is it possible to use a subselect in a CASE expression with db2?

Group BY on Condition basis

Cumulating value of current row + sum of previous rows

Categories

Resources