Getting count of records by Group by at 2 levels - sql

I have data as below:
custid date gender cust_type
25309 29/10/2018 M A
25310 09/11/2018 F B
25311 10/11/2018 O C
25312 18/09/2018 F D
25313 18/09/2018 O A
25314 18/09/2018 M B
25315 18/09/2018 F C
25316 18/09/2018 F D
25317 19/09/2018 M D
25318 19/09/2018 O B
My final output should be as below:
quarter total A M F O TOTAL B M F O TOTAL C M F O TOTAL D M F O
2 1 1 3 1 1 1 2 0 1 1 3 1 2 0
I need the count of distinct customer for each cust_type.
Within each cust_type, i need the count of M,F,O (gender)
The output should be calculated for each quarter based on date column. I tried few suggestions in site, but its giving me wrong count, while using sum within case statement.
at present i m running seperate queries for each quarter to get cust_type count and gender count as below:
SELECT INDIVIDUAL_TYPE,COUNT(DISTINCT CUST_ID)
FROM TOT_POP_DET
WHERE DATE < (TO_DATE('01-JAN-2020','DD-MON-YYYY'))
GROUP BY CUST_TYPE
SELECT GENDER,COUNT(DISTINCT CUST_ID)
FROM TOT_POP_DET
WHERE DATE < (TO_DATE('01-JAN-2020','DD-MON-YYYY'))
AND CUST_TYPE='OTHER'
GROUP BY GENDER
Seeking help here.

Do a group by DATEPART(QUARTER, [Date Column]) then do SUM(CASE ...) for the individual rows you need to count.
Below is example using your example data.
Select
DATEPART(QUARTER, date) [Quarter],
SUM(CASE WHEN cust_type = 'A' THEN 1 ELSE 0) [Total A],
SUM(CASE WHEN cust_type = 'A' AND gender = 'M' THEN 1 ELSE 0) [A - Male],
SUM(CASE WHEN cust_type = 'A' AND gender = 'F' THEN 1 ELSE 0) [A - Female],
SUM(CASE WHEN cust_type = 'A' AND gender = 'O' THEN 1 ELSE 0) [A - Other],
SUM(CASE WHEN cust_type = 'B' THEN 1 ELSE 0) [Total B],
SUM(CASE WHEN cust_type = 'B' AND gender = 'M' THEN 1 ELSE 0) [B - Male],
SUM(CASE WHEN cust_type = 'B' AND gender = 'F' THEN 1 ELSE 0) [B - Female],
SUM(CASE WHEN cust_type = 'B' AND gender = 'O' THEN 1 ELSE 0) [B - Other],
SUM(CASE WHEN cust_type = 'C' THEN 1 ELSE 0) [Total C],
SUM(CASE WHEN cust_type = 'C' AND gender = 'M' THEN 1 ELSE 0) [C - Male],
SUM(CASE WHEN cust_type = 'C' AND gender = 'F' THEN 1 ELSE 0) [C - Female],
SUM(CASE WHEN cust_type = 'C' AND gender = 'O' THEN 1 ELSE 0) [C - Other],
SUM(CASE WHEN cust_type = 'D' THEN 1 ELSE 0) [Total D],
SUM(CASE WHEN cust_type = 'D' AND gender = 'M' THEN 1 ELSE 0) [D - Male],
SUM(CASE WHEN cust_type = 'D' AND gender = 'F' THEN 1 ELSE 0) [D - Female],
SUM(CASE WHEN cust_type = 'D' AND gender = 'O' THEN 1 ELSE 0) [D - Other]
FROM
TOT_POP_DET
WHERE
SNAPSHOT_DATE < (TO_DATE('01-JAN-2020','DD-MON-YYYY'))
GROUP BY
DATEPART(QUARTER, date)
Oracle Below
Select
TO_NUMBER(TO_CHAR(date, 'QUARTER')) [Quarter],
SUM(CASE WHEN cust_type = 'A' THEN 1 ELSE 0) [Total A],
SUM(CASE WHEN cust_type = 'A' AND gender = 'M' THEN 1 ELSE 0) [A - Male],
SUM(CASE WHEN cust_type = 'A' AND gender = 'F' THEN 1 ELSE 0) [A - Female],
SUM(CASE WHEN cust_type = 'A' AND gender = 'O' THEN 1 ELSE 0) [A - Other],
SUM(CASE WHEN cust_type = 'B' THEN 1 ELSE 0) [Total B],
SUM(CASE WHEN cust_type = 'B' AND gender = 'M' THEN 1 ELSE 0) [B - Male],
SUM(CASE WHEN cust_type = 'B' AND gender = 'F' THEN 1 ELSE 0) [B - Female],
SUM(CASE WHEN cust_type = 'B' AND gender = 'O' THEN 1 ELSE 0) [B - Other],
SUM(CASE WHEN cust_type = 'C' THEN 1 ELSE 0) [Total C],
SUM(CASE WHEN cust_type = 'C' AND gender = 'M' THEN 1 ELSE 0) [C - Male],
SUM(CASE WHEN cust_type = 'C' AND gender = 'F' THEN 1 ELSE 0) [C - Female],
SUM(CASE WHEN cust_type = 'C' AND gender = 'O' THEN 1 ELSE 0) [C - Other],
SUM(CASE WHEN cust_type = 'D' THEN 1 ELSE 0) [Total D],
SUM(CASE WHEN cust_type = 'D' AND gender = 'M' THEN 1 ELSE 0) [D - Male],
SUM(CASE WHEN cust_type = 'D' AND gender = 'F' THEN 1 ELSE 0) [D - Female],
SUM(CASE WHEN cust_type = 'D' AND gender = 'O' THEN 1 ELSE 0) [D - Other]
FROM
TOT_POP_DET
WHERE
SNAPSHOT_DATE < (TO_DATE('01-JAN-2020','DD-MON-YYYY'))
GROUP BY
TO_NUMBER(TO_CHAR(date, 'QUARTER'))

You didn't specify the logic for the QUARTER column so I just assumed you meant quarter of the calendar year (Jan-Mar = Q1, Apr-Jun = Q2, etc.). If you group the information by quarter and customer type, you can then pivot that information to get it in the format that you want.
Setup
create table cust_table as
select 25309 as custid , to_date('29/10/2018','dd/mm/yyyy') as date_val, 'M' as gender, 'A' as cust_type from dual union all
select 25310 as custid , to_date('09/11/2018','dd/mm/yyyy') as date_val, 'F' as gender, 'B' as cust_type from dual union all
select 25311 as custid , to_date('10/11/2018','dd/mm/yyyy') as date_val, 'O' as gender, 'C' as cust_type from dual union all
select 25312 as custid , to_date('18/09/2018','dd/mm/yyyy') as date_val, 'F' as gender, 'D' as cust_type from dual union all
select 25313 as custid , to_date('18/09/2018','dd/mm/yyyy') as date_val, 'O' as gender, 'A' as cust_type from dual union all
select 25314 as custid , to_date('18/09/2018','dd/mm/yyyy') as date_val, 'M' as gender, 'B' as cust_type from dual union all
select 25315 as custid , to_date('18/09/2018','dd/mm/yyyy') as date_val, 'F' as gender, 'C' as cust_type from dual union all
select 25316 as custid , to_date('18/09/2018','dd/mm/yyyy') as date_val, 'F' as gender, 'D' as cust_type from dual union all
select 25317 as custid , to_date('19/09/2018','dd/mm/yyyy') as date_val, 'M' as gender, 'D' as cust_type from dual union all
select 25318 as custid , to_date('19/09/2018','dd/mm/yyyy') as date_val, 'O' as gender, 'B' as cust_type from dual;
Query
SELECT year,
quarter,
NVL (a_m_total, 0)
+ NVL (a_f_total, 0)
+ NVL (a_o_total, 0)
+ NVL (b_m_total, 0)
+ NVL (b_f_total, 0)
+ NVL (b_o_total, 0)
+ NVL (c_m_total, 0)
+ NVL (c_f_total, 0)
+ NVL (c_o_total, 0)
+ NVL (d_m_total, 0)
+ NVL (d_f_total, 0)
+ NVL (d_o_total, 0) AS quarter_total,
NVL (a_m_total, 0) + NVL (a_f_total, 0) + NVL (a_o_total, 0) AS a_total,
NVL (a_m_total, 0) AS a_m_total,
NVL (a_f_total, 0) AS a_f_total,
NVL (a_o_total, 0) AS a_o_total,
NVL (b_m_total, 0) + NVL (b_f_total, 0) + NVL (b_o_total, 0) AS b_total,
NVL (b_m_total, 0) AS b_m_total,
NVL (b_f_total, 0) AS b_f_total,
NVL (b_o_total, 0) AS b_o_total,
NVL (c_m_total, 0) + NVL (c_f_total, 0) + NVL (c_o_total, 0) AS c_total,
NVL (c_m_total, 0) AS c_m_total,
NVL (c_f_total, 0) AS c_f_total,
NVL (c_o_total, 0) AS c_o_total,
NVL (d_m_total, 0) + NVL (d_f_total, 0) + NVL (d_o_total, 0) AS d_total,
NVL (d_m_total, 0) AS d_m_total,
NVL (d_f_total, 0) AS d_f_total,
NVL (d_o_total, 0) AS d_o_total
FROM ( SELECT EXTRACT (YEAR FROM date_val) AS year,
CEIL (EXTRACT (MONTH FROM date_val) / 3) AS quarter,
cust_type,
SUM (CASE gender WHEN 'M' THEN 1 ELSE 0 END) AS total_m,
SUM (CASE gender WHEN 'F' THEN 1 ELSE 0 END) AS total_f,
SUM (CASE gender WHEN 'O' THEN 1 ELSE 0 END) AS total_o
FROM cust_table
GROUP BY EXTRACT (YEAR FROM date_val), CEIL (EXTRACT (MONTH FROM date_val) / 3), cust_type)
PIVOT (MAX (total_m) AS m_total, MAX (total_f) AS f_total, MAX (total_o) AS o_total
FOR cust_type
IN ('A' AS a, 'B' AS b, 'C' AS c, 'D' AS d));
Result
YEAR QUARTER QUARTER_TOTAL A_TOTAL A_M_TOTAL A_F_TOTAL A_O_TOTAL B_TOTAL B_M_TOTAL B_F_TOTAL B_O_TOTAL C_TOTAL C_M_TOTAL C_F_TOTAL C_O_TOTAL D_TOTAL D_M_TOTAL D_F_TOTAL D_O_TOTAL
_______ __________ ________________ __________ ____________ ____________ ____________ __________ ____________ ____________ ____________ __________ ____________ ____________ ____________ __________ ____________ ____________ ____________
2018 3 7 1 0 0 1 2 1 0 1 1 0 1 0 3 1 2 0
2018 4 3 2 1 0 0 1 0 1 0 1 0 0 1 0 0 0 0

I used "list" as a source table, then COUNTed distinctly "custid"s for gender category by using PIVOT clause. and assumed that the quarter is "YYYY-Q" formatted. as a last query I summed the counts for each gender for each cust_type to get the result you needed using the pivot table named "cg".
with
list (custid, "date", gender, cust_type) as (
select 25309, to_date('29/10/2018', 'dd/mm/yyyy'), 'M', 'A' from dual union all
select 25310, to_date('09/11/2018', 'dd/mm/yyyy'), 'F', 'B' from dual union all
select 25311, to_date('10/11/2018', 'dd/mm/yyyy'), 'O', 'C' from dual union all
select 25312, to_date('18/09/2018', 'dd/mm/yyyy'), 'F', 'D' from dual union all
select 25313, to_date('18/09/2018', 'dd/mm/yyyy'), 'O', 'A' from dual union all
select 25314, to_date('18/09/2018', 'dd/mm/yyyy'), 'M', 'B' from dual union all
select 25315, to_date('18/09/2018', 'dd/mm/yyyy'), 'F', 'C' from dual union all
select 25316, to_date('18/09/2018', 'dd/mm/yyyy'), 'F', 'D' from dual union all
select 25317, to_date('19/09/2018', 'dd/mm/yyyy'), 'M', 'D' from dual union all
select 25318, to_date('19/09/2018', 'dd/mm/yyyy'), 'O', 'B' from dual
)
,cg as (
select * from (select custid, to_char("date", 'YYYY-Q') as quarter, cust_type, gender from list)
pivot (count(distinct custid) as gender for gender in('F' F, 'M' M, 'O' O))
)
select
quarter,
----------
sum(case when cust_type = 'A' then nvl(f_gender,0)+nvl(m_gender,0)+nvl(o_gender,0) else 0 end) as a_total,
sum(case when cust_type = 'A' then f_gender else 0 end) as a_f,
sum(case when cust_type = 'A' then m_gender else 0 end) as a_m,
sum(case when cust_type = 'A' then o_gender else 0 end) as a_o,
----------
sum(case when cust_type = 'B' then nvl(f_gender,0)+nvl(m_gender,0)+nvl(o_gender,0) else 0 end) as b_total,
sum(case when cust_type = 'B' then f_gender else 0 end) as b_f,
sum(case when cust_type = 'B' then m_gender else 0 end) as b_m,
sum(case when cust_type = 'B' then o_gender else 0 end) as b_o,
----------
sum(case when cust_type = 'C' then nvl(f_gender,0)+nvl(m_gender,0)+nvl(o_gender,0) else 0 end) as c_total,
sum(case when cust_type = 'C' then f_gender else 0 end) as c_f,
sum(case when cust_type = 'C' then m_gender else 0 end) as c_m,
sum(case when cust_type = 'C' then o_gender else 0 end) as c_o,
----------
sum(case when cust_type = 'D' then nvl(f_gender,0)+nvl(m_gender,0)+nvl(o_gender,0) else 0 end) as d_total,
sum(case when cust_type = 'D' then f_gender else 0 end) as d_f,
sum(case when cust_type = 'D' then m_gender else 0 end) as d_m,
sum(case when cust_type = 'D' then o_gender else 0 end) as d_o
from cg
group by quarter;
If your source table name is TOT_POP_DET and have columns "custid", "date", "gender", "cust_type" then you can ignore my data preparation "list" query and start with "cg". I used CTE (common table expression) but using subquery is fine as well. "list" and "cg" are CTEs.
I mean you can start
with cg as (
select * from (select custid, to_char("date", 'YYYY-Q') as quarter, cust_type, gender from TOT_POP_DET)
pivot (count(distinct custid) as gender for gender in('F' F, 'M' M, 'O' O))
)
,...
As you will notice "cg" is distinct count by quarter and cust_type.
PIVOT syntax is as below:
SELECT * FROM (SELECT column1, column2, .. FROM table(s) WHERE condition(s))
PIVOT (aggregate_function(column2) FOR column2 IN ( expr1, expr2, ... expr_n))
ORDER BY expression [ ASC | DESC ];
firstly, selected the needed columns from the source table, then count(distinct custid) as aggregation_function, gender FOR category column, IN (gender list). so got the first result set by quarter and cust_type.
QUARTER CUST_TYPE F_GENDER M_GENDER O_GENDER
2018-3 A 0 0 1
2018-3 B 0 1 1
2018-3 C 1 0 0
2018-3 D 2 1 0
2018-4 A 0 1 0
2018-4 B 1 0 0
2018-4 C 0 0 1
then used this result set, grouped and summed counts with each case to transpose the data to get the final result set.
QUARTER A_TOTAL A_F A_M A_O B_TOTAL B_F B_M B_O C_TOTAL C_F C_M C_O D_TOTAL D_F D_M D_O
2018-4 1 0 1 0 1 1 0 0 1 0 0 1 0 0 0 0
2018-3 1 0 0 1 2 0 1 1 1 1 0 0 3 2 1 0
Additionally, if you change "YYYY-Q" to "YYYY" in PIVOT and execute the query you can get the result by year as below
QUARTER A_TOTAL A_F A_M A_O B_TOTAL B_F B_M B_O C_TOTAL C_F C_M C_O D_TOTAL D_F D_M D_O
2018 2 0 1 1 3 1 1 1 2 1 0 1 3 2 1 0
tried an explanation hope it helps

I hope that the answer could help you. Find it in this fiddle
The version of SQL is Oracle 11g. The cte manipulates your data to produce and produces two columns for your year and quarter. After that, you need to perform a COUNT based on the parameters that you want.
with cte AS(SELECT custid, gender, cust_type, EXTRACT(YEAR FROM cust_date) AS cust_year,
CASE WHEN EXTRACT(MONTH FROM cust_date)<4 THEN 1
WHEN EXTRACT(MONTH FROM cust_date)<7 THEN 2
WHEN EXTRACT(MONTH FROM cust_date)<10 THEN 3
ELSE 4
END AS quarter
FROM your_Table)
SELECT gender, cust_type, cust_year, quarter, COUNT(custid)
FROM cte
GROUP BY gender, cust_type, cust_year, quarter
ORDER BY cust_year, quarter, cust_type

Related

select from table where a=1 and a=2

i have a four requirement (may be four select is ok) where I need to find from single table, if customer has
a. apple and samsung
b. no_apple and no_samsung
c. apple and no_samsung
d. no_apple and samsung
my table be like...
cust_name device
john apple
john samsung
dave apple
tim samsung
patrick nokia
rick nokia
so expect output be like...
a:- output ( both apple and samsung)
count(*)
1
b:-output (no_apple and no_samsung)
count(*)
2
c:-output (apple and no_samsung)
count(*)
1
d:-output (no_apple and samsung)
count(*)
1
You can do it all in a single query using conditional aggregation:
SELECT COUNT(CASE WHEN num_apple > 0 AND num_samsung > 0 THEN 1 END)
AS apple_and_samsung,
COUNT(CASE WHEN num_apple = 0 AND num_samsung > 0 THEN 1 END)
AS no_apple_and_samsung,
COUNT(CASE WHEN num_apple > 0 AND num_samsung = 0 THEN 1 END)
AS apple_and_no_samsung,
COUNT(CASE WHEN num_apple = 0 AND num_samsung = 0 THEN 1 END)
AS no_apple_and_no_samsung
FROM (
SELECT cust_name,
COUNT(CASE device WHEN 'apple' THEN 1 END) AS num_apple,
COUNT(CASE device WHEN 'samsung' THEN 1 END) AS num_samsung
FROM table_name
GROUP BY cust_name
)
Which, for the sample data:
CREATE TABLE table_name (cust_name, device) AS
SELECT 'john', 'apple' FROM DUAL UNION ALL
SELECT 'john', 'samsung' FROM DUAL UNION ALL
SELECT 'dave', 'apple' FROM DUAL UNION ALL
SELECT 'tim', 'samsung' FROM DUAL UNION ALL
SELECT 'patrick', 'nokia' FROM DUAL UNION ALL
SELECT 'rick', 'nokia' FROM DUAL;
Outputs:
APPLE_AND_SAMSUNG
NO_APPLE_AND_SAMSUNG
APPLE_AND_NO_SAMSUNG
NO_APPLE_AND_NO_SAMSUNG
1
1
1
2
You can also do it by PIVOTing twice:
SELECT *
FROM table_name
PIVOT (
COUNT(DISTINCT device) FOR device IN (
'apple' AS apple,
'samsung' AS samsung
)
)
PIVOT (
COUNT(cust_name) FOR (apple, samsung) IN (
(1, 1) AS apple_and_samsung,
(1, 0) AS apple_and_no_samsung,
(0, 1) AS no_apple_and_samsung,
(0, 0) AS no_apple_and_no_samsung
)
)
db<>fiddle here
You might add proper HAVING clauses for each case after GROUPing by cust_name column such as
a)
SELECT COUNT(DISTINCT cust_name)
FROM t
GROUP BY cust_name
HAVING SUM(CASE WHEN device ='apple' THEN 1 ELSE 0 END)
* SUM(CASE WHEN device ='samsung' THEN 1 ELSE 0 END) = 1;
b)
SELECT SUM(COUNT(DISTINCT cust_name))
FROM t
GROUP BY cust_name
HAVING MIN(CASE WHEN device ='apple' THEN 0 ELSE 1 END)
* MIN(CASE WHEN device ='samsung' THEN 0 ELSE 1 END) = 1;
c)
SELECT COUNT(DISTINCT cust_name)
FROM t
GROUP BY cust_name
HAVING MIN(CASE WHEN device ='samsung' THEN 0 ELSE 1 END)
* MIN(CASE WHEN device ='apple' THEN 1 ELSE 0 END) = 1;
d)
SELECT COUNT(DISTINCT cust_name)
FROM t
GROUP BY cust_name
HAVING MIN(CASE WHEN device ='apple' THEN 0 ELSE 1 END)
* MIN(CASE WHEN device ='samsung' THEN 1 ELSE 0 END) = 1
Demo

how do we have count of a specific values for multiple columns with table having a unique column

If I have a table like :
u_id A B C D
----------------------------------
jud 1 1 0 1
bud 0 0 1 0
cud 1 1 0 1
nud 0 0 1 0
dud 1 0 0 1
aud 0 1 1 0
fud 1 0 1 1
which sql is useful to get output like:
count 0 count 1
-----------------------
A 3 4
B 4 3
C 3 4
D 3 4
Doesn't matter row or columns just need count of a specific value count for multiple columns in a table.
Instead of 0's and 1's it can be specific string values as well as 'yes' or 'no'
Thank you
Use UNION ALL and aggregation. Assuming that the only possible values in the columns are 0 and 1:
SELECT 'A' col, COUNT(*) - SUM(A) count0, SUM(A) count1 FROM mytable
UNION ALL SELECT 'B', COUNT(*) - SUM(B), SUM(B) FROM mytable
UNION ALL SELECT 'C', COUNT(*) - SUM(C), SUM(C) FROM mytable
UNION ALL SELECT 'D', COUNT(*) - SUM(D), SUM(D) FROM mytable
Demo on DB Fiddle:
| col | count0 | count1 |
| --- | ------ | ------ |
| A | 3 | 4 |
| B | 4 | 3 |
| C | 3 | 4 |
| D | 3 | 4 |
If other values than 0/1 are possible, then you can change the SELECTs to, eg 'yes'/'no', then:
SELECT
'A' col,
SUM(CASE WHEN A = 'no' THEN 1 ELSE 0 END) count_no,
SUM(CASE WHEN A = 'yes' THEN 1 ELSE O END) count_yes
FROM mytable
GROUP BY col
UNION ALL SELECT
'B' col,
SUM(CASE WHEN B = 'no' THEN 1 ELSE 0 END),
SUM(CASE WHEN B = 'yes' THEN 1 ELSE 0 END)
FROM mytable
GROUP BY col
UNION ALL SELECT
'C' col,
SUM(CASE WHEN C = 'no' THEN 1 ELSE 0 END),
SUM(CASE WHEN C = 'yes' THEN 1 ELSE 0 END)
FROM mytable
GROUP BY col
UNION ALL SELECT
'D' col,
SUM(CASE WHEN D = 'no' THEN 1 ELSE 0 END),
SUM(CASE WHEN D = 'yes' THEN 1 ELSE 0 END)
FROM mytable
GROUP BY col
If you are okay with a single row, you can do:
select sum(a), sum(1-a), sum(b), sum(1-b), sum(c), sum(1-c), sum(d), sum(1-d)
from t;
The advantage of this approach is that t is read only once. This is even more true if it is a complex view.
With that in mind, you can unpivot this result:
select v.x,
(case when v.x = 'a' then a_0 end) as a_0,
(case when v.x = 'a' then a_1 end) as a_1,
(case when v.x = 'b' then b_0 end) as b_0,
(case when v.x = 'b' then b_1 end) as b_1,
(case when v.x = 'c' then c_0 end) as c_0,
(case when v.x = 'c' then c_1 end) as c_1,
(case when v.x = 'd' then d_0 end) as d_0,
(case when v.x = 'd' then d_1 end) as d_1
from (select sum(a) as a_1, sum(1-a) as a_0,
sum(b) as b_1, sum(1-b) as b_0,
sum(c) as c_1, sum(1-c) as c_0,
sum(d) as d_1, sum(1-d) as d_0
from t
) s cross join
(values ('a'), ('b'), ('c'), ('d')) v(x) -- may require a subquery
You don't mention the database you're using, but in Oracle you can use DECODE and COUNT together to make this reasonably clean:
SELECT 'A' AS FIELD_NAME,
COUNT(DECODE(A, 0, 0, NULL)) AS ZERO_COUNT,
COUNT(DECODE(A, 0, NULL, A)) AS NON_ZERO_COUNT
FROM TEST_TABLE UNION ALL
SELECT 'B', COUNT(DECODE(B, 0, 0, NULL)),
COUNT(DECODE(B, 0, NULL, A))
FROM TEST_TABLE UNION ALL
SELECT 'C', COUNT(DECODE(C, 0, 0, NULL)),
COUNT(DECODE(C, 0, NULL, A))
FROM TEST_TABLE UNION ALL
SELECT 'D', COUNT(DECODE(D, 0, 0, NULL)),
COUNT(DECODE(D, 0, NULL, A))
FROM TEST_TABLE
dbfiddle here

Counting columns with a where clause

Is there a way to count a number of columns which has a particular value for each rows in Hive.
I have data which looks like in input and I want to count how many columns have value 'a' and how many column have value 'b' and get the output like in 'Output'.
Is there a way to accomplish this with Hive query?
One method in Hive is:
select ( (case when cl_1 = 'a' then 1 else 0 end) +
(case when cl_2 = 'a' then 1 else 0 end) +
(case when cl_3 = 'a' then 1 else 0 end) +
(case when cl_4 = 'a' then 1 else 0 end) +
(case when cl_5 = 'a' then 1 else 0 end)
) as count_a,
( (case when cl_1 = 'b' then 1 else 0 end) +
(case when cl_2 = 'b' then 1 else 0 end) +
(case when cl_3 = 'b' then 1 else 0 end) +
(case when cl_4 = 'b' then 1 else 0 end) +
(case when cl_5 = 'b' then 1 else 0 end)
) as count_b
from t;
To get the total count, I would suggest using a subquery and adding count_a and count_b.
Use lateral view with explode on the data and do the aggregations on it.
select id
,sum(cast(col='a' as int)) as cnt_a
,sum(cast(col='b' as int)) as cnt_b
,sum(cast(col in ('a','b') as int)) as cnt_total
from tbl
lateral view explode(array(ci_1,ci_2,ci_3,ci_4,ci_5)) tbl as col
group by id

Sum data for many different results for same field

I am trying to find a better way to write this sql server code 2008. It works and data is accurate. Reason i ask is that i will be asked to do this for several other reports going forward and want to reduce the amount of code to upkeep going forward.
How can i take a field where i sum for the yes/no/- (dash) in each field without doing an individual sum as i have in code. Each table is a month of detail data which i sum using in a CTE. i changed the table name for each month and Union All to put data together. Is there a better way to do this. This is a small sample of code. Thanks for the help.
WITH H AS (
SELECT 'August' AS Month_Name
, SUM(CASE WHEN G.FFS = '-' THEN 1 ELSE 0 END) AS FFS_Dash
, SUM(CASE WHEN G.FFS = 'Yes' THEN 1 ELSE 0 END) AS FFS_Yes
, SUM(CASE WHEN G.FFS = 'No' THEN 1 ELSE 0 END) AS FFS_No
, SUM(CASE WHEN G.DNA = '-' THEN 1 ELSE 0 END) AS DNA_Dash
, SUM(CASE WHEN G.DNA = 'Yes' THEN 1 ELSE 0 END) AS DNA_Yes
, SUM(CASE WHEN G.DNA = 'No' THEN 1 ELSE 0 END) AS DNA_No
FROM table08 G )
, G AS (
SELECT 'July' AS Month_Name
, SUM(CASE WHEN G.FFS = '-' THEN 1 ELSE 0 END) AS FFS_Dash
, SUM(CASE WHEN G.FFS = 'Yes' THEN 1 ELSE 0 END) AS FFS_Yes
, SUM(CASE WHEN G.FFS = 'No' THEN 1 ELSE 0 END) AS FFS_No
, SUM(CASE WHEN G.DNA = '-' THEN 1 ELSE 0 END) AS DNA_Dash
, SUM(CASE WHEN G.DNA = 'Yes' THEN 1 ELSE 0 END) AS DNA_Yes
, SUM(CASE WHEN G.DNA = 'No' THEN 1 ELSE 0 END) AS DNA_No
FROM table07 G )
select * from H
UNION ALL
select * from G
How about:
SELECT Month_Name,
SUM(CASE WHEN G.FFS = '-' THEN 1 ELSE 0 END) AS FFS_Dash,
SUM(CASE WHEN G.FFS = 'Yes' THEN 1 ELSE 0 END) AS FFS_Yes,
SUM(CASE WHEN G.FFS = 'No' THEN 1 ELSE 0 END) AS FFS_No,
SUM(CASE WHEN G.DNA = '-' THEN 1 ELSE 0 END) AS DNA_Dash,
SUM(CASE WHEN G.DNA = 'Yes' THEN 1 ELSE 0 END) AS DNA_Yes,
SUM(CASE WHEN G.DNA = 'No' THEN 1 ELSE 0 END) AS DNA_No
FROM ((select 'July' as Month_Name, G.*
from table07 G
) union all
(select 'August', H.*
from table08 H
)
) gh
GROUP BY Month_Name;
However, having tables with the same structure is usually a sign of poor database design. You should have a single table with a column representing the month.

Oracle SQL dividing two self defined columns

if i have the following select two count cases:
COUNT(CASE WHEN STATUS ='Færdig' THEN 1 END) as completed_callbacks,
COUNT(CASE WHEN SOLVED_SECONDS /60 /60 <= 2 THEN 1 END) as completed_within_2hours
and i want to devide the two results with eachother how can i achieve this?
this is my attemt however that failed:
CASE(completed_callbacks / completed_within_2hours * 100) as Percentage
i know this is a rather simple question but i havnt been able to find the answer anywhere
You have to create a derived table:
SELECT completed_callbacks / completed_within_2hours * 100
FROM (SELECT Count(CASE
WHEN status = 'Færdig' THEN 1
END) AS completed_callbacks,
Count(CASE
WHEN solved_seconds / 60 / 60 <= 2 THEN 1
END) AS completed_within_2hours
FROM yourtable
WHERE ...)
Try this:
with x as (
select 'Y' as completed, 'Y' as completed_fast from dual
union all
select 'Y' as completed, 'N' as completed_fast from dual
union all
select 'Y' as completed, 'Y' as completed_fast from dual
union all
select 'N' as completed, 'N' as completed_fast from dual
)
select
sum(case when completed='Y' then 1 else 0 end) as count_completed,
sum(case when completed='N' then 1 else 0 end) as count_not_completed,
sum(case when completed='Y' and completed_fast='Y' then 1 else 0 end) as count_completed_fast,
case when (sum(case when completed='Y' then 1 else 0 end) = 0) then 0 else
((sum(case when completed='Y' and completed_fast='Y' then 1 else 0 end) / sum(case when completed='Y' then 1 else 0 end))*100)
end pct_completed_fast
from x;
Results:
"COUNT_COMPLETED" "COUNT_NOT_COMPLETED" "COUNT_COMPLETED_FAST" "PCT_COMPLETED_FAST"
3 1 2 66.66666666666666666666666666666666666667
The trick is to use SUM rather than COUNT, along with a decode or CASE.
select
COUNT(CASE WHEN STATUS ='Færdig' THEN 1 END)
/
COUNT(CASE WHEN SOLVED_SECONDS /60 /60 <= 2 THEN 1 END)
* 100
as
Percentage