SQL aggregate multiple values in a column then Pivot

SQL aggregate multiple values in a column then Pivot - sql

I've got a table with a particular column of products. Let's say product A, B, and C.
This table also has a date column.
I'd like to built a pivot table by month, with columns being combinations of users of these products.
So
Just A
Just B
Just C
A and B
A and C
B and C
A, B and C.
I can do it without the combination values as follows:
Select * from
(Select product_type, date_month, person_id
From products_table)
PIVOT (
Count(person_id)
For product_type in
(
[A]
,[B]
,[C]
)
) As pivot_table;
So my question is, how do I build combinations of these values and then add them to the pivot? Do I need to build the combination columns first and then add them to the pivot somehow?
DATE_MONTH | A | B | C | A_B | A_C | B_C | A_B_C
01-01-2020 | 30 | 75 | 10 | 105 | 40 | 85| 115

I think you're looking for something like this. First it determines the distinct product types for each person_id. Then it aggregates the product types using STRING_AGG to create the groupings. Then it uses conditional aggregation to count the person_id's in each category by month.
Using STRING_AGG SQL Server 2017+
with
dist_prod_typ_cte(person_id, product_type) as (
select distinct person_id, product_type
from products_table),
prod_grp_cte(person_id, prod_grp) as (
select person_id, string_agg(product_type, '_') within group (order by product_type) p_type_grp
from dist_prod_typ_cte
group by person_id)
select pt.date_month,
count(case when pgc.prod_grp='A' then pt.person_id else null end) A,
count(case when pgc.prod_grp='B' then pt.person_id else null end) B,
count(case when pgc.prod_grp='C' then pt.person_id else null end) C,
count(case when pgc.prod_grp='A_B' then pt.person_id else null end) A_B,
count(case when pgc.prod_grp='A_C' then pt.person_id else null end) A_C,
count(case when pgc.prod_grp='B_C' then pt.person_id else null end) B_C,
count(case when pgc.prod_grp='A_B_C' then pt.person_id else null end) A_B_C
from products_table pt
join prod_grp_cte pgc on pt.person_id=pgc.person_id
group by pt.date_month;
Using STUFF' and 'FOR XML SQL Server 2016 and prior
with
dist_prod_typ_cte(person_id, product_type) as (
select distinct person_id, product_type
from products_table),
prod_grp_cte(person_id, prod_grp) as (
select person_id, stuff((select '_' + cast(dptc_in.product_type as varchar(18))
from dist_prod_typ_cte dptc_in
where dptc.person_id = dptc_in.person_id
order by dptc_in.product_type
for xml path('')), 1, 1, '')
from dist_prod_typ_cte
group by person_id)
select pt.date_month,
count(case when pgc.prod_grp='A' then pt.person_id else null end) A,
count(case when pgc.prod_grp='B' then pt.person_id else null end) B,
count(case when pgc.prod_grp='C' then pt.person_id else null end) C,
count(case when pgc.prod_grp='A_B' then pt.person_id else null end) A_B,
count(case when pgc.prod_grp='A_C' then pt.person_id else null end) A_C,
count(case when pgc.prod_grp='B_C' then pt.person_id else null end) B_C,
count(case when pgc.prod_grp='A_B_C' then pt.person_id else null end) A_B_C
from products_table pt
join prod_grp_cte pgc on pt.person_id=pgc.person_id
group by pt.date_month;

Related

T-SQL: How do I flatten out a table like this? [duplicate]

This question already has answers here:
Convert Rows to columns using 'Pivot' in SQL Server
(9 answers)
Closed 3 years ago.
I have a table I loaded that looks like this:
CUSTID VALUETYPE COST
1 A 123
1 B 456
1 C 789
2 B 222
And I need to flatten it out in the same table or insert into a new one to look like this:
CUSTID A B C
1 123 456 789
2 0 222 0
Each row has an identity column not shown.
What would this cursor look like?
Thank you.

I don't see that you need to sum the columns:
select
custid,
max(case when valuetype = 'A' then cost else 0 end) A,
max(case when valuetype = 'B' then cost else 0 end) B,
max(case when valuetype = 'C' then cost else 0 end) C
from tablename
group by custid

Use a query, such as conditional aggregation:
select custid,
sum(case when valuetype = 'A' then cost end) as a,
sum(case when valuetype = 'B' then cost end) as b,
sum(case when valuetype = 'C' then cost end) as c
from t
group by custid;

use case when
select custid , sum(case when valuetype='A' then cost else 0 end) A,
sum(case when valuetype='B' then cost else 0 end) B
,sum(case when valuetype='C' then cost else 0 end) C
from t group by custid

You could make use of a PIVOT
SELECT
CUSTID
,ISNULL(p.A,0) AS A
,ISNULL(p.B,0) AS B
,ISNULL(p.C,0) AS C
FROM t
PIVOT (
SUM(COST) FOR VALUETYPE IN ([A],[B],[C])) p

If you don't mind NULL values in the Results
Select *
From YourTable
Pivot (sum(Cost) for ValueType in ([A],[B],[C])) pvt
Returns
CUSTID A B C
1 123 456 789
2 NULL 222 NULL
Otherwise, You Can Eliminate NULL Values
Select *
From (Select * From YourTable
Union All
Select A.CustID ,B.VALUETYPE,0
From (Select Distinct CustID from YourTable) A
Cross Join (Select Distinct VALUETYPE from YourTable) B
) src
Pivot (sum(Cost) for ValueType in ([A],[B],[C])) pvt
Returns
CUSTID A B C
1 123 456 789
2 0 222 0

Comparing values in two columns and returning values on conditional bases using sql hive

I have two columns, I want to get an output based on a comparative basis of both. My data is somewhat like:
Customer Id status
100 A
100 B
101 B
102 A
103 A
103 B
So a customer can have a status A or B or both, I have to segrerate them on customer id basis for a status. If status A and B then return happy, if only A, return Avg and if only B return Sad.

try the below query,
SELECT DISTINCT Customer_Id,
(CASE WHEN COUNT(*) OVER(PARTITION BY Customer_Id) > 1 THEN 'happy'
WHEN Tstatus = 'A' THEN 'Avg'
ELSE 'Sad'END) AS New_Status
FROM #table1
GROUP BY Customer_Id,Tstatus

if Customer Id and status is a unique combination then
STEP 1: use case to determine a or b
SELECT customer id
,CASE WHEN avg(case when [status] ='A' then 0 else 2 end)
FROM [Your Table]
group by[customer id]
and step 2 will be casing avg into result: like this
SELECT customer id
,CASE WHEN (avg(case when [status] ='A' then 0 else 2 end)) = 1 THEN 'happy' ELSE WHEN (avg(case when [status] ='A' then 0 else 2 end)) = 0 THEN 'Avg' ELSE 'Sad' END
FROM [Your Table]
group by[customer id]

I would do this simply as:
select customer_id,
(case when min(status) <> max(status) then 'happy'
when min(status) = 'A' then 'avg'
else 'sad'
end)
from t
where status in ('A', 'B')
group by customer_id

how to Sum "case when" clause after inner join without duplication

I am a new member of Stackoverflow (and also I'm a new of coding sql) if i do any mistakes about my question please advise me :)
I'm trying to get SUM of Amount in CASE WHEN Clause.
Here is my table
tableA
UserID transid Brand Amount
UserA 109974 MIX 960.00 --BrandMIX=A & B
UserB 109975 B 894.00
UserC 109976 C 350.00
UserC 109977 MIX 300.00 --BrandMIX=C & D
tableB
Row transid Brand
1 109974 A
2 109974 B
3 109975 B
4 109976 C
5 109977 C
6 109977 D
I tried inner join table a and table b on a.soid = b.soid
and when i SUM it, result was wrong due to duplication of transid in table b.
here is my coding.
SELECT UserID
,CASE WHEN COUNT(DISTINCT(b.transid )) <> 0 THEN COUNT(DISTINCT(b.transid ))
ELSE NULL END) AS 'Frequency'
,SUM(Amount) as 'TotalAmount'
,YEAR(transdatetime) AS 'Year'
,SUM (CASE [Brand] WHEN 'AA' THEN 1 ELSE 0 END) AS [BrandA]
,SUM (CASE [Brand] WHEN 'BB' THEN 1 ELSE 0 END) AS [BrandB]
,SUM (CASE [Brand] WHEN 'CC' THEN 1 ELSE 0 END) AS [BrandC]
,SUM (CASE [Brand] WHEN 'DD' THEN 1 ELSE 0 END) AS [BrandD]
,SUM (CASE [Brand] WHEN 'ZZ' THEN 1 ELSE 0 END) AS [BrandZ]
FROM tableA a
INNER tableB b ON a.transid = b.transid
WHERE is_paid = 'N'
GROUP BY UserID, YEAR(transdatetime)
Result that I got
Result that I want
I added ROW_NUMBER() into tableB, I want to SUM(Amount) WHERE MIN(ROW_NUMBER), is it impossible?
Please advise, thank you.

Based on your row_number i.e (row column of tableB) I have modified your query to take min(row) values. Please try this...If this does not work please post the complete tables structures.
SELECT UserID
,CASE WHEN COUNT(DISTINCT(b.transid )) <> 0 THEN COUNT(DISTINCT(b.transid ))
ELSE NULL END AS 'Frequency'
,SUM(Amount) as 'TotalAmount'
,YEAR(transdatetime) AS 'Year'
,SUM (CASE [Brand] WHEN 'AA' THEN 1 ELSE 0 END) AS [BrandA]
,SUM (CASE [Brand] WHEN 'BB' THEN 1 ELSE 0 END) AS [BrandB]
,SUM (CASE [Brand] WHEN 'CC' THEN 1 ELSE 0 END) AS [BrandC]
,SUM (CASE [Brand] WHEN 'DD' THEN 1 ELSE 0 END) AS [BrandD]
,SUM (CASE [Brand] WHEN 'ZZ' THEN 1 ELSE 0 END) AS [BrandZ]
FROM tableA a
INNER JOIN tableB b ON a.transid = b.transid
WHERE is_paid = 'N' AND b.row in (select min(row) from tableB group by transid)
GROUP BY UserID, YEAR(transdatetime)

Tuning oracle subquery in select statement

I have a master table and a reference table as below.
WITH MAS as (
SELECT 10 as CUSTOMER_ID, 1 PROCESS_ID, 44 PROCESS_TYPE, 200 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 1 PROCESS_ID, 44 PROCESS_TYPE, 250 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 2 PROCESS_ID, 45 PROCESS_TYPE, 300 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 2 PROCESS_ID, 45 PROCESS_TYPE, 350 as AMOUNT FROM DUAL
), REFTAB as (
SELECT 44 PROCESS_TYPE, 'A' GROUP_ID FROM DUAL UNION ALL
SELECT 44 PROCESS_TYPE, 'B' GROUP_ID FROM DUAL UNION ALL
SELECT 45 PROCESS_TYPE, 'C' GROUP_ID FROM DUAL UNION ALL
SELECT 45 PROCESS_TYPE, 'D' GROUP_ID FROM DUAL
) SELECT ...
My first select statement which works correctly is this one:
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN PROCESS_TYPE IN (SELECT PROCESS_TYPE FROM REFTAB WHERE GROUP_ID = 'A')
THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN PROCESS_TYPE IN (SELECT PROCESS_TYPE FROM REFTAB WHERE GROUP_ID = 'D')
THEN 1 ELSE NULL END) as COUNT1
FROM MAS
GROUP BY CUSTOMER_ID
However, to address a performance issue, I changed it to this select statement:
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN GROUP_ID = 'A' THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN GROUP_ID = 'D' THEN 1 ELSE NULL END) as COUNT1
FROM MAS A
LEFT JOIN REFTAB B ON A.PROCESS_TYPE = B.PROCESS_TYPE
GROUP BY CUSTOMER_ID
For the AMOUNT2 and COUNT1 columns, the values stay the same. But for AMOUNT1, the value is multiplied because of the join with the reference table.
I know I can add 1 more left join with an additional join condition on GROUP_ID. But that won't be any different from using a subquery.
Any idea how to make the query work with just 1 left join while not multiplying the AMOUNT1 value?

I know I can add 1 more left join with adding aditional GROUP_ID clause but it wont be different from subquery.
You'd be surprised. Having 2 left joins instead of subqueries in the SELECT gives the optimizer more ways of optimizing the query. I would still try it:
select m.customer_id,
sum(m.amount) as amount1,
sum(case when grpA.group_id is not null then m.amount end) as amount2,
count(grpD.group_id) as count1
from mas m
left join reftab grpA
on grpA.process_type = m.process_type
and grpA.group_id = 'A'
left join reftab grpD
on grpD.process_type = m.process_type
and grpD.group_id = 'D'
group by m.customer_id
You can also try this query, which uses the SUM() analytic function to calculate the amount1 value before the join to avoid the duplicate value problem:
select m.customer_id,
m.customer_sum as amount1,
sum(case when r.group_id = 'A' then m.amount end) as amount2,
count(case when r.group_id = 'D' then 'X' end) as count1
from (select customer_id,
process_type,
amount,
sum(amount) over (partition by customer_id) as customer_sum
from mas) m
left join reftab r
on r.process_type = m.process_type
group by m.customer_id,
m.customer_sum
You can test both options, and see which one performs better.

Starting off with your original query, simply replacing your IN queries with EXISTS statements should provide a significant boost. Also, be wary of summing NULLs, perhaps your ELSE statements should be 0?
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN EXISTS(SELECT 1 FROM REFTAB WHERE REFTAB.GROUP_ID = 'A' AND REFTAB.PROCESS_TYPE = MAS.PROCESS_TYPE)
THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN EXISTS(SELECT 1 FROM REFTAB WHERE REFTAB.GROUP_ID = 'D' AND REFTAB.PROCESS_TYPE = MAS.PROCESS_TYPE)
THEN 1 ELSE NULL END) as COUNT1
FROM MAS
GROUP BY CUSTOMER_ID

The normal way is to aggregate the values before the group by. You can also use conditional aggregation, if the rest of the query is correct:
SELECT CUSTOMER_ID,
SUM(CASE WHEN seqnum = 1 THEN AMOUNT END) as AMOUNT1,
SUM(CASE WHEN GROUP_ID = 'A' THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN GROUP_ID = 'D' THEN 1 ELSE NULL END) as COUNT1
FROM MAS A LEFT JOIN
(SELECT B.*, ROW_NUMBER() OVER (PARTITION BY PROCESS_TYPE ORDER BY PROCESS_TYPE) as seqnum
FROM REFTAB B
) B
ON A.PROCESS_TYPE = B.PROCESS_TYPE
GROUP BY CUSTOMER_ID;
This ignores the duplicates created by the joins.

multiple count conditions with single query

I have a table like below -
Student ID | History | Maths | Geography
1 A B B
2 C C E
3 D A B
4 E D A
How to find out how many students got A in history, B in maths and E in Geography with a single sql query ?

If you want to get number of students who got A in History in one column, number of students who got B in Maths in second column and number of students who got E in Geography in third then:
select
sum(case when [History] = 'A' then 1 else 0 end) as HistoryA,
sum(case when [Maths] = 'B' then 1 else 0 end) as MathsB,
sum(case when [Geography] = 'E' then 1 else 0 end) as GeographyC
from Table1
If you want to count students who got A in history, B in maths and E in Geography:
select count(*)
from Table1
where [History] = 'A' and [Maths] = 'B' and [Geography] = 'E'

If you want independent counts use:
SELECT SUM(CASE WHEN Condition1 THEN 1 ELSE 0 END) AS 'Condition1'
,SUM(CASE WHEN Condition2 THEN 1 ELSE 0 END) AS 'Condition2'
,SUM(CASE WHEN Condition3 THEN 1 ELSE 0 END) AS 'Condition3'
FROM YourTable
If you want multiple conditions for one count use:
SELECT COUNT(*)
FROM YourTable
WHERE Condition1
AND Condition2
AND Condition3
It sounds like you want multiple independent counts:
SELECT SUM(CASE WHEN History = 'A' THEN 1 ELSE 0 END) AS 'History A'
,SUM(CASE WHEN Maths = 'B' THEN 1 ELSE 0 END) AS 'Maths B'
,SUM(CASE WHEN Geography = 'E' THEN 1 ELSE 0 END) AS 'Geography E'
FROM YourTable

You can try to select from multiple select statements
SELECT t1.*, t2.*, t3.* FROM
(SELECT COUNT(*) AS h FROM students WHERE History = 'A') as t1,
(SELECT COUNT(*) AS m FROM students WHERE Maths = 'B') as t2,
(SELECT COUNT(*) AS g FROM students WHERE Geography = 'E') as t3

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL aggregate multiple values in a column then Pivot - sql

Related

T-SQL: How do I flatten out a table like this? [duplicate]

Comparing values in two columns and returning values on conditional bases using sql hive

how to Sum "case when" clause after inner join without duplication

Tuning oracle subquery in select statement

multiple count conditions with single query

Categories

Resources