SQL: alternatives and substitutions for GROUPING SETS and PIVOT - sql

I've got code like this:
SELECT id, YEAR(datek) AS YEAR, COUNT(*) AS NUM
FROM Orders
GROUP BY GROUPING SETS
(
(id, YEAR(datek)),
id,
YEAR(datek),
()
);
It gives me this output:
1 NULL 4
2 NULL 11
3 NULL 6
NULL NULL 21
1 2006 36
2 2006 56
3 2006 51
NULL 2006 143
1 2007 130
2 2007 143
3 2007 125
NULL 2007 398
1 2008 79
2 2008 116
3 2008 73
NULL 2008 268
NULL NULL 830
1 NULL 249
2 NULL 326
3 NULL 255
What I need to do is write it without "grouping sets" (nor cube or rollup) but with the same result. I thought about writing three different queries and join them with "union". I try something like "null" in group by settings but it does not work.
SELECT id, YEAR(datek) AS rok, COUNT(*) AS NUM
FROM Orders
GROUP BY id, YEAR(datek)
UNION
SELECT id, YEAR(datek) AS rok, COUNT(*) AS NUM
FROM Orders
GROUP BY id, null
order by id, YEAR(datek)
I also have a question about "PIVOT". What kind of syntax can replace query with "PIVOT"?
Thanks for your time and all the answers!

You are right in that you need separate queries, although you actually need 4, and rather than GROUP BY NULL, just group by the columns in the corresponding grouping set, and replace the column in the SELECT with NULL:
SELECT id, YEAR(datek) AS rok, COUNT(*) AS NUM
FROM Orders
GROUP BY id, YEAR(datek)
UNION ALL
SELECT id, NULL, COUNT(*) AS NUM
FROM Orders
GROUP BY id
UNION ALL
SELECT NULL, YEAR(datek), COUNT(*) AS NUM
FROM Orders
GROUP BY YEAR(datek)
UNION ALL
SELECT NULL, NULL, COUNT(*) AS NUM
FROM Orders
ORDER BY ID, Rok
With regard to a replacement for PIVOT I think the best alternative is to use a conditional aggregate, e.g. instead of:
SELECT pvt.SomeGroup,
pvt.[A],
pvt.[B],
pvt.[C]
FROM T
PIVOT (SUM(Val) FOR Col IN ([A], [B], [C])) AS pvt;
You would use:
SELECT T.SomeGroup,
[A] = SUM(CASE WHEN T.Col = 'A' THEN T.Val ELSE 0 END),
[B] = SUM(CASE WHEN T.Col = 'B' THEN T.Val ELSE 0 END),
[C] = SUM(CASE WHEN T.Col = 'C' THEN T.Val ELSE 0 END)
FROM T
GROUP BY T.SomeGroup;

Related

How put grouping variable to columns in SQL/

I have following dataset
and want to get this
How can I do it?
Using SQL Server, you can use a PIVOT, such as :
SELECT Time, [a],[b],[c]
FROM
(
SELECT time, [group],value
FROM dataset) d
PIVOT
(
SUM(value)
FOR [group] IN ([a],[b],[c])
) AS pvt
You can try it on the following fiddle.
Changed the column names to not conflict with reserved words. You would have to put them into single quotes otherwise.
WITH
-- the input
indata(grp,tm,val) AS (
SELECT 'a',1,44
UNION ALL SELECT 'a',2,22
UNION ALL SELECT 'a',3, 1
UNION ALL SELECT 'b',1, 1
UNION ALL SELECT 'b',2, 5
UNION ALL SELECT 'b',3, 6
UNION ALL SELECT 'c',1, 7
UNION ALL SELECT 'c',2, 8
UNION ALL SELECT 'c',3, 9
)
SELECT tm
, SUM(CASE grp WHEN 'a' THEN val END) AS a
, SUM(CASE grp WHEN 'b' THEN val END) AS b
, SUM(CASE grp WHEN 'c' THEN val END) AS c
FROM indata
GROUP BY tm
;
tm | a | b | c
----+----+---+---
1 | 44 | 1 | 7
2 | 22 | 5 | 8
3 | 1 | 6 | 9
select * from
(
select
time,[group],value
from yourTable
group by time,[group],value
)
as table
pivot
(
sum([value])
for [group] in ([a],[b],[c])
) as p
order by time
This is the result
for Vertica,
SELECT time
, SUM(value) FILTER (WHERE group = a) a
, SUM(value) FILTER (WHERE group = b) b
, SUM(value) FILTER (WHERE group = c) c
FROM yourTable
GROUP BY time

Find subsequent occurrence of a value in a table

I have a table which looks like shown below
ID SubmittedValue ApprovedValue
1 25.9 0
1 29 29
1 25.9 25.9
1 50 0
1 45 0
1 10 0
1 10 10
Expected result
ID SubsequentlyApproved(CNT) Total_Amt_sub_aprvd
1 2 35.9
We get the above result because 25.9+10 since it is repeated in the subsequent rows.
How to perform VLOOKUP like functionality for this scenario. I tried the subquery but it didn't work.
SELECT a.id,
SUM(CASE WHEN a.ApprovedValue=0 THEN 1 ELSE 0 END) AS SUB_COUNT
FROM myTable a
join (select id, sum( case when SubmittedValue=ApprovedValue then 1 end) as check_value from myTable) b
on b.id=a.id and SUB_COUNT=check_value
but this is not giving me the expected result.
You seem to want to count rows where the values are the same and the first value appears more than once. If so, you can use window functions and aggregation:
select id, count(*), sum(ApprovedValue)
from (select t.*, count(*) over (partition by id, SubmittedValue) as cnt
from t
) t
where cnt > 1 and SubmittedValue = ApprovedValue
group by id
Without window functions using a semi-join
select id, count(*), sum(submittedvalue)
from test t1
where submittedvalue=approvedvalue
and exists (select 1
from test t2
where t1.id=t2.id and t1.submittedvalue=t2.submittedvalue
group by id, submittedvalue
having count(*)>1)
group by id;

How do I create a pivot table based on a specific condition using SQL Server?

I need to make this particular pivot table using SQL Server, but can't get it to work.
What I have:
id name was_clicked times
1 CustomerA 0 654
1 CustomerA 1 24
1 CustomerB 0 121
1 CustomerB 1 12
1 CustomerC 0 1203
1 CustomerC 1 67
What I want:
id name views clicks
1 CustomerA 654 24
1 CustomerB 121 12
1 CustomerC 1203 67
Is it possible?
Thank you.
You can use an aggregate function with a CASE expression to get the result:
select id, name,
sum(case when was_clicked = 0 then times else 0 end) views,
sum(case when was_clicked = 1 then times else 0 end) clicks
from yourtable
group by id, name;
See SQL Fiddle with Demo
Or you can use the PIVOT function:
select id, name, views, clicks
from
(
select id, name,
case when was_clicked = 1 then 'clicks' else 'views' end vc,
times
from yourtable
) d
pivot
(
sum(times)
for vc in (clicks, views)
) piv
See SQL Fiddle with Demo
About as basic example of PIVOT as you could ask for:
declare #t table (id int, name varchar(10),was_clicked bit,times int)
insert into #t(id, name, was_clicked, times) values
( 1 ,'CustomerA', 0, 654 ),
( 1 ,'CustomerA', 1, 24 ),
( 1 ,'CustomerB', 0, 121 ),
( 1 ,'CustomerB', 1, 12 ),
( 1 ,'CustomerC', 0, 1203 ),
( 1 ,'CustomerC', 1, 67 )
select
id,
name,
[0] as views,
[1] as clicks
from #t t pivot (MAX(times) for was_clicked in ([0],[1])) as pt
You always have to include some aggregate in the PIVOT expression, since there could be multiple matching rows which should end up in the same column and row position in the result set. When you happen to know that only one is possible, the choice between MIN, MAX and SUM is somewhat arbitrary.
I tend to default to using MIN or MAX since its applicable to non-numeric types as well.

SQL union same number of columns, same data types, different data

I have two result sets that look approximately like this:
Id Name Count
1 Asd 1
2 Sdf 4
3 Dfg 567
4 Fgh 23
But the Count column data is different for the second one and I would like both to be displayed, about like this:
Id Name Count from set 1 Count from set two
1 Asd 1 15
2 Sdf 4 840
3 Dfg 567 81
4 Fgh 23 9
How can I do this in SQL (with union if possible)?
My current SQL, hope this will better explain what I want to do:
(SELECT Id, Name, COUNT(*) FROM Customers where X)
union
(SELECT Id, Name, COUNT(*) FROM Customers where Y)
select *
from
(
SELECT 'S1' as dataset, Id, Name, COUNT(*) as resultcount FROM Customers where X
union
SELECT 'S2',Id, Name, COUNT(*) FROM Customers where Y
) s
pivot
(sum(resultcount) for dataset in (s1,s2)) p
You can do something like:
;WITH Unioned
AS
(
SELECT 'Set1' FromWhat, Id, Name FROM Table1
UNION ALL
SELECT 'Set2', Id, Name FROM Table2
)
SELECT
Id,
Name,
SUM(CASE FromWhat WHEN 'Set1' THEN 1 ELSE 0 END) 'Count from set 1',
SUM(CASE FromWhat WHEN 'Set2' THEN 1 ELSE 0 END) 'Count from set 2'
FROM Unioned
GROUP BY Id, Name;
SQL Fiddle Demo

How to transpose recordset columns into rows

I have a query whose code looks like this:
SELECT DocumentID, ComplexSubquery1 ... ComplexSubquery5
FROM Document
WHERE ...
ComplexSubquery are all numerical fields that are calculated using, duh, complex subqueries.
I would like to use this query as a subquery to a query that generates a summary like the following one:
Field DocumentCount Total
1 dc1 s1
2 dc2 s2
3 dc3 s3
4 dc4 s4
5 dc5 s5
Where:
dc<n> = SUM(CASE WHEN ComplexSubquery<n> > 0 THEN 1 END)
s <n> = SUM(CASE WHEN Field = n THEN ComplexSubquery<n> END)
How could I do that in SQL Server?
NOTE: I know I could avoid the problem by discarding the original query and using unions:
SELECT '1' AS TypeID,
SUM(CASE WHEN ComplexSubquery1 > 0 THEN 1 END) AS DocumentCount
SUM(ComplexSubquery1) AS Total
FROM (SELECT DocumentID, BLARGH ... AS ComplexSubquery1) T
UNION ALL
SELECT '2' AS TypeID,
SUM(CASE WHEN ComplexSubquery2 > 0 THEN 1 END) AS DocumentCount
SUM(ComplexSubquery2) AS Total
FROM (SELECT DocumentID, BLARGH ... AS ComplexSubquery2) T
UNION ALL
...
But I want to avoid this route, because redundant code makes my eyes bleed. (Besides, there is a real possibility that the number of complex subqueries grow in the future.)
WITH Document(DocumentID, Field) As
(
SELECT 1, 1 union all
SELECT 2, 1 union all
SELECT 3, 2 union all
SELECT 4, 3 union all
SELECT 5, 4 union all
SELECT 6, 5 union all
SELECT 7, 5
), CTE AS
(
SELECT DocumentID,
Field,
(select 10) As ComplexSubquery1,
(select 20) as ComplexSubquery2,
(select 30) As ComplexSubquery3,
(select 40) as ComplexSubquery4,
(select 50) as ComplexSubquery5
FROM Document
)
SELECT Field,
SUM(CASE WHEN RIGHT(Query,1) = Field AND QueryValue > 1 THEN 1 END ) AS DocumentCount,
SUM(CASE WHEN RIGHT(Query,1) = Field THEN QueryValue END ) AS Total
FROM CTE
UNPIVOT (QueryValue FOR Query IN
(ComplexSubquery1, ComplexSubquery2, ComplexSubquery3,
ComplexSubquery4, ComplexSubquery5)
)AS unpvt
GROUP BY Field
Returns
Field DocumentCount Total
----------- ------------- -----------
1 2 20
2 1 20
3 1 30
4 1 40
5 2 100
I'm not 100% positive from your example, but perhaps the PIVOT operator will help you out here? I think if you selected your original query into a temporary table, you could pivot on the document ID and get the sums for the other queries.
I don't have much experience with it though, so I'm not sure how complex you can get with your subqueries - you might have to break it down.