SQL Pivot Columns with prefixes - sql

New to SQL, struggling to fully understand the pivot clause. I have four fields (state, season, rain, snow) and am trying to pivot so that I have 5 fields (state, summer_rain, summer_snow, winter_rain, winter_snow). I'm not sure how to pivot two fields so that they are prefixed with another if that makes sense. Reprex below.
What I have now
What I'm after
My code (receiving an error when aggregating snow & rain within pivot clause):
DECLARE #myTable AS TABLE([state] VARCHAR(20), [season] VARCHAR(20), [rain] int, [snow] int)
INSERT INTO #myTable VALUES ('AL', 'summer', 1, 1)
INSERT INTO #myTable VALUES ('AK', 'summer', 3, 3)
INSERT INTO #myTable VALUES ('AZ', 'summer', 0, 1)
INSERT INTO #myTable VALUES ('AL', 'winter', 5, 4)
INSERT INTO #myTable VALUES ('AK', 'winter', 2, 2)
INSERT INTO #myTable VALUES ('AZ', 'winter', 1, 1)
INSERT INTO #myTable VALUES ('AL', 'summer', 6, 4)
INSERT INTO #myTable VALUES ('AK', 'summer', 3, 0)
INSERT INTO #myTable VALUES ('AZ', 'summer', 5, 1)
SELECT [state], [year], [month], [day]
FROM
(
SELECT * FROM #myTable
) t
PIVOT
(
sum([rain]), sum([snow]) FOR [season] IN ([summer], [winter])
) AS pvt

PIVOTS are great, but Conditional Aggregations offer a bit more flexibility and often more performant.
PIVOT
Select *
From (
SELECT State
,B.*
FROM #myTable
Cross Apply (values (concat(season,'_rain'),rain)
,(concat(season,'_snow'),snow)
) B(Item,Value)
) src
Pivot ( sum(value) for Item in ([summer_rain],[summer_snow],[winter_rain],[winter_snow]) ) pvt
Conditional Aggregation
Select State
,[summer_rain] = sum(case when season='summer' then rain end)
,[summer_snow] = sum(case when season='summer' then snow end)
,[winter_rain] = sum(case when season='winter' then rain end)
,[winter_snow] = sum(case when season='winter' then snow end)
From #myTable
Group By State

Related

Convert rows to columns using pivot

I try to convert this procedure to PIVOT, but I can't. Does anyone have a solution to help?
I have a table has ItemID, StoreID,Stock
I want to convert it to ItemID, Store1,Store2,Store3...,Stock
sum the stock according to itemID and StoreID then inserts the result as a row.
Many thanks
CREATE table #test222
([Id] int,[ItemID] INT, [storeid] int, [stock] decimal(18,2)
);
INSERT INTO #test222
([Id],[ItemID], [storeid], [stock])
VALUES
(1, 1, 3,10),
(2, 1,1, 20),
(3, 1,1, 30),
(4, 2,1, 40),
(5, 2,2,50),
(6, 2,2,60),
(7, 3,2,70),
(8, 4,2,80),
(9, 4,2,90),
(10, 5,2,100);
select * from #test222;
select ItemID, store1,store2,storeid3,storeid4,storeid5,storeid6,storeid7,storeid8,storeid9,storeid10 stock
from
(
select ItemID, storeid, stock
from #test222
) d
pivot
(
max(stock)
for storeid in (1,2,3,4,5,6,7,8,9,10)
) piv;
Give error:
Msg 102 Level 15 State 1 Line 9
Incorrect syntax near '1'.
Here is a simple PIVOT. Just remember to "feed" your pivot with just only the required columns
Example
Select *
From (
Select ItemID
,Col = 'store'+left(storeid,10)
,val = stock
From #test222
) src
Pivot ( max(val) for Col in ( store1,store2,storeid3,storeid4,storeid5,storeid6,storeid7,storeid8,storeid9,storeid10 ) ) src
Results

Referencing multiple aliases in a new field [duplicate]

This question already has answers here:
How to reuse calculated columns avoiding duplicating the sql statement
(6 answers)
Possible to store value of one select column and use it for the next one?
(4 answers)
Closed 4 months ago.
I'm calculating 3 new fields where calculation #2 is dependent on calculation #1 and calculation #2 is dependent on calculation #3.
I'd like to alias these calculations to create a cleaner solution, but I'm not sure how to reference more than one alias. If I just had 2 calculations, I know I could create a subquery and reference my alias in the upper level. However, I'm not sure how to do this with 3 calculations. Would I join subqueries?
Reprex below (current code will encounter an error when attempting to reference an alias within an alias.)
DECLARE #myTable AS TABLE([state] VARCHAR(20), [season] VARCHAR(20), [rain] int, [snow] int, [ice] int)
INSERT INTO #myTable VALUES ('AL', 'summer', 1, 1, 1)
INSERT INTO #myTable VALUES ('AK', 'summer', 3, 3, 1)
INSERT INTO #myTable VALUES ('AZ', 'summer', 0, 1, 1)
INSERT INTO #myTable VALUES ('AL', 'winter', 5, 4, 2)
INSERT INTO #myTable VALUES ('AK', 'winter', 2, 2, 2)
INSERT INTO #myTable VALUES ('AZ', 'winter', 1, 1, 2)
INSERT INTO #myTable VALUES ('AL', 'summer', 6, 4, 3)
INSERT INTO #myTable VALUES ('AK', 'summer', 3, 0, 3)
INSERT INTO #myTable VALUES ('AZ', 'summer', 5, 1, 3)
select *,
ice + snow as cold_precipitation,
rain as warm_precipitation,
cold_precipitation + warm_precipitation as overall_precipitation,
cold_precipitation / sum(overall_precipitation) as cold_pct_of_total,
warm_precipitation / sum(overall_precipitation) as warm_pct_of_total
from #myTable
You can use CROSS APPLY(s) to stack expressions and reference aliases. However, you have an aggregate sum() without a GROUP BY, so your desired results is not clear.
I did change your sum() to a window function sum() over(partition by state,season)
Example
select A.*,
cold_precipitation,
warm_precipitation,
overall_precipitation,
cold_precipitation / sum(overall_precipitation+0.0) over(partition by state,season) as cold_pct_of_total,
warm_precipitation / sum(overall_precipitation+0.0) over(partition by state,season) as warm_pct_of_total
from #myTable A
Cross Apply ( values ( ice + snow , rain ) ) B(cold_precipitation,warm_precipitation)
Cross Apply ( values ( cold_precipitation+warm_precipitation )) C(overall_precipitation)
Results

SQL Filter rows based on multiple distinct values of a column

Given the following table
DECLARE #YourTable TABLE (id int, PLU int, Siteid int, description varchar(50))
INSERT #YourTable VALUES (1, 8972, 2, 'Beer')
INSERT #YourTable VALUES (2, 8972, 3, 'cider')
INSERT #YourTable VALUES (3, 8972, 4, 'Beer')
INSERT #YourTable VALUES (4, 8973, 2, 'Vodka')
INSERT #YourTable VALUES (5, 8973, 3, 'Vodka')
INSERT #YourTable VALUES (6, 8973, 4, 'Vodka')
I trying to write a query that would give me all rows that have multiple distinct values for a given description value against a plu.
So in the example above I would want to return rows 1,2,3 as they have both a 'cider' value and a 'beer' value for a plu of '8972'.
I thought 'GROUP BY' and 'HAVING' was the way to go but I can't seem to get it to work correctly.
SELECT P.PLU, P.Description
FROM #YourTable P
GROUP BY P.PLU, P.Description
HAVING COUNT(DISTINCT(P.DESCRIPTION)) > 1
Any help appreciated.
You shouldn't GROUP BY the description if you are doing a DISTINCT COUNT on it (then it will always be just 1). Try something like this:
SELECT P2.PLU, P2.Description
FROM #YourTable P2
WHERE P2.PLU in (
SELECT P.PLU
FROM #YourTable P
GROUP BY P.PLU
HAVING COUNT(DISTINCT(P.DESCRIPTION)) > 1
)

Show multiple table's data in one single cell in SQL Server

I want to show multiple data from tables into one single cell. Please find the script for my tables.
DECLARE #Tab TABLE(code VARCHAR(10), name varchar(20), val1 INT)
INSERT INTO #Tab VALUES ('A', 'Test', 34)
INSERT INTO #Tab VALUES ('B', 'Test', 6)
DECLARE #Tab1 TABLE(code VARCHAR(10), name varchar(20), val2 INT)
INSERT INTO #Tab1 VALUES ('A','Test', 178)
DECLARE #Tab2 TABLE(code VARCHAR(10), name varchar(20), Total INT)
INSERT INTO #Tab2 VALUES ('A','Test', 180)
INSERT INTO #Tab2 VALUES ('B', 'Test', 10)
DECLARE #Tab3 TABLE(code VARCHAR(10), name varchar(20), val1 INT)
INSERT INTO #Tab3 VALUES ('A', 'Test1', 56)
DECLARE #Tab4 TABLE(code VARCHAR(10), name varchar(20), val2 INT)
INSERT INTO #Tab4 VALUES ('A','Test1', 87)
DECLARE #Tab5 TABLE(code VARCHAR(10), name varchar(20), Total INT)
INSERT INTO #Tab5 VALUES ('A','Test1', 93)
I want to show the data in a cell single cell in the format as below:-
Thanks
Hope this helps you.
With T (code, [name], val1, val2, total)
AS
(
Select code, [name], val1, 0, 0 from #Tab
Union
Select code, [name], 0, val2, 0 from #Tab1
Union
Select code, [name], 0, 0, Total from #Tab2
Union
Select code, [name], val1, 0, 0 from #Tab3
Union
Select code, [name], 0, val2, 0 from #Tab4
Union
Select code, [name], 0, 0, Total from #Tab5
),
T2 (code, [name], total, val1, val2)
AS
(
Select code, [name], total = sum(total), val1 = sum(val1), val2 = sum(val2)
from T group by code, [name]
)
Select code,
IsNull(Min( case [name] when 'Test' then CONCAT('Total ', total, '(val1:', val1, ',val2:', val2,')') end ), 'Total 0(val1:0,val2:0)') Test,
IsNull(Min( case [name] when 'Test1' then CONCAT('Total ', total, '(val1:', val1, ',val2:', val2,')') end ), 'Total 0(val1:0,val2:0)') Test1
from T2
Group By code

T-SQL: Paging WITH TIES

I am trying to implement a paging routine that's a little different.
For the sake of a simple example, let's assume that I have a table defined and populated as follows:
DECLARE #Temp TABLE
(
ParentId INT,
[TimeStamp] DATETIME,
Value INT
);
INSERT INTO #Temp VALUES (1, '1/1/2013 00:00', 6);
INSERT INTO #Temp VALUES (1, '1/1/2013 01:00', 7);
INSERT INTO #Temp VALUES (1, '1/1/2013 02:00', 8);
INSERT INTO #Temp VALUES (2, '1/1/2013 00:00', 6);
INSERT INTO #Temp VALUES (2, '1/1/2013 01:00', 7);
INSERT INTO #Temp VALUES (2, '1/1/2013 02:00', 8);
INSERT INTO #Temp VALUES (3, '1/1/2013 00:00', 6);
INSERT INTO #Temp VALUES (3, '1/1/2013 01:00', 7);
INSERT INTO #Temp VALUES (3, '1/1/2013 02:00', 8);
TimeStamp will always be the same interval, e.g. daily data, 1 hour data, 1 minute data, etc. It will not be mixed.
For reporting and presentation purposes, I want to implement paging that:
Orders by TimeStamp
Starts out using a suggested pageSize (say 4), but will automatically adjust to include additional records matching on TimeStamp. In other words, if 1/1/2013 01:00 is included for one ParentId, the suggested pageSize will be overridden and all records for hour 01:00 will be included for all ParentId's. It's almost like the TOP WITH TIES option.
So running this query with pageSize of 4 would return 6 records. There are 3 hour 00:00 and 1 hour 01:00 by default, but because there are more hour 01:00's, the pageSize would be overridden to return all hour 00:00 and 01:00.
Here's what I have so far, and I think I'm close as it works for the first iteration, but sequent queries for the next pageSize+ rows doesn't work.
WITH CTE AS
(
SELECT ParentId, [TimeStamp], Value,
RANK() OVER(ORDER BY [TimeStamp]) AS rnk,
ROW_NUMBER() OVER(ORDER BY [TimeStamp]) AS rownum
FROM #Temp
)
SELECT *
FROM CTE
WHERE (rownum BETWEEN 1 AND 4) OR (rnk BETWEEN 1 AND 4)
ORDER BY TimeStamp, ParentId
The ROW_NUMBER ensures the minimum pageSize is met, but the RANK will include additional ties.
declare #Temp as Table ( ParentId Int, [TimeStamp] DateTime, [Value] Int );
insert into #Temp ( ParentId, [TimeStamp], [Value] ) values
(1, '1/1/2013 00:00', 6),
(1, '1/1/2013 01:00', 7),
(1, '1/1/2013 02:00', 8),
(2, '1/1/2013 00:00', 6),
(2, '1/1/2013 01:00', 7),
(2, '1/1/2013 02:00', 8),
(3, '1/1/2013 00:00', 6),
(3, '1/1/2013 01:00', 7),
(3, '1/1/2013 02:00', 8);
declare #PageSize as Int = 4;
declare #Page as Int = 1;
with Alpha as (
select ParentId, [TimeStamp], Value,
Rank() over ( order by [TimeStamp] ) as Rnk,
Row_Number() over ( order by [TimeStamp] ) as RowNum
from #Temp ),
Beta as (
select Min( Rnk ) as MinRnk, Max( Rnk ) as MaxRnk
from Alpha
where ( #Page - 1 ) * #PageSize < RowNum and RowNum <= #Page * #PageSize )
select A.*
from Alpha as A inner join
Beta as B on B.MinRnk <= A.Rnk and A.Rnk <= B.MaxRnk
order by [TimeStamp], ParentId;
EDIT:
An alternative query that assigns page numbers as it goes, so that next/previous page can be implemented without overlapping rows:
with Alpha as (
select ParentId, [TimeStamp], Value,
Rank() over ( order by [TimeStamp] ) as Rnk,
Row_Number() over ( order by [TimeStamp] ) as RowNum
from #Temp ),
Beta as (
select ParentId, [TimeStamp], Value, Rnk, RowNum, 1 as Page, 1 as PageRow
from Alpha
where RowNum = 1
union all
select A.ParentId, A.[TimeStamp], A.Value, A.Rnk, A.RowNum,
case when B.PageRow >= #PageSize and A.TimeStamp <> B.TimeStamp then B.Page + 1 else B.Page end,
case when B.PageRow >= #PageSize and A.TimeStamp <> B.TimeStamp then 1 else B.PageRow + 1 end
from Alpha as A inner join
Beta as B on B.RowNum + 1 = A.RowNum
)
select * from Beta
option ( MaxRecursion 0 )
Note that recursive CTEs often scale poorly.
I think your strategy of using row_number() and rank() is overcomplicating things.
Just pick the top 4 timestamps from the data. Then choose any timestamps that match those:
select *
from #temp
where [timestamp] in (select top 4 [timestamp] from #temp order by [TimeStamp])