Trying to Sum up Cross-Tab Data in SQL - sql

I have a table where every ID has one or more places, and each place comes with a count. Places can be repeated within IDs. It is stored in rows like so:
ID ColumnName DataValue
1 place1 ABC
1 count1 5
2 place1 BEC
2 count1 12
2 place2 CDE
2 count2 6
2 place3 BEC
2 count3 9
3 place1 BBC
3 count1 5
3 place2 BBC
3 count2 4
Ultimately, I want a table where every possible place name is its own column, and the count per place per ID is summed up, like so:
ID ABC BEC CDE BBC
1 5 0 0 0
2 0 21 6 0
3 0 0 0 9
I don't know the best way to go about this. There are around 50 different possible place names, so specifically listing them out in a query isn't ideal. I know I ultimately have to pivot the data, but I don't know if I should do it before or after I sum up the counts. And whether it's before or after, I haven't been able to figure out how to go about summing it up.
Any ideas/help would be greatly appreciated. At this point, I'm having a hard time finding where to even start. I've seen a few posts with similar problems, but nothing quite as convoluted as this.
EDIT:
Right now I'm working with this to pivot the table, but this leaves me with columns named place1, place2, .... count1, count2,...
and I don't know how to appropriately sum up the counts and make new columns with the place names.
DECLARE #cols NVARCHAR(MAX), #query NVARCHAR(MAX);
SET #cols = STUFF(
(
SELECT DISTINCT
','+QUOTENAME(c.[ColumnName])
FROM #temp c FOR XML PATH(''), TYPE
).value('.', 'nvarchar(max)'), 1, 1, '');
SET #query = 'SELECT [ID], '+#cols+'from (SELECT [ID],
[DataValue] AS [amount],
[ColumnName] AS [category]
FROM #temp
)x pivot (max(amount) for category in ('+#cols+')) p';
EXECUTE (#query);

Your table structure is pretty bad. You'll need to normalize your data before you can attempt to pivot it. Try this:
;WITH IDs AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Place = datavalue
FROM #temp
WHERE ISNUMERIC(datavalue) = 0
)
,Counts AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Cnt = CAST(datavalue AS INT)
FROM #temp
WHERE ISNUMERIC(datavalue) = 1
)
SELECT
piv.id
,ABC = ISNULL(piv.ABC, 0)
,BEC = ISNULL(piv.BEC, 0)
,CDE = ISNULL(piv.CDE, 0)
,BBC = ISNULL(piv.BBC, 0)
FROM (SELECT i.id, i.Place, c.Cnt FROM IDs i JOIN Counts c ON c.id = i.id AND c.ColId = i.ColId) src
PIVOT ( SUM(Cnt)
FOR Place IN ([ABC], [BEC], [CDE], [BBC])
) piv;
Doing it with dynamic SQL would yield the following:
SET #query =
';WITH IDs AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Place = datavalue
FROM #temp
WHERE ISNUMERIC(datavalue) = 0
)
,Counts AS
(
SELECT DISTINCT
id
,ColId = RIGHT(ColumnName, LEN(ColumnName) - 5)
,Cnt = CAST(datavalue AS INT)
FROM #temp
WHERE ISNUMERIC(datavalue) = 1
)
SELECT [ID], '+#cols+'
FROM
(
SELECT i.id, i.Place, c.Cnt
FROM IDs i
JOIN Counts c ON c.id = i.id AND c.ColId = i.ColId
) src
PIVOT
(SUM(Cnt) FOR Place IN ('+#cols+')) piv;';
EXECUTE (#query);

Try this out:
SELECT id,
COALESCE(ABC, 0) AS ABC,
COALESCE(BBC, 0) AS BBC,
COALESCE(BEC, 0) AS BEC,
COALESCE(CDE, 0) AS CDE
FROM
(SELECT id,
MIN(CASE WHEN columnname LIKE 'place%' THEN datavalue END) AS col,
CAST(MIN(CASE WHEN columnname LIKE 'count%' THEN datavalue END) AS INT) AS val
FROM t
GROUP BY id, RIGHT(columnname, 1)
) src
PIVOT
(SUM(val)
FOR col in ([ABC], [BBC], [BEC], [CDE])) pvt
Tested here: http://rextester.com/XUTJ68690
In the src query, you need to re-format your data, so that you have a unique id and place in each row. From there a pivot will work.

If the count is always immediately after the place, the following query will generate a data set for pivoting.
The result data set before pivoting has the following columns:
id, placename, count
select placeTable.id, placeTable.datavalue, countTable.datavalue
from
(select *, row_number() over (order by id, %%physloc%%) as rownum
from test
where isnumeric(datavalue) = 1
) as countTable
join
(select *, row_number() over (order by id, %%physloc%%) as rownum
from test
where isnumeric(datavalue) <> 1
) as placeTable
on countTable.id = placeTable.id and
countTable.rownum = placeTable.rownum
Tested on sqlfidde mssqlserver: http://sqlfiddle.com/#!6/701c91/18

Here is one other approach using PIVOT operator with dynamic style
declare #Col varchar(2000) = '',
#Query varchar(2000) = ''
set #Col = stuff(
(select ','+QUOTENAME(DataValue)
from table where isnumeric(DataValue) = 0
group by DataValue for xml path('')),1,1,'')
set #Query = 'select id, '+#Col+' from
(
select id, DataValue,
cast((case when isnumeric(DataValue) = 1 then DataValue else lead(DataValue) over (order by id) end) as int) Value
from table
) as a
PIVOT
(
sum(Value) for DataValue in ('+#Col+')
)pvt'
EXECUTE (#Query)
Note : I have used lead() function to access next rows data if i found character string values and replace with numeric data values
Result :
id ABC BBC BEC CDE
1 5 NULL NULL NULL
2 NULL NULL 21 6
3 NULL 9 NULL NULL

Related

Convert Data in a Column to a row in SQL Server

Fairly new to SQL, so I do apologise!
Currently I have the following SQL Query:
select [data]
from Database1.dbo.tbl_Data d
join Database1.tbl_outbound o on d.session_id = o.session_id
where o.campaign_id = 1047
and d.session_id = 12
This returns ONE column which looks like this (and it can return different number of rows, depending on campaign_id and session_id!):
[data]
[1] Entry 1
[2] Entry 2
[3] Entry 3
[4] Entry 4
[5] Entry 5
.....
[98] Entry 98
[99] Entry 99
I would like to convert the data so they are displayed in 1 row and not 1 column, for example:
[data1] [data2] [data3] [data4] [data5] .... [data98] [data99]
[1] Entry 1 Entry 2 Entry 3 Entry 4 Entry 5 .... Entry 98 Entry 99
I hope I have explained that well enough! Thanks! :)
I have seen some information floating around about Pivot and Unpivot, but couldn't get it to play ball!
Try This Dynamic sql which helps your requirement
IF OBJECT_ID('tempdb..#Temp')IS NOT NULL
DROP TABLE #Temp
CREATE TABLE #Temp (data VARCHAR(100))
GO
IF OBJECT_ID('tempdb..#FormatedTable')IS NOT NULL
DROP TABLE #FormatedTable
Go
INSERT INTO #Temp(data)
SELECT 'Entry1' UNION ALL
SELECT 'Entry2' UNION ALL
SELECT 'Entry3' UNION ALL
SELECT 'Entry4' UNION ALL
SELECT 'Entry5'
SELECT ROW_NUMBER()OVER(ORDER BY Data) AS SeqId,
Data,
'Data'+CAST(ROW_NUMBER()OVER(ORDER BY Data) AS VARCHAR(100)) AS ReqColumn
INTO #FormatedTable
FROM #Temp
DECLARE #Sql nvarchar(max),
#DynamicColumn nvarchar(max),
#MaxDynamicColumn nvarchar(max)
SELECT #DynamicColumn = STUFF((SELECT ', '+QUOTENAME(ReqColumn)
FROM #FormatedTable FOR XML PATH ('')),1,1,'')
SELECT #MaxDynamicColumn = STUFF((SELECT ', '+'MAX('+(ReqColumn)+') AS '+QUOTENAME(CAST(ReqColumn AS VARCHAR(100)))
FROM #FormatedTable FOR XML PATH ('')),1,1,'')
SET #Sql=' SELECT ROW_NUMBER()OVER(ORDER BY (SELECT 1)) AS SeqId, '+ #MaxDynamicColumn+'
FROM
(
SELECT * FROM #FormatedTable
) AS src
PIVOT
(
MAX(Data) FOR [ReqColumn] IN ('+#DynamicColumn+')
) AS Pvt
'
EXEC (#Sql)
PRINT #Sql
Result
SeqId Data1 Data2 Data3 Data4 Data5
----------------------------------------------
1 Entry1 Entry2 Entry3 Entry4 Entry5
There is no really simple way. You can use pivot or conditional aggregation. I prefer the latter:
select max(case when left(data, 3) = '[1]' then data end) as data_001,
max(case when left(data, 3) = '[2]' then data end) as data_002,
max(case when left(data, 5) = '[100]' then data end) as data_100
from Database1.dbo.tbl_Data d join
Database1.tbl_outbound o
on d.session_id = o.session_id
where o.campaign_id = 1047 and d.session_id = 12;
Note that the columns are fixed, so you will always have 100 columns, regardless of the number of actual values in the data.
If you need a flexible number of columns, then you need dynamic pivoting, which requires constructing the query as a string and then executing the string.
The easiest way to do that is to utilize SQLCLR.
Check out the solution and explanation on An Easier Way of Transposing Query Result in SQL Server

Uneven pivot without aggregation (filling in the blanks)

I have the following table
Header: user, status, value
row1: u1,A,3
row2: u1,B,5
row3: u1,B,2
row4, u2,A,4
row5: u2,C,8
and I want the output to be a crosstab with NULL if there are not sufficient values from one user to another. In the example the output would be:
Header: status, u1, u2
row1: A,3,4
row2: B, 5, NULL
row3: B, 2, NULL
row4: C, NULL, 8
(I am using SQL Server 2016.)
Not clear if you needed Dynamic (i.e. user columns). Small matter if needed.
Example
Select [Status]
,u1 = max(case when [user]='u1' then Value end)
,u2 = max(case when [user]='u2' then Value end)
From (
Select *
,Grp = Value - Row_Number() over (Partition By [Status] Order by Value)
From YourTable
) A
Group By [Status],[Grp]
Order By 1,2,3
Returns
Status u1 u2
A 3 4
B 2 NULL
B 5 NULL
C NULL 8
EDIT - Dynamic Approach
Declare #SQL varchar(max) = Stuff((Select Distinct ',' + QuoteName([User]) From Yourtable Order by 1 For XML Path('')),1,1,'')
Select #SQL = '
Select [Status],' + #SQL + '
From (
Select *
,Grp = Value - Row_Number() over (Partition By [Status] Order by Value)
From YourTable
) A
Pivot (max([Value]) For [User] in (' + #SQL + ') ) p
Order By 1,2,3
'
Exec(#SQL);

Two dimensional rank using T-SQL

This is the data I'm dealing with:
I would like to find a way, in sql, of adding numbers to the yellow column which will rank the Names in such a way that I get the following.
note: This is the final pivoted result - in the sql table there is no need to pivot the data.
This ranking is decided via these rules:
The most recent week (ie Wk5 column) is the most important.
The next most recent week is next most important.
...so on to the left with the oldest week column "WK1" being the least important.
A data value that is small e.g. 1, is best. A data value that is high e.g. 7, is not good. A blank space is the worst and if at all possible should be located near the bottom of the page - but rules 1/2/3 always take precedence.
This is the data with a placeholder of 0 in the column Idx:
CREATE TABLE #values
(
Name varchar(5),
Idx int,
"Week" varchar(5),
Amount int
);
INSERT INTO #values
VALUES
('A',0,'WK1',3),
('T',0,'WK1',2),
('H',0,'WK1',1),
('P',0,'WK1',4),
('V',0,'WK1',6),
('N',0,'WK1',5),
('A',0,'WK2',2),
('F',0,'WK2',1),
('K',0,'WK2',3),
('P',0,'WK2',4),
('W',0,'WK2',7),
('V',0,'WK2',5),
('B',0,'WK2',6),
('A',0,'WK3',1),
('F',0,'WK3',2),
('T',0,'WK3',3),
('K',0,'WK3',4),
('W',0,'WK3',5),
('V',0,'WK3',6),
('N',0,'WK3',7),
('A',0,'WK4',2),
('F',0,'WK4',1),
('T',0,'WK4',5),
('K',0,'WK4',4),
('B',0,'WK4',6),
('A',0,'WK5',1),
('F',0,'WK5',2),
('T',0,'WK5',3),
('H',0,'WK5',4),
('K',0,'WK5',5);
This is my current attempt:
WITH
allData AS
(
SELECT Name,
"Week",
newRank = RANK() OVER (ORDER BY "Week" DESC,Amount)
FROM #values
)
,allData2 AS
(
SELECT *,
newRank2 = 1 / CONVERT(NUMERIC(18,10),newRank)
FROM allData
)
,allData3 AS
(
SELECT Name,
smRank = SUM(newRank2)
FROM allData2
GROUP BY Name
)
SELECT Name,
smRank,
rnk = RANK() OVER (ORDER BY smRank DESC)
INTO #RankA
FROM allData3;
UPDATE X
SET X.Idx = Y.rnk
FROM #values X
INNER JOIN #RankA Y ON
X.Name = Y.Name;
Unfortunately if I pivot the results, and then order by the Idx column it is not in the order I am aiming at.
This is based on two nested ROW_NUMBERs:
select *,
row_number()
over (order by "Week" desc, amount)
from
(
select *,
row_number()
over (partition by name
order by "Week" desc, amount) as rn
from #values
) as dt
where rn = 1 -- for each name find the latest week and it's lowest number
What if two names share the same week/amount? You might consider RANK or DENSE_RANK instead.
Using your #values table, here is how to pivot it (since the data you provided was not in the same table format) and then assign a value to the index based on your requirements.
select *
, ROW_NUMBER() OVER(ORDER BY CASE WHEN wk5 IS NULL THEN 1 ELSE 0 END, wk5, CASE WHEN wk4 IS NULL THEN 1 ELSE 0 END, wk4, CASE WHEN wk3 IS NULL THEN 1 ELSE 0 END,wk3, CASE WHEN wk2 IS NULL THEN 1 ELSE 0 END,wk2, CASE WHEN wk1 IS NULL THEN 1 ELSE 0 END, wk1) AS new_index
from (
select * from #values
) p
PIVOT (
MAX(Amount)
FOR [week] IN (wk1, wk2, wk3, wk4, wk5)) AS pvt
USING DYNAMIC FOR 52 WEEKS
DECLARE #COLS AS NVARCHAR(MAX),
#QUERY AS NVARCHAR(MAX)
SELECT #COLS = STUFF(( SELECT distinct ','+QUOTENAME(C.[week])
FROM #values AS C
FOR XML PATH('')), 1, 1, '')
SET #QUERY = '
select *
, ROW_NUMBER() OVER(ORDER BY CASE WHEN wk5 IS NULL THEN 1 ELSE 0 END, wk5, CASE WHEN wk4 IS NULL THEN 1 ELSE 0 END, wk4, CASE WHEN wk3 IS NULL THEN 1 ELSE 0 END,wk3, CASE WHEN wk2 IS NULL THEN 1 ELSE 0 END,wk2, CASE WHEN wk1 IS NULL THEN 1 ELSE 0 END, wk1) AS new_index
from (
select * from #values
) p
PIVOT (
MAX(Amount)
FOR [week] IN (' + #cols+ ')) AS pvt'
EXEC(#QUERY)

Dynamic SELECT statement, generate columns based on present and future values

Currently building a SELECT statement in SQL Server 2008 but would like to make this SELECT statement dynamic, so the columns can be defined based on values in a table. I heard about pivot table and cursors, but seems kind of hard to understand at my current level, here is the code;
DECLARE #date DATE = null
IF #date is null
set # date = GETDATE() as DATE
SELECT
Name,
value1,
value2,
value3,
value4
FROM ref_Table a
FULL OUTER JOIN (
SELECT
PK_ID ID,
sum(case when FK_ContainerType_ID = 1 then 1 else null) Box,
sum(case when FK_ContainerType_ID = 2 then 1 else null) Pallet,
sum(case when FK_ContainerType_ID = 3 then 1 else null) Bag,
sum(case when FK_ContainerType_ID = 4 then 1 else null) Drum
from
Packages
WHERE
#date between PackageStart AND PackageEnd
group by PK_ID ) b on a.Name = b.ID
where
Group = 0
The following works great for me , but PK_Type_ID and the name of the column(PackageNameX,..) are hard coded, I need to be dynamic and it can build itself based on present or futures values in the Package table.
Any help or guidance on the right direction would be greatly appreciated...,
As requested
ref_Table (PK_ID, Name)
1, John
2, Mary
3, Albert
4, Jane
Packages (PK_ID, FK_ref_Table_ID, FK_ContainerType_ID, PackageStartDate, PackageEndDate)
1 , 1, 4, 1JAN2014, 30JAN2014
2 , 2, 3, 1JAN2014, 30JAN2014
3 , 3, 2, 1JAN2014, 30JAN2014
4 , 4, 1, 1JAN2014, 30JAN2014
ContainerType (PK_ID, Type)
1, Box
2, Pallet
3, Bag
4, Drum
and the result should look like this;
Name Box Pallet Bag Drum
---------------------------------------
John 1
Mary 1
Albert 1
Jane 1
The following code like I said works great, the issue is the Container table is going to grow and I need to replicated the same report without hard coding the columns.
What you need to build is called a dynamic pivot. There are plenty of good references on Stack if you search out that term.
Here is a solution to your scenario:
IF OBJECT_ID('tempdb..##ref_Table') IS NOT NULL
DROP TABLE ##ref_Table
IF OBJECT_ID('tempdb..##Packages') IS NOT NULL
DROP TABLE ##Packages
IF OBJECT_ID('tempdb..##ContainerType') IS NOT NULL
DROP TABLE ##ContainerType
SET NOCOUNT ON
CREATE TABLE ##ref_Table (PK_ID INT, NAME NVARCHAR(50))
CREATE TABLE ##Packages (PK_ID INT, FK_ref_Table_ID INT, FK_ContainerType_ID INT, PackageStartDate DATE, PackageEndDate DATE)
CREATE TABLE ##ContainerType (PK_ID INT, [Type] NVARCHAR(50))
INSERT INTO ##ref_Table (PK_ID,NAME)
SELECT 1,'John' UNION
SELECT 2,'Mary' UNION
SELECT 3,'Albert' UNION
SELECT 4,'Jane'
INSERT INTO ##Packages (PK_ID, FK_ref_Table_ID, FK_ContainerType_ID, PackageStartDate, PackageEndDate)
SELECT 1,1,4,'2014-01-01','2014-01-30' UNION
SELECT 2,2,3,'2014-01-01','2014-01-30' UNION
SELECT 3,3,2,'2014-01-01','2014-01-30' UNION
SELECT 4,4,1,'2014-01-01','2014-01-30'
INSERT INTO ##ContainerType (PK_ID, [Type])
SELECT 1,'Box' UNION
SELECT 2,'Pallet' UNION
SELECT 3,'Bag' UNION
SELECT 4,'Drum'
DECLARE #DATE DATE, #PARAMDEF NVARCHAR(MAX), #COLS NVARCHAR(MAX), #SQL NVARCHAR(MAX)
SET #DATE = '2014-01-15'
SET #COLS = STUFF((SELECT DISTINCT ',' + QUOTENAME(T.[Type])
FROM ##ContainerType T
FOR XML PATH, TYPE).value('.', 'NVARCHAR(MAX)'),1,1,'')
SET #SQL = 'SELECT [Name], ' + #COLS + '
FROM (SELECT [Name], [Type], 1 AS Value
FROM ##ref_Table R
JOIN ##Packages P ON R.PK_ID = P.FK_ref_Table_ID
JOIN ##ContainerType T ON P.FK_ContainerType_ID = T.PK_ID
WHERE #DATE BETWEEN P.PackageStartDate AND P.PackageEndDate) X
PIVOT (COUNT(Value) FOR [Type] IN (' + #COLS + ')) P
'
PRINT #COLS
PRINT #SQL
SET #PARAMDEF = '#DATE DATE'
EXEC SP_EXECUTESQL #SQL, #PARAMDEF, #DATE=#DATE
Output:
Name Bag Box Drum Pallet
Albert 0 0 0 1
Jane 0 1 0 0
John 0 0 1 0
Mary 1 0 0 0
Static Query:
SELECT [Name],[Box],[Pallet],[Bag],[Drum] FROM
(
SELECT *
FROM
(
SELECT rf.Name,cnt.[Type], pk.PK_ID AS PKID, rf.PK_ID AS RFID
FROM ref_Table rf INNER JOIN Packages pk ON rf.PK_ID = pk.FK_ref_Table_ID
INNER JOIN ContanerType cnt ON cnt.PK_ID = pk.FK_ContainerType_ID
) AS SourceTable
PIVOT
(
COUNT(PKID )
FOR [Type]
IN ( [Box],[Pallet],[Bag],[Drum])
) AS PivotTable
) AS Main
ORDER BY RFID
Dynamic Query:
DECLARE #columnList nvarchar (MAX)
DECLARE #pivotsql nvarchar (MAX)
SELECT #columnList = STUFF(
(
SELECT ',' + '[' + [Type] + ']'
FROM ContanerType
FOR XML PATH( '')
)
,1, 1,'' )
SET #pivotsql =
N'SELECT [Name],' + #columnList + ' FROM
(
SELECT *
FROM
(
SELECT rf.Name,cnt.[Type], pk.PK_ID AS PKID, rf.PK_ID AS RFID
FROM ref_Table rf INNER JOIN Packages pk ON rf.PK_ID = pk.FK_ref_Table_ID
INNER JOIN ContanerType cnt ON cnt.PK_ID = pk.FK_ContainerType_ID
) AS SourceTable
PIVOT
(
COUNT(PKID )
FOR [Type]
IN ( ' + #columnList + ')
) AS PivotTable
) AS Main
ORDER BY RFID;'
EXEC sp_executesql #pivotsql
Following my tutorial below will help you to understand the PIVOT functionality:
We write sql queries in order to get different result sets like full, partial, calculated, grouped, sorted etc from the database tables. However sometimes we have requirements that we have to rotate our tables. Sounds confusing?
Let's keep it simple and consider the following two screen grabs.
SQL Table:
Expected Results:
Wow, that's look like a lot of work! That is a combination of tricky sql, temporary tables, loops, aggregation......, blah blah blah
Don't worry let's keep it simple, stupid(KISS).
MS SQL Server 2005 and above has a function called PIVOT. It s very simple to use and powerful. With the help of this function we will be able to rotate sql tables and result sets.
Simple steps to make it happen:
Identify all the columns those will be part of the desired result set.
Find the column on which we will apply aggregation(sum,ave,max,min etc)
Identify the column which values will be the column header.
Specify the column values mentioned in step3 with comma separated and surrounded by square brackets.
So, if we now follow above four steps and extract information from the above sales table, it will be as below:
Year, Month, SalesAmount
SalesAmount
Month
[Jan],[Feb] ,[Mar] .... etc
We are nearly there if all the above steps made sense to you so far.
Now we have all the information we need. All we have to do now is to fill the below template with required information.
Template:
Our SQL query should look like below:
SELECT *
FROM
(
SELECT SalesYear, SalesMonth,Amount
FROM Sales
) AS SourceTable
PIVOT
(
SUM(Amount )
FOR SalesMonth
IN ( [Jan],[Feb] ,[Mar],
[Apr],[May],[Jun] ,[Jul],
[Aug],[Sep] ,[Oct],[Nov] ,[Dec])
) AS PivotTable;
In the above query we have hard coded the column names. Well it's not fun when you have to specify a number of columns.
However, there is a work arround as follows:
DECLARE #columnList nvarchar (MAX)
DECLARE #pivotsql nvarchar (MAX)
SELECT #columnList = STUFF(
(
SELECT ',' + '[' + SalesMonth + ']'
FROM Sales
GROUP BY SalesMonth
FOR XML PATH( '')
)
,1, 1,'' )
SET #pivotsql =
N'SELECT *
FROM
(
SELECT SalesYear, SalesMonth,Amount
FROM Sales
) AS SourceTable
PIVOT
(
SUM(Amount )
FOR SalesMonth
IN ( ' + #columnList +' )
) AS PivotTable;'
EXEC sp_executesql #pivotsql
Hopefully this tutorial will be a help to someone somewhere.
Enjoy coding.

How can I query row data as columns?

I'm sure I'm missing something here.
I have a dataset like this:
FK RowNumber Value Type Status
1 1 aaaaa A New
1 2 bbbbb B Good
1 3 ccccc A Bad
1 4 ddddd C Good
1 5 eeeee B Good
2 1 fffff C Bad
2 2 ggggg A New
2 3 hhhhh C Bad
3 1 iiiii A Good
3 2 jjjjj A Good
I'd like to query the top 3 results and Pivot them as columns, so the end result set looks like this:
FK Value1 Type1 Status1 Value2 Type2 Status2 Value3 Type3 Status3
1 aaaaa A New bbbbb B Good ccccc A Bad
2 fffff C Bad ggggg A New hhhhh C Bad
3 iiiii A Good jjjjj A Good
How can I accomplish this in SQL Server 2005?
I have been attempting this using PIVOT, but I am still very unfamiliar with that keyword and cannot get it to work the way I want.
SELECT * --Id, [1], [2], [3]
FROM
(
SELECT Id, Value, Type, Status
, ROW_NUMBER() OVER (PARTITION BY Id ORDER Status, Type) as [RowNumber]
FROM MyTable
) as T
PIVOT
(
-- I know this section doesn't work. I'm still trying to figure out PIVOT
MAX(T.Value) FOR RowNumber IN ([1], [2], [3]),
MAX(T.Type) FOR RowNumber IN ([1], [2], [3]),
MAX(T.Status) FOR RowNumber IN ([1], [2], [3])
) AS PivotTable;
My actual data set is a bit more complex than this, and I need the top 10 records, not the top 3, so I don't want to simply do CASE WHEN RowNumber = X THEN... for each one.
Update
I tested all the answers below, and found most of them seem about the same with no apparent performance difference in smaller data sets (around 3k records), however there was a slight difference when running the queries against larger data sets.
Here are the results of my tests using 80,000 records and querying for 5 columns in the top 10 rows, so my end result set was 50 columns + the Id column. I'd suggest you test them on your own to decide which one works best for you and your environment.
bluefoot's answer of unpivoting and re-pivoting the data averaged the fastest at about 12 seconds. I also liked this answer because I found it easiest to read and maintain.
Aaron's answer and koderoid's answer both suggest using a MAX(CASE WHEN RowNumber = X THEN ...), and was close behind averaging at around 13 seconds.
Rodney's answer of using multiple PIVOT statements averaged around 16 seconds, although it might be faster with fewer PIVOT statements (my tests had 5).
And the first half of Aaron's answer that suggested using a CTE and OUTER APPLY was the slowest. I don't know how long it would take to run because I cancelled it after 2 minutes, and that was with around 3k records, 3 rows, and 3 columns instead of 80k records, 10 rows, and 5 columns.
You can do an UNPIVOT and then a PIVOT of the data. this can be done either statically or dynamically:
Static Version:
select *
from
(
select fk, col + cast(rownumber as varchar(1)) new_col,
val
from
(
select fk, rownumber, value, cast(type as varchar(10)) type,
status
from yourtable
) x
unpivot
(
val
for col in (value, type, status)
) u
) x1
pivot
(
max(val)
for new_col in
([value1], [type1], [status1],
[value2], [type2], [status2],
[value3], [type3])
) p
see SQL Fiddle with demo
Dynamic Version, this will get the list of columns to unpivot and then to pivot at run-time:
DECLARE #colsUnpivot AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX),
#colsPivot as NVARCHAR(MAX)
select #colsUnpivot = stuff((select ','+quotename(C.name)
from sys.columns as C
where C.object_id = object_id('yourtable') and
C.name not in ('fk', 'rownumber')
for xml path('')), 1, 1, '')
select #colsPivot = STUFF((SELECT ','
+ quotename(c.name
+ cast(t.rownumber as varchar(10)))
from yourtable t
cross apply
sys.columns as C
where C.object_id = object_id('yourtable') and
C.name not in ('fk', 'rownumber')
group by c.name, t.rownumber
order by t.rownumber
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query
= 'select *
from
(
select fk, col + cast(rownumber as varchar(10)) new_col,
val
from
(
select fk, rownumber, value, cast(type as varchar(10)) type,
status
from yourtable
) x
unpivot
(
val
for col in ('+ #colsunpivot +')
) u
) x1
pivot
(
max(val)
for new_col in
('+ #colspivot +')
) p'
exec(#query)
see SQL Fiddle with Demo
Both will generate the same results, however the dynamic is great if you do not know the number of columns ahead of time.
The Dynamic version is working under the assumption that the rownumber is already a part of the dataset.
You can try to do the pivot in three separate pivot statements. Please give this a try:
SELECT Id
,MAX(S1) [Status 1]
,MAX(T1) [Type1]
,MAX(V1) [Value1]
--, Add other columns
FROM
(
SELECT Id, Value , Type, Status
, 'S' + CAST(ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Status, Type) AS VARCHAR(10)) [Status_RowNumber]
, 'T' + CAST(ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Status, Type) AS VARCHAR(10)) [Type_RowNumber]
, 'V' + CAST(ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Status, Type) AS VARCHAR(10)) [Value_RowNumber]
FROM MyTable
) as T
PIVOT
(
MAX(Status) FOR Status_RowNumber IN ([S1], [S2], [S3],[S4],[S5],[S6],[S7],[S8],[S9],[S10])
)AS StatusPivot
PIVOT(
MAX(Type) FOR Type_RowNumber IN ([T1], [T2], [T3],[T4],[T5],[T6],[T7],[T8],[T9],[T10])
)AS Type_Pivot
PIVOT(
MAX(Value) FOR Value_RowNumber IN ([V1], [V2], [V3],[V4],[V5],[V6],[V7],[V8],[V9],[V10])
)AS Value_Pivot
GROUP BY Id
I don't know the full scope of the criteria for selecting the top ten records, but this produces and output that may get you closer to your answer.
SQL Fiddle Example
Rodney's muli-pivot is clever, that's for sure. Here are two other alternatives that are of course less appealing when you get into the 10X vs. 3X area.
;WITH a AS
(
SELECT Id, Value, Type, Status,
n = ROW_NUMBER() OVER (PARTITION BY Id ORDER BY [Status], [Type])
FROM dbo.MyTable
)
SELECT a.Id,
Value1 = a.Value, Type1 = a.[Type], Status1 = a.[Status],
Value2 = b.Value, Type2 = b.[Type], Status2 = b.[Status],
Value3 = c.Value, Type3 = c.[Type], Status3 = c.[Status]
FROM a
OUTER APPLY (SELECT * FROM a AS T2 WHERE n = a.n + 1 AND id = a.id) AS b
OUTER APPLY (SELECT * FROM a AS T2 WHERE n = b.n + 1 AND id = b.id) AS c
WHERE a.n = 1
ORDER BY a.Id;
-- or --
;WITH a AS
(
SELECT Id, Value, [Type], [Status],
n = ROW_NUMBER() OVER (PARTITION BY Id ORDER BY [Status], [Type])
FROM dbo.MyTable
)
SELECT Id,
Value1 = MAX(CASE WHEN n = 1 THEN Value END),
Type1 = MAX(CASE WHEN n = 1 THEN [Type] END),
Status1 = MAX(CASE WHEN n = 1 THEN [Status] END),
Value2 = MAX(CASE WHEN n = 2 THEN Value END),
Type2 = MAX(CASE WHEN n = 2 THEN [Type] END),
Status2 = MAX(CASE WHEN n = 2 THEN [Status] END),
Value3 = MAX(CASE WHEN n = 3 THEN Value END),
Type3 = MAX(CASE WHEN n = 3 THEN [Type] END),
Status3 = MAX(CASE WHEN n = 3 THEN [Status] END)
FROM a
GROUP BY Id
ORDER BY a.Id;
This might work for you, though it's not elegant.
select aa.FK_Id
, isnull(max(aa.Value1), '') as Value1
, isnull(max(aa.Type1), '') as Type1
, isnull(max(aa.Status1), '') as Status1
, isnull(max(aa.Value2), '') as Value2
, isnull(max(aa.Type2), '') as Type2
, isnull(max(aa.Status2), '') as Status2
, isnull(max(aa.Value3), '') as Value3
, isnull(max(aa.Type3), '') as Type3
, isnull(max(aa.Status3), '') as Status3
from
(
select FK_Id
, case when RowNumber = 1 then Value else null end as Value1
, case when RowNumber = 1 then [Type] else null end as Type1
, case when RowNumber = 1 then [Status] else null end as Status1
, case when RowNumber = 2 then Value else null end as Value2
, case when RowNumber = 2 then [Type] else null end as Type2
, case when RowNumber = 2 then [Status] else null end as Status2
, case when RowNumber = 3 then Value else null end as Value3
, case when RowNumber = 3 then [Type] else null end as Type3
, case when RowNumber = 3 then [Status] else null end as Status3
from Table1
) aa
group by aa.FK_Id
try something like this:
declare #rowCount int
set #rowCount = 10
declare #isNullClause varchar(4024)
set #isnullClause = ''
declare #caseClause varchar(4024)
set #caseClause = ''
declare #i int
set #i = 1
while(#i <= #rowCount) begin
set #isnullClause = #isNullClause +
' , max(aa.Value' + CAST(#i as varchar(3)) + ') as Value' + CAST(#i as varchar(3)) +
' , max(aa.Type' + CAST(#i as varchar(3)) + ') as Type' + CAST(#i as varchar(3)) +
' , max(aa.Status' + CAST(#i as varchar(3)) + ') as Status' + CAST(#i as varchar(3)) + ' ';
set #caseClause = #caseClause +
' , case when RowNumber = ' + CAST(#i as varchar(3)) + ' then Value else null end as Value' + CAST(#i as varchar(3)) +
' , case when RowNumber = ' + CAST(#i as varchar(3)) + ' then Type else null end as Type' + CAST(#i as varchar(3)) +
' , case when RowNumber = ' + CAST(#i as varchar(3)) + ' then Status else null end as Status' + CAST(#i as varchar(3)) + ' '
set #i = #i + 1;
end
declare #sql nvarchar(4000)
set #sql = 'select aa.FK_Id ' + #isnullClause + ' from ( select FK_Id '
+ #caseClause + ' from Table1) aa group by aa.FK_Id '
exec SP_EXECUTESQL #sql