Dynamic Pivot in MS SQL Server - sql

I am trying to do a dynamic pivot on the last two columns that i take from one table and am joining onto the contents of another table. I need the Name values to pivot to the header fields and the Value values to fill in correspondingly underneath. This is my current query:
USE Innovate
DECLARE #DynamicPivotQuery AS NVARCHAR(MAX)
DECLARE #ColumnName AS NVARCHAR(MAX)
--Get distinct values of the PIVOT Column
SELECT #ColumnName= ISNULL(#ColumnName + ',','')
+ QUOTENAME(NAME)
FROM (SELECT DISTINCT NAME FROM Innovate.dbo.Table1 WHERE Name IS NOT NULL) AS ATTRIBUTE_NAME
WHERE Name LIKE 'Suture_-_2nd_Needle_Code'
OR Name LIKE 'Suture_-_Absorbable'
OR Name LIKE 'Suture_-_Antibacterial'
OR Name LIKE 'Suture_-_Armed'
OR Name LIKE 'Suture_-_Barbed'
OR Name LIKE 'Suture_-_Brand_Name'
OR Name LIKE 'Suture_-_C/R_2nd_Needle_Code'
OR Name LIKE 'Suture_-_C/R_Brand_Name'
OR Name LIKE 'Suture_-_C/R_length'
OR Name LIKE 'Suture_-_C/R_Needle_Code'
OR Name LIKE 'Suture_-_Coating'
OR Name LIKE 'Suture_-_Dyed'
OR Name LIKE 'Suture_-_Filament'
OR Name LIKE 'Suture_-_length_inches'
OR Name LIKE 'Suture_-_Looped'
OR Name LIKE 'Suture_-_Material'
OR Name LIKE 'Suture_-_Needle_Code'
OR Name LIKE 'Suture_-_Needle_Shape'
OR Name LIKE 'Suture_-_Needle_Style'
OR Name LIKE 'Suture_-_Noun'
OR Name LIKE 'Suture_-_pleget'
OR Name LIKE 'Suture_-_Popoff'
OR Name LIKE 'Suture_-_Suture_count'
OR Name LIKE 'Suture_-_Suture_size'
--Prepare the PIVOT query using the dynamic
SET #DynamicPivotQuery =
N'SELECT Table1.Primary_Key, Company_Name, Part_Number, Product_Desc, Innovate_Description, ' + #ColumnName + '
FROM Table1 AS P
LEFT JOIN Table2 AS A ON P.Primary_Key = A.Primary_Key
PIVOT(MAX(A.VALUE)
FOR A.NAME IN (' + #ColumnName + ')) AS PVTTable'
--Execute the Dynamic Pivot Query
EXECUTE sp_executesql #DynamicPivotQuery
;
And this is the Result for the Query that I keep getting: Msg 8156,
Level 16, State 1, Line 5 The column 'Primary_Key' was specified
multiple times for 'PVTTable'. Msg 4104, Level 16, State 1, Line 1 The
multi-part identifier "Table1.Primary_Key" could not be bound.
Can anyone help me pivot these columns without the error message? I only specified the Primary_Key in the code once so I do not know how I specified it multiple times and how it is unbound.

Try with the below script..
SET #DynamicPivotQuery =
N'SELECT P.Primary_Key, Company_Name, Part_Number, Product_Desc, Innovate_Description, ' + #ColumnName + '
FROM Table1 AS P
LEFT JOIN Table2 AS A ON P.Primary_Key = A.Primary_Key
PIVOT(MAX(A.VALUE)
FOR A.NAME IN (' + #ColumnName + ')) AS PVTTable'
--Execute the Dynamic Pivot Query
EXECUTE sp_executesql #DynamicPivotQuery

you have a couple of things going on with your PIVOT statement that are problematic. First you are attempting to reference table aliases for Table1 and Table2 but those aliases are not available in the final select of a PIVOT the Pivot command is kind of like an outer select and the only table alias that is then available is the pivot alias.
Next pivot's documenation states "You can use the PIVOT and UNPIVOT relational operators to change a table-valued expression into another table" (https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx). Basically that means that you need a single table expression of only the uniquely named columns you want to be involved pivot being passed to the PIVOT command. The later part is likely the issue as Table1 and Table2 probably both have a column Primary_Key so pivot doesn't understand the reference.
To fix you can either move your Table1 & 2. join to an inner select and alias the table or build a cte and use the cte in your command. Here is the former way:
SET #DynamicPivotQuery =
N'SELECT * FROM
(
SELECT Table1.Primary_Key, Company_Name, Part_Number, Product_Desc, Innovate_Description, ' + #ColumnName + '
FROM Table1 AS P
LEFT JOIN Table2 AS A ON P.Primary_Key = A.Primary_Key
) t
PIVOT(MAX(A.VALUE)
FOR NAME IN (' + #ColumnName + ')) AS PVTTable'

Those Name values get hardcoded to calculate that #ColumnName variable.
If the values are hardcoded anyway, then you might as well run the pivot with a join without building a SQL statement to execute.
SELECT A.Company_Name, A.Part_Number, A.Product_Desc, A.Innovate_Description, P.*
FROM (select Primary_Key, Name, Value from Innovate.dbo.Table1) T1
PIVOT(MAX(VALUE) FOR NAME IN (
[Suture_-_2nd_Needle_Code],
[Suture_-_Absorbable],
[Suture_-_Antibacterial],
[Suture_-_Armed],
[Suture_-_Barbed],
[Suture_-_Brand_Name],
[Suture_-_C/R_2nd_Needle_Code],
[Suture_-_C/R_Brand_Name],
[Suture_-_C/R_length],
[Suture_-_C/R_Needle_Code],
[Suture_-_Coating],
[Suture_-_Dyed],
[Suture_-_Filament],
[Suture_-_length_inches],
[Suture_-_Looped],
[Suture_-_Material],
[Suture_-_Needle_Code],
[Suture_-_Needle_Shape],
[Suture_-_Needle_Style],
[Suture_-_Noun],
[Suture_-_pleget],
[Suture_-_Popoff],
[Suture_-_Suture_count],
[Suture_-_Suture_size]
)
) P
LEFT JOIN Innovate.dbo.Table2 AS A ON (A.Primary_Key = P.Primary_Key);
Fair enough, this has a disadvantage that if that [Primary_Key] needs to be the first column, that the P.* should be replaced by those literal column values. Or use the EXEC approach after all.
Anyway, to build name list for that #ColumnName variable, it can be done without all the OR's :
DECLARE #ColumnName NVARCHAR(MAX);
--Get distinct values of the PIVOT Column
SELECT #ColumnName = ISNULL(#ColumnName + ',','') + QUOTENAME(NAME)
FROM Innovate.dbo.Table1
WHERE Name Like 'Suture_-_%'
AND SUBSTRING(Name,10,30) IN (
'2nd_Needle_Code',
'Absorbable',
'Antibacterial',
'Armed',
'Barbed',
'Brand_Name',
'C/R_2nd_Needle_Code',
'C/R_Brand_Name',
'C/R_length',
'C/R_Needle_Code',
'Coating',
'Dyed',
'Filament',
'length_inches',
'Looped',
'Material',
'Needle_Code',
'Needle_Shape',
'Needle_Style',
'Noun',
'pleget',
'Popoff',
'Suture_count',
'Suture_size')
GROUP BY Name;

Related

SQL Sort / Order By pivoted fields while COALESCE function

I have some rates for resources for all countries
The rows will be Resource IDs
Columns should be Country Codes
Challenge here, I cannot sort the Country Codes in ASC
It would be so grateful if you could help me on this.
When I query, I get the list of country codes, but not sorted. i.e., USA,BRA,ARG etc. But the expected result should be ARG,BRA,USA in columns of the pivot.
Here is my code:
DECLARE #idList nvarchar(MAX)
SELECT
#idList = COALESCE(#idList + ',', '') + CountryCodeISO3
FROM
(
SELECT
DISTINCT CountryCodeISO3
FROM
Published.RateCardsValues
WHERE
CardID = 55
) AS SRC
DECLARE #sqlToRun nvarchar(MAX)
SET
#sqlToRun = '
SELECT *
FROM (
SELECT
[ResourceCode]
,[TITLES]
,[MostRepresentativeTitle]
,[ABBR_RES_DESC]
,[TypicalJobGrade]
,[BidGridResourceCode]
,[OpUnit]
,[PSResType]
,[JobGradeORResCat]
,[CountryCodeISO3]
--,[CurrencyCode]
,[RateValue]
FROM
[Published].[RateCardsValues] rc
WHERE
CardID = 55) As src
PIVOT (
MAX(RateValue) FOR [CountryCodeISO3] IN (' + #idList + ')
) AS pvt'
EXEC (#sqlToRun)
As you have discovered, PIVOT in T-SQL requires you to know at development time what the values will be that you will be pivoting on.
This is limiting, because if you want something like "retrieve data for all the countries where Condition X is true, then pivot on their IDs!", you have to resort to dynamic SQL to do it.
If Condition X is constant -- I'm guessing that belonging to CardID = 55 doesn't change often -- you can look up the values, and hardcode them in your code.
If the CardID you're looking up is always 55 and you have relatively few countries in that category, I'd actually advise doing that.
But if your conditions for picking countries can change, or the number of columns you want can vary -- something like "all the countries where there were sales of product Y, for month Z!" -- then you can't predict them, which means that the T-SQL PIVOT can't be set up (without dynamic SQL.)
In that case, I'd strongly suggest that you have whatever app you plan to use the data in do the pivoting, not T-SQL. (SSRS and Excel can both do it themselves, and code can be written to do it in .NET langauges.) T-SQL, as you have seen, does not lend itself to dynamic pivoting.
What you have will "work" in the sense that it will execute without errors, but there's another downside, in the next stage of your app: not only will the number of columns potentially change over time, the names of the columns will change, as countries move in and out of Card ID 55. That may cause problems for whatever app or destination you have in mind for this data.
So, my two suggestions would be: either hard-code your country codes, or have the next stage in your app (whatever executes the query) do the actual pivoting.
You need to sort the columns while creating the dynamic SQL
Also:
Do not use variable coalescing, use STRING_AGG or FOR XML instead
Use QUOTENAME to escape the column names
sp_executesql allows you to pass parameters to the dynamic query
DECLARE #idList nvarchar(MAX)
SELECT
#idList = STRING_AGG(QUOTENAME(CountryCodeISO3), ',') WITHIN GROUP (ORDER BY CountryCodeISO3)
FROM
(
SELECT
DISTINCT CountryCodeISO3
FROM
Published.RateCardsValues
WHERE
CardID = 55
) AS SRC;
DECLARE #sqlToRun nvarchar(MAX);
SET
#sqlToRun = '
SELECT *
FROM (
SELECT
[ResourceCode]
,[TITLES]
,[MostRepresentativeTitle]
,[ABBR_RES_DESC]
,[TypicalJobGrade]
,[BidGridResourceCode]
,[OpUnit]
,[PSResType]
,[JobGradeORResCat]
,[CountryCodeISO3]
--,[CurrencyCode]
,[RateValue]
FROM
[Published].[RateCardsValues] rc
WHERE
CardID = 55) As src
PIVOT (
MAX(RateValue) FOR [CountryCodeISO3] IN (' + #idList + ')
) AS pvt'
EXEC sp_executesql #sqlToRun;
On earlier versions of SQL Server, you cannot use STRING_AGG. You need to hack it with FOR XML. You need to also use STUFF to strip off the first separator.
DECLARE #idList nvarchar(MAX)
DECLARE #separator nvarchar(20) = ',';
SET #idList =
STUFF(
(
SELECT
#sep + QUOTENAME(CountryCodeISO3)
FROM
Published.RateCardsValues
WHERE
CardID = 55
GROUP BY
CountryCodeISO3
ORDER BY
CountryCodeISO3
FOR XML PATH(''), TYPE
).value('text()[1]','nvarchar(max)'),
1, LEN(#separator), '')
;
DECLARE #sqlToRun nvarchar(MAX);
SET
#sqlToRun = '
SELECT *
FROM (
SELECT
[ResourceCode]
,[TITLES]
,[MostRepresentativeTitle]
,[ABBR_RES_DESC]
,[TypicalJobGrade]
,[BidGridResourceCode]
,[OpUnit]
,[PSResType]
,[JobGradeORResCat]
,[CountryCodeISO3]
--,[CurrencyCode]
,[RateValue]
FROM
[Published].[RateCardsValues] rc
WHERE
CardID = 55) As src
PIVOT (
MAX(RateValue) FOR [CountryCodeISO3] IN (' + #idList + ')
) AS pvt'
EXEC sp_executesql #sqlToRun;

How to use the dynamic column name from the table in a where clause

I am trying to get the dynamic column names from the table using the 'INFORMATION_SCHEMA.COLUMNS' Following is the query.
Select COLUMN_NAME into #TempTable
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME = 'MyTable'
Result:
COLUMN_NAME
Person_ID
Person_Name
Person_Address
Wanting to Do:
Select * from MyTable where Person_ID = 1
What can be the ways to use the Person_ID from 1st query to the second query?
You can use dynamic SQL to execute this via the EXEC command.
Build a VARCHAR string for your query based on the dynamic column names you are getting from your first query, then EXEC on the string you have created.
You have not provided enough information on exactly what columns you need in your WHERE clause, or how you determine which ones, but dynamic SQL seems to be what you need here.
if you are trying to do something like this
select * from [table] where [col] =#param
then you can use query like below
declare #query nvarchar(max)
select
#query='select * from '+t.name +
' where '+c.name + ' ='+
case
when c.name ='Person_ID' then '1'
when c.name ='Someother_ID' then '10'
else c.name
end
from sys.tables t join sys.columns c
on c.object_id=t.object_id
and t.name ='MyTable'
exec( #query)

Dynamic Pivot - SQL Server

I have a test SQL database the following query:
USE DataBase1
Select Data.MonthDate,
Data.AccountID,
Data.MonthID,
Data.Sales,
Data.AccountName
From Test1 as Data with(nolock)
That I need to pivot based off of the sales column. The problem is the months when I run this query will always change (though there will always be 4 of them) and they need to be ordered left-to-right/oldest-newest in the pivoted result based off of the MonthDate column. The initial return when the query is run looks like this:
And the final result needs to look like this:
I'm using Excel here to demonstrate and I highlighted the 0's because those are technically NULL values but I need them to come back as 0.
I'm using SQL Server Management Studio and the actual database I'll be running this against is over 200,000 rows.
Any thoughts?
Thanks,
Joshua
Use Dynamic Query.
DECLARE #col_list VARCHAR(max)='',
#sel_list VARCHAR(max)='',
#sql NVARCHAR(max)
SELECT DISTINCT #col_list += '[' + Isnull(MonthID, '') + '],'
FROM Test1
ORDER BY MonthID
SELECT #col_list = LEFT(#col_list, Len(#col_list) - 1)
SELECT DISTINCT #sel_list += 'Isnull([' + Isnull(MonthID, '') + '],0) ' + '['+ MonthID + '],'
FROM Test1
ORDER BY MonthID
SELECT #sel_list = LEFT(#sel_list, Len(#sel_list) - 1)
SET #sql ='select Data.AccountID,Data.AccountName,'+ #sel_list+ ' from (
Select
Data.AccountID,
Data.MonthID,
Data.Sales,
Data.AccountName
From Test1 as Data ) A
pivot (sum(Sales) for monthid in('+ #col_list + ')) piv'
--PRINT #sql
EXEC Sp_executesql #sql
Basically you need to dynamically build the PIVOT query and use sp_exec to run it.
SQL Server, out of the box, has no support for dynamic ever-changing columns as the columns need to be defined in the PIVOT query.
Here's an example of how to accomplish this: http://sqlhints.com/tag/dynamic-pivot-column-names/

Create table from query

I managed to apply the PIVOT statement you suggested to transpose the values ​​of the records of a table as columns automatically:
DECLARE #PivotColumnas VARCHAR(MAX)
SELECT #PivotColumnas = COALESCE (#PivotColumnas + ',[' + IB_PDSBATCHATTRIBIDBI + ']', '[' + IB_PDSBATCHATTRIBIDBI + ']') FROM PDSBATCHATTRIB
DECLARE #PivotTablaSQL NVARCHAR(MAX)
SET #PivotTablaSQL = N' SELECT *
FROM (SELECT INVENTBATCHID, ITEMID, PDSBATCHATTRIB.IB_PDSBATCHATTRIBIDBI, PDSBATCHATTRIBVALUE FROM PDSBATCHATTRIBUTES
LEFT JOIN PDSBATCHATTRIB ON PDSBATCHATTRIBUTES.IB_PDSBATCHATTRIBIDBI = PDSBATCHATTRIB.IB_PDSBATCHATTRIBIDBI) AS TablaOrigen
PIVOT
(MIN(PDSBATCHATTRIBVALUE)
FOR IB_PDSBATCHATTRIBIDBI IN ('+ #PivotColumnas + ')) AS PivotTable'
EXECUTE (#PivotTablaSQL)
What I need is how to save the result as a query or create a table from this query. If I try to save the result as a query, I get the following error:
Incorrect syntax near the keyword 'DECLARE'.
Thanks!
Because it is a dynamic sql, and you don`t know the exact columns, you can use an SELECT ... INTO #TempPivot. It is creating a temporary table what you can use later, or try to build up a dynamic solution which can select the temp table's structure and create a table, however it seems a bit overkill.
I found the solution, is very simple in fact!
I only have to add the command INTO NewTable in the SELECT sentence oof the pivot table, just like that:
SET #PivotTablaSQL = N' SELECT * **INTO NewTable**
FROM (SELECT INVENTBATCHID, ITEMID, PDSBATCHATTRIB.IB_PDSBATCHATTRIBIDBI, PDSBATCHATTRIBVALUE FROM PDSBATCHATTRIBUTES
LEFT JOIN PDSBATCHATTRIB ON PDSBATCHATTRIBUTES.IB_PDSBATCHATTRIBIDBI = PDSBATCHATTRIB.IB_PDSBATCHATTRIBIDBI) AS TablaOrigen
PIVOT
(MIN(PDSBATCHATTRIBVALUE)
FOR IB_PDSBATCHATTRIBIDBI IN ('+ #PivotColumnas + ')) AS PivotTable'
This create a new table in the SQL database with the pivot table results.
Thanks all!

SQL query to find duplicate rows, in any table

I'm looking for a schema-independent query. That is, if I have a users table or a purchases table, the query should be equally capable of catching duplicate rows in either table without any modification (other than the from clause, of course).
I'm using T-SQL, but I'm guessing there should be a general solution.
I believe that this should work for you. Keep in mind that CHECKSUM() isn't 100% perfect - it's theoretically possible to get a false positive here (I think), but otherwise you can just change the table name and this should work:
;WITH cte AS (
SELECT
*,
CHECKSUM(*) AS chksum,
ROW_NUMBER() OVER(ORDER BY GETDATE()) AS row_num
FROM
My_Table
)
SELECT
*
FROM
CTE T1
INNER JOIN CTE T2 ON
T2.chksum = T1.chksum AND
T2.row_num <> T1.row_num
The ROW_NUMBER() is needed so that you have some way of distinguishing rows. It requires an ORDER BY and that can't be a constant, so GETDATE() was my workaround for that.
Simply change the table name in the CTE and it should work without spelling out the columns.
I'm still confused about what "detecting them might be" but I'll give it a shot.
Excluding them is easy
e.g.
SELECT DISTINCT * FROM USERS
However if you wanted to only include them and a duplicate is all the fields than you have to do
SELECT
[Each and every field]
FROM
USERS
GROUP BY
[Each and every field]
HAVING COUNT(*) > 1
You can't get away with just using (*) because you can't GROUP BY *
so this requirement from your comments is difficult
a schema-independent means I don't want to specify all of the columns
in the query
Unless that is you want to use dynamic SQL and read the columns from sys.columns or information_schema.columns
For example
DECLARE #colunns nvarchar(max)
SET #colunns = ''
SELECT #colunns = #colunns + '[' + COLUMN_NAME +'], '
FROM INFORMATION_SCHEMA.columns
WHERE table_name = 'USERS'
SET #colunns = left(#colunns,len(#colunns ) - 1)
DECLARE #SQL nvarchar(max)
SET #SQL = 'SELECT ' + #colunns
+ 'FROM USERS' + 'GROUP BY '
+ #colunns
+ ' Having Count(*) > 1'
exec sp_executesql #SQL
Please note you should read this The Curse and Blessings of Dynamic SQL if you haven't already
I have done this using CTEs in SQL Server.
Here is a sample on how to delete dupes but you should be able to adapt it easily to find dupes:
WITH CTE (COl1, Col2, DuplicateCount)
AS
(
SELECT COl1,Col2,
ROW_NUMBER() OVER(PARTITION BY COl1,Col2 ORDER BY Col1) AS DuplicateCount
FROM DuplicateRcordTable
)
DELETE
FROM CTE
WHERE DuplicateCount > 1
GO
Here is a link to an article where I got the SQL:
http://blog.sqlauthority.com/2009/06/23/sql-server-2005-2008-delete-duplicate-rows/
I recently was looking into the same issue and noticed this question.
I managed to solve it using a stored procedure with some dynamic SQL. This way you only need to specify the table name. And it will get all the other relevant data from sys tables.
/*
This SP returns all duplicate rows (1 line for each duplicate) for any given table.
to use the SP:
exec [database].[dbo].[sp_duplicates]
#table = '[database].[schema].[table]'
*/
create proc dbo.sp_duplicates #table nvarchar(50) as
declare #query nvarchar(max)
declare #groupby nvarchar(max)
set #groupby = stuff((select ',' + [name]
FROM sys.columns
WHERE object_id = OBJECT_ID(#table)
FOR xml path('')), 1, 1, '')
set #query = 'select *, count(*)
from '+#table+'
group by '+#groupby+'
having count(*) > 1'
exec (#query)