Dynamic Pivot table, how to remove NULL values without knowing column names?

Dynamic Pivot table, how to remove NULL values without knowing column names? - sql

Ok, so I got myself needing a dynamic pivot table, and that was fine.
I needed to create a dynamically sized temporary table to hold these values, eventually I worked that one out. Called it #table
Can I get rid of nulls after the insert without doing it dynamically? I really don't want even more horrific red stuff.
#table
Year CashflowID1 CashflowID2 CashflowID3........CashflowIDn
1 NULL -4 1.23............... etc
2 43 78 -34 ............... NULL
Each cashflow id might have data for a different set of years, resulting in a bunch of nulls.
Something simple like
SELECT ISNULL(*,0)
FROM #table
but, you know, that is clever and actually works. As an aside I tried setting up #table with default values and non nullable columns but that just broke the insert.
Shout if I have missed anything obvious, or failed to provide necessary info.
Cheers.

so, this gets a little messy, but here's the idea
For this i'm querying out of the master table and pivoting on a variate of received (datetime).
declare #columns varchar(max)
declare #columnsisnull varchar(max)
declare #sql nvarchar(max)
SELECT #columns = STUFF(( SELECT DISTINCT TOP 100 PERCENT '],[' + CONVERT(VARCHAR(7), m1.received, 120)
FROM master m1 where m1.received between DATEADD(year, -1, getdate()) and GETDATE()
ORDER BY '],[' + CONVERT(VARCHAR(7), m1.received, 120) desc
FOR XML PATH('')), 1, 2, '') + ']'
SELECT #columnsisnull = STUFF(( SELECT DISTINCT TOP 100 PERCENT ', isnull([' + CONVERT(VARCHAR(7), m1.received, 120) + '],0)'
FROM master m1 where m1.received between DATEADD(year, -1, getdate()) and GETDATE()
--ORDER BY ', isnull([' + CONVERT(VARCHAR(7), m1.received, 120) + '],0)'
FOR XML PATH('')), 1, 2, '')
this looks basically like your code for getting the cols, with the difference being the #columnsisnull where i just append the isnull function into the columns
then for your #sql
set #sql = N'SELECT name, ' + #columnsisnull + ' from master ) p
pivot(sum(amount) for received in ( '+#columns+')) as pvt'
execute(#sql)

I would recommend to make one more attempt with default values, this feature should work fine for your case. It is more sophisticated solution since you will not depend on logic in your query and will not duplicate replacement with NULL.

Related

SQL Sort / Order By pivoted fields while COALESCE function

I have some rates for resources for all countries
The rows will be Resource IDs
Columns should be Country Codes
Challenge here, I cannot sort the Country Codes in ASC
It would be so grateful if you could help me on this.
When I query, I get the list of country codes, but not sorted. i.e., USA,BRA,ARG etc. But the expected result should be ARG,BRA,USA in columns of the pivot.
Here is my code:
DECLARE #idList nvarchar(MAX)
SELECT
#idList = COALESCE(#idList + ',', '') + CountryCodeISO3
FROM
(
SELECT
DISTINCT CountryCodeISO3
FROM
Published.RateCardsValues
WHERE
CardID = 55
) AS SRC
DECLARE #sqlToRun nvarchar(MAX)
SET
#sqlToRun = '
SELECT *
FROM (
SELECT
[ResourceCode]
,[TITLES]
,[MostRepresentativeTitle]
,[ABBR_RES_DESC]
,[TypicalJobGrade]
,[BidGridResourceCode]
,[OpUnit]
,[PSResType]
,[JobGradeORResCat]
,[CountryCodeISO3]
--,[CurrencyCode]
,[RateValue]
FROM
[Published].[RateCardsValues] rc
WHERE
CardID = 55) As src
PIVOT (
MAX(RateValue) FOR [CountryCodeISO3] IN (' + #idList + ')
) AS pvt'
EXEC (#sqlToRun)

As you have discovered, PIVOT in T-SQL requires you to know at development time what the values will be that you will be pivoting on.
This is limiting, because if you want something like "retrieve data for all the countries where Condition X is true, then pivot on their IDs!", you have to resort to dynamic SQL to do it.
If Condition X is constant -- I'm guessing that belonging to CardID = 55 doesn't change often -- you can look up the values, and hardcode them in your code.
If the CardID you're looking up is always 55 and you have relatively few countries in that category, I'd actually advise doing that.
But if your conditions for picking countries can change, or the number of columns you want can vary -- something like "all the countries where there were sales of product Y, for month Z!" -- then you can't predict them, which means that the T-SQL PIVOT can't be set up (without dynamic SQL.)
In that case, I'd strongly suggest that you have whatever app you plan to use the data in do the pivoting, not T-SQL. (SSRS and Excel can both do it themselves, and code can be written to do it in .NET langauges.) T-SQL, as you have seen, does not lend itself to dynamic pivoting.
What you have will "work" in the sense that it will execute without errors, but there's another downside, in the next stage of your app: not only will the number of columns potentially change over time, the names of the columns will change, as countries move in and out of Card ID 55. That may cause problems for whatever app or destination you have in mind for this data.
So, my two suggestions would be: either hard-code your country codes, or have the next stage in your app (whatever executes the query) do the actual pivoting.

You need to sort the columns while creating the dynamic SQL
Also:
Do not use variable coalescing, use STRING_AGG or FOR XML instead
Use QUOTENAME to escape the column names
sp_executesql allows you to pass parameters to the dynamic query
DECLARE #idList nvarchar(MAX)
SELECT
#idList = STRING_AGG(QUOTENAME(CountryCodeISO3), ',') WITHIN GROUP (ORDER BY CountryCodeISO3)
FROM
(
SELECT
DISTINCT CountryCodeISO3
FROM
Published.RateCardsValues
WHERE
CardID = 55
) AS SRC;
DECLARE #sqlToRun nvarchar(MAX);
SET
#sqlToRun = '
SELECT *
FROM (
SELECT
[ResourceCode]
,[TITLES]
,[MostRepresentativeTitle]
,[ABBR_RES_DESC]
,[TypicalJobGrade]
,[BidGridResourceCode]
,[OpUnit]
,[PSResType]
,[JobGradeORResCat]
,[CountryCodeISO3]
--,[CurrencyCode]
,[RateValue]
FROM
[Published].[RateCardsValues] rc
WHERE
CardID = 55) As src
PIVOT (
MAX(RateValue) FOR [CountryCodeISO3] IN (' + #idList + ')
) AS pvt'
EXEC sp_executesql #sqlToRun;
On earlier versions of SQL Server, you cannot use STRING_AGG. You need to hack it with FOR XML. You need to also use STUFF to strip off the first separator.
DECLARE #idList nvarchar(MAX)
DECLARE #separator nvarchar(20) = ',';
SET #idList =
STUFF(
(
SELECT
#sep + QUOTENAME(CountryCodeISO3)
FROM
Published.RateCardsValues
WHERE
CardID = 55
GROUP BY
CountryCodeISO3
ORDER BY
CountryCodeISO3
FOR XML PATH(''), TYPE
).value('text()[1]','nvarchar(max)'),
1, LEN(#separator), '')
;
DECLARE #sqlToRun nvarchar(MAX);
SET
#sqlToRun = '
SELECT *
FROM (
SELECT
[ResourceCode]
,[TITLES]
,[MostRepresentativeTitle]
,[ABBR_RES_DESC]
,[TypicalJobGrade]
,[BidGridResourceCode]
,[OpUnit]
,[PSResType]
,[JobGradeORResCat]
,[CountryCodeISO3]
--,[CurrencyCode]
,[RateValue]
FROM
[Published].[RateCardsValues] rc
WHERE
CardID = 55) As src
PIVOT (
MAX(RateValue) FOR [CountryCodeISO3] IN (' + #idList + ')
) AS pvt'
EXEC sp_executesql #sqlToRun;

How can I run my custom function and query in a loop for different time frames?

I am writing a function to calculate the total number of seconds a user was online at my website. Afterwards, I convert the number of seconds to hh:mm:ss:
select * into #temp from [MyFunction](timestamp1, timestamp2);
select u.Name,
convert(varchar(8), t.Seconds / 3600) + ':'
+ right('0', convert(varchar(2) t.Seconds % 3600/60), 2) + ':'
+ right('0', convert(varchar(2) t.Seconds % 60), 2)
as [Total Time]
from #temp t left join Users u
on t.UserID = u.UserID;
Where an example timestamp is 2016-04-01 00:00:00.000. What I want now, is to see total time spent on my website, not on 1 range, but a sequence of ranges, for instance:
2016-01-01 to 2016-01-15
2016-01-16 to 2016-01-31
2016-02-01 to 2016-02-15
Is it possible to put my code in a dynamic query to calculate all of these ranges by running the same code every time?
The output of my code above is:
Name [Total Time]
--------------------
Anton 6:34:55
Bert 5:22:14
What I would like is an output such as
Name [Period_1] [Period_2] [Period_3] [Period_4]
---------------------------------------------------
Anton 6:34:55 5:00:22 null 10:44:32
Bert 5:22:14 null null 9:22:53
So each range, or loop over the code, should be a column.
I believe pivot() will help me here, but any help kickstarting me with the dynamic SQL (or any better solution) would be greatly appreciated.

Wrap your current code into a procedure with parameters, something like:
CREATE PROCEUDRE dbo.CalcTime
#Period varchar(100) -- Name of the period
,#PeriodStart datetime -- Period starts
,#PeriodEnd datetime -- Period ends
and using appropriate datatypes.
Next, create a second procedure. Within this one, define another temporary table, like
CREATE TABLE #Results
(
Name varchar(100) not null -- Or however long it might get
,Period varchar(100) not null -- Ditto
,TotalTime int null -- *
)
Loop over every period you wish to define data for. For each period, call the "CalcTime" stored procedure, and dump the results into the temp table. Two ways to do this, use
INSERT #Results
execute dbo.CalcTime 'Period', 'Jan 1, 2016', 'Jan 15, 2016'
or, having defined the temp table in the calling procedure, you can reference it in the called procedure in a standard INSERT... SELECT... statement.
Also within the loop, build a comma-delimited string that lists all your period labels, e.g.
SET #AllPeriodLabels = isnull(#AllPeriodLabels + ',', '') + #ThisPeriodLabel
or,
SET #AllPeriodLabels = isnull(#AllPeriodLabels + ',', '') + '[' + #ThisPeriodLabel + ']' -- **
Use this to build the dynamic SQL pivot statement against the temp table, and you’re done. As mentioned in the comments, there are any number of SO posts on how to do that, and here are links to two: The first discusses building a dynamic pivot statement, and the second uses similar tactics for an unpivot statement.
* Avoid embedded spaces in object names, they will only give you pain.
** Ok, Sometimes you have to do it.

Two pseudo tables:
persons:
personId int
lastname nvarchar(50)
visits:
personid int
created datetime
duration int -- (store things like duration in seconds)
First make a list of the columns, here I used a created column and converted it to a month period. So the result is something like: [201501],[201502],[201503],....
declare #cols nvarchar(max)
set #cols = STUFF((select ',' + quotename(convert(VARCHAR(6), created, 112))
from visits
group by convert(VARCHAR(6), created, 112)
order by convert(VARCHAR(6), created, 112)
for xml path(''), type).value('.', 'nvarchar(max)'), 1, 1, '')
I need dynamic SQL to fill in the variable number of COLs, I suggest you start with NON dynamic SQL, make it dynamic should be the last step.
declare #sql nvarchar(max)
set #sql = N'
select *
-- lazy create a temp so you don't have to bother about the column definitions
-- into #temp
from (
select p.lastname, convert(VARCHAR(6), created, 112) as period
-- this is optional to get a Grand Row total
-- ,(select sum(duration) from visits v where v.personId = p.personId) as total
from visits v
inner join persons p on v.personId = p.personId
) src
pivot (
sum(duration) for period in (' + #cols + ')
) pvt;
'
Well you can print this for verification or run it ...
exec sp_executesql #sql
You can make a twist by dumping the result in a temp table (created on the fly). That creates the opportunity to add extra columns for output, like an organization etc. etc..
alter table #temp add organization nvarchar(100)
Good luck !

Here is a working test code. Adapt it as you see fit.
Setup:
-- create test tables
CREATE TABLE Users
(
UserId INT,
UserName NVARCHAR(max)
)
CREATE TABLE Access
(
UserId INT,
StartTime DATETIME2,
EndTime DATETIME2
)
CREATE TABLE Periods
(
NAME NVARCHAR(max),
StartTime DATETIME2,
EndTime DATETIME2
)
go
-- function to format the time
CREATE FUNCTION ToTime(#SECONDS BIGINT)
returns NVARCHAR(max)
AS
BEGIN
RETURN CONVERT(VARCHAR(8), #SECONDS / 3600) + ':'
+ RIGHT('00'+CONVERT(VARCHAR(2), #SECONDS % 3600/60), 2)
+ ':'
+ RIGHT('00'+CONVERT(VARCHAR(2), #SECONDS % 60), 2)
END
go
-- populate values
INSERT INTO Users
VALUES (1, 'Anton'),
(2,'Bert')
DECLARE #I INT=100
DECLARE #D1 DATETIME2
DECLARE #D2 DATETIME2
WHILE ( #I > 0 )
BEGIN
SET #D1=Dateadd(second, Rand() * 8640000, Getdate())
SET #D2=Dateadd(second, Rand() * 1000, #D1)
INSERT INTO Access
VALUES (Floor(Rand() * 2 + 1), #D1, #D2);
SET #I=#I - 1
END
SET #I=1
SET #D1=Getdate()
WHILE ( #I < 6 )
BEGIN
SET #D2=Dateadd(day, 15, #D1)
INSERT INTO Periods
VALUES (Concat('Period_', #I),
#D1,
#D2);
SET #D1=#D2
SET #I=#I + 1
END
go
Working code:
-- Getting the values
DECLARE #COLS NVARCHAR(max)
SET #COLS = Stuff((SELECT ',' + Quotename(NAME)
FROM Periods
GROUP BY NAME
ORDER BY NAME
FOR xml path(''), type).value('.', 'nvarchar(max)'), 1, 1, ''
)
DECLARE #SQL NVARCHAR(max)
SET #SQL = N'SELECT * FROM
(
SELECT u.UserName,
p.Name AS Period,
dbo.Totime(Sum(Datediff(SECOND,a.StartTime,a.EndTime))) AS [Time]
FROM Access a
INNER JOIN Users u
ON a.UserId=u.UserId
INNER JOIN Periods p
ON p.StartTime<=a.StartTime
AND a.StartTime<p.EndTime
GROUP BY u.UserName,
p.Name ) x PIVOT ( Max([Time]) FOR Period IN (' + #COLS +')
) p;'
--PRINT #SQL
EXECUTE(#SQL)

Dynamic Pivot - SQL Server

I have a test SQL database the following query:
USE DataBase1
Select Data.MonthDate,
Data.AccountID,
Data.MonthID,
Data.Sales,
Data.AccountName
From Test1 as Data with(nolock)
That I need to pivot based off of the sales column. The problem is the months when I run this query will always change (though there will always be 4 of them) and they need to be ordered left-to-right/oldest-newest in the pivoted result based off of the MonthDate column. The initial return when the query is run looks like this:
And the final result needs to look like this:
I'm using Excel here to demonstrate and I highlighted the 0's because those are technically NULL values but I need them to come back as 0.
I'm using SQL Server Management Studio and the actual database I'll be running this against is over 200,000 rows.
Any thoughts?
Thanks,
Joshua

Use Dynamic Query.
DECLARE #col_list VARCHAR(max)='',
#sel_list VARCHAR(max)='',
#sql NVARCHAR(max)
SELECT DISTINCT #col_list += '[' + Isnull(MonthID, '') + '],'
FROM Test1
ORDER BY MonthID
SELECT #col_list = LEFT(#col_list, Len(#col_list) - 1)
SELECT DISTINCT #sel_list += 'Isnull([' + Isnull(MonthID, '') + '],0) ' + '['+ MonthID + '],'
FROM Test1
ORDER BY MonthID
SELECT #sel_list = LEFT(#sel_list, Len(#sel_list) - 1)
SET #sql ='select Data.AccountID,Data.AccountName,'+ #sel_list+ ' from (
Select
Data.AccountID,
Data.MonthID,
Data.Sales,
Data.AccountName
From Test1 as Data ) A
pivot (sum(Sales) for monthid in('+ #col_list + ')) piv'
--PRINT #sql
EXEC Sp_executesql #sql

Basically you need to dynamically build the PIVOT query and use sp_exec to run it.
SQL Server, out of the box, has no support for dynamic ever-changing columns as the columns need to be defined in the PIVOT query.
Here's an example of how to accomplish this: http://sqlhints.com/tag/dynamic-pivot-column-names/

Dynamic SQL: Grouping by one variable, counting another for column names

I am trying to do a dynamic sql query, similar to some that have appeared on this forum, but for the life of me, I cannot get it to work.
I am using SQL Server 2008. I have a table with a series of order_ref numbers. Each of these numbers has a varying number of advice_refs associated with it. advice_ref numbers are unique (they are a key from another table). There is at least one advice_ref for each order_ref. There are a bunch of columns that describe information for each advice_ref.
What I want to do is create a table with a row for each unique order_ref, with columns for each advice_ref, in ascending order. The columns would be Advice01, Advice02, ....Advice10, Advice11, etc. Not all the Advice# columns would be filled in for every order_ref and the number of advice# columns would depend on the order_ref with the greatest number of advice_refs.
The table would look like:
Order Advice01 Advice02 Advice03 Advice04.....
1 1 2 3
2 5 8 9 20
3 25
The code I've tried to use is:
DECLARE #SQL NVARCHAR(MAX)
DECLARE #PVT NVARCHAR(MAX)
SELECT #SQL = #SQL + ', COALESCE(' + QUOTENAME('Advice' + RowNum) + ', '''') AS ' + QUOTENAME('Advice' + RowNum),
#PVT = #PVT + ', ' + QUOTENAME('Advice' + RowNum)
FROM (SELECT case when RowNum2 < 10 then '0'+RowNum2 when RowNum2 >=10 then RowNum2 end [RowNum] From
( SELECT DISTINCT CONVERT(VARCHAR, ROW_NUMBER() OVER(PARTITION BY order_ref ORDER BY advice_ref)) [RowNum2]
FROM [ED_dups].[dbo].[NewEDDupsLongForm]
) rn2 ) rn
SET #SQL = 'SELECT order_ref' + #SQL + '
FROM ( SELECT order_ref,
advice_ref,
case when CONVERT(VARCHAR, ROW_NUMBER() OVER(PARTITION BY order_ref ORDER BY advice_ref)) < 10
then ''Advice0'' + CONVERT(VARCHAR, ROW_NUMBER() OVER(PARTITION BY order_ref ORDER BY advice_ref))
else ''Advice'' + CONVERT(VARCHAR, ROW_NUMBER() OVER(PARTITION BY order_ref ORDER BY advice_ref))
end [AdviceID]
FROM [ED_dups].[dbo].[NewEDDupsLongForm]
) data
PIVOT
( MAX(advice_ref)
FOR AdviceID IN (' + STUFF(#PVT, 1, 2, '') + ')
) pvt'
EXECUTE SP_EXECUTESQL #SQL
SQL server tells me that the query executed successfully, but there is no output. When I run snippets of the code, it seems that the problem either lies in the pivot statement, near
+ STUFF(#PVT, 1, 2, '') + ')
and/or in the select statement, near
''Advice0'' +
Thanks in advance for any help--I've been at this for days!

I think you have to initialize variables like
DECLARE #SQL NVARCHAR(MAX) = ''
DECLARE #PVT NVARCHAR(MAX) = ''
or
DECLARE #SQL NVARCHAR(MAX)
DECLARE #PVT NVARCHAR(MAX)
SELECT #SQL = '', #PVT = ''
Otherwise your #SQL would be null

fist thing that comes to my mind is - do you really need SQL to fetch you dataset with dynamic number of columns? If you are writting an application, then your user interface, being it a web page or desktop app form, would be much nicer place to transform your data into a desired structure.
If you really need to do so, you will make your life much easier when you will not try to do everything in one big and rather complicated query, but rather split it into smaller tasks done step by step. What I would do is to use temporary tables to store working results, then use cursors to process order by order and advice by advice while inserting my data into temporary table or tables, in the end return a content of this table. Wrap everything in a stored procedure.
This method will also allow you to debug it easier - you can check every single step if it has done what it was expected to do.
And final advice - share a definition of your NewEDDupsLongForm table - someone might write some code to help you out then.
cheers

SQL Dynamic Pivot - how to order columns

I'm working on a dynamic pivot query on a table that contains:
OID - OrderID
Size - size of the product
BucketNum - the order that the sizes
should go
quantity - how many ordered
The size column contains different sizes depending upon the OID.
So, using the code found here, I put this together:
DECLARE #listCol VARCHAR(2000)
DECLARE #query VARCHAR(4000)
SELECT #listCol = STUFF(( SELECT distinct '], [' + [size]
FROM #t
FOR
XML PATH('')
), 1, 2, '') + ']'
SET #query = 'SELECT * FROM
(SELECT OID, [size], [quantity]
FROM #t
) src
PIVOT (SUM(quantity) FOR Size
IN (' + #listCol + ')) AS pvt'
EXECUTE ( #query )
This works great except that the column headers (the sizes labels) are not in the order based upon the bucketnum column. The are in the order based upon the sizes.
I've tried the optional Order By after the pivot, but that is not working.
How do I control the order in which the columns appear?
Thank you

You need to fix this:
SELECT #listCol = STUFF(( SELECT distinct '], [' + [size]
FROM #t
FOR
XML PATH('')
), 1, 2, '') + ']'
To return the columns in the right order. You might have to do something like this instead of using DISTINCT:
SELECT [size]
FROM #t
GROUP BY [size]
ORDER BY MIN(BucketNum)

SELECT #listCol = STUFF(
(SELECT DISTINCT ',' + QUOTENAME(size) AS [size]
FROM #t
ORDER BY [size]
FOR XML PATH('')

I saw this link just today, which uses a CTE to build the column list (which, presumably, you could order) on the fly without the need for dynamic sql:
http://blog.stevienova.com/2009/07/13/using-ctes-to-create-dynamic-pivot-tables-in-sql-20052008/

I had the same problem and tried the solution suggested above but, probably due to my level of understanding, couldn't get it to work. I found a simple hack was to create a Temp table with the column headers ordered correctly using Order by statements and then pull in that list to the variable that sets the dynamic pivot query column names.
e.g.
SELECT WeekNum INTO #T3
FROM #T2
GROUP BY WeekNum
ORDER BY MIN(WeekNum)
SELECT #ColumnName1 = ISNULL(#ColumnName1 + ',','') + QuoteName(WeekNum)
FROM (SELECT WeekNum From #T3) AS WeekNum
Worked a treat.
Hope that helps someone.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas