Transpose Columns or unpivot SQL Query - sql

I am currently attempting to transpose some data inside an SQL query however I can not seem to find a solution using un-pivot. Example of the Data I am working with is
SELECT * FROM (SELECT 'ACCOUNTS' AS Dept
, DATENAME(MONTH, GETDATE()) AS [Month]
, '3254' AS [1st Letter]
, '2544' AS [2nd Letter]
, '1254' AS [3rd Letter]
, '64' AS [4th Letter]
) AS t
I will admit I don't fully understand PIVOT and UNPIVOT fully, however I can not seem to work out if it will work in this query? The desired output would be
Dept |ACCOUNTS
Month |May
1st Letter |3254
2nd Letter |2544
3rd Letter |1254
4th Letter |64
I have seen a lot of solutions on Google but not really for what I am looking for, would a unpivot do the below for me.

Yes. It just works.
declare #t table (Dept varchar(20), Month varchar(20), [1st letter]varchar(20),[2nd letter]varchar(20),[3rd letter]varchar(20),[4th letter]varchar(20))
insert #t
SELECT 'ACCOUNTS' AS Dept
, DATENAME(MONTH, GETDATE()) AS [Month]
, '3254' AS [1st Letter]
, '2544' AS [2nd Letter]
, '1254' AS [3rd Letter]
, '64' AS [4th Letter]
SELECT * FROM #t AS t
unpivot (item for value in (Dept, Month, [1st letter],[2nd letter],[3rd letter],[4th letter])) u

SELECT
unpvt.[key],
unpvt.[value]
FROM
t
UNPIVOT
(
[key] FOR [value] IN ([Dept],[Month],[1st letter],[2nd letter],[3rd letter],[4th letter])
)
AS unpvt
The UNPIVOT effectively joins on a new table of two columns. In the case the [key] and the [value].
[key] is the string representation of the field name.
[value] is the value that was stored in that field.
The IN list allows you to specify which fields are being pivoted.
NOTE: This does mean that you need to know the full list of field names in advance; it won't dynamically adjust to include more fields if you add them to the table. Also, take care when your fields have different data-types (though this is not the case in your example).

Related

SQL Pivot and Distinct from new Columns

I have a table called ADMIN that originally looks like this
KEY VALUE
Version1 2019_RQK#2019
Version2 2019_RQK#2020
Version2 2019_RQK#2021
Version2 2019_RQK#2022
Version2 2020_TKA#2020
Version2 2020_TKA#2021
Version2 2020_TKA#2022
Version2 2020_TKA#2023
And I am try to change it to look like this
VERSION YEAR1 YEAR2 YEAR3 YEAR4
2019_RQK 2019 2020 2021 2022
2020_TKA 2020 2021 2022 2023
I wrote some SQL in order to get the left and right versions of the [VALUE] columns but I dont know how to condense it so that it only shows a DISTINCT as for the left side of the [VALUE] column. I tried to use distinct but it still brings up the same repeated entries, this is what I've written so far, I dont know if PIVOT function would work here I tried a few things it didn't end up correct.
SELECT DISTINCT LEFT([VALUE], 7) AS VERSION, RIGHT([VALUE], 4) AS YEAR
FROM ADMIN
WHERE [KEY] LIKE '%VERSION%'
Just gives me, not sure how to change it in the same query
VERSION YEAR
2019_RQK 2019
2019_RQK 2020
2019_RQK 2021
2019_RQK 2022
2020_TKA 2020
2020_TKA 2021
2020_TKA 2022
2020_TKA 2023
So, yes, you need a PIVOT table to do that. You can learn all about them here, which has a pretty straightforward (and quick!) walkthrough to understand why it works magic.
To PIVOT this table, we need to add a column for YEAR1, YEAR2, etc.. so they can be our headers/new columns. I'll do that with a basic ROW_NUMBER function. I know this example has 4 maximum new columns per entry, so I'm hardcoding them in, but the link above explains how you can dynamically generate the IN statement if you have an unknown number of maximum columns.
Please note, my test table was created with col1 and col2 because I am lazy. You should swap those for the actual column names.
SELECT * FROM (
-- we start with your basic table, as you provided
SELECT
LEFT(col2, 7) AS VERSION,
RIGHT(col2, 4) AS YEAR,
ROW_NUMBER() OVER (partition by LEFT(col2, 7) order by RIGHT(col2, 4)) as YearNum /* sort these by the proper year, so we don't get of order */
FROM ADMIN
WHERE col1 LIKE '%VERSION%'
) versionResults
PIVOT (
max([YEAR]) -- grab the year
for [YearNum] -- this column holds our new column headers
in ( /* these are the possible YearNum values, now our new column headers */
[1],
[2],
[3],
[4]
)
) as pivotResults
Demo here.
You need to also extract the first 4 chars as the "base year", then subtract the "Year" from the "Base Year" (and add 1) to get an integer value (1-4) and use those as the PIVOT list.
Example Fiddle
The reason this is "difficult" is that you have 3 key values stored in 1 column. At least it's fixed width so easy enough to break apart consistently.
If the VALUE column contains differently formatted data, this won't work.
CREATE TABLE Admin
( Key1 char(8)
, Val char(13)
);
INSERT INTO Admin (Key1, Val)
VALUES
('Version1','2019_RQK#2019')
, ('Version2','2019_RQK#2020')
, ('Version2','2019_RQK#2021')
, ('Version2','2019_RQK#2022')
, ('Version2','2020_TKA#2020')
, ('Version2','2020_TKA#2021')
, ('Version2','2020_TKA#2022')
, ('Version2','2020_TKA#2023');
WITH Src AS (
SELECT
Version = SUBSTRING(Val,1,8)
, Year = CAST(SUBSTRING(Val,10,4) as int)
, YearCt = CAST(SUBSTRING(Val,10,4) as int) - CAST(SUBSTRING(Val,1,4) as int) + 1
FROM Admin
)
SELECT
pvt.Version
, Year1 = pvt.[1]
, Year2 = pvt.[2]
, Year3 = pvt.[3]
, Year4 = pvt.[4]
FROM Src
PIVOT (MAX(Year) FOR YearCt IN ([1],[2],[3],[4])) pvt;
Using dynamical pivoting such as below one would be a better option in terms of picking the currently inserted years with no applying any manipulation to the query while the data change
DECLARE #cols AS NVARCHAR(MAX), #query AS NVARCHAR(MAX)
SET #cols = ( SELECT STRING_AGG(CONCAT('[year',[n],']'),',')
FROM (SELECT DISTINCT
ROW_NUMBER() OVER
(PARTITION BY LEFT([value], 7) ORDER BY [value]) AS [n]
FROM [admin] ) q );
SET #query =
N'SELECT *
FROM
(
SELECT DISTINCT LEFT([value], 7) AS version, RIGHT([value], 4) AS year,
CONCAT(''year'',ROW_NUMBER() OVER
(PARTITION BY LEFT([value], 7)
ORDER BY RIGHT([value], 4))) AS [n]
FROM [admin]
WHERE [key] LIKE ''%VERSION%'' ) q
PIVOT (
MAX([year]) FOR [n] IN (' + #cols + N')
) p
ORDER BY [version]';
EXEC sp_executesql #query;
Demo
Btw, you might also split the value by # sign SUBSTRING([value],1,CHARINDEX('#',[value])-1) in order to extract version, and SUBSTRING([value],CHARINDEX('#',[value])+1,LEN([value])) in order to extract year columns without specifying the length values as arguments within the functions as an alternative.

Unpivot Data with Multiple Columns - Syntax Help Please

I have the following data in which I would like to unpivot
I created this query to unpivot the 'Actual' rows but can't seem to figure out the syntax for unpivoting the 'Plan' and 'PriorYear' as well.
SELECT FiscalYear, Period, MetricName, ActualValue
FROM vw_ExecSummary
UNPIVOT
(ActualValue FOR MetricName IN ( [Net Revenue], [Total C.S. Salaries]
)) AS unpvt
WHERE [Type] = 'Actual'
The unpivoted data looks like this but I want to also add the Plan and PriorYear columns to the right of the ActualValue column below
Any help would be greatly appreciated. Thanks.
I can't test this at the moment, but I think it works like this. First, you need to UNPIVOT all the data:
SELECT fiscalYear, period, type, metricName, metricValue
FROM vw_ExecSummary
UNPIVOT (metricValue FOR metricName IN ([Net Revenue], [Total C.S. Salaries])) unpvt
Which should result in a table that looks something like this:
fiscalYear period type metricName metricValue
===================================================================
15 1 'Actual' 'Net Revenue' 3676798.98999997
15 1 'Actual' 'Total C.S. Salaries' 1463044.72
15 1 'Plan' 'Net Revenue' 3503920.077405
...................... (remaining rows omitted)
We could then PIVOT the rows as normal to get the new columns (that's what it's for):
SELECT fiscalYear, period, metricName,
[Actual] AS actualValue, [Plan] AS planValue, [PriorYear] AS priorYearValue
FROM <previous_data>
PIVOT (SUM(metricValue) FOR (type IN ([Actual], [Plan], [PriorYear]) pvt
(the SUM(...) shouldn't actually do anything here, as presumable the other columns comprise a unique row, but we're required to use an aggregate function)
...which should yield something akin to the following:
fiscalYear period metricName actualValue planValue priorYearValue
======================================================================================
15 1 'Net Revenue' 3676798.98999997 3503920.077405 40436344.4499999
...................................... (remaining rows omitted)
So putting it together would look like this:
SELECT fiscalYear, period, metricName,
[Actual] AS actualValue, [Plan] AS planValue, [PriorYear] AS priorYearValue
FROM (SELECT fiscalYear, period, type, metricName, metricValue
FROM vw_ExecSummary
UNPIVOT (metricValue FOR metricName IN ([Net Revenue], [Total C.S. Salaries])) unpvt) unpvt
PIVOT (SUM(metricValue) FOR type IN ([Actual], [Plan], [PriorYear])) AS pvt
SQL Fiddle Example
I have just one concern, though: values like 3676798.98999997, 3503920.077405, etc, make me think those columns are floating point (ie, REAL or FLOAT),but the values are named for monetary uses. If this is the case.... you are aware floating-point values can't store things like .1 exactly, right (ie, you can't actually add a dime to a value)? And that, when values get large enough, you can't add 1 anymore either? Usually when dealing with monetary values you should be using something based on a fixed-point type, like DECIMAL or NUMERIC.
This is a situation for Itzig Ben-Gan's cross apply values pivoting:
create table #data (FiscalYear smallint, Period tinyint, ValueType nvarchar(25), NetRevenue float, Salaries float);
insert into #data values
(15,1,N'Actual',3676798.98999,1463044.71999),
(15,1,N'Plan',3503920.977405,1335397.32878),
(15,1,N'PriorYear',4043634.449,1543866.89);
select d.FiscalYear, d.Period,
ActualNetRevenue = sum(v.ActualNetRevenue), ActualSalaries = sum(v.ActualSalaries),
PlanNetRevenue = sum(v.PlanNetRevenue), PlanSalaries = sum(v.PlanSalaries),
PriorYearNetRevenue = sum(v.PriorYearNetRevenue), PriorYearSalaries = sum(v.PriorYearSalaries)
from #data d
cross apply
(values
(N'Actual',d.NetRevenue,d.Salaries,0,0,0,0),
(N'Plan',0,0,d.NetRevenue,d.Salaries,0,0),
(N'PriorYear',0,0,0,0,d.NetRevenue,d.Salaries))
v (ValueType, ActualNetRevenue, ActualSalaries, PlanNetRevenue, PlanSalaries, PriorYearNetRevenue, PriorYearSalaries)
where d.ValueType = v.ValueType
group by d.FiscalYear, d.Period;
drop table #data;

Finding the number of concurrent days two events happen over the course of time using a calendar table

I have a table with a structure
(rx)
clmID int
patid int
drugclass char(3)
drugName char(25)
fillDate date
scriptEndDate date
strength int
And a query
;with PatientDrugList(patid, filldate,scriptEndDate,drugClass,strength)
as
(
select rx.patid,rx.fillDate,rx.scriptEndDate,rx.drugClass,rx.strength
from rx
)
,
DrugList(drugName)
as
(
select x.drugClass
from (values('h3a'),('h6h'))
as x(drugClass)
where x.drugClass is not null
)
SELECT PD.patid, C.calendarDate AS overlap_date
FROM PatientDrugList AS PD, Calendar AS C
WHERE drugClass IN ('h3a','h6h')
AND calendardate BETWEEN filldate AND scriptenddate
GROUP BY PD.patid, C.CalendarDate
HAVING COUNT(DISTINCT drugClass) = 2
order by pd.patid,c.calendarDate
The Calendar is simple a calendar table with all possible dates throughout the length of the study with no other columns.
My query returns data that looks like
The overlap_date represents every day that a person was prescribed a drug in the two classes listed after the PatientDrugList CTE.
I would like to find the number of consecutive days that each person was prescribed both families of drugs. I can't use a simple max and min aggregate because that wouldn't tell me if someone stopped this regimen and then started again. What is an efficient way to find this out?
EDIT: The row constructor in the DrugList CTE should be a parameter for a stored procedure and was amended for the purposes of this example.
You are looking for consecutive sequences of dates. The key observation is that if you subtract a sequence from the dates, you'll get a constant date. This defines a group of dates all in sequence, which can then be grouped.
select patid
,MIN(overlap_date) as start_overlap
,MAX(overlap_date) as end_overlap
from(select cte.*,(dateadd(day,row_number() over(partition by patid order by overlap_Date),overlap_date)) as groupDate
from cte
)t
group by patid, groupDate
This code is untested, so it might have some typos.
You need to pivot on something and a max and min work that out. Can you state if someone had both drugs on a date pivot? Then you would be limiting by date if I understand your question correctly.
EG Example SQL:
declare #Temp table ( person varchar(8), dt date, drug varchar(8));
insert into #Temp values ('Brett','1-1-2013', 'h3a'),('Brett', '1-1-2013', 'h6h'),('Brett','1-2-2013', 'h3a'),('Brett', '1-2-2013', 'h6h'),('Joe', '1-1-2013', 'H3a'),('Joe', '1-2-2013', 'h6h');
with a as
(
select
person
, dt
, max(case when drug = 'h3a' then 1 else 0 end) as h3a
, max(case when drug = 'h6h' then 1 else 0 end) as h6h
from #Temp
group by person, dt
)
, b as
(
select *, case when h3a = 1 and h6h = 1 then 1 end as Logic
from a
)
select person, count(Logic) as DaysOnBothPresriptions
from b
group by person

Inserting and transforming data from SQL table

I have a question which has been bugging me for a couple of days now. I have a table with:
Date
ID
Status_ID
Start_Time
End_Time
Status_Time(seconds) (How ling they were in a certain status, in seconds)
I want to put this data in another table, that has the Status_ID grouped up as columns. This table has columns like this:
Date
ID
Lunch (in seconds)
Break(in seconds)
Vacation, (in seconds) etc.
So, Status_ID 2 and 3 might be grouped under vacation, Status_ID 1 lunch, etc.
I have thought of doing a Case nested in a while loop, to go through every row to insert into my other table. However, I cannot wrap my head around inserting this data from Status_ID in rows, to columns that they are now grouped by.
There's no need for a WHILE loop.
SELECT
date,
id,
SUM(CASE WHEN status_id = 1 THEN status_time ELSE 0 END) AS lunch,
SUM(CASE WHEN status_id = 2 THEN status_time ELSE 0 END) AS break,
SUM(CASE WHEN status_id = 3 THEN status_time ELSE 0 END) AS vacation
FROM
My_Table
GROUP BY
date,
id
Also, keeping the status_time in the table is a mistake (unless it's a non-persistent, calculated column). You are effectively storing the same data in two places in the database, which is going to end up resulting in inconsistencies. The same goes for pushing this data into another table with times broken out by status type. Don't create a new table to hold the data, use the query to get the data when you need it.
This type of query (that transpose values from rows into columns) is named pivot query (SQL Server) or crosstab (Access).
There is two types of pivot queries (generally speaking):
With a fixed number of columns.
With a dynamic number of columns.
SQL Server support both types but:
Database Engine (query language: T-SQL) support directly only pivot
queries with a fixed number of columns(1) and indirectly (2)
Analysis Services (query language: MDX) support directly both types (1 & 2).
Also, you can query(MDX) Analysis Service data sources from T-SQL using OPENQUERY/OPENROWSET functions or using a linked server with four-part names.
T-SQL (only) solutions:
For the first type (1), starting with SQL Server 2005 you can use the PIVOT operator:
SELECT pvt.*
FROM
(
SELECT Date, Id, Status_ID, Status_Time
FROM Table
) src
PIVOT ( SUM(src.Status_Time) FOR src.Status_ID IN ([1], [2], [3]) ) pvt
or
SELECT pvt.Date, pvt.Id, pvt.[1] AS Lunch, pvt.[2] AS [Break], pvt.[3] Vacation
FROM
(
SELECT Date, Id, Status_ID, Status_Time
FROM Table
) src
PIVOT ( SUM(src.Status_Time) FOR src.Status_ID IN ([1], [2], [3]) ) pvt
For a dynamic number of columns (2), T-SQL offers only an indirect solution: dynamic queries. First, you must find all distinct values from Status_ID and the next move is to build the final query:
DECLARE #SQLStatement NVARCHAR(4000)
,#PivotValues NVARCHAR(4000);
SET #PivotValues = '';
SELECT #PivotValues = #PivotValues + ',' + QUOTENAME(src.Status_ID)
FROM
(
SELECT DISTINCT Status_ID
FROM Table
) src;
SET #PivotValues = SUBSTRING(#PivotValues,2,4000);
SELECT #SQLStatement =
'SELECT pvt.*
FROM
(
SELECT Date, Id, Status_ID, Status_Time
FROM Table
) src
PIVOT ( SUM(src.Status_Time) FOR src.Status_ID IN ('+#PivotValues+') ) pvt';
EXECUTE sp_executesql #SQLStatement;

SQL query ...multiple max value selection. Help needed

Business World 1256987 monthly 10 2009-10-28
Business World 1256987 monthly 10 2009-09-23
Business World 1256987 monthly 10 2009-08-18
Linux 4 U 456734 monthly 25 2009-12-24
Linux 4 U 456734 monthly 25 2009-11-11
Linux 4 U 456734 monthly 25 2009-10-28
I get this result with the query:
SELECT DISTINCT ljm.journelname,ljm. subscription_id,
ljm.frequency,ljm.publisher, ljm.price, ljd.receipt_date
FROM lib_journals_master ljm,
lib_subscriptionhistory
lsh,lib_journal_details ljd
WHERE ljd.journal_id=ljm.id
ORDER BY ljm.publisher
What I need is the latest date in each journal?
I tried this query:
SELECT DISTINCT ljm.journelname, ljm.subscription_id,
ljm.frequency, ljm.publisher, ljm.price,ljd.receipt_date
FROM lib_journals_master ljm,
lib_subscriptionhistory lsh,
lib_journal_details ljd
WHERE ljd.journal_id=ljm.id
AND ljd.receipt_date = (
SELECT max(ljd.receipt_date)
from lib_journal_details ljd)
But it gives me the maximum from the entire column. My needed result will have two dates (maximum of each magazine), but this query gives me only one?
You could change the WHERE statement to look up the last date for each journal:
AND ljd.receipt_date = (
SELECT max(subljd.receipt_date)
from lib_journal_details subljd
where subljd.journelname = ljd.journelname)
Make sure to give the table in the subquery a different alias from the table in the main query.
You should use Group By if you need the Max from date.
Should look something like this:
SELECT
ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price
, **MAX(ljd.receipt_date)**
FROM
lib_journals_master ljm
, lib_subscriptionhistory lsh
, lib_journal_details ljd
WHERE
ljd.journal_id=ljm.id
GROUP BY
ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price
Something like this should work for you.
SELECT ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price
,md.max_receipt_date
FROM lib_journals_master ljm
, ( SELECT journal_id
, max(receipt_date) as max_receipt_date
FROM lib_journal_details
GROUP BY journal_id) md
WHERE ljm.id = md.journal_id
/
Note that I have removed the tables from the FROM clause which don't contribute anything to the query. You may need to replace them if yopu simplified your scenario for our benefit.
Separate this into two queries one will get journal name and latest date
declare table #table (journalName as varchar,saleDate as datetime)
insert into #table
select journalName,max(saleDate) from JournalTable group by journalName
select all fields you need from your table and join #table with them. join on journalName.
Sounds like top of group. You can use a CTE in SQL Server:
;WITH journeldata AS
(
SELECT
ljm.journelname
,ljm.subscription_id
,ljm.frequency
,ljm.publisher
,ljm.price
,ljd.receipt_date
,ROW_NUMBER() OVER (PARTITION BY ljm.journelname ORDER BY ljd.receipt_date DESC) AS RowNumber
FROM
lib_journals_master ljm
,lib_subscriptionhistory lsh
,lib_journal_details ljd
WHERE
ljd.journal_id=ljm.id
AND ljm.subscription_id = ljm.subscription_id
)
SELECT
journelname
,subscription_id
,frequency
,publisher
,price
,receipt_date
FROM journeldata
WHERE RowNumber = 1