SQL: break column into multiple columns by cell value - sql

I'm trying to build a module in VBA that will query 2 data sources and left join them on the common field of "Name". The problem is that the data are grained differently. The set I'm trying to change currently looks like this:
And I'm trying to make it look like this:
So that I can left join it with a query from my second dataset, which is grained by "name" rather than "item value".
FWIW, here's what the query I'm currently trying to run looks like. I'm using an implicit join to get "current value" and "past value" for each "name", but none of this refers to dataset 2:
with set1 as (SELECT NAME, ITEM_CODE, ITEM_VALUE_NUMBER as CURRENT_VALUE FROM [...] WHERE [...]),
with set2 as (SELECT NAME, ITEM_CODE, ITEM_VALUE_NUMBER as PAST_VALUE FROM [...] WHERE [...]),
SELECT set1.NAME [ITEM_1], [ITEM_2], [ITEM_3] from
(SELECT set1.NAME, set1.CURRENT_VALUE, set2.PAST_VALUE FROM set1, set2 WHERE set1.NAME=set2.NAME and set1.ITEM_CODE=set2.ITEM_CODE) AS SourceTable
PIVOT (max(CURRENT_VALUE) for ITEM_CODE in [ITEM_1], [ITEM_2], [ITEM_3]) AS PivotTable;
Thanks in advance for any guidance you can offer!

So, you have a couple options. First is trying to reword the pivot to get the results you want, which makes for a more complicated query than required. You can do this in the following way if you wish:
SELECT Name
, MAX(C1) Item1CurrentValue
, MAX(P1) Item1PastValue
, MAX(C2) Item2CurrentValue
, MAX(P2) Item2PastValue
, MAX(C3) Item3CurrentValue
, MAX(P3) Item3PastValue
FROM (
SELECT set1.Name
, 'C' + CAST(set1.ItemCode AS VARCHAR(255)) ItemCode1
, CurVal
, 'P' + CAST(set2.ItemCode AS VARCHAR(255)) ItemCode2
, PastVal
FROM (SELECT Name, ItemCode, ITEM_VALUE_NUMBER CurVal FROM myFirstQuery) set1
JOIN (SELECT Name, ItemCode, ITEM_VALUE_NUMBER PastVal FROM mySecondQuery) set2 ON set1.Name = set2.Name AND set1.ItemCode = set2.ItemCode) T
PIVOT (MAX(CurVal) FOR ItemCode1 IN (C1, C2, C3)) P1
PIVOT (MAX(PastVal) FOR ItemCode2 IN (P1, P2, P3)) P2
GROUP BY Name;
I don't really think CTEs make the query more readable so I just removed them, but you could use the CTEs in a similar fashion if you wish.
The better (and easier to understand, in my opinion) way is to just use conditional aggregation, like such:
SELECT Name
, MAX(CASE WHEN ItemCode = 1 THEN CurVal END) Item1CurrentValue
, MAX(CASE WHEN ItemCode = 1 THEN PastVal END) Item1PastValue
, MAX(CASE WHEN ItemCode = 2 THEN CurVal END) Item2CurrentValue
, MAX(CASE WHEN ItemCode = 2 THEN PastVal END) Item2PastValue
, MAX(CASE WHEN ItemCode = 3 THEN CurVal END) Item3CurrentValue
, MAX(CASE WHEN ItemCode = 3 THEN PastVal END) Item3PastValue
FROM (
SELECT set1.Name
, set1.ItemCode
, CurVal
, PastVal
FROM (SELECT Name, ItemCode, ITEM_VALUE_NUMBER CurVal FROM myFirstQuery) set1
JOIN (SELECT Name, ItemCode, ITEM_VALUE_NUMBER PastVal FROM mySecondQuery) set2 ON set1.Name = set2.Name AND set1.ItemCode = set2.ItemCode) T
GROUP BY Name;

Related

How to pivot two rows into two columns

I have the following SQL Query:
select
distinct
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
from
Equipment,
Studies,
Equipment_Reserved
where
Studies.Study = 'MAINT19-01'
and
Equipment.idEquipment = Equipment_Reserved.Equipment_idEquipment
and
Studies.idStudies = Equipment_Reserved.Studies_idStudies
and
Equipment.Type = 'Probe'
This query produces the following results:
Equipment_Attached_To Name
2297 R1-P1
2297 R1-P2
2299 R1-P3
I would like to change it to the following:
Equipment_Attached_To Name1 Name2
2297 R1-P1 R1-P2
2299 R1-P3 NULL
Thanks for your help!
I'd first change your query from the old, legacy JOIN syntax to an explicit join as it makes the query easier to understand:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
I don't think you actually need a PIVOT - I think you can do this with a nested query with the ROW_NUMBER function. I've seen that PIVOT queries often have worse query execution plans than nested-queries.
Let's add ROW_NUMBER (which require an ORDER BY as it's a windowing-function) and a matching ORDER BY in the whole query to make it consistent). Let's also use PARTITION BY so it resets the row-number for each Equipment_Attached_To value:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
ORDER BY
Equipment_Attached_To,
[Name]
This will give output like this:
Equipment_Attached_To Name RowNumber
2297 R1-P1 1
2297 R1-P2 2
2299 R1-P3 1
This can then be split out into explicit columns like so below. The use of MAX() is arbitrary (we could use MIN() instead) and only because we're dealing with a GROUP BY and because the CASE WHEN... restricts the input set to just 1 row anyway.
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
-- the query from above
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2
So the final query is:
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2
Let's start with some basics.
To facilitate reading the code, I added alias to the tables using their initials.
Then, I converted the old join syntax which is partly deprecated to use the standard syntax since 1992 (27 years and people still use the old syntax).
Finally, since there are only 2 possible values, we can use MIN and MAX to separate them in 2 columns.
And because we're using aggregate functions, we remove the DISTINCT and use GROUP BY
The code now looks like this:
SELECT er.Equipment_Attached_To,
--Gets the first row for the id
MIN( e.Name) AS Name1,
--If the MAX is equal to the MIN, returns a NULL. If not, it returns the second value.
NULLIF( MAX(e.Name), MIN( e.Name)) AS Name2
FROM Equipment e
JOIN Studies s ON s.idStudies = er.Studies_idStudies
JOIN Equipment_Reserved er ON e.idEquipment = er.Equipment_idEquipment
WHERE s.Study = 'MAINT19-01'
AND e.Type = 'Probe'
GROUP BY er.Equipment_Attached_To;

How to pivot multiple columns without aggregation

I use SqlServer and i have to admit that i'm not realy good with it ...
This might be and easy question for the advanced users (I hope)
I have two tables which look like this
First table (ID isn't the primary key)
ID IdCust Ref
1 300 123
1 300 124
2 302 345
And the second (ID isn't the primary key)
ID Ref Code Price
1 123 A 10
1 123 Y 15
2 124 A 14
3 345 C 18
In the second table, the column "Ref" is the foreign key of "Ref" in the first table
I'm trying to produce the following output:
[EDIT]
The column "Stock", "Code" and "Price" can have x values, so I don't know it, in advance...
I tried so many things like "PIVOT" but it didn't give me the right result, so i hope someone can solve my problem ...
Use row_number() function and do the conditional aggregation :
select id, IdCust, Ref,
max(case when Seq = 1 then stock end) as [Stock A], -- second table *id*
max(case when Seq = 1 then code end) as [Code 1],
max(case when Seq = 1 then price end) as [Price1],
max(case when Seq = 2 then stock end) as [Stock B], -- second table *id*
max(case when Seq = 2 then code end) as [Code 2],
max(case when Seq = 2 then price end) as [Price2]
from (select f.*, s.Id Stock, s.Code, s.Price,
row_number() over (partition by f.Ref order by s.id) as Seq
from first f
inner join second s on s.Ref = f.Ref
) t
group by id, IdCust, Ref;
However, this would go with known values else you would need go with dynamic solution for that.
#YogeshSharma's provided an excellent answer.
Here's the same done using Pivot; SQL Fiddle Demo.
Functionally there's no difference between the two answers. However, Yogesh's solution's simpler to understand, and performs better; so personally I'd opt for that... I included this answer only because you mention PIVOT in the question:
select ft.Id
, ft.IdCust
, ft.Ref
, x.Stock1
, x.Code1
, x.Price1
, x.Stock2
, x.Code2
, x.Price2
from FirstTable ft
left outer join (
select Ref
, max([Stock1]) Stock1
, max([Stock2]) Stock2
, max([Code1]) Code1
, max([Code2]) Code2
, max([Price1]) Price1
, max([Price2]) Price2
from
(
select Ref
, Id Stock
, Code
, Price
, ('Stock' + cast(Row_Number() over (partition by Ref order by Id, Code) as nvarchar)) StockLineNo
, ('Code' + cast(Row_Number() over (partition by Ref order by Id, Code) as nvarchar)) CodeLineNo
, ('Price' + cast(Row_Number() over (partition by Ref order by Id, Code) as nvarchar)) PriceLineNo
from SecondTable
) st
pivot (max(Stock) for StockLineNo in ([Stock1],[Stock2])) pvtStock
pivot (max(Code) for CodeLineNo in ([Code1],[Code2])) pvtCode
pivot (max(Price) for PriceLineNo in ([Price1],[Price2])) pvtPrice
Group by Ref
) x
on x.Ref = ft.Ref
order by ft.Ref
Like Yogesh's solution, this will only handle as many columns as you specify; it won't dynamically alter the number of columns to match the data. For that you'd need to do dynamic SQL. However; if you need to do that, it's more likely you're attempting to solve the problem in the wrong way... so consider your design / determine if you really need additional columns per result rather than additional rows / some alternate approach...
Here's a Dynamic SQL implementation based on #YogeshSharma's answer: DBFiddle
declare #sql nvarchar(max) = 'select id, IdCust, Ref'
select #sql = #sql + '
,max(case when Seq = 1 then stock end) as [Stock' + rowNumVarchar + ']
,max(case when Seq = 1 then code end) as [Code' + rowNumVarchar + ']
,max(case when Seq = 1 then price end) as [Price' + rowNumVarchar + ']
'
from
(
select distinct cast(row_number() over (partition by ref order by ref) as nvarchar) rowNumVarchar
from second s
) z
set #sql = #sql + '
from (select f.*, s.Id Stock, s.Code, s.Price,
row_number() over (partition by f.Ref order by s.id) as Seq
from first f
inner join second s on s.Ref = f.Ref
) t
group by id, IdCust, Ref;
'
print #sql --see what the SQL produced is
exec (#sql)
(Here's a SQL Fiddle link for this one; but it's not working despite the SQL being valid

How to merge two columns from CASE STATEMENT of DIFFERENT CONDITION

My expected result should be like
----invoiceNo----
T17080003,INV14080011
But right now, I've come up with following query.
SELECT AccountDoc.jobCode,AccountDoc.shipmentSyskey,AccountDoc.docType,
CASE AccountDoc.docType
WHEN 'M' THEN
JobInvoice.invoiceNo
WHEN 'I' THEN
(STUFF((SELECT ', ' + RTRIM(CAST(AccountDoc.docNo AS VARCHAR(20)))
FROM AccountDoc LEFT OUTER JOIN JobInvoice
ON AccountDoc.principalCode = JobInvoice.principalCode AND
AccountDoc.jobCode = JobInvoice.jobCode
WHERE (AccountDoc.isCancelledByCN = 0)
AND (AccountDoc.docType = 'I')
AND (AccountDoc.jobCode = #jobCode)
AND (AccountDoc.shipmentSyskey = #shipmentSyskey)
AND (AccountDoc.principalCode = #principalCode) FOR XML
PATH(''), TYPE).value('.','NVARCHAR(MAX)'),1,2,' '))
END AS invoiceNo
FROM AccountDoc LEFT OUTER JOIN JobInvoice
ON JobInvoice.principalCode = AccountDoc.principalCode AND
JobInvoice.jobCode = AccountDoc.jobCode
WHERE (AccountDoc.jobCode = #jobCode)
AND (AccountDoc.isCancelledByCN = 0)
AND (AccountDoc.shipmentSyskey = #shipmentSyskey)
AND (AccountDoc.principalCode = #principalCode)
OUTPUT:
----invoiceNo----
T17080003
INV14080011
Explanation:
I want to select docNo from table AccountDoc if AccountDoc.docType = I.
Or select invoiceNo from table JobInvoice if AccountDoc.docType = M.
The problem is what if under same jobCode there have 2 docType which are M and I, how I gonna display these 2 invoices?
You can achieve this by using CTE and FOR XML. below is the sample code i created using similar tables you have -
Create table #AccountDoc (
id int ,
docType char(1),
docNo varchar(10)
)
Create table #JobInvoice (
id int ,
invoiceNo varchar(10)
)
insert into #AccountDoc
select 1 , 'M' ,'M1234'
union all select 2 , 'M' ,'M2345'
union all select 3 , 'M' ,'M3456'
union all select 4 , 'I' ,'I1234'
union all select 5 , 'I' ,'I2345'
union all select 6 , 'I' ,'I3456'
insert into #JobInvoice
select 1 , 'INV1234'
union all select 2 , 'INV2345'
union all select 3 , 'INV3456'
select *
from #AccountDoc t1 left join #JobInvoice t2
on t1.id = t2.id
with cte as
(
select isnull( case t1.docType WHEN 'M' THEN t2.invoiceNo WHEN 'I' then
t1.docNo end ,'') invoiceNo
from #AccountDoc t1 left join #JobInvoice t2
on t1.id = t2.id )
select invoiceNo + ',' from cte For XML PATH ('')
You need to pivot your data if you have situations where there are two rows, and you want two columns. Your sql is a bit messy, particularly the bit where you put an entire select statement inside a case when in the select part of another query. These two queries are virtually the same, you should look for a more optimal way of writing them. However, you can wrap your entire sql in the following:
select
Jobcode, shipmentsyskey, [M],[I]
from(
--YOUR ENTIRE SQL GOES HERE BETWEEN THESE BRACKETS. Do not alter anything else, just paste your entire sql here
) yoursql
pivot(
max(invoiceno)
for docType in([M],[I])
)pvt

How to add a count/sum and group by in a CTE

Just a question on displaying a row on flight level and displaying a count on how many crew members on that flight.
I want to change the output so it will only display a single record at flight level and it will display two additional columns. One column (cabincrew) is the count of crew members that have the 'CREWTYPE' = 'F' and the other column (cockpitcrew) is the count of crew members that have the `'CREWTYPE' = 'C'.
So the query result should look like:
Flight DepartureDate DepartureAirport CREWBASE CockpitCrew CabinCrew
LS361 2016-05-19 BFS BFS 0 3
Can I have a little help tweaking the below query please:
WITH CTE AS (
SELECT cd.*, c.*, l.Carrier, l.FlightNumber, l.Suffix, l.ScheduledDepartureDate, l.ScheduledDepartureAirport
FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber FROM Data.Crew) c
INNER JOIN
Data.CrewDetail cd
ON c.UpdateID = cd.CrewUpdateID
AND cd.IsPassive = 0
AND RowNumber = 1
INNER JOIN
Data.Leg l
ON c.LegKey = l.LegKey
)
SELECT
sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix AS Flight
, sac.DepartureDate
, sac.DepartureAirport
, sac.CREWBASE
, sac.CREWTYPE
, sac.EMPNO
, sac.FIRSTNAME
, sac.LASTNAME
, sac.SEX
FROM
Staging.SabreAssignedCrew sac
LEFT JOIN CTE cte
ON sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix = cte.Carrier + CAST(cte.FlightNumber AS VARCHAR) + cte.Suffix
AND sac.DepartureDate = cte.ScheduledDepartureDate
PLEASE TRY THIS.
SELECT Flight,
DepartureDate,
DepartureAirport,
CREWBASE,
SUM(CASE WHEN CREWTYPE = 'F' THEN 1 ELSE 0 END) AS CabinCrew ,
SUM(CASE WHEN CREWTYPE = 'C' THEN 1 ELSE 0 END) AS CockpitCrew
FROM #Table
GROUP BY Flight, DepartureDate, DepartureAirport, CREWBASE
Please Try This:
select Flight, DepartureDate, DepartureAirport,CREWBASE,
count(case when CREWTYPE='F' then 1 end ) as CabinCrew,count(case when CREWTYPE='C' then 1 end ) as CockpitCrew
from Staging.SabreAssignedCrew
group by Flight, DepartureDate, DepartureAirport,CREWBASE

Using Order By with Distinct on a Join (PLSQL)

I have written a join on some tables and I have ordered the data using two levels of ordering - one of which is the primary key of one table.
Now, with this data sorted I want to then exclude any duplicates from my data using an in-line view and the DISTINCT clause - and this is where I am coming unstuck.
I seem to be able to either sort the data OR distinct it, but never both at the same time. Is there a way around this or have I stumbled upon the SQL equivalent of the uncertainty principle?
This code returns the data sorted, but with duplicates
SELECT
ada.source_tab source_tab
, ada.source_col source_col
, ada.source_value source_value
, ada.ada_id ada_id
FROM
are_aud_data ada
, are_aud_exec_checks aec
, are_audit_elements ael
WHERE
aec.aec_id = ada.aec_id
AND ael.ano_id = aec.ano_id
AND aec.acn_id = 123456
AND ael.ael_type = 1
ORDER BY
CASE
WHEN source_tab = 'Tab type 1' THEN 1
WHEN source_tab = 'Tab type 2' THEN 2
ELSE 3
END
,ada.ada_id ASC;
This code removes the duplicates, but I lose the order...
SELECT DISTINCT source_tab, source_col, source_value FROM (
SELECT
ada.source_tab
, ada.source_col source_col
, ada.source_value source_value
, ada.ada_id ada_id
FROM
are_aud_data ada
, are_aud_exec_checks aec
, are_audit_elements ael
WHERE
aec.aec_id = ada.aec_id
AND ael.ano_id = aec.ano_id
AND aec.acn_id = 123456
AND ael.ael_type = 1
ORDER BY
CASE
WHEN source_tab = 'Tab type 1' THEN 1
WHEN source_tab = 'Tab type 2' THEN 2
ELSE 3
END
,ada.ada_id ASC
)
;
If I try and include 'ORDER BY ada_id' at the end of the outer select, I get the error message 'ORA-01791: not a SELECTed expression' which is infuriating me!!
Why don't you include ada_id at the selected fields of the outer query?
;WITH CTE AS
(
SELECT
ada.source_tab source_tab
, ada.source_col source_col
, ada.source_value source_value
, ada.ada_id ada_id
, ROW_NUMBER() OVER (PARTITION BY [COLUMNS_YOU_WANT TO BE DISTINCT]
ORDER BY [your_columns]) rn
FROM
are_aud_data ada
, are_aud_exec_checks aec
, are_audit_elements ael
WHERE
aec.aec_id = ada.aec_id
AND ael.ano_id = aec.ano_id
AND aec.acn_id = 356441
AND ael.ael_type = 1
ORDER BY
CASE
WHEN source_tab = 'Licensed Inventory' THEN 1
WHEN source_tab = 'CMDB' THEN 2
ELSE 3
END
,ada.ada_id ASC
)
select * from CTE WHERE rn<2
it seems that the ada_id is meaningless in the outer query.
you have removed all those values to boil it down to the distinct source_tab and source_col...
what would you expect the order to be?
you want maybe the minimum ada_id for each table and column set to be the driver for the order - (although the table name seems appropriate to me)
include the minimum ada_id in the inner query (you'll need a group by clause)
then reference that in the outer query and sort on it.