How to use unpivot operator to combine rows and columns? - sql

I need to group customers by GroupName. Customers can be duplicated on each GroupName. Each GroupName has a unique number called "GroupCode" in table OCQG. Customer table (OCRD) has separate column for Each GroupCode. As an example, C-0001 customer can have more group names.We can identify GroupCodes for each customer by see Group1,...,Group64 column values.(If this value = Y).Table structure as follows.Please help me.
I tried following query.But it didn't work.
SELECT p.CardCode, REPLACE(p.QryGroup,'GROUP','') groupcode, ocqg.GroupName
FROM ocrd UNPIVOT
( value
FOR groupcode IN ([QryGroup1],[QryGroup2],[QryGroup3])
) as p,
ocqg
WHERE value = 'Y' and
ocqg.GroupCode = REPLACE(p.groupcode,'GROUP','')
order by p.CardCode
Table Structure as follows,

I recommend using APPLY for this purpose:
SELECT ocrd.CardCode, v.groupcode, ocqg.GroupName
FROM ocrd CROSS APPLY
(VALUES (1, QryGroup1),
(2, QryGroup2),
(3, QryGroup3),
. . .
) v(GroupCode, Value) JOIN
ocqg
ON ocqg.GroupCode = v.GroupCode
WHERE v.value = 'Y'
ORDER BY p.CardCode;
UNPIVOT is bespoke syntax for SQL Server and Oracle that does one thing.
On the other hand, APPLY implements "lateral join"s.' These are very powerful -- much more powerful than UNPIVOT -- and supported by more databases.

Related

Can I use string_split with enforcing combination of labels?

So I have the following table:
Id Name Label
---------------------------------------
1 FirstTicket bike|motorbike
2 SecondTicket bike
3 ThirdTicket e-bike|motorbike
4 FourthTicket car|truck
I want to use string_split function to identify rows that have both bike and motorbike labels.
So the desired output in my example will be just the first row:
Id Name Label
--------------------------------------
1 FirstTicket bike|motorbike
Currently, I am using the following query but it is returning row 1,2 and 3. I only want the first. Is it possible?
SELECT Id, Name, Label FROM tickets
WHERE EXISTS (
SELECT * FROM STRING_SPLIT(Label, '|')
WHERE value IN ('bike', 'motorbike')
)
You can use APPLY & do aggregation :
SELECT t.id, t.FirstTicket, t.Label
FROM tickets t CROSS APPLY
STRING_SPLIT(t.Label, '|') t1
WHERE t1.value IN ('bike', 'motorbike')
GROUP BY t.id, t.FirstTicket, t.Label
HAVING COUNT(DISTINCT t1.value) = 2;
However, this breaks the normalization rules you should have separate table tickets.
You could just use string functions for this:
select t.*
from mytable t
where
'|' + label + '|' like '%|bike|%'
and '|' + label + '|' like '%|motorbike|%'
I would expect this to be more efficient than other methods that split and aggregate.
Please note, however, that you should really consider fixing your data model. Instead of storing delimited lists, you should have a separated table to represent the relation between tickets and labels, with one row per ticket/label tuple. Storing delimited lists in database column is a well-know SQL antipattern, that should be avoided at all cost (hard to maintain, hard to query, hard to enforce data integrity, inefficicent, ...). You can have a look at this famous SO post for more on this topic.
Yogesh beat me to it; my solution is similar but with a HUGE performance improvement worth pointing out. We'll start with this sample data:
SET NOCOUNT ON;
IF OBJECT_ID('tempdb..#tickets','U') IS NOT NULL DROP TABLE #tickets;
CREATE TABLE #tickets (Id INT, [Name] VARCHAR(50), Label VARCHAR(1000));
INSERT #tickets (Id, [Name], Label)
VALUES
(1,'FirstTicket' , 'bike|motorbike'),
(2,'SecondTicket', 'bike'),
(3,'ThirdTicket' , 'e-bike|motorbike'),
(4,'FourthTicket', 'car|truck'),
(5,'FifthTicket', 'motorbike|bike');
Now the original and much improved version:
-- Original
SELECT t.id, t.[Name], t.Label
FROM #tickets AS t
CROSS APPLY STRING_SPLIT(t.Label, '|') t1
WHERE t1.[value] IN ('bike', 'motorbike')
GROUP BY t.id, t.[Name], t.Label
HAVING COUNT(DISTINCT t1.[value]) = 2;
-- Improved Version Leveraging APPLY to avoid a sort
SELECT t.Id, t.[Name], t.Label
FROM #tickets AS t
CROSS APPLY
(
SELECT 1
FROM STRING_SPLIT(t.Label,'|') AS split
WHERE split.[value] IN ('bike','motorbike')
HAVING COUNT(*) = 2
) AS isMatch(TF);
Now the execution plans:
If you compare the costs: the "sortless" version is query 4.36 times faster than the original. In reality it's more because, with the first version, we're not just sorting, we are sorting three columns - an int and two (n)varchars. Because sorting costs are N * LOG(N), the original query gets exponentially slower the more rows you throw at it.

Pivot in SQL without Aggregate function

I have a scenario Where I have a table like
Table View
and What Output I want is
If your argument is "I will only ever have one value or no values, therefore I don't want an aggregate", realise that there are several aggregates that, if they're only passed a single value to aggregate, will return that value back as their result. MIN and MAX come to mind. SUM also works for numeric data.
Therefore the solution to specifying a PIVOT without an aggregate is instead to specify such a "pass through" aggregate here.
Basically, PIVOT internally works a lot the same as GROUP BY. Except the grouping columns are all columns in the current result set other than the column mentioned in the aggregate part of the PIVOT specification. And just as with the rules for the SELECT clause when GROUP BY is used1, every column either needs to be a grouping column or contained in an aggregate.
1Grumble, grumble, older mysql grumble. Although the defaults are more sensible from 5.7.5 up.
Try this:
Demo
with cte1 as
(
select 'Web' as platformname,'abc' as productname,'A' as grade
union all
select 'Web' ,'cde' ,'B'
union all
select 'IOS' ,'xyz' ,'C'
union all
select 'MAX' ,'cde' ,'D'
)
select productname,[Web], [IOS], [Android],[Universal],[Mac],[Win32]
from cte1 t
pivot
(
max(grade)
for platformname in ([Web], [IOS], [Android],[Universal],[Mac],[Win32])
) p
You can "pivot" such data using joins:
select p.productname,
t_win32.grade as win32,
t_universal.grade as universal,
. . .
from products p left join -- assume you have such a table
t t_win32
on t_win32.product_name = p.productname and t_win32.platform = 'Win32' left join
t t_universal
on t_universal.product_name = p.productname and t_universal.platform = 'Universal' left join
. . .
If you don't have a table products, use a derived table instead:
from (select distinct product_name from t) p left join
. . .

Modify my SQL Server query -- returns too many rows sometimes

I need to update the following query so that it only returns one child record (remittance) per parent (claim).
Table Remit_To_Activate contains exactly one date/timestamp per claim, which is what I wanted.
But when I join the full Remittance table to it, since some claims have multiple remittances with the same date/timestamps, the outermost query returns more than 1 row per claim for those claim IDs.
SELECT * FROM REMITTANCE
WHERE BILLED_AMOUNT>0 AND ACTIVE=0
AND REMITTANCE_UUID IN (
SELECT REMITTANCE_UUID FROM Claims_Group2 G2
INNER JOIN Remit_To_Activate t ON (
(t.ClaimID = G2.CLAIM_ID) AND
(t.DATE_OF_LATEST_REGULAR_REMIT = G2.CREATE_DATETIME)
)
where ACTIVE=0 and BILLED_AMOUNT>0
)
I believe the problem would be resolved if I included REMITTANCE_UUID as a column in Remit_To_Activate. That's the REAL issue. This is how I created the Remit_To_Activate table (trying to get the most recent remittance for a claim):
SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
MAX(claim_id) AS ClaimID,
INTO Latest_Remit_To_Activate
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID
Claims_Group2 contains these fields:
REMITTANCE_UUID,
CLAIM_ID,
BILLED_AMOUNT,
CREATE_DATETIME
Here are the 2 rows that are currently giving me the problem--they're both remitts for the SAME CLAIM, with the SAME TIMESTAMP. I only want one of them in the Remits_To_Activate table, so only ONE remittance will be "activated" per Claim:
enter image description here
You can change your query like this:
SELECT
p.*, latest_remit.DATE_OF_LATEST_REMIT
FROM
Remittance AS p inner join
(SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
claim_id,
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID) as latest_remit
on latest_remit.claim_id = p.claim_id;
This will give you only one row. Untested (so please run and make changes).
Without having more information on the structure of your database -- especially the structure of Claims_Group2 and REMITTANCE, and the relationship between them, it's not really possible to advise you on how to introduce a remittance UUID into DATE_OF_LATEST_REMIT.
Since you are using SQL Server, however, it is possible to use a window function to introduce a synthetic means to choose among remittances having the same timestamp. For example, it looks like you could approach the problem something like this:
select *
from (
select
r.*,
row_number() over (partition by cg2.claim_id order by cg2.create_datetime desc) as rn
from
remittance r
join claims_group2 cg2
on r.remittance_uuid = cg2.remittance_uuid
where
r.active = 0
and r.billed_amount > 0
and cg2.active = 0
and cg2.billed_amount > 0
) t
where t.rn = 1
Note that that that does not depend on your DATE_OF_LATEST_REMIT table at all, it having been subsumed into the inline view. Note also that this will introduce one extra column into your results, though you could avoid that by enumerating the columns of table remittance in the outer select clause.
It also seems odd to be filtering on two sets of active and billed_amount columns, but that appears to follow from what you were doing in your original queries. In that vein, I urge you to check the results carefully, as lifting the filter conditions on cg2 columns up to the level of the join to remittance yields a result that may return rows that the original query did not (but never more than one per claim_id).
A co-worker offered me this elegant demonstration of a solution. I'd never used "over" or "partition" before. Works great! Thank you John and Gaurasvsa for your input.
if OBJECT_ID('tempdb..#t') is not null
drop table #t
select *, ROW_NUMBER() over (partition by CLAIM_ID order by CLAIM_ID) as ROW_NUM
into #t
from
(
select '2018-08-15 13:07:50.933' as CREATE_DATE, 1 as CLAIM_ID, NEWID() as
REMIT_UUID
union select '2018-08-15 13:07:50.933', 1, NEWID()
union select '2017-12-31 10:00:00.000', 2, NEWID()
) x
select *
from #t
order by CLAIM_ID, ROW_NUM
select CREATE_DATE, MAX(CLAIM_ID), MAX(REMIT_UUID)
from #t
where ROW_NUM = 1
group by CREATE_DATE

unpivot calculated columns value

SELECT
P.EmployeeId,
(P.Amount*1)*T.RegularHours as RegularEarnings
,(P.Amount*1.5)*T.OTHours as OvertimeEarnings
,(P.Amount*2)*T.DOTHours as DoubleOvertimeEarnings
,((P.Amount*1)*T.RegularHours)+((P.Amount*1.5)*T.OTHours )+((P.Amount*2)*T.DOTHours)+((SELECT ISNULL(SUM(AMOUNT),0) FROM EmployeeEarning WHERE EmployeeId=T.EmployeeId)) as GrossPay
FROM TimeSheet as T
INNER JOIN PayInformation as P ON T.EmployeeId=P.EmployeeId
above is my select statement which returns me earnings in column format . I want to unpivot my resulted table so that each employeeId have two columns one is earning type and second one is amount of earning .
I already done with simple unpivote
select unpvt.car_id, unpvt.attribute, unpvt.value
from #cars c
unpivot (
value
for attribute in (Make, Model, Color)
) unpvt
but now stuck as i have to unpivot tables values calculated in select statement .
I have attempt something more and here is my latest code
SELECT unpvt.EmployeeId, unpvt.Earning, unpvt.Amount
FROM
(SELECT
P.EmployeeId,
(P.Amount*1)*T.RegularHours as RegularEarnings
,(P.Amount*1.5)*T.OTHours as OvertimeEarnings
,(P.Amount*2)*T.DOTHours as DoubleOvertimeEarnings
,((P.Amount*1)*T.RegularHours)+((P.Amount*1.5)*T.OTHours )+((P.Amount*2)*T.DOTHours)+((SELECT ISNULL(SUM(AMOUNT),0) FROM EmployeeEarning WHERE EmployeeId=T.EmployeeId)) as GrossPay
FROM TimeSheet as T
INNER JOIN PayInformation as P ON T.EmployeeId=P.EmployeeId )as c
unpivot (
Amount
for Earning in (RegularEarnings,OvertimeEarnings,DoubleOvertimeEarnings)
)unpvt
and now i am getting error
The type of column "OvertimeEarnings" conflicts with the type of other columns specified in the UNPIVOT list.

SQL : Turning rows into columns

I need to turning the value of a row into column - for example:
SELECT s.section_name,
s.section_value
FROM tbl_sections s
this outputs :
section_name section_value
-----------------------------
sectionI One
sectionII Two
sectionIII Three
desired output :
sectionI sectionII sectionIII
-----------------------------------------
One Two Three
This is probably better done client-side in the programming language of your choice.
You absolutely need to know the section names in advance to turn them into column names.
Updated answer for Oracle 11g (using the new PIVOT operator):
SELECT * FROM
(SELECT section_name, section_value FROM tbl_sections)
PIVOT
MAX(section_value)
FOR (section_name) IN ('sectionI', 'sectionII', 'sectionIII')
For older versions, you could do some self-joins:
WITH
SELECT section_name, section_value FROM tbl_sections
AS
data
SELECT
one.section_value 'sectionI',
two.section_value 'sectionII',
three.section_value 'sectionIII'
FROM
select selection_value from data where section_name = 'sectionI' one
CROSS JOIN
select selection_value from data where section_name = 'sectionII' two
CROSS JOIN
select selection_value from data where section_name = 'sectionIII' three
or also use the MAX trick and "aggregate":
SELECT
MAX(DECODE(section_name, 'sectionI', section_value, '')) 'sectionI',
MAX(DECODE(section_name, 'sectionII', section_value, '')) 'sectionII',
MAX(DECODE(section_name, 'sectionIII', section_value, '')) 'sectionIII'
FROM tbl_sections