How do I take the values of one record in a table and split it into multiple columns? - sql

I have two tables which contain the following fields I need to use:
Master Data: Master_ID (PK)
Item Data: Crate_ID, Master_ID (FK), Item_Type_ID, Item_Type_Description, Item_Date
The Item_Type_ID has several different numerical values, i.e. 10, 20, 30, 40, 50 ... 100 ... etc.
Each numerical value represents a type, i.e. Veggie, Fruit, Grains, Meat, etc.
The Item_Type_Description are things like: Fruit, Veggies, Grains, Meat, etc.
The Item_Date is a single date that identifies when that particular item (based upon Item_ID) was added to the Crate.
Note that there can only ever be one unique Item_Type_ID per Master_ID. Meaning, Item_Type_ID '10' can only ever be related to Master_ID '1234' once. An Item_Type_ID can be related to many different Master_IDs, but each of those Master_IDs, it can only be related once.
The issue I am having is that I can get the combined results, but for each Item_Type_ID, a distinct record/row is being created.
Here is the code I have generated thus far, which is giving me the incorrect Results:
USE Shipping
GO
BEGIN
SELECT
vmi.master_id
,CASE
WHEN vid.item_type_id = 10 THEN vid_item_date
ELSE NULL
END as 'Fruit_Item_Date'
,CASE
WHEN vid.item_type_id = 20 THEN vid_item_date
ELSE NULL
END as 'Veggie_Item_Date'
,CASE
WHEN vid.item_type_id = 30 THEN vid_item_date
ELSE NULL
END as 'Grains_Item_Date'
,CASE
WHEN vid.item_type_id = 40 THEN vid_item_date
ELSE NULL
END as 'Meat_Item_Date'
FROM v_master_data vmi
LEFT JOIN v_item_data vid ON vmi.master_id = vid.master_id
WHERE vid.item_type_id IN (10,20,30,40)
END
GO
Any input, pointers, assistance, direction, advice, is greatly appreciated.
Running SQL Server 2016, accessed via SQL Server Management Studio v18.

Perhaps this will give you a little nudge
PIVOT
Select Master_ID
,Fruit_Date = [10]
,Veggie_Date = [20]
,Grains_Date = [30]
,Meat_Date = [40]
From (
Select Master_ID
,Item_Type_ID
,Item_Date
From YourTable
) src
Pivot ( max(Item_Date) for Item_Type_ID in ( [10],[20],[30],[40] ) ) pvt
Conditional Aggregation
Select Master_ID
,Fruit_Date = max( case when Iten_Type_ID =10 then Item_Date end)
,Veggie_Date = max( case when Iten_Type_ID =20 then Item_Date end)
,Grains_Date = max( case when Iten_Type_ID =30 then Item_Date end)
,Meat_Date = max( case when Iten_Type_ID =40 then Item_Date end)
From YourTable
Group By Master_ID
A conditional aggregation offers a bit more flexibility and is often more performant.

Related

How to create a case statement which groups fields?

I am trying to understand how to group values together to add an indicator. I want to 'fix' the values and based on this, attribute an indicator.
The values I am trying to group are date, customer name and product type to create an indicator which captures what kind of order was placed (fruit only, fruit and vegetable, vegetable only). The goal is to calculate the total volume of each kind of order placed. The data is set out like this, and the column I am trying to create is the 'Order Type.
What I have done so far:
I originally completed this analysis in Tableau ]where I was able to use the 'Fixed' function and sum the value of indicators (for fruit or veggie) to determine each order type individually.
I have written case statements to identify the product type, with the idea that I could sum this to determine order type (code below) however this did not work as I only need one instance of the indicator for each order. To solve this, I have written a case statement which partitions the fields and orders by date to get one instance of an indicator for each order.
Case Statements
CASE WHEN Product_Type = 'Fruit' THEN 1 ELSE 0 END AS Fruit_Indicator
, CASE WHEN Product_Type = 'Vegetable' THEN 1 ELSE 0 END AS Veg_Indicator
Case Statement with partition by and order by
, CASE WHEN ROW_NUMBER() OVER (PARTITION BY Order_Date, Customer ORDER BY Order_Date ASC) = 1 AND Product_Type = 'Fruit' THEN 1 ELSE NULL END AS Fruit_Ind
, CASE WHEN ROW_NUMBER() OVER (PARTITION BY Order_Date, Customer ORDER BY Order_Date ASC) = 1 AND Product_Type = 'Vegetable' THEN 1 ELSE NULL END AS Veg_Ind
I would appreciate any guidance on the right direction.
Thanks!
It APPEARS you are trying to get data grouped by date such as Mar 21, Mar 22, etc... So, you may want to have a secondary query to join the primary data from. The second query will be an aggregate by customer and date. If the date field is date/time oriented, you will have to adjust the group by to get proper formatted context such as date-format using month/day/year and ignoring any time component. This might also be handled by a function to just get the date-part and ignoring the time. Then, your original data to the aggregate should get you what you need. Maybe something like.
select
yt.date,
yt.customer,
yt.product,
yt.productType,
case when PreQuery.IsFruit > 0 and PreQuery.IsVegetable > 0
then 'Fruit & Vegetable'
when PreQuery.IsFruit > 0 and PreQuery.IsVegetable = 0
then 'Fruit Only'
when PreQuery.IsFruit = 0 and PreQuery.IsVegetable > 0
then 'Vegetable Only' end OrderType
from
YourTable yt
JOIN
( select
yt2.customer,
yt2.date,
max( case when yt2.ProductType = 'Fruit'
then 1 else 0 end ) IsFruit,
max( case when yt2.ProductType = 'Vegetable'
then 1 else 0 end ) IsVegetable
from
YourTable yt2
-- if you want to restrict time period, add a where
-- clause here on the date range as to not query entire table
group by
yt2.customer,
yt2.date ) PreQuery
ON yt.customer = PreQuery.customer
AND yt.date = PreQuery.date
-- same here for your outer query to limit just date range in question.
-- if you want to restrict time period, add a where
-- clause here on the date range as to not query entire table
order by
yt.date,
yt.customer,
yt.product

Subquery or CTE to add additional column

I have a large data sets of truck tickets which generate two lines of output per ticket. This is because the ticket has an "out" and "in" component to each ticket. I want to generate one line of output but include information from both the "out" portion of the ticket and the "in" portion.
SELECT Ticket_number, Oil_volume,Faciliy_ID,Ticket_Type
FROM Truckticket T
JOIN TBATTERY TB
ON TB.Battery_ID = T.Battery_ID
My Output has two lines:
Ticket_number
Oil_volume
Facility_ID
Ticket_type
1
10
SK01
O
1
10
SK02
I
Now, what I want my output to be when I use a where clause on Facility_ID SK01:
Ticket_number
Oil_volume
Facility_ID
Facility_ID
Ticket_type
1
10
SK01
SK02
O
I know I have to do this with a subquery or CTE to get Facility_ID SK02 on the same line, but I'm stuck. I hope I have presented my question ok, my first time. Thanks!`
Instead of a subquery or CTE, this looks like a good time to use aggregation and pivoting. With a small number of static types, a simple MAX(CASE... expression with a GROUP BY can pivot the rows to columns.
SELECT
Ticket_number,
MAX(CASE WHEN Ticket_Type = 'O' then Oil_volume else NULL END) Oil_volume_out,
MAX(CASE WHEN Ticket_Type = 'I' then Oil_volume else NULL END) Oil_volume_in,
MAX(CASE WHEN Ticket_Type = 'O' then Facility_ID else NULL END) Facility_ID_out,
MAX(CASE WHEN Ticket_Type = 'I' then Facility_ID else NULL END) Facility_ID_in,
FROM Truckticket T
JOIN TBATTERY TB
ON TB.Battery_ID = T.Battery_ID
GROUP BY Ticket_number;

Calculation of occurrence of strings

I have a table with 3 columns, id, name and vote. They're populated with many registers. I need that return the register with the best balance of votes. The votes types are 'yes' and 'no'.
Yes -> Plus 1
No -> Minus 1
This column vote is a string column. I am using SQL SERVER.
Example:
It must return Ann for me
Use conditional Aggregation to tally the votes as Kannan suggests in his answer
If you really only want 1 record then you can do it like so:
SELECT TOP 1
name
,SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) AS VoteTotal
FROM
#Table
GROUP BY
name
ORDER BY
VoteTotal DESC
This will not allow for ties but you can use this method which will rank the responses and give you results use RowNum to get only 1 result or RankNum to get ties.
;WITH cteVoteTotals AS (
SELECT
name
,SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) AS VoteTotal
,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) DESC) as RowNum
,DENSE_RANK() OVER (PARTITION BY 1 ORDER BY SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) DESC) as RankNum
FROM
#Table
GROUP BY
name
)
SELECT name, VoteTotal
FROM
cteVoteTotals
WHERE
RowNum = 1
--RankNum = 1 --if you want with ties use this line instead
Here is the test data used and in the future do NOT just put an image of your test data spend the 2 minutes to make a temp table or a table variable so that people you are asking for help do not have to!
DECLARE #Table AS TABLE (id INT, name VARCHAR(25), vote VARCHAR(4))
INSERT INTO #Table (id, name, vote)
VALUES (1, 'John','no'),(2, 'John','no'),(3, 'John','yes')
,(4, 'Ann','no'),(5, 'Ann','yes'),(6, 'Ann','yes')
,(9, 'Marie','no'),(8, 'Marie','no'),(7, 'Marie','yes')
,(10, 'Matt','no'),(11, 'Matt','yes'),(12, 'Matt','yes')
Use this code,
;with cte as (
select id, name, case when vote = 'yes' then 1 else -1 end as votenum from register
) select name, sum(votenum) from cte group by name
You can get max or minimum based out of this..
This one gives the 'yes' rate for each person:
SELECT Name, SUM(CASE WHEN Vote = 'Yes' THEN 1 ELSE 0 END)/COUNT(*) AS Rate
FROM My_Table
GROUP BY Name

pivot table returns more than 1 row for the same ID

I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.
If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID

Constructing A Query In BigQuery With CASE Statements

So I'm trying to construct a query in BigQuery that I'm struggling with for a final part.
As of now I have:
SELECT
UNIQUE(Name) as SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) as RevenueGenerated
FROM (
SELECT
mantaSubscriptionIdmetadata,
planIdmetadata,
INTEGER(Amount) as RevenueGenerated
FROM
[sample_internal_data.charge0209]
WHERE
revenueSourcemetadata = 'new'
AND
Status = 'Paid'
GROUP BY
mantaSubscriptionIdmetadata,
planIdmetadata,
RevenueGenerated
)a
JOIN (
SELECT
id,
Name,
Interval
FROM
[sample_internal_data.subplans]
WHERE
id in ('150017','150030','150033','150019')
GROUP BY
id,
Name,
Interval )b
ON
a.planIdmetadata = b.id
GROUP BY
ID,
Interval,
Name
ORDER BY
Interval ASC
The resulting query looks like this
Which is exactly what I'm looking for up to that point.
Now what I'm stuck on this. There is another column I need to add called SalesRepName. The resulting field will either be null or not null. If its null it means it was sold online. If its not null, it means it was sold via telephone. What I want to do is create two additional columns where it says how many were sold via telesales and via online. The sum total of the two columns will always equal the SubsPurchased total.
Can anyone help?
You can include case statements within aggregate functions. Here you could choose sum(case when SalesRepName is null then 1 else 0 end) as online and sum(case when SalesRepName is not null then 1 else 0 end) as telesales.
count(case when SalesRepName is null then 1 end) as online would give the same result. Using sum in these situations is simply my personal preference.
Note that omitting the else clause is equivalent to setting else null, and null isn't counted by count. This can be very useful in combination with exact_count_distinct, which has no equivalent in terms of sum.
Try below:
it assumes your SalesRepName field is in [sample_internal_data.charge0209] table
and then it uses "tiny version" of SUM(CASE ... WHEN ...) which works when you need 0 or 1 as a result to be SUM'ed
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telsales
SELECT
UNIQUE(Name) AS SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) AS RevenueGenerated,
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telesales
FROM (
SELECT SalesRepName, mantaSubscriptionIdmetadata, planIdmetadata, INTEGER(Amount) AS RevenueGenerated
FROM [sample_internal_data.charge0209]
WHERE revenueSourcemetadata = 'new'
AND Status = 'Paid'
GROUP BY mantaSubscriptionIdmetadata, planIdmetadata, RevenueGenerated
)a
JOIN (
SELECT id, Name, Interval
FROM [sample_internal_data.subplans]
WHERE id IN ('150017','150030','150033','150019')
GROUP BY id, Name, Interval
)b
ON a.planIdmetadata = b.id
GROUP BY ID, Interval, Name
ORDER BY Interval ASC