Constructing A Query In BigQuery With CASE Statements

Constructing A Query In BigQuery With CASE Statements - google-bigquery

So I'm trying to construct a query in BigQuery that I'm struggling with for a final part.
As of now I have:
SELECT
UNIQUE(Name) as SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) as RevenueGenerated
FROM (
SELECT
mantaSubscriptionIdmetadata,
planIdmetadata,
INTEGER(Amount) as RevenueGenerated
FROM
[sample_internal_data.charge0209]
WHERE
revenueSourcemetadata = 'new'
AND
Status = 'Paid'
GROUP BY
mantaSubscriptionIdmetadata,
planIdmetadata,
RevenueGenerated
)a
JOIN (
SELECT
id,
Name,
Interval
FROM
[sample_internal_data.subplans]
WHERE
id in ('150017','150030','150033','150019')
GROUP BY
id,
Name,
Interval )b
ON
a.planIdmetadata = b.id
GROUP BY
ID,
Interval,
Name
ORDER BY
Interval ASC
The resulting query looks like this
Which is exactly what I'm looking for up to that point.
Now what I'm stuck on this. There is another column I need to add called SalesRepName. The resulting field will either be null or not null. If its null it means it was sold online. If its not null, it means it was sold via telephone. What I want to do is create two additional columns where it says how many were sold via telesales and via online. The sum total of the two columns will always equal the SubsPurchased total.
Can anyone help?

You can include case statements within aggregate functions. Here you could choose sum(case when SalesRepName is null then 1 else 0 end) as online and sum(case when SalesRepName is not null then 1 else 0 end) as telesales.
count(case when SalesRepName is null then 1 end) as online would give the same result. Using sum in these situations is simply my personal preference.
Note that omitting the else clause is equivalent to setting else null, and null isn't counted by count. This can be very useful in combination with exact_count_distinct, which has no equivalent in terms of sum.

Try below:
it assumes your SalesRepName field is in [sample_internal_data.charge0209] table
and then it uses "tiny version" of SUM(CASE ... WHEN ...) which works when you need 0 or 1 as a result to be SUM'ed
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telsales
SELECT
UNIQUE(Name) AS SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) AS RevenueGenerated,
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telesales
FROM (
SELECT SalesRepName, mantaSubscriptionIdmetadata, planIdmetadata, INTEGER(Amount) AS RevenueGenerated
FROM [sample_internal_data.charge0209]
WHERE revenueSourcemetadata = 'new'
AND Status = 'Paid'
GROUP BY mantaSubscriptionIdmetadata, planIdmetadata, RevenueGenerated
)a
JOIN (
SELECT id, Name, Interval
FROM [sample_internal_data.subplans]
WHERE id IN ('150017','150030','150033','150019')
GROUP BY id, Name, Interval
)b
ON a.planIdmetadata = b.id
GROUP BY ID, Interval, Name
ORDER BY Interval ASC

Related

How do I take the values of one record in a table and split it into multiple columns?

I have two tables which contain the following fields I need to use:
Master Data: Master_ID (PK)
Item Data: Crate_ID, Master_ID (FK), Item_Type_ID, Item_Type_Description, Item_Date
The Item_Type_ID has several different numerical values, i.e. 10, 20, 30, 40, 50 ... 100 ... etc.
Each numerical value represents a type, i.e. Veggie, Fruit, Grains, Meat, etc.
The Item_Type_Description are things like: Fruit, Veggies, Grains, Meat, etc.
The Item_Date is a single date that identifies when that particular item (based upon Item_ID) was added to the Crate.
Note that there can only ever be one unique Item_Type_ID per Master_ID. Meaning, Item_Type_ID '10' can only ever be related to Master_ID '1234' once. An Item_Type_ID can be related to many different Master_IDs, but each of those Master_IDs, it can only be related once.
The issue I am having is that I can get the combined results, but for each Item_Type_ID, a distinct record/row is being created.
Here is the code I have generated thus far, which is giving me the incorrect Results:
USE Shipping
GO
BEGIN
SELECT
vmi.master_id
,CASE
WHEN vid.item_type_id = 10 THEN vid_item_date
ELSE NULL
END as 'Fruit_Item_Date'
,CASE
WHEN vid.item_type_id = 20 THEN vid_item_date
ELSE NULL
END as 'Veggie_Item_Date'
,CASE
WHEN vid.item_type_id = 30 THEN vid_item_date
ELSE NULL
END as 'Grains_Item_Date'
,CASE
WHEN vid.item_type_id = 40 THEN vid_item_date
ELSE NULL
END as 'Meat_Item_Date'
FROM v_master_data vmi
LEFT JOIN v_item_data vid ON vmi.master_id = vid.master_id
WHERE vid.item_type_id IN (10,20,30,40)
END
GO
Any input, pointers, assistance, direction, advice, is greatly appreciated.
Running SQL Server 2016, accessed via SQL Server Management Studio v18.

Perhaps this will give you a little nudge
PIVOT
Select Master_ID
,Fruit_Date = [10]
,Veggie_Date = [20]
,Grains_Date = [30]
,Meat_Date = [40]
From (
Select Master_ID
,Item_Type_ID
,Item_Date
From YourTable
) src
Pivot ( max(Item_Date) for Item_Type_ID in ( [10],[20],[30],[40] ) ) pvt
Conditional Aggregation
Select Master_ID
,Fruit_Date = max( case when Iten_Type_ID =10 then Item_Date end)
,Veggie_Date = max( case when Iten_Type_ID =20 then Item_Date end)
,Grains_Date = max( case when Iten_Type_ID =30 then Item_Date end)
,Meat_Date = max( case when Iten_Type_ID =40 then Item_Date end)
From YourTable
Group By Master_ID
A conditional aggregation offers a bit more flexibility and is often more performant.

How to create a case statement which groups fields?

I am trying to understand how to group values together to add an indicator. I want to 'fix' the values and based on this, attribute an indicator.
The values I am trying to group are date, customer name and product type to create an indicator which captures what kind of order was placed (fruit only, fruit and vegetable, vegetable only). The goal is to calculate the total volume of each kind of order placed. The data is set out like this, and the column I am trying to create is the 'Order Type.
What I have done so far:
I originally completed this analysis in Tableau ]where I was able to use the 'Fixed' function and sum the value of indicators (for fruit or veggie) to determine each order type individually.
I have written case statements to identify the product type, with the idea that I could sum this to determine order type (code below) however this did not work as I only need one instance of the indicator for each order. To solve this, I have written a case statement which partitions the fields and orders by date to get one instance of an indicator for each order.
Case Statements
CASE WHEN Product_Type = 'Fruit' THEN 1 ELSE 0 END AS Fruit_Indicator
, CASE WHEN Product_Type = 'Vegetable' THEN 1 ELSE 0 END AS Veg_Indicator
Case Statement with partition by and order by
, CASE WHEN ROW_NUMBER() OVER (PARTITION BY Order_Date, Customer ORDER BY Order_Date ASC) = 1 AND Product_Type = 'Fruit' THEN 1 ELSE NULL END AS Fruit_Ind
, CASE WHEN ROW_NUMBER() OVER (PARTITION BY Order_Date, Customer ORDER BY Order_Date ASC) = 1 AND Product_Type = 'Vegetable' THEN 1 ELSE NULL END AS Veg_Ind
I would appreciate any guidance on the right direction.
Thanks!

It APPEARS you are trying to get data grouped by date such as Mar 21, Mar 22, etc... So, you may want to have a secondary query to join the primary data from. The second query will be an aggregate by customer and date. If the date field is date/time oriented, you will have to adjust the group by to get proper formatted context such as date-format using month/day/year and ignoring any time component. This might also be handled by a function to just get the date-part and ignoring the time. Then, your original data to the aggregate should get you what you need. Maybe something like.
select
yt.date,
yt.customer,
yt.product,
yt.productType,
case when PreQuery.IsFruit > 0 and PreQuery.IsVegetable > 0
then 'Fruit & Vegetable'
when PreQuery.IsFruit > 0 and PreQuery.IsVegetable = 0
then 'Fruit Only'
when PreQuery.IsFruit = 0 and PreQuery.IsVegetable > 0
then 'Vegetable Only' end OrderType
from
YourTable yt
JOIN
( select
yt2.customer,
yt2.date,
max( case when yt2.ProductType = 'Fruit'
then 1 else 0 end ) IsFruit,
max( case when yt2.ProductType = 'Vegetable'
then 1 else 0 end ) IsVegetable
from
YourTable yt2
-- if you want to restrict time period, add a where
-- clause here on the date range as to not query entire table
group by
yt2.customer,
yt2.date ) PreQuery
ON yt.customer = PreQuery.customer
AND yt.date = PreQuery.date
-- same here for your outer query to limit just date range in question.
-- if you want to restrict time period, add a where
-- clause here on the date range as to not query entire table
order by
yt.date,
yt.customer,
yt.product

Check whether an employee is present on three consecutive days

I have a table called tbl_A with the following schema:
After insert, I have the following data in tbl_A:
Now the question is how to write a query for the following scenario:
Put (1) in front of any employee who was present three days consecutively
Put (0) in front of employee who was not present three days consecutively
The output screen shoot:
I think we should use case statement, but I am not able to check three consecutive days from date. I hope I am helped in this
Thank you

select name, case when max(cons_days) >= 3 then 1 else 0 end as presence
from (
select name, count(*) as cons_days
from tbl_A, (values (0),(1),(2)) as a(dd)
group by name, adate + dd
)x
group by name

With a self-join on name and available = 'Y', we create an inner table with different combinations of dates for a given name and take a count of those entries in which the dates of the two instances of the table are less than 2 units apart i.e. for each value of a date adate, it will check for entries with its own value adate as well as adate + 1 and adate + 2. If all 3 entries are present, the count will be 3 and you will have a flag with value 1 for such names(this is done in the outer query). Try the below query:
SELECT Z.NAME,
CASE WHEN Z.CONSEQ_AVAIL >= 3 THEN 1 ELSE 0 END AS YOUR_FLAG
FROM
(
SELECT A.NAME,
SUM(CASE WHEN B.ADATE >= A.ADATE AND B.ADATE <= A.ADATE + 2 THEN 1 ELSE 0 END) AS CONSEQ_AVAIL
FROM
TABL_A A INNER JOIN TABL_A B
ON A.NAME = B.NAME AND A.AVAILABLE = 'Y' AND B.AVAILABLE = 'Y'
GROUP BY A.NAME
) Z;
Due to the complexity of the problem, I have not been able to test it out. If something is really wrong, please let me know and I will be happy to take down my answer.

--Below is My Approch
select Name,
Case WHen Max_Count>=3 Then 1 else 0 end as Presence
from
(
Select Name,MAx(Coun) as Max_Count
from
(
select Name, (count(*) over (partition by Name,Ref_Date)) as Coun from
(
select Name,adate + row_number() over (partition by Name order by Adate desc) as Ref_Date
from temp
where available='Y'
)
) group by Name
);

select name as employee , case when sum(diff) > =3 then 1 else 0 end as presence
from
(select id, name, Available,Adate, lead(Adate,1) over(order by name) as lead,
case when datediff(day, Adate,lead(Adate,1) over(order by name)) = 1 then 1 else 0 end as diff
from table_A
where Available = 'Y') A
group by name;

Is there a way to avoid columns from GROUP BY

My table has columns such as ID,Perdium and Location so I want to calculate all the perdiums given to an employee and the perdium share given in NY. The issue which I am facing is that SQL Server engine is throwing as error stating that location column isnt present in the GROUP BY clause(as needed in my use-case).If I include the location in the Group By clause I always get NYPerdiumShare as 1 which is not what I am expecting. Is there any workaround to this?
WITH CTE_Employee AS
(
SELECT ID,
SUM(Perdium) AS TotalPerdium,
CASE WHEN Location='NY' THEN SUM(Perdium) ELSE NULL END AS NYPerdium FROM EmployeePerdium
GROUP BY ID
)
SELECT ID,
TotalPerdium,
NYPerdium/TotalPerdium AS NYPerdiumShare
FROM CTE_Employee

You can eliminate the need to group by on anything other than ID by rewriting your query as follows to hide CASE inside an aggregate function:
WITH CTE_Employee AS (
SELECT
ID
, SUM(Perdium) AS TotalPerdium
, SUM(CASE WHEN Location='NY' THEN Perdium ELSE 0 END) AS NYPerdium
FROM EmployeePerdium
GROUP BY ID
)
SELECT
ID
, TotalPerdium
, NYPerdium/TotalPerdium AS NYPerdiumShare
FROM CTE_Employee

You don't need a cte here. Just use the sum window function.
SELECT DISTINCT
ID,
SUM(Perdium) OVER() as TotalPerdium
SUM(CASE WHEN Location='NY' THEN 1.0*Perdium ELSE 0 END) OVER(PARTITION BY ID)
/SUM(Perdium) OVER() AS NYPerdium
FROM EmployeePerdium

pivot table returns more than 1 row for the same ID

I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.

If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas