rolling sum to calculate YTD for each month group by product and save to separate columns using SQL - sql

I have a data like this:
Order_No Product Month Qty
3001 r33 1 8
3002 r34 1 11
3003 r33 1 17
3004 r33 2 3
3005 r34 2 11
3006 r34 3 1
3007 r33 3 -10
3008 r33 3 18
I'd like to calculate total YTD qty for product and each month and save to separate columns. Below is what I want
Product Qty_sum_jan Qty_sum_feb Qty_sum_mar
r33 25 28 36
r34 11 22 23
I know how to use window function to calculate rolling sums but I have no idea to group them to separate columns. I currently use something like this:
case when Month = 1 then sum(Qty) over(partition by Product order by Month) else 0 end as Qty_sum_jan,
case when Month <=2 then sum(Qty) over(partition by Product order by Month) else 0 end as Qty_sum_feb,
case when Month <=3 then sum(Qty) over(partition by Product order by Month) else 0 end as Qty_sum_mar,
This will get me rolling sum by order but how to get to product level like what I show above? If I use group by then it will throw an error since Month is not in group by clause. I also cannot just use max to get the last value since qty can be negative so the last value may not be maximum. I use sparkSQL by the way

To my understanding, there is no need to use window functions. The following query achieves your desired output:
select
product,
sum(case when month = 1 then qty else 0 end) as sum_qty_jan,
sum(case when month <= 2 then qty else 0 end) as sum_qty_feb,
sum(case when month <= 3 then qty else 0 end) as sum_qty_mar
from your_table
group by 1;
Output:
product
sum_qty_jan
sum_qty_feb
sum_qty_mar
r33
25
28
36
r34
11
22
23

Related

SQL Create Column Headers by Month ID

I am trying to extract itemised sales data for the past 12 months and build a dynamic table with column headers for each month ID. Extracting the data as below works, however when I get to the point of creating a SUM column for each month ID, I get stuck. I have tried to find similar questions but I'm not sure of the best approach.
Select Item, Qty, format(Transaction Date,'MMM-yy')
from Transactions
Data Extract:
Item
Qty
Month ID
A123
50
Apr-22
A123
30
May-22
A123
50
Jun-22
A321
50
Apr-22
A999
25
May-22
A321
10
Jun-22
Desired Output:
Item
Apr-22
May-22
Jun-22
A123
50
30
50
A321
50
Null
10
A999
Null
25
Null
Any advice would be greatly appreciated.
This is a typical case of pivot operation, where you
first filter every value according to your "Month_ID" value
then aggregate on common "Item"
WITH cte AS (
SELECT Item, Qty, FORMAT(Transaction Date,'MMM-yy') AS Month_ID
FROM Transactions
)
SELECT Item,
MAX(CASE WHEN Month_ID = 'Apr-22' THEN Qty END) AS [Apr-22],
MAX(CASE WHEN Month_ID = 'May-22' THEN Qty END) AS [May-22],
MAX(CASE WHEN Month_ID = 'Jun-22' THEN Qty END) AS [Jun-22]
FROM cte
GROUP BY Item
Note: you don't need the SUM as long as there's only one value for each couple <"Item", "Month-Year">.

SQL Sum with conditions in two columns

I'm very new to SQL and VB.NET. I have an existing table called STOCK with the columns shown here, and I want to sum buy and sell to display current quantity.
Existing table:
ID
Date
BUY
SELL
Current quantity
1
01/01/22
88
0
2
03/01/22
22
0
94669
05/02/22
0
30
I want to display in Current quantity like this
(the current quantity amount in the row above + BUY - SELL)
I add result in Current quantity manually, but I want to do this in automatic way it is possible in SQL code
ID
Date
BUY
SELL
Current quantity
1
01/01/22
88
0
88
2
03/01/22
22
0
110
3
05/02/22
0
30
80
You can try this:
select a.*,
sum(net_sell) over (order by Curr_date ) as Current_quantity
from
(select s.*,
buy-sell as net_sell
from stock s) a ;
Dbfiddle link : https://dbfiddle.uk/?rdbms=postgres_11&fiddle=196a41a578d1e699ccaa3e878e261019

SQL Troubleshooting Help on Table Structure

I'm attempting to calculate average number of days between a customer's 1st and 3rd purchase, but struggling to get the data ordered in a way that will allow me to calculate.
I currently have the below data table. (Note: Order sequence number refers to the number order for that customer.)
Order Date
Customer Number
Order Sequence Number
2020-09-20
1
1
2021-01-20
1
2
2021-01-21
1
3
2020-10-01
2
1
2020-08-06
3
1
2020-09-06
3
2
2020-09-09
3
3
I've been trying to get the data to look like the following table. [To then be able to calculate datediff on the last two columns.]
Customer Number
Order Count
First Order Date
Third Order Date
1
3
2020-09-20
2021-01-21
2
1
2020-10-01
Null
3
3
2020-08-06
2020-09-09
I've completely messed up the code, but here's what I've been trying.
CREATE TABLE X2 as
SELECT
customer_number,
max(order_sequence_number) as order_count,
CASE
WHEN order_sequence_number = 1 then order_date
ELSE null
END as first_order_date,
CASE
WHEN order_sequence_number = 3 then order_date
ELSE null
END as third_order_date
FROM X1
GROUP BY customer_number;
Can someone please tell me what I'm missing? Thanks in advance!
You are on the right track but you need aggregation functions:
SELECT customer_number,
max(order_sequence_number) as order_count,
MAX(CASE WHEN order_sequence_number = 1 THEN order_date END) as first_order_date,
MAX(CASE WHEN order_sequence_number = 3 THEN order_date END) as third_order_date
FROM X1
GROUP BY customer_number;
To get the difference in days, you would just subtract the two expressions using whatever date arithmetic is supported in your database.

How to bring corresponding data in the column

How to bring the corresponding data in new columns by comparing there other attributes. here in the below table we have 2 weeks of data along with Store ID and Price type, if the price type is "Regular" then we have to add "Reduced" price with same criteria (Year, Week, StoreID) in the new column and if the price type is "Reduced" then we have to add "Regular" price with same criteria (Year, Week, StoreID) in the new column.
Year
Week
StoreID
PriceType
Price
2021
10
S
Regular
200
2021
10
S
Reduced
150
2021
10
D
Regular
180
2021
10
D
Reduced
120
2021
9
S
Regular
35
2021
9
D
Reduced
40
Has to be change like the below table, in the below output table, "Reduced/Regular" value is 150 in row number 1 because 150 is the corresponding value for 200 with criteria (2021, 10, S) and in 2nd row the Reduced/Regular value is 200 because 200 is the corresponding vale for 150 with criteria (2021, 10, S).
But last 2 rows for week 9 will gives 0 because we don't have corresponding criteria.
Year
Week
StoreID
PriceType
Price
Reduced/Regular
2021
10
S
Regular
200
150
2021
10
S
Reduced
150
200
2021
10
D
Regular
180
120
2021
10
D
Reduced
120
180
2021
9
S
Regular
35
0
2021
9
D
Reduced
40
0
Kindly help with this logic Thanks in advance
You can use window functions and conditional logic:
select t.*,
(case when priceType = 'Regular'
then max(case when priceType = 'Reduced' then price end) over (partition by year, week, storeId)
else max(case when priceType = 'Regular' then price end) over (partition by year, week, storeId)
end) as other_price
from t;
Happily, this is standard SQL and will work in any database.

Get multiple counts by 5 year increments

This is my table:
index_melanoma_yr Total_Melanoma Total_Virus
2000 700 12
2001 746 7
2002 724 12
2003 815 15
2004 893 16
2005 1020 22
I would like to count by 5 year increments. So, 2000-2004, 2005-2009, etc. I can hard code this, but since there are so many years, I'm wondering if there is a more efficient way.
Here's how I got the initial counts:
SELECT index_melanoma_yr,
COUNT(DISTINCT PersonID) AS Total_Melanoma,
SUM( CASE
WHEN index_virus_yr IS NOT NULL THEN
1
ELSE
0
END
) AS Total_Virus
FROM Asare_ViralMelanoma_IndexDates
GROUP BY index_melanoma_yr
ORDER BY index_melanoma_yr
you can perform some simple maths year / 5 * 5 on the year column, and then GROUP BY that. Assuming that the year column is integer
SELECT MIN(index_melanoma_yr) AS Year_Start,
MAX(index_melanoma_yr) AS Year_End,
COUNT(DISTINCT PersonID) AS Total_Melanoma,
SUM( CASE
WHEN index_virus_yr IS NOT NULL THEN
1
ELSE
0
END
) AS Total_Virus
FROM Asare_ViralMelanoma_IndexDates
GROUP BY index_melanoma_yr / 5 * 5
ORDER BY Year_Start