SQL get year-on-year quarter-to-date revenue - sql

Having the table below:
Year Quarter Month Revenue
2005 Q1 1 13
2006 Q1 1 10
2006 Q1 2 15
2006 Q1 3 35
2006 Q2 4 11
2006 Q2 5 15
2006 Q2 6 9
2007 Q1 1 6
2007 Q1 2 14
2007 Q1 3 7
2007 Q2 4 20
2007 Q2 5 6
2007 Q2 6 6
I need a query to calculate the year-on-year comparison of quarter-to-date revenue as below:
Year Quarter Month CUrrentQTDRevenue PreviousQTDRevenue
2005 Q1 1 13
2006 Q1 1 10 13
2006 Q1 2 25 13
2006 Q1 3 60 13
2006 Q2 4 11
2006 Q2 5 26
2006 Q2 6 35
2007 Q1 1 6 10
2007 Q1 2 20 25
2007 Q1 3 27 60
2007 Q2 4 20 11
2007 Q2 5 26 26
2007 Q2 6 32 35
I've managed to get the current year quarter-to-date revenue
SELECT Year, Quarter, Month
, SUM(Revenue) OVER (PARTITION BY Year, Quarter ORDER BY Year, Quarter, Month)
AS CurrentYearQuarterToDateRevenue
FROM revenue
but how do I get to the second part? Note that I can't simply join quarters and months since, for example, 2005 has only one month for Q1, so Q1 for 2006 will have 13 for every month.

In the example the prior year revenue is inconsistently applied. If the YQM revenue were cumulative by Quarter in 2007 vs 2006 as well as 2006 vs 2005, then the value of 13 would carry forward into month 2 and 3 of Q1. Something like this
with yqm_ytd_cte(Year, Quarter, Month, YQM_YTD_Revenue) as (
select Year, Quarter, Month,
sum(Revenue) over (partition by Year, Quarter order by Year, Quarter, Month)
from revenue)
select yy.*, isnull(yy_lag.YQM_YTD_Revenue, 0) as Prior_Year_YQM_YTD_Revenue
from yqm_ytd_cte yy
left join yqm_ytd_cte yy_lag on yy.Year=yy_lag.Year-1
and yy.Quarter=yy_lag.Quarter
and yy.Month=yy_lag.Month;

I think I would expand the data out and use window functions:
with yyyymm as (
select t.year, m.month, m.qtr, t.revenue, t.quarter
from (select distinct year from t) y cross join
(values (1, 1), (2, 1), . . . (12, 4)) m(month, qtr) left join
t
on t.year = y.year and t.month = m.month
)
select ym.*
from (select ym.*, lag(currentQTD, 12) over (order by year, month) as prevQTD
from (select ym.*,
sum(revenue) over (partition by year, qtr order by month) as currentQTD
from yyyymm ym
) ym
) ym
where quarter is null;
You can also use apply:
select t.*,
sum(revenue) over (partition by year, quarter order by month) as currentQTD,
tt.prevQTD
from t outer apply
(select sum(revenue) as prevQTD
from t tt
where tt.year = t.year - 1 and
tt.quarter = t.quarter and
tt.month <= t.month
) tt;

Related

Sum of last 12 months

I have a table with 3 columns (Year, Month, Value) like this in Sql Server :
Year
Month
Value
ValueOfLastTwelveMonths
2021
1
30
30
2021
2
24
54 (30 + 24)
2021
5
26
80 (54+26)
2021
11
12
92 (80+12)
2022
1
25
87 (SUM of values from 1 2022 TO 2 2021)
2022
2
40
103 (SUM of values from 2 2022 TO 3 2021)
2022
4
20
123 (SUM of values from 4 2022 TO 5 2021)
I need a SQL request to calculate ValueOfLastTwelveMonths.
SELECT Year,
       Month,
Value,
SUM (Value) OVER (PARTITION BY Year, Month)
FROM MyTable
This is much easier if you have a row for each month and year, and then (if needed) you can filter the NULL rows out. The reason it's easier is because then you know how many rows you need to look back at: 11.
If you make a dataset of the years and months, you can then LEFT JOIN to your data, aggregate, and then finally filter the data out:
SELECT *
INTO dbo.YourTable
FROM (VALUES(2021,1,30),
(2021,2,24),
(2021,5,26),
(2021,11,12),
(2022,1,25),
(2022,2,40),
(2022,4,20))V(Year,Month,Value);
GO
WITH YearMonth AS(
SELECT YT.Year,
V.Month
FROM (SELECT DISTINCT Year
FROM dbo.YourTable) YT
CROSS APPLY (VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))V(Month)),
RunningTotal AS(
SELECT YM.Year,
YM.Month,
YT.Value,
SUM(YT.Value) OVER (ORDER BY YM.Year, YM.Month
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS Last12Months
FROM YearMonth YM
LEFT JOIN dbo.YourTable YT ON YM.Year = YT.Year
AND YM.Month = YT.Month)
SELECT Year,
Month,
Value,
Last12Months
FROM RunningTotal
WHERE Value IS NOT NULL;
GO
DROP TABLE dbo.YourTable;

SQL - use only clients that are present in all months

I have a dataset with different clients, and their sales count. Over time, some clients get added and deleted from the data. How do I make sure that when I look at the sales counts, that I am only using a selection of the clients that were in the data set all the time? Ie if I have a client that doesn't have a record for 2018-03, then I don't want that client to be part of the entire query. If a clients does not have a record in 2020-03, then I also do not want this client to be part of the entire query.
For example, the following query:
select DATE_PART (y, sold_date)as year, DATE_PART (mm, sold_date) as month, count(distinct(client))
from sales_data
where sold_date > '2018-01-01'
group by year, month
order by year,month
Yields
year month count
2018 1 78
2018 2 83
2018 3 80
2018 4 83
2018 5 84
2018 6 81
2018 7 83
2018 8 90
2018 9 89
2018 10 95
2018 11 94
2018 12 97
2019 1 102
2019 2 103
2019 3 102
2019 4 105
2019 5 103
2019 6 104
2019 7 104
2019 8 106
2019 9 106
2019 10 108
2019 11 109
2019 12 104
2020 1 104
2020 2 102
2020 3 103
2020 4 98
2020 5 97
2020 6 79
So I want to only use the clients that are in all months, they should not be more than 78, because there can not be more users than the minimal month (2018-1).
FYI, I am using Amazon Redshift here but I am OK with a query that's rdbms agnostic or works for SQL-Server/Oracle/MySQL/PostgreSQL, I am just interested in a pattern on how to solve this issue effectively.
If I'm understanding what you want correctly, and if this is just a one-off query, you could use a correlated subquery in the where clause:
SELECT
DATE_PART(y, s.sold_date) AS year,
DATE_PART(mm, s.sold_date) AS month,
COUNT(DISTINCT s.client)
FROM
sales_data AS s
WHERE
EXISTS (
SELECT sd.client FROM sales_data AS sd WHERE DATE_PART(y,
sd.sold_date) = 2018 AND DATE_PART(mm, sd.sold_date) = 1 AND
sd.client = s.client
) AND
s.sold_date > '2018-01-01'
GROUP BY
year,
month
ORDER
DATE_PART(y, s.sold_date),
DATE_PART(mm, s.sold_date)
presence in all months can be done with 2-step aggregation:
group sales data by customer ID having all months
group sales data joined to (1) by year, month
like this (=12 can be a dynamic expression, depending on the amount of history you have)
with
stable_customers as (
select customer_id
from sales_data
group by 1
having count(distinct date_trunc('month' from sold_date)=12
)
select
DATE_PART (y, sold_date) as year
,DATE_PART (mm, sold_date) as month,
,count(1)
from sales_date
join stable_customers
using (customer_id)
where sold_date > '2018-01-01'
group by year, month
order by year,month
Use window functions. Unfortunately, SQL Server does not support count(distinct) as a window function. Fortunately, there is a simple work-around using dense_rank():
select year, month, count(distinct client)
from (select sd.*, year, month,
(dense_rank() over (order by year, month) +
dense_rank() over (order by year desc, month desc)
) as num_months,
(dense_rank() over (partition by client order by year, month) +
dense_rank() over (partition by client order by year desc, month desc)
) as num_months_client
from sales_data sd cross apply
(values (year(sold_date), month(sold_date))) v(year, month)
where sd.sold_date > '2018-01-01'
) sd
where num_months_client = num_months
group by year, month
order by year, month;
Note: This looks at all months that are in the data. If all clients are missing 2019-03, then that months is not considered at all.

Running Total in Oracle SQL - insert missing rows

Let's assume I have following set of data in Oracle SQL database:
Product Year Month Revenue
A 2016 1 7
A 2016 5 15
After creating running totals with following code
select Product, Year, Month, Revenue,
sum(Revenue) over (partition by Product, Year order by Month) Revenue_Running
from exemplary_table
I receive following result:
Product Year Month Revenue Revenue_Running
A 2016 1 7 7
A 2016 5 15 22
Is there any way that I can get this:
Product Year Month Revenue Revenue_Running
A 2016 1 7 7
A 2016 2 (null) 7
A 2016 2 (null) 7
A 2016 4 (null) 7
A 2016 5 15 22
You need a calendar table and Left join with your exemplary_table
SELECT p.product,
c.year,
c.month,
COALESCE(revenue, 0),
Sum(revenue)OVER (partition BY p.product, c.year ORDER BY c.month) Revenue_Running
FROM calendar_table c
CROSS JOIN (SELECT DISTINCT product
FROM exemplary_table) p
LEFT JOIN exemplary_table e
ON c.year = e.year
AND e.month = c.month
WHERE c.dates >= --your start date
AND c.dates <= --your end date

Grouping data on SQL Server

I have this table in SQL Server:
Year Month Quantity
----------------------------
2015 January 10
2015 February 20
2015 March 30
2014 November 40
2014 August 50
How can I identify the different years and months adding two more columns that group the same years with a number and then different months in sequential way like the example
Year Month Quantity Group Subgroup
------------------------------------------------
2015 January 10 1 1
2015 February 20 1 2
2015 March 30 1 3
2014 November 40 2 1
2014 August 50 2 2
You can use DENSE_RANK to calculate the groups for you:
SELECT t1.*, DENSE_RANK() OVER (ORDER BY Year DESC) AS [Group],
DENSE_RANK() OVER (PARTITION BY Year ORDER BY DATEPART(month, Month + ' 01 2010')) AS [SubGroup]
FROM t1
ORDER BY 4, 5
See this fiddle.
To associate group and subgroup with a number you can do this:
WITH RankedTable AS (
SELECT year, month, quantity,
ROW_NUMBER() OVER (partition by year order by Month) AS rn
FROM yourtable)
SELECT year, month, quantity,
SUM (CASE WHEN rn = 1 THEN 1 ELSE 0 END) OVER (ORDER BY YEAR) as year_group,
rn AS subgroup
FROM RankedTable
Here ROW_NUMBER() OVER clause calculates rank of a month within a year.
And SUM() ... OVER calculates running SUM for the months with rank 1.
SQL Fiddle

How to get month and year in single column and grouping the data for all the years and months?

For the below query (sdate is column name and table name is storedata)
Collapse
WITH TotalMonths AS (SELECT T1.[Month], T2.[Year]
FROM ((SELECT DISTINCT Number AS [Month]
FROM MASTER.dbo.spt_values WHERE [Type] = 'p' AND Number BETWEEN 1 AND 12) T1 CROSS JOIN
(SELECT DISTINCT DATEPART(year, sdate) AS [Year]
FROM storedata) T2))
SELECT CTE.[Year], CTE.[Month], ISNULL(T3.[Sum], 0) areasum
FROM TotalMonths CTE LEFT OUTER JOIN (
SELECT SUM(areasft) [Sum], DATEPART(YEAR, sdate) [Year], DATEPART(MONTH, sdate) [Month]
FROM storedata
GROUP BY DATEPART(YEAR, sdate) ,DATEPART(MONTH, sdate)) T3
ON CTE.[Year] = T3.[Year] AND CTE.[Month] = T3.[Month] WHERE CTE.[Year]>'2007'
ORDER BY CTE.[Year], CTE.[Month]
I am getting result set like below.
YEAR MONTH AREASUM
2008 1 0
2008 2 1193
2008 3 4230
2008 4 350
2008 5 2200
2008 6 4660
2008 7 0
2008 8 6685
2008 9 0
2008 10 3051
2008 11 7795
2008 12 2940
2009 1 1650
2009 2 3235
2009 3 2850
2009 4 6894
2009 5 3800
2009 6 2250
2009 7 1000
2009 8 1800
2009 9 1550
2009 10 2350
2009 11 0
2009 12 1800
But I have to combine both month and year in single column. The reult set should like below.
JAN/08 O
FEB/08 1193
.. ..
.. ..
DEC/O9 1800
How can I modify my query? (I should display for all the years and months even if there is no area for a month)
Regards,
N.SRIRAM
Try:
SELECT CONVERT(VARCHAR(3), DATENAME(MONTH, CTE.Month), 7) + '/' + RIGHT(CTE.Year, 2)
instead of using your first 2 columns from your SELECT.
You seem to be saying that you're getting the right data from your original query, but the wrong format. So
Make a view out of the query you originally posted.
Build a SELECT query based on that view to give you the format you want.
Let's say you do this:
CREATE VIEW wibble AS <your original query goes here>
Then you can just query wibble to correct the formatting.
select
case
when month = 1 then 'Jan/'
when month = 2 then 'Feb/'
when month = 3 then 'Mar/'
when month = 4 then 'Apr/'
when month = 5 then 'May/'
when month = 6 then 'Jun/'
when month = 7 then 'Jul/'
when month = 8 then 'Aug/'
when month = 9 then 'Sep/'
when month = 10 then 'Oct/'
when month = 11 then 'Nov/'
when month = 12 then 'Dec/'
else 'Err'
end || substring(cast(year as CHAR(4)), 3, 2) as yearmonth,
areasum from wibble;