Get quarter on quarter growth with SQL for current quarter - sql

I'm trying to get the quarter on quarter revenue growth for only the current quarter from a dataset. I currently have a query that looks something like this
Select
x.Year,
x.quarter,
x.product,
x.company_id,
y.company,
SUM(x.revenue)
FROM
company_directory y
LEFT JOIN (
SELECT
DATEPART(YEAR, #date) year,
DATEPART(QUARTER, #date) quarter,
product,
SUM(revenue)
FROM sales
WHERE year => 2018 ) x
ON y.company_id = x.company_id
The data I get is in this format
Year Quarter Product company_id company revenue
2020 Q2 Banana 1092 companyX $100
What I'm trying to do is get quarter on quarter growth for revenue if it's reporting the current quarter. So for example, in the above data, because we're in Q2-2020, I want an extra column to say QoQ is x% which will compare Q2 vs Q1 revenue. If the row is reporting Q1-2020 or Q2-2019 QoQ will be empty because neither of those are the current quarter based on today's date.
Expected result
Year Quarter Product company_id company revenue QoQ
2020 Q2 Banana 1092 companyX $100 20%
2020 Q1 Pear 1002 companyX $23 NULL
I'm not entirely sure how to go about this, haven't had much luck searching. Any idea how I can implement?

You can use window functions.
select
s.yr,
s.qt,
s.product,
s.company_id,
cd.company,
s.revenue,
1.0 * (
s.revenue
- lag(s.revenue) over(partition by s.product, s.company_id order by s.yr, s.qt)
) / lag(s.revenue) over(partition by s.product, s.company_id order by s.yr, s.qt) as QoQ
from company_directory cd
inner join (
select
datepart(year, sales_date) yr,
datepart(quarter, sales_date) qt,
product,
company_id,
sum(revenue) revenue
from sales
where sales_date >= '2018-01-01'
group by
datepart(year, sales_date) year,
datepart(quarter, sales_date) quarter,
product,
company_id
) s on s.company_id = cd.company_id
Notes:
your code is not valid MySQL; I assume that you are running SQL Server instead of MySQL
the use of a variable in the subquery does not make sense - I assume that you have a column called sales_date in table sales that holds the sales date
your group by clauses are inconsistent with your select clauses - I assume that you want the quarter to quarter sales growth per company and per product
you might need to ajust the QoQ computation to your actual definition of the quarter to quarter growth
I don't see the point for a left join, so I used inner join instead

You need analytical window function LAG
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=40a7ba897df766f913ebd99e2f2a0f4e

Related

Calculate average and standard deviation for pre defined number of values substituting missing rows with zeros

I have a simple table that contains a record of products and their total sales per day over a year (just 3 columns - Product, Date, Sales). So, for example, if product A is sold every single day, it'll have 365 records. Similarly, if product B is sold for only 50 days, the table will have just 50 rows for that product - one for each day of sale.
I need to calculate the daily average sales and standard deviation for the entire year, which means that, for product B, I need to have additional 365-50=315 entries with zero sales to be able to calculate the daily average and standard deviation for the year correctly.
Is there a way to do this efficiently and dynamically in SQL?
Thanks
We can generate 366 rows and join the sales data to it:
WITH rg(rn) AS (
SELECT 1 AS rn
UNION ALL
SELECT a.rn + 1 AS rn
FROM rg a
WHERE a.rn <= 366
)
SELECT
*
FROM
rg
LEFT JOIN (
SELECT YEAR(saledate) as yr, DATEPART(dayofyear, saledate) as doy, count(*) as numsales
FROM sales
GROUP BY YEAR(saledate), DATEPART(dayofyear, saledate)
) s ON rg.rn = s.doy
OPTION (MAXRECURSION 370);
You can replace the nulls (where there is no sale data for that day) with e.g. AVG(COALESCE(numsales, 0)). You'll probably also need a WHERE clause to eliminate the 366th day on non leap years (such as MODULO the year by 4 and only do 366 rows if it's 0).
If you're only doing a single year, you can use a where clause in the sales subquery to give only the relevant records; most efficient is to use a range like WHERE salesdate >= DATEFROMPARTS(YEAR(GetDate()), 1, 1) AND salesdate < DATEFROMPARTS(YEAR(GetDate()) + 1, 1, 1) rather than calling a function on every sales date to extract the year from it to compare to a constant. You can also drop the YEAR(salesdate) from the select/group by if there is only a single year
If you're doing multiple years, you could make the rg generate more rows, or (perhaps simpler) cross join it to a list of years so you get 366 rows multiplied by e.g. VALUES (2015),(2016),(2017),(2018),(2019),(2020) (and make the year from the sales part of the join too)
find the first and last day of the year and then use datediff() to find number of days in that year.
After that don't use AVG on sales, but SUM(Sales) / days_in_year
select *,
days_in_year = datediff(day, first_of_year, last_of_year) + 1
from (values (2019), (2020)) v(year)
cross apply
(
select first_of_year = dateadd(year, year - 1900, 0),
last_of_year = dateadd(year, year - 1900 + 1, -1)
) d
There's a different way to look at it - don't try to add additional empty rows, just divide by the number of days in a year. While the number of days a year isn't constant (a leap year will have 366 days), it can be calculated easily since the first day of the year is always January 1st and the last is always December 31st:
SELECT YEAR(date),
product,
SUM(sales) / DATEPART(dy, DATEFROMPARTS(YEAR(date)), 12, 31))
FROM sales_table
GROUP BY YEAR(date), product

How to show all holiday types with the code

For my homework, I have to write sql code to show "Among all the orders in 2015, calculate the number of days for each holidaytype, and the average sales per day for each holidaytype. Exclude holidaytype=NULL. Sort the results by the average sales per day from high to low."
This is the code that I have been trying to use
select distinct holidaytype, sum(AvgSales) as AvgSales, sum(NumDays) as NumDays
from(
Select min(holidaytype) as holidaytype, count(order_date) as cnt, count(numholidays) as NumDays, avg(o.sales*o.quantity) as AvgSales
from orderline o, orders1 o1, calendar c
where o.Order_ID=o1.Order_ID and datepart(yyyy,order_date)= 2015 and holidaytype is not null) J
group by holidaytype
go
In the output, only 1 holiday type is showing but I am supposed to have 6 or 7 different holiday types.
You are grouping by Holiday Type, so it doesn't make sense to aggregate it, which isn't what you want anyway, according to your question.
So instead of this:
Select min(holidaytype) as holidaytype,
You should do this:
Select holidaytype,

SQL BigQuery : Calculate Value per time period

I'm new to SQL on BigQuery and I'm blocked on a project I have to compile.
I'm being asked to find the year over year growth of sales in percentage on a database that doesn't even sum the revenues... I know I have to assemble various request but can't figure out how to calculate the growth of sales.
Here is where I am at :
Has Anybody an insight on how to do so?
Thanks a lot !
(1) Starting from what you have, group by product line to get this year and last year's revenue in each row:
#standardsql
with yearly_sales AS (
select year, product_line, sum(revenue) as revenue
from `dataset.sales`
group by product_line, year
),
year_on_year AS (
select array_agg(struct(year, revenue))
OVER(partition by product_line ORDER BY year
RANGE BETWEEN PRECEDING AND CURRENT ROW) AS data
from yearly_sales
)
(2) Compute year-on-year growth from the two values you now have in each row
Below is for BigQuery Standard SQL
#standardSQL
SELECT product_line, year, revenue, prev_year_revenue,
ROUND(100 * (revenue - prev_year_revenue)/prev_year_revenue) year_over_year_growth_percent
FROM (
SELECT product_line, year, revenue,
LAG(revenue) OVER(PARTITION BY product_line ORDER BY year) prev_year_revenue
FROM (
SELECT product_line, year, SUM(revenue) revenue
FROM `project.dataset.table`
GROUP BY product_line, year
)
)
-- ORDER BY product_line, year
I tried with your information (plus mine made up data for 2007) and I arrived here:
SELECT
year,
sum(revenue) as year_sum
FROM
YearlyRevenue.SportCompany
GROUP BY
year
ORDER BY
year_sum
Whose result is:
R year year_sum
1 2005 1.159E9
2 2006 1.4953E9
3 2007 1.5708E9
Now the % growth should be added. Have a look here for inspiration.
Let me know if you don't succeed and I will try the hard part, with no guarantees.

Joining together 2 SQL queries to obtain weekday average and weekend average results in same query

I have two queries that work fine separately: one query extracts average sales of pens & pencils from a sales table for each salesperson for WEEKDAYS for each month of a year and the other query extracts average sales of pencils and pens from the same sales table for each salesperson for WEEKENDS for each month of a year. I can't figure out how to join the two queries together so that average sales for pencils and pens for weekends AND weekdays appear in the same result set.
My two separate working queries are:
++++++++++++++FIRST QUERY+++++++++++++++++
SELECT salesperson,
Avg([pencil_sales]) [pencil_salesAV],
Avg([pen_sales]) [pen_salesAV],
Month(the_date) Month,
Year(the_date) Year
FROM regionalsales
WHERE DATEPART(w,[the_date]) NOT IN (1,7)
GROUP BY GROUPING SETS((Month(the_date), Year(the_date), salesperson), (salesperson));
+++++++++++++++++SECONDQUERY++++++++++++++++
SELECT salesperson,
Avg([pencil_sales]) [pencil_salesAV],
Avg([pen_sales]) [pen_salesAV],
Month(the_date) Month,
Year(the_date) Year
FROM regionalsales
WHERE DATEPART(w,[the_date]) NOT IN (2,3,4,5,6)
GROUP BY GROUPING SETS((Month(the_date), Year(the_date), salesperson), (salesperson));
++++++++++++++++++++
I would appreciate advice on how to join these two queries into one result set. Thanks.
Use conditional aggregation:
SELECT salesperson,
Avg(CASE WHEN DATEPART(weekday, [the_date]) NOT IN (1, 7) THEN [pencil_sales] END)) as pencil_salesAV_weekday,
Avg(CASE WHEN DATEPART(weekday, [the_date]) NOT IN (1, 7) THEN [pen_sales] END) as pen_salesAV_weekday,
Avg(CASE WHEN DATEPART(weekday, [the_date]) IN (1, 7) THEN [pencil_sales] END)) as pencil_salesAV_weekend,
Avg(CASE WHEN DATEPART(weekday, [the_date]) IN (1, 7) THEN [pen_sales] END) as pen_salesAV_weekend,
Month(the_date) as Month,
Year(the_date) as Year
FROM regionalsales
GROUP BY GROUPING SETS((Month(the_date), Year(the_date), salesperson), (salesperson));
Just treat day of week as a field to group the data:
SELECT
salesperson,
day_of_week,
month,
year,
Avg([pencil_sales]) [pencil_salesAV],
Avg([pen_sales]) [pen_salesAV]
FROM (
SELECT
salesperson,
pencil_sales,
pen_sales,
IF(DATEPART(w,[the_date]) IN (1,7), 'weekend', 'weekday') AS day_of_week,
Month(the_date) AS month,
Year(the_date) AS year
FROM regionalsales
) AS sales
GROUP BY salesperson, day_of_week, month, year
GROUPING SETS(salesperson, (salesperson, day_of_week), (salesperson, day_of_week, month, year));
If the dialect of sql you use doesn't have IF, you can always use CASE.
CASE WHEN DATEPART(w,[the_date]) IN (1,7) THEN 'weekend' ELSE 'weekday' AS day_of_week

Order total sales by month

I am printing out the total sales for each state so long as they have over $6000 in sales for that month. The results need to be ordered by month(descending order). I noticed that I got 2 results for Texas for the same month. So what that tells me is it is just looking for individual sales over $6,000 even though I have a sum(so.total) which I thought would give the total sales for state ordered by the month. Not sure how to break down the sales by each individual month if the order_date field has day-month-year in it.(Using Oracle 10g)
Input:
SELECT c.state "State",to_char(so.order_date, 'Month')"Month", sum(so.total)"Sales"
FROM a_customer c
JOIN a_sales_order so ON c.customer_id=so.customer_id
GROUP BY c.state, so.order_date
HAVING sum(so.total)>6000
ORDER BY to_char(so.order_date, 'MM') desc
Output:
State Month Sales
----- --------- --------
TX September $8,400
CA July $8,377
TX March $7,700
TX March $8,540
CA February $46,370
MN February $6,400
CA February $24,650
I think there may be an issue in both your GROUP BY and ORDER BY clauses.
Try:
SELECT c.state AS "State",
to_char(so.order_date, 'Month') AS "Month",
sum(so.total) AS "Sales"
FROM a_customer c
JOIN a_sales_order so ON c.customer_id=so.customer_id
GROUP BY c.state,
to_char(so.order_date, 'Month')
HAVING sum(so.total)>6000
ORDER BY to_date(to_char(so.order_date, 'Month'), 'Month') desc
I don't have a working instance in front of me at the moment to try it but you need to ensure you are grouping on the same output as you are displaying, in your original example, you are grouping on the full order_date but displaying just the month portion.
The order by in your example will order by the text of the month rather than the actual chronological order.
Hope it helps.