Calculating Year over Year Growth using PostgreSQL - sql

I'm using a SQL dataset called Superstore and I want to figure out how to calculate year over year sales growth as a percentage. Here's the code I already have:
SELECT
EXTRACT(year FROM order_date) AS order_year,
SUM(sales) AS total_sales,
FROM orders
WHERE order_date BETWEEN date '2016-01-01' and date '2019-12-01'
GROUP BY 1
ORDER BY 1
Any help would be welcomed. Thank you.

I would aggregate in a subquery, then use a window function:
SELECT *
FROM (SELECT total_sales / lag(total_sales) OVER (ORDER BY order_year) * 100.0,
year
FROM (SELECT CAST (EXTRACT(year FROM order_date) AS integer) AS order_year,
SUM(sales) AS total_sales
FROM orders
GROUP BY 1) AS subq
) AS subq2
WHERE year = 2021;

Related

Calculate month on month growth rate of orders for the last 3 months for each country

I am trying to find the month on month growth rate of orders for the past 3 months for each country.
So far I have tried:
select date_part('month', order_date) as mnth,
country_id,
100 * (count() - lag(count(), 1) over (order by order_date)) / lag(count(), 1) over (order by order_date) as growth
from orders
and order_date >= DATEADD(DAY, -90, GETDATE())
group by country_id;
When we GROUP BY country_id, we produce a result of rows, one per country.
The aggregate COUNT will then operate on one group for each country and the subsequent window function (LAG) won't see more than one row for each country.
There's no way, in this context, LAG can be used to obtain data for a prior month for the same country.
GROUP BY country_id, date_part('month', order_date) is one approach that could be used. Be sure to LAG OVER PARTITIONs for each country, ordered by date.
Here's a small change in your SQL that might help (not tested and just a starting point).
Note: I used SQL Server to test below. Convert datepart to date_part as needed.
Fiddle for SQL Server
WITH cte AS (
SELECT *, datepart(month, order_date) AS mnth
FROM orders
WHERE order_date >= DATEADD(DAY, -90, GETDATE())
)
SELECT mnth
, country_id
, 100 * (COUNT(*) - LAG(COUNT(*)) OVER (PARTITION BY country_id ORDER BY mnth)) / LAG(COUNT(*)) OVER (PARTITION BY country_id ORDER BY mnth) AS growth
FROM cte
GROUP BY country_id, mnth
;

How to retrieve last year data on line-by-line basis (main set grouped by year, month & aggregated on volume)?

Is there a way to easily retrieve last years data during volume aggregation, grouped by year, month.
Sample of code below (from BQ). It shows an error in the subquery
WHERE clause expression references t1.date which is neither grouped nor aggregated
SELECT
EXTRACT(YEAR FROM t1.date) AS year,
EXTRACT(MONTH FROM t1.date) AS month,
t1.ProductId AS product,
SUM(t1.Quantity) AS UnitsSold_TY,
(SELECT
SUM(Quantity)
FROM `new-project-jun21.sales.sales_info`
WHERE
EXTRACT(YEAR FROM date) = EXTRACT(YEAR FROM t1.date) - 1 AND
EXTRACT(MONTH FROM date) = EXTRACT(MONTH FROM t1.date) AND
ProductId = t1.ProductId
GROUP BY
EXTRACT(YEAR FROM date),
EXTRACT(MONTH FROM date),
EXTRACT(MONTH FROM t1.date),
ProductId) AS UnitsSold_LY
FROM `new-project-jun21.sales.sales_info` AS t1
GROUP BY
year,
month,
product
ORDER BY product, year, month
If you have data every month, then you can use lag(). I would recommend using date_trunc() instead of separating the year and date components. So:
SELECT productId,
DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm
SUM(Quantity),
LAG(SUM(Quantity), 12) OVER (PARTITION BY ProductId ORDER BY MIN(date)) as yoy
FROM `new-project-jun21.sales.sales_info`
GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH);
ORDER BY product, yyyymm;
If you have missing months for some products, then you can still use window functions, but the logic is a bit more complicated.
EDIT:
If you don't have data every month, then you can use a RANGE specification, but you need a month counter:
SELECT productId,
DATE_TRUNC(date, INTERVAL 1 MONTH) as yyyymm
SUM(Quantity),
MAX(SUM(Quantity)) OVER (PARTITION BY ProductId
ORDER BY DATE_DIFF(MIN(DATE), DATE '2000-01-01', month)
RANGE BETWEEN 12 PRECEDING AND 12 PRECEDING
) as yoy
FROM `new-project-jun21.sales.sales_info`
GROUP BY product_id, DATE_TRUNC(date, INTERVAL 1 MONTH);
ORDER BY product, yyyymm;
Functionally, this should be equivalent, but I'd probably try to avoid the correlated subquery:
WITH xrows AS (
SELECT EXTRACT(YEAR FROM t1.date) AS year
, EXTRACT(MONTH FROM t1.date) AS month
, t1.ProductId AS product
, SUM(t1.Quantity) AS UnitsSold_TY
FROM `new-project-jun21.sales.sales_info` AS t1
GROUP BY year, month, product
)
SELECT t.*
, ( SELECT SUM(Quantity)
FROM `new-project-jun21.sales.sales_info`
WHERE EXTRACT(YEAR FROM date) = t.year - 1
AND EXTRACT(MONTH FROM date) = t.month
AND ProductId = t.ProductId
) AS UnitsSold_LY
FROM xrows t
ORDER BY product, year, month
;
Adjusted to remove the unnecessary GROUP BY in the correlated subquery.

Extracting entries from a specific month in SQL

I'm trying to calculate total revenues from each customer purchase in March from a table called orders, where order_date is in format of datetime (e.g. 2019-03-04). However, I'm unsuccessful to obtain the results with either of the following options:
WITH cte_orders AS
(
SELECT
cust_id,
SUM(order_quantity * order_cost) AS revenue
FROM
orders
WHERE
DATEPART('month', order_date) = 3
GROUP BY
cust_id
)
SELECT cust_id, revenue
FROM cte_orders;
also
WHERE EXTRACT(month from order_date) = '03'
or
WHERE MONTH(order_date) = 03
doesn't work either. What's wrong with my syntax here?
Thanks everyone for the input, finally figured out the right way to do this:
WITH cte_orders AS
(
SELECT
cust_id,
SUM(order_quantity * order_cost) AS revenue,
FROM
orders
WHERE
EXTRACT('MONTH' FROM order_date :: TIMESTAMP) = 3
GROUP BY
cust_id
)
SELECT cust_id, revenue
FROM cte_orders;
with this it converted the date to timestamp and extracted March as required.
What about date_part?
WITH cte_orders AS
(select cust_id
, SUM(order_quantity * order_cost) as revenue
from orders
WHERE date_part('month',TIMESTAMP order_date) = 3
group by cust_id)
select cust_id, revenue from cte_orders;
Did you consider putting the WHERE clause outside of the CTE?
WITH cte_orders AS
(
SELECT
cust_id,
SUM(order_quantity * order_cost) AS revenue
FROM
orders
WHERE
DATEPART('month', order_date) = 3
GROUP BY
cust_id
)
SELECT cust_id, revenue
FROM cte_orders
WHERE MONTH(order_date) = 03;
Also, it would be helpful if you let us know what (if any) results you are getting. Are you getting an error message or is it just not returning the expected values/row count?

T-SQL query to summarize total per month per year, and cumulative amounts to date

I have a database table that captures every Sales Transaction:
Transactions
(
ID INT,
TransactionDate DATETIME,
SalesAmount MONEY
)
I want to write a T-SQL query which returns a report (snapshot sample below). First column it lists the month, next column Total-Sales per month within year, and last column cumulative amount of that year up to this month. Only for year of 2018.
Any thoughts or solutions? Thank you.
Try this:
;with cte as
(
Select
YEAR(TransactionDate) as [Year],
MONTH(TransactionDate) as [Month],
SUM (SalesAmount) as [MonthlySales],
DATEPART(m, TransactionDate) as [MonthNumber]
from Transactions
group by YEAR(TransactionDate), MONTH(TransactionDate)
)
select
a.[Month], a.MonthlySales as [MonthlySales 2018], SUM(b.MonthlySales) as [Cumulative 2018]
from cte a inner join cte b on a.MonthNumber >= b.MonthNumber
WHERE (a.[Year]) = 2018 AND (b.[Year]) = 2018
group by a.[Month], a.MonthlySales
ORDER by a.[Month]
Try this one:
With Q
as
(
Select DatePart(yyyy,TransactionDate) 'Year',DatePart(m,TransactionDate) 'Month', sum(SalesAmount) 'Sales'
From Transactions
Group by DatePart(yyyy,TransactionDate),DatePart(m,TransactionDate)
)
Select q.Year,q.Month,( Select sum(q1.Sales)
From Q q1
Where q1.Year=q.Year
And q1.Month <= q.Month
) 'Cumulative Sale'
From Q q
Order by q.Year,q.Month
You would use aggregation and window functions:
select datename(month, transaction_date) as mon,
sum(salesAmount) as monthly_sales,
sum(salesAumount) over (order by min(transaction_date)) as running_amount
from transactions t
where t.transaction_date >= '2018-01-01' and
t.transaction_date < '2019-01-01'
group by datename(month, transaction_date)
order by min(transaction_date);

Find max value for each year

I have a question that is asking:
-List the max sales for each year?
I think I have the starter query but I can't figure out how to get all the years in my answer:
SELECT TO_CHAR(stockdate,'YYYY') AS year, sales
FROM sample_newbooks
WHERE sales = (SELECT MAX(sales) FROM sample_newbooks);
This query gives me the year with the max sales. I need max sales for EACH year. Thanks for your help!
Use group by and max if all you need is year and max sales of the year.
select
to_char(stockdate, 'yyyy') year,
max(sales) sales
from sample_newbooks
group by to_char(stockdate, 'yyyy')
If you need rows with all the columns with max sales for the year, you can use window function row_number:
select
*
from (
select
t.*,
row_number() over (partition by to_char(stockdate, 'yyyy') order by sales desc) rn
from sample_newbooks t
) t where rn = 1;
If you want to get the rows with ties on sales, use rank:
select
*
from (
select
t.*,
rank() over (partition by to_char(stockdate, 'yyyy') order by sales desc) rn
from sample_newbooks t
) t where rn = 1;