calculate CAGR using SQL - sql

I have a dataset which looks like below
ADVERTISER YR REVENUE
---------------------------------
Altus Dental 2015 5560.00
Altus Dental 2016 48295.00
Altus Dental 2017 39920.00
I'm trying to find CAGR - year over year and taking an average of them, meaning
CAGR = (((REVENUE(2016)/REVENUE(2015)) - 1) + ((REVENUE(2017)/REVENUE(2016)) - 1) ) / 2
And Finally I will need an output something like this
ADVERTISER CAGR
--------------------
Altus Dental 3.75
How can I accomplish this in SQL? Please help me in providing an effective solution for this.

Calculate the CAGR (revenue/prev_revenue - 1) for each year and calculate the average CAGR (assume your dbms supports the LAG function)
select advertiser, avg(cagr) as CAGR
from
(
select advertiser, yr, revenue, revenue/prev_revenue - 1 as cagr
from
(select *, lag(revenue, 1) over
(partition by advertiser order by yr) as prev_revenue
from test ) t
) t1
group by advertiser

Here is one way:
select advertiser,
(((t16.revenue/t15.revenue) - 1) + ((t17.revenue/t16.revenue) - 1) ) / 2 as cagr
from t t15 join
t t16
on t15.advertiser = t16.advertiser and t15.yr = 2015 and t16.yr = 2016 join
t t17
on t15.advertiser = t17.advertiser and t17.yr = 2017

I'm assuming there won't be "holes" in the list of years. This should work for n years, and n advertisers:
SELECT advertiser,
SUM(revenue) / (COUNT(*) - 1) AS CAGR
FROM (SELECT advertiser,
COALESCE((revenue/revenue_old - 1), 0) as revenue
FROM (SELECT s.advertiser,
s.revenue,
LAG(s.revenue, 1) OVER(PARTITION BY s.advertiser
ORDER BY s.yr) AS revenue_old
FROM table_1 s))
GROUP BY advertiser;

Related

How to calculate value based on average of previous month and average of same month last year in SQL

I would like to calculate targets for opened rates and clicked rates based on actuals of the last month and the same month last year.
My table is aggregated at daily level and I have grouped it by month and year to get the monthly averages. I have then created a self-join to join my current dates on the results of the previous months. This works fine for all months except for January because SQL can't know that it's supposed to join 1 on 12. Is there a way to specify this in my join clause?
Essentially, the results for January 2021 shouldn't be null because I have December 2020 data.
This is my data and my query:
CREATE TABLE exasol_last_year_avg(
date_col date,
country text,
brand text,
category text,
delivered integer,
opened integer,
clicked integer
)
INSERT INTO exasol_last_year_avg
(date_col,country,brand,category,delivered,opened,clicked) VALUES
(2021-01-01,'AT','brand1','cat1',100,60,23),
(2021-01-01,'AT','brand1','cat2',200,50,45),
(2021-01-01,'AT','brand2','cat1',300,49,35),
(2021-01-01,'AT','brand2','cat2',400,79,57),
(2021-02-02,'AT','brand1','cat1',130,78,30),
(2021-02-02,'AT','brand1','cat2',260,65,59),
(2021-02-02,'AT','brand2','cat1',390,64,46),
(2021-02-02,'AT','brand2','cat2',520,103,74),
(2020-12-02,'AT','brand1','cat1',130,78,30),
(2020-12-02,'AT','brand1','cat2',260,65,59),
(2020-12-02,'AT','brand2','cat1',390,64,46),
(2020-12-02,'AT','brand2','cat2',520,103,74),
(2020-02-02,'AT','brand1','cat2',236,59,53),
(2020-02-02,'AT','brand2','cat1',355,58,41),
(2020-02-02,'AT','brand2','cat2',473,93,67),
(2020-02-02,'AT','brand1','cat1',118,71,27)
This is written in PostgresSQL because I think it's more accessible to most people, but my production database is Exasol!
select *
from
(Select month_col,
year_col,
t_campaign_cmcategory,
t_country,
t_brand,
(t2_clicktoopenrate + t3_clicktoopenrate)/2 as target_clicktoopenrate,
(t2_openrate + t3_openrate)/2 as target_openrate
from (
with CTE as (
select extract(month from date_col) as month_col,
extract(year from date_col) as year_col,
category as t_campaign_cmcategory,
country as t_country,
brand as t_brand,
round(sum(opened)/nullif(sum(delivered),0),3) as OpenRate,
round(sum(clicked)/nullif(sum(opened),0),3) as ClickToOpenRate
from public.exasol_last_year_avg
group by 1, 2, 3, 4, 5)
select t1.month_col,
t1.year_col,
t2.month_col as t2_month_col,
t2.year_col as t2_year_col,
t3.month_col as t3_month_col,
t3.year_col as t3_year_col,
t1.t_campaign_cmcategory,
t1.t_country,
t1.t_brand,
t1.OpenRate,
t1.ClickToOpenRate,
t2.OpenRate as t2_OpenRate,
t2.ClickToOpenRate as t2_ClickToOpenRate,
t3.OpenRate as t3_OpenRate,
t3.ClickToOpenRate as t3_ClickToOpenRate
from CTE t1
left join CTE t2
on t1.month_col = t2.month_col + 1
and t1.year_col = t2.year_col
and t1.t_campaign_cmcategory = t2.t_campaign_cmcategory
and t1.t_country = t2.t_country
and t1.t_brand = t2.t_brand
left join CTE t3
on t1.month_col = t3.month_col
and t1.year_col = t3.year_col + 1
and t1.t_campaign_cmcategory = t3.t_campaign_cmcategory
and t1.t_country = t3.t_country
and t1.t_brand = t3.t_brand) as target_base) as final_tbl
Start with an aggregation query:
select date_trunc('month', date_col), country, brand,
sum(opened) * 1.0 / nullif(sum(delivered), 0) as OpenRate,
sum(clicked) * 1.0 / nullif(sum(opened), 0) as ClickToOpenRate
from exasol_last_year_avg
group by 1, 2, 3;
Then, use window functions. Assuming you have a value for every month (with no gaps). you can just use lag(). I'm not sure what your final calculation is, but this brings in the data:
with mcb as (
select date_trunc('month', date_col) as yyyymm, country, brand,
sum(opened) * 1.0 / nullif(sum(delivered), 0) as OpenRate,
sum(clicked) * 1.0 / nullif(sum(opened), 0) as ClickToOpenRate
from exasol_last_year_avg
group by 1, 2, 3
)
select mcb.*,
lag(openrate, 1) over (partition by country, brand order by yyyymm) as prev_month_openrate,
lag(ClickToOpenRate, 1) over (partition by country, brand order by yyyymm) as prev_month_ClickToOpenRate,
lag(openrate, 12) over (partition by country, brand order by yyyymm) as prev_year_openrate,
lag(ClickToOpenRate, 12) over (partition by country, brand order by yyyymm) as prev_year_ClickToOpenRate
from mcb;
This works with a different join condition:
select *
from
(Select month_col,
year_col,
t_campaign_cmcategory,
t_country,
t_brand,
(t2_clicktoopenrate + t3_clicktoopenrate)/2 as target_clicktoopenrate,
(t2_openrate + t3_openrate)/2 as target_openrate
from (
with CTE as (
select extract(month from date_col) as month_col,
extract(year from date_col) as year_col,
category as t_campaign_cmcategory,
country as t_country,
brand as t_brand,
round(sum(opened)/nullif(sum(delivered),0),3) as OpenRate,
round(sum(clicked)/nullif(sum(opened),0),3) as ClickToOpenRate
from public.exasol_last_year_avg
group by 1, 2, 3, 4, 5)
select t1.month_col,
t1.year_col,
t2.month_col as t2_month_col,
t2.year_col as t2_year_col,
t3.month_col as t3_month_col,
t3.year_col as t3_year_col,
t1.t_campaign_cmcategory,
t1.t_country,
t1.t_brand,
t1.OpenRate,
t1.ClickToOpenRate,
t2.OpenRate as t2_OpenRate,
t2.ClickToOpenRate as t2_ClickToOpenRate,
t3.OpenRate as t3_OpenRate,
t3.ClickToOpenRate as t3_ClickToOpenRate
from CTE t1
left join CTE t2
-- adjusted join condition
on ((t1.month_col = (CASE WHEN t1.month_col = 1 then t2.month_col - 11 END) and t1.year_col = t2.year_col + 1)
or (t1.month_col = (CASE WHEN t1.month_col != 1 then t2.month_col + 1 END) and t1.year_col = t2.year_col))
and t1.t_campaign_cmcategory = t2.t_campaign_cmcategory
and t1.t_country = t2.t_country
and t1.t_brand = t2.t_brand
left join CTE t3
on t1.month_col = t3.month_col
and t1.year_col = t3.year_col + 1
and t1.t_campaign_cmcategory = t3.t_campaign_cmcategory
and t1.t_country = t3.t_country
and t1.t_brand = t3.t_brand) as target_base) as final_tbl

Postgresql - Aggregate queries inside aggregate queries

I'm working on building a select statement for a sales rep commission report that uses postgresql tables. I want it to show these columns:
-Customer No.
-Part No.
-Month-to-date Qty (MTD Qty)
-Year-to-date Qty (YTD Qty)
-Month-to-date Extended Selling Price (MTD Extended)
-Year-to-date Extended Selling Price (YTD Extended)
The data is in two tables:
Sales_History (one record per invoice and this table includes Cust. No. and Invoice Date)
Sales_History_Items (one record per part no. per invoice and this table includes Part No., Qty and Unit Price).
If I do a simple query that combines these two tables, this is what it looks like:
Date / Cust / Part / Qty / Unit Price
Apr 1 / ABC Co. / WIDGET / 5 / $11
Apr 4 / ABC Co. / WIDGET / 8 / $11.50
Apr 1 / ABC Co. / GADGET / 1 / $30
Apr 7 / XYZ Co. / WIDGET / 3 / $11.50
etc.
This is the final result I want (one line per customer per part):
Cust / Part / Qty / MTD Qty / MTD Sales / YTD Qty / YTD Sales
ABC Co. / WIDGET / 13 / $147 / 1500 / $16,975
ABC Co. / GADGET / 1 / $30 / 7 / $210
XYZ Co. / WIDGET / 3 / $34.50 / 18 / $203.40
I’ve been able to come up with this SQL statement so far, which does not get me the extended selling columns (committed_qty * unit_price) per line and then summarize them by cust no./part no., and that’s my problem:
with mtd as
(SELECT sales_history.cust_no, part_no, Sum(sales_history_items.committed_qty) AS MTDQty
FROM sales_history left JOIN sales_history_items
ON sales_history.invoice_no = sales_history_items.invoice_no where sales_history_items.part_no is not null and sales_history.invoice_date >= '2020-04-01' and sales_history.invoice_date <= '2020-04-30'
GROUP BY sales_history.cust_no, sales_history_items.part_no),
ytd as
(SELECT sales_history.cust_no, part_no, Sum(sales_history_items.committed_qty) AS YTDQty
FROM sales_history left JOIN sales_history_items
ON sales_history.invoice_no = sales_history_items.invoice_no where sales_history_items.part_no is not null and sales_history.invoice_date >= '2020-01-01' and sales_history.invoice_date <= '2020-12-31' GROUP BY sales_history.cust_no, sales_history_items.part_no),
mysummary as
(select MTDQty, YTDQty, coalesce(ytd.cust_no,mtd.cust_no) as cust_no,coalesce(ytd.part_no,mtd.part_no) as part_no
from ytd full outer join mtd on ytd.cust_no=mtd.cust_no and ytd.part_no=mtd.part_no)
select * from mysummary;
I believe that I have to nest another couple of aggregate queries in here that would group by cust_no, part_no, unit_price but then have those extended price totals (qty * unit_price) sum up by cust_no, part_no.
Any assistance would be greatly appreciated. Thanks!
Do this in one go with filter expressions:
with params as (
select '2020-01-01'::date as year, 4 as month
)
SELECT h.cust_no, i.part_no,
SUM(i.committed_qty) AS YTDQty,
SUM(i.committed_qty * i.unit_price) as YTDSales,
SUM(i.committed_qty) FILTER
(WHERE extract('month' from h.invoice_date) = p.month) as MTDQty,
SUM(i.committed_qty * i.unit_price) FILTER
(WHERE extract('month' from h.invoice_date) = p.month) as MTDSales
FROM params p
CROSS JOIN sales_history h
LEFT JOIN sales_history_items i
ON i.invoice_no = h.invoice_no
WHERE i.part_no is not null
AND h.invoice_date >= p.year
AND h.invoice_date < p.year + interval '1 year'
GROUP BY h.cust_no, i.part_no
If I follow you correctly, you can do conditional aggregation:
select sh.cust_no, shi.part_no,
sum(shi.qty) mtd_qty,
sum(shi.qty * shi.unit_price) ytd_sales,
sum(shi.qty) filter(where sh.invoice_date >= date_trunc('month', current_date) mtd_qty,
sum(shi.qty * shi.unit_price) filter(where sh.invoice_date >= date_trunc('month', current_date) mtd_sales
from sales_history sh
left join sales_history_items shi on sh.invoice_no = shi.invoice_no
where shi.part_no is not null and sh.invoice_date >= date_trunc('year', current_date)
group by sh.cust_no, shi.part_no
The logic is to filter on the current year, and use simple aggregation to compute the "year to date" figures. To get the "month to date" columns, we can just filter the aggregate functions.

SQL Year over year growth percentage from data same query

How do I calculate the percentage difference from 2 different columns, calculated in that same query? Is it even possible?
This is what I have right now:
SELECT
Year(OrderDate) AS [Year],
Count(OrderID) AS TotalOrders,
Sum(Invoice.TotalPrice) AS TotalRevenue
FROM
Invoice
INNER JOIN Order
ON Invoice.InvoiceID = Order.InvoiceID
GROUP BY Year(OrderDate);
Which produces this table
Now I'd like to add one more column with the YoY growth, so even when 2016 comes around, the growth should be there..
EDIT:
I should clarify that I'd like to have for example next to
2015,5,246.28 -> 346,15942029% ((R2015-R2014) / 2014 * 100)
If you save your existing query as qryBase, you can use it as the data source for another query to get what you want:
SELECT
q1.Year,
q1.TotalOrders,
q1.TotalRevenue,
IIf
(
q0.TotalRevenue Is Null,
Null,
((q1.TotalRevenue - q0.TotalRevenue) / q0.TotalRevenue) * 100
) AS YoY_growth
FROM
qryBase AS q1
LEFT JOIN qryBase AS q0
ON q1.Year = (q0.Year + 1);
Access may complain it "can't represent the join expression q1.Year = (q0.Year + 1) in Design View", but you can still edit the query in SQL View and it will work.
What you are looking for is something like this?
Year Revenue Growth
2014 55
2015 246 4.47
2016 350 1.42
You could wrap the original query a twice to get the number from both years.
select orders.year, orders.orders, orders.revenue,
(select (orders.revenue/subOrders.revenue)
from
(
--originalQuery or table link
) subOrders
where subOrders.year = (orders.year-1)
) as lastYear
from
(
--originalQuery or table link
) orders
here's a cheap union'd table example.
select orders.year, orders.orders, orders.revenue,
(select (orders.revenue/subOrders.revenue)
from
(
select 2014 as year, 2 as orders, 55.20 as revenue
union select 2015 as year, 2 as orders, 246.28 as revenue
union select 2016 as year, 7 as orders, 350.47 as revenue
) subOrders
where subOrders.year = (orders.year-1)
) as lastYear
from
(
select 2014 as year, 2 as orders, 55.20 as revenue
union select 2015 as year, 2 as orders, 246.28 as revenue
union select 2016 as year, 7 as orders, 350.47 as revenue
) orders

Display top n records and consolidate the rest of the rows

I am relatively new to writing SQL. I have a requirement where I have to display the top 5 records as it is and consolidate the rest as 1 single record and append it as the 6th record. I know top 5 selects the top 5 records, but I am finding it difficult to put together a logic to consolidate the rest of the records and append it at the bottom of the result set.
weekof sales year weekno
-------------------------------------------------------------
07/01 - 07/07 2 2012 26
07/08 - 07/14 2 2012 27
07/29 - 08/04 1 2012 30
08/05 - 08/11 1 2012 31
08/12 - 08/18 32 2012 32
08/26 - 09/01 2 2012 34
09/02 - 09/08 8 2012 35
09/09 - 09/15 46 2012 36
09/16 - 09/22 26 2012 37
I want this to be displayed as
weekof sales
----------------------
09/16 - 09/22 26
09/09 - 09/15 46
09/02 - 09/08 8
08/26 - 09/01 2
08/12 - 08/18 32
07/01 - 08/11 6
Except when weekof spans years, this will get the data you want and in the correct order:
;WITH x AS
(
SELECT weekof, sales,
rn = ROW_NUMBER() OVER (ORDER BY [year] DESC, weekno DESC)
FROM dbo.table_name
)
SELECT weekof, sales FROM x WHERE rn <= 5
UNION ALL
SELECT MIN(LEFT(weekof, 5)) + ' - ' + MAX(RIGHT(weekof, 5)), SUM(sales)
FROM x WHERE rn > 5
ORDER BY weekof DESC;
When the rows being returned span a year, you may have to return the rn as well (and just ignore it at the presentation layer):
;WITH x AS
(
SELECT weekof, sales,
rn = ROW_NUMBER() OVER (ORDER BY [year] DESC, weekno DESC)
FROM dbo.table_name
)
SELECT weekof, sales, rn FROM x WHERE rn <= 5
UNION ALL
SELECT MIN(LEFT(weekof, 5)) + ' - ' + MAX(RIGHT(weekof, 5)), SUM(sales), rn = 6
FROM x WHERE rn > 5
ORDER BY rn;
Here's an alternate version that displays in the correct order even when crossing years, and requires a single scan instead of two (important only if the amount of data is large). It will work in SQL 2005 and up:
WITH Sales AS (
SELECT
Seq = Row_Number() OVER (ORDER BY S.yr DESC, S.weekno DESC),
*
FROM dbo.SalesByWeek S
)
SELECT
weekof =
Substring(Min(Convert(varchar(11), S.yr) + Left(S.weekof, 5)), 5, 5)
+ ' - '
+ Substring(Max(Convert(varchar(11), S.yr) + Right(S.weekof, 5)), 5, 5),
sales = Sum(S.sales)
FROM
Sales S
OUTER APPLY (SELECT S.Seq WHERE S.Seq <= 5) G
GROUP BY G.Seq
ORDER BY Coalesce(G.Seq, 2147483647)
;
There's also a query possible that might perform even better for very large data sets, since it doesn't calculate row number (using TOP once instead). But this is all theoretical without real data and performance testing:
SELECT
weekof =
Substring(Min(Convert(varchar(11), S.yr) + Left(S.weekof, 5)), 5, 5)
+ ' - '
+ Substring(Max(Convert(varchar(11), S.yr) + Right(S.weekof, 5)), 5, 5),
sales = Sum(S.sales)
FROM
dbo.SalesByWeek S
LEFT JOIN (
SELECT TOP 5 *
FROM dbo.SalesByWeek
ORDER BY yr DESC, weekno DESC
) G ON S.yr = G.yr AND S.weekno = G.weekno
GROUP BY G.yr, G.weekno
ORDER BY Coalesce(G.yr, 9999), G.weekno
;
Ultimately, I really don't like the string manipulation to pull out the date ranges. It would be better if the table had the MinDate and MaxDate as separate columns, and then creating a label for them would be much simpler.

sql query to calculate monthly growth percentage

I need to build a query with 4 columns (sql 2005).
Column1: Product
Column2: Units sold
Column3: Growth from previous month (in %)
Column4: Growth from same month last year (in %)
In my table the year and months have custom integer values. For example, the most current month is 146 - but also the table has a year (eg 2011) column and month (eg 7) column.
Is it possible to get this done in one query or do i need to start employing temp tables etc??
Appreciate any help.
thanks,
KS
KS,
To do this on the fly, you could use subqueries.
SELECT product, this_month.units_sold,
(this_month.sales-last_month.sales)*100/last_month.sales,
(this_month.sales-last_year.sales)*100/last_year.sales
FROM (SELECT product, SUM(units_sold) AS units_sold, SUM(sales) AS sales
FROM product WHERE month = 146 GROUP BY product) AS this_month,
(SELECT product, SUM(units_sold) AS units_sold, SUM(sales) AS sales
FROM product WHERE month = 145 GROUP BY product) AS last_month,
(SELECT product, SUM(units_sold) AS units_sold, SUM(sales) AS sales
FROM product WHERE month = 134 GROUP BY product) AS this_year
WHERE this_month.product = last_month.product
AND this_month.product = last_year.product
If there's a case where a product was sold in one month but not another month, you will have to do a left join and check for null values, especially if last_month.sales or last_year.sales is 0.
I hope I got them all:
SELECT
Current_Month.product_name, units_sold_current_month,
units_sold_last_month * 100 / units_sold_current_month prc_last_month,
units_sold_last_year * 100 / units_sold_current_month prc_last_year
FROM
(SELECT product_id, product_name, sum(units_sold) units_sold_current_month FROM MyTable WHERE YEAR = 2011 AND MONTH = 7) Current_Month
JOIN
(SELECT product_id, product_name, sum(units_sold) units_sold_last_month FROM MyTable WHERE YEAR = 2011 AND MONTH = 6) Last_Month
ON Current_Month.product_id = Last_Month.product_id
JOIN
(SELECT product_id, product_name, sum(units_sold) units_sold_last_year FROM MyTable WHERE YEAR = 2010 AND MONTH = 7) Last_Year
ON Current_Month.product_id = Last_Year.product_id
I am slightly guessing as the structure of the table provided is the result table, right? You will need to do self-join on month-to-previous-month basis:
SELECT <growth computation here>
FROM SALES s1 LEFT JOIN SALES s2 ON (s1.month = s2.month-1) -- last month join
LEFT JOIN SALES s3 ON (s1.month = s3.month - 12) -- lat year join
where <growth computation here> looks like
((s1.sales - s2.sales)/s2.sales * 100),
((s1.sales - s3.sales)/s3.sales * 100)
I use LEFT JOIN for months that have no previous months. Change your join conditions based on actual relations in month/year columns.