Use a regular aggregative function (sum) alongside a window function - sql

I was reading this tutorial on how to calculate running totals.
Copying the suggested approach I have a query of the form:
select
date,
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table;
This works fine and does what I want - a running total by date.
However, in addition to the running total, I'd also like to add daily sales:
select
date,
sum(sales),
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by 1;
This throws an error:
SYNTAX_ERROR: line 6:8: '"sum"("sales") OVER (ORDER BY "activity_date" ASC ROWS UNBOUNDED PRECEDING)' must be an aggregate expression or appear in GROUP BY clause
How can I calculate both daily total as well as running total?

I think you can try it, but it will repeat your daily_sales. In this way you don't need to group by your date field.
SELECT date,
SUM(sales) OVER (PARTITION BY DATE) as daily_sales
SUM(sales) OVER (ORDER BY DATE ROWS UNBOUNDED PRECEDING) as cumulative_sales
FROM sales_table;

Presumably, you intend an aggregation query to begin with:
select date, sum(sales) as daily_sales,
sum(sum(sales)) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by date
order by date;

Related

SQL Query for YTD & MTD

I am trying to find YTD and MTD totals for total sales. The total sales is derived by multiplying "Order Quantity" and "Unit price" and subtracting "Discount_Applied".
Here is my query,
select orderdate,datename(month,orderdate) as Mnth,year(orderdate) as YR, sum((unit_price*order_quantity)-discount_applied) as Total_Sales,
sum((unit_price*order_quantity)-discount_applied) over (partition by year(orderdate) order by orderdate) as YTD,
sum((unit_price*order_quantity)-discount_applied) over (partition by year(orderdate),datename(month,orderdate) order by orderdate) as MTD
from sales
group by orderdate,datename(month,orderdate),year(orderdate)
However, when I run this query, it gives me an error saying
Column 'sales.Unit_Price' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
I guess it has something to do with the windows function I am using, but can't figure out specifically the problem. Can someone help.
When you use a window function, it is calculated after any possible GROUP BY aggregation. So those columns don't exist anymore, only the aggregation exists.
Instead, you need to SUM the SUM: you need to do a windowed sum over the aggregate sum.
SELECT
s.orderdate,
DATENAME(month, s.orderdate) as Mnth,
YEAR(s.orderdate) as YR,
SUM((s.unit_price * s.order_quantity) - s.discount_applied) as Total_Sales,
SUM(SUM((s.unit_price * s.order_quantity) - s.discount_applied) OVER
(PARTITION BY YEAR(s.orderdate) ORDER BY EOMONTH(s.orderdate), s.orderdate ROWS UNBOUNDED PRECEDING) as YTD,
SUM(SUM((s.unit_price * s.order_quantity) - s.discount_applied)) OVER
(PARTITION BY YEAR(s.orderdate), EOMONTH(s.orderdate) ORDER BY s.orderdate ROWS UNBOUNDED PRECEDING) as MTD
FROM sales s
GROUP BY
s.orderdate;
Note also that EOMONTH is a little more efficient than DATENAME, and that adding the month to the YTD ordering means it can use the same sort, without affecting the calculation.
Also ROWS UNBOUNDED PRECEDING is a little more efficient than the default RANGE UNBOUNDED PRECEDING.

Postgres - AVG calculation

Please refer to the below query
SELECT sum(sales) AS "Sales",
sum(discount) AS "discount",
year
FROM Sales_tbl
WHERE Group by year
Now I want to also display a column for AVG(sales) that is the same value and based on the total of sales column
Output
Please advise
Use AVG() as a window function:
WITH t AS (
SELECT
SUM(sales) AS sales, SUM(discount) AS discount, year
FROM tbl_sales
GROUP BY year
)
SELECT *,AVG(sales) OVER w_total
FROM t
WINDOW w_total AS (RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY year;
The frame RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is pretty much optional in this case, but it is considered a good practice to be as explicit as possible in window functions. So you're also able to write the query like this:
WITH t AS (
SELECT
SUM(sales) AS sales, SUM(discount) AS discount, year
FROM tbl_sales
GROUP BY year
)
SELECT *,AVG(sales) OVER ()
FROM t
ORDER BY year;
Demo: db<>fiddle

SQL Rolling LTV (Lifetime Value)

I am trying to get a rolling calculation of customer lifetime value. The basic formula that I am using would 'SUM(revenue) / COUNT(DISTINCT CUSTOMERS)' but am running into issues when trying to just get those numbers from whatever day it is moving backward. I have code below that isn't correct but had also tried PARTITION code that also didn't work.
CREATE TEMP TABLE customer_revenue AS
(
SELECT TRUNC(timestamp) AS "order_date", COUNT(DISTINCT customer_email) AS "customers",
SUM(revenue)-SUM(discount)-SUM(shipping)-SUM(tax) AS "revenue"
FROM public.fact_shopify_orders
GROUP BY TRUNC(timestamp)
);
SELECT TRUNC(SO.timestamp) AS "date", SUM(CR.revenue) / COUNT(customers) AS "LTV"
FROM customer_revenue CR
LEFT JOIN public.fact_shopify_orders SO ON CR.order_date = SO.timestamp
WHERE CR.order_date <= SO.timestamp
GROUP BY TRUNC(SO.timestamp)
ORDER BY TRUNC(SO.timestamp) DESC
I think you want rolling sums and count(distinct). The latter is a little tricky but you can emulate it easily using a flag based on the first time the customer is seen:
SELECT date,
( SUM(SUM(net_revenue)) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) /
SUM(SUM( (seqnum = 1)::int )) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
) as LTV
FROM (SELECT so.*, TRUNC(SO.timestamp) as date,
(revenue - discount - shipping - tax) as net_revenue,
ROW_NUMBER() OVER (PARTITION BY customer_email ORDER BY timestamp) as seqnum
FROM public.fact_shopify_orders so
) so
GROUP BY date;
EDIT:
I think Redshift supports window functions with aggregation . . . but there is some database out there that does not. You can try this:
SELECT date,
( SUM(net_revenue) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) /
SUM(num_firsts) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
) as LTV
FROM (SELECT date, SUM(net_revenue) as net_revenue,
SUM( (seqnum = 1)::int ) as num_firsts
FROM (SELECT so.*, TRUNC(SO.timestamp) as date,
(revenue - discount - shipping - tax) as net_revenue,
ROW_NUMBER() OVER (PARTITION BY customer_email ORDER BY timestamp) as seqnum
FROM public.fact_shopify_orders so
) so
GROUP BY date
) so;
Here is a similar version running in Postgres.

Finding sales growth from cumulative totals over period in SQL

SELECT CUST_ID,CONTACTS
Sum("CONTACTS") Over (PARTITION by "CUST_ID" Order By "end_Period" ROWS UNBOUNDED PRECEDING) as RunningContacts,
"SALES",
Sum("SALES") Over (PARTITION by "CUST_ID" Order By "end_Period" ROWS UNBOUNDED PRECEDING) as RunningSales,
end_Period
FROM Table2
I have currently created the Running growth column in excel formula is (New Runningsales - Previous Running sales) / Previous RunningSales.
Any help here is appreciated.
Are you looking for this?
select t.*,
RunningSales / (Running - Sales) - 1
from (< your query here > ) x
The SQL derived table can hold a query to aggregate sales by period, and you an join such to itself to compare each period to the prior period.

How to group same multiple window functions as one and call by an alias wherever needed in the query?

How can I address the issue of having the same window function multiple times in a single SQL query for different aggregations? Is there any way I can alias it and call it multiple times as needed in the query.
I tried using 'Window' clause for the same but SQL Server currently doesn't support the 'Window' clause.
select empid, qty,
sum(qty) over (partition by empid order by month rows between unbounded preceding and current row) as running_sum,
avg(qty) over (partition by empid order by month rows between unbounded preceding and current row) as running_avg,
min(qty) over (partition by empid order by month rows between unbounded preceding and current row) as running_min,
max(qty) over (partition by empid order by month rows between unbounded preceding and current row) as running_max
from employee
Is there a way to remove the redundancy in the code?
Not in SQL Server, ANSI SQL supports a WINDOWS clause for defining windows which can be re-used. However, SQL Server does not support it.
I think you can slightly simplify your logic:
select empid, qty,
sum(qty) over (partition by empid order by month) as running_sum,
avg(qty) over (partition by empid order by month) as running_avg,
min(qty) over (partition by empid order by month) as running_min,
max(qty) over (partition by empid order by month) as running_max
from employee;