Finding sales growth from cumulative totals over period in SQL - sql

SELECT CUST_ID,CONTACTS
Sum("CONTACTS") Over (PARTITION by "CUST_ID" Order By "end_Period" ROWS UNBOUNDED PRECEDING) as RunningContacts,
"SALES",
Sum("SALES") Over (PARTITION by "CUST_ID" Order By "end_Period" ROWS UNBOUNDED PRECEDING) as RunningSales,
end_Period
FROM Table2
I have currently created the Running growth column in excel formula is (New Runningsales - Previous Running sales) / Previous RunningSales.
Any help here is appreciated.

Are you looking for this?
select t.*,
RunningSales / (Running - Sales) - 1
from (< your query here > ) x

The SQL derived table can hold a query to aggregate sales by period, and you an join such to itself to compare each period to the prior period.

Related

Postgres - AVG calculation

Please refer to the below query
SELECT sum(sales) AS "Sales",
sum(discount) AS "discount",
year
FROM Sales_tbl
WHERE Group by year
Now I want to also display a column for AVG(sales) that is the same value and based on the total of sales column
Output
Please advise
Use AVG() as a window function:
WITH t AS (
SELECT
SUM(sales) AS sales, SUM(discount) AS discount, year
FROM tbl_sales
GROUP BY year
)
SELECT *,AVG(sales) OVER w_total
FROM t
WINDOW w_total AS (RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY year;
The frame RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is pretty much optional in this case, but it is considered a good practice to be as explicit as possible in window functions. So you're also able to write the query like this:
WITH t AS (
SELECT
SUM(sales) AS sales, SUM(discount) AS discount, year
FROM tbl_sales
GROUP BY year
)
SELECT *,AVG(sales) OVER ()
FROM t
ORDER BY year;
Demo: db<>fiddle

Faster alternative of MIN/MAX in SQL Server

I need the lowest/highest price of stocks for the past n days. The following query works really slow. I would appreciate faster alternative:
SELECT
*,
MIN(Close) OVER (PARTITION BY Ticker ORDER BY PriceDate ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS MinPrice14d,
MAX(Close) OVER (PARTITION BY Ticker ORDER BY PriceDate ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS MaxPrice14d
FROM
(SELECT CompanyID, Ticker, PriceDate, Close
FROM price.PriceHistoryDaily) a
I need the columns specified.
It is trailing, so I need it day by day.
As for period, I will limit it to one year.
Although it doesn't affect the performance, no subquery is needed. So start with the simpler version:
SELECT phd.CompanyID, phd.Ticker, phd.PriceDate, phd.Close,
min(Close) over (partition by Ticker
order by PriceDate
rows between 14 preceding and 1 preceding
) as MinPrice14d,
max(Close) over (partition by Ticker
order by PriceDate
rows between 14 preceding and 1 preceding
) as MaxPrice14d
FROM price.PriceHistoryDaily phd;
Then try adding an index: PriceHistoryDaily(Ticker, PriceDate).
Note: That this returns all rows from PriceHistoryDaily and -- depending on the size of the table -- that might be what is driving the performance.

PostgreSQL subquery - calculating average of lagged values

I am looking at Sales Rates by month, and was able to query the 1st table. I am quite new to PostgreSQL and am trying to figure out how I can query the second (I had to do the 2nd one in Excel for now)
I have the current Sales Rate and I would like to compare it to the Sales Rate 1 and 2 months ago, as an averaged rate.
I am not asking for an answer how exactly to solve it because this is not the point of getting better, but just for hints for functions to use that are specific to PostgreSQL. What I am trying to calculate is the 2 month average in the 2nd table based on the lagged values of the 2nd table. Thanks!
Here is the query for the 1st table:
with t1 as
(select date,
count(sales)::numeric/count(poss_sales) as SR_1M_before
from data
where date between '2019-07-01' and '2019-11-30'
group by 1),
t2 as
(select date,
count(sales)::numeric/count(poss_sales) as SR_2M_before
from data
where date between '2019-07-01' and '2019-10-31'
group by 1)
select t0.date,
count(t0.sales)::numeric/count(t0.poss_sales) as Sales_Rate
t1.SR_1M_before,
t2.SR_2M_before
from data as t0
left join t1 on t0.date=t1.date
left join t2 on t0.date=t1.date
where date between '2019-07-01' and '2019-12-31'
group by 1,3,4
order by 1;
As commented by a_horse_with_no_name, you can use window functions to take the average of the two previous monthes with a range clause:
select
date,
count(sales)::numeric/count(poss_sales) as Sales_Rate,
avg(count(sales)::numeric/count(poss_sales)) over(
order by date
rows between '2 month' preceding and '1 month' preceding
) Sales_Rate,
count(sales)::numeric/count(poss_sales) as Sales_Rate
- avg(count(sales)::numeric/count(poss_sales)) over(
order by date
rows between '2 month' preceding and '1 month' preceding
) PercentDeviation
from data
where date between '2019-07-01' and '2019-12-31'
group by date
order by date;
Your data is a bit confusing -- it would be less confusing if you had decimal places (that is, 58% being the average of 57% and 58% is not obvious).
Because you want to have NULL values on the first two rows, I'm going to calculate the values using sum() and count():
with q as (
<whatever generates the data you have shown>
)
select q.*,
(sum(sales_rate) over (order by date
rows between 2 preceding and 1 preceding
) /
nullif(count(*) over (order by date
rows between 2 preceding and 1 preceding
)
) as two_month_average
from q;
You could also express this using case and avg():
select q.*,
(case when row_number() over (order by date) > 2)
then avg(sales_rate) over (order by date
rows between 2 preceding and 1 preceding
)
end) as two_month_average
from q;

Forward Rolling Average in Hive Query

I want to calculate the rolling average looking forward on "4 day window" basis. Please find the details below
Create table stock(day int, time String, cost float);
Insert into stock values(1,"8 AM",3.1);
Insert into stock values(1,"9 AM",3.2);
Insert into stock values(1,"10 AM",4.5);
Insert into stock values(1,"11 AM",5.5);
Insert into stock values(2,"8 AM",5.1);
Insert into stock values(2,"9 AM",2.2);
Insert into stock values(2,"10 AM",1.5);
Insert into stock values(2,"11 AM",6.5);
Insert into stock values(3,"8 AM",8.1);
Insert into stock values(3,"9 AM",3.2);
Insert into stock values(3,"10 AM",2.5);
Insert into stock values(3,"11 AM",4.5);
Insert into stock values(4,"8 AM",3.1);
Insert into stock values(4,"9 AM",1.2);
Insert into stock values(4,"10 AM",0.5);
Insert into stock values(4,"11 AM",1.5);
I wrote the below query
select day, cost,sum(cost) over (order by day range between current row and 4 Following), avg(cost) over (order by day range between current row and 4 Following)
from stock
As you can see, I get 4 records for each day and I need to calculate rolling average on 4 day window. For this, I wrote the above window query, as I have the data only for 4 days each day containing 4 records, the sum for the first day will be the total of all the 16 records. Based on this the first record will have the sum of 56.20 which is correct and the average should be 56.20/4 (as there are 4 days), but it is doing 56.20/16 as there 16 records in total. How do I fix the average part of this?
Thanks
Raj
Is this what you want?
select t.*,
avg(cost) over (order by day range between current row and 4 following)
from t;
EDIT:
What you seem to want is:
select t.*,
(sum(cost) over (order by day range between current row and 3 following) /
count(distinct day) over (order by day range between current row and 3 following)
)
from t;
But, Hive does not support this. You can use a subquery for this purpose:
select t.*,
(sum(cost) over (order by day range between current row and 3 following) /
sum(case when seqnum = 1 then 1 else 0 end) over (order by day range between current row and 3 following)
)
from (select t.*
row_number() over (partition by day order by time) as seqnum
from t
)t

Use a regular aggregative function (sum) alongside a window function

I was reading this tutorial on how to calculate running totals.
Copying the suggested approach I have a query of the form:
select
date,
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table;
This works fine and does what I want - a running total by date.
However, in addition to the running total, I'd also like to add daily sales:
select
date,
sum(sales),
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by 1;
This throws an error:
SYNTAX_ERROR: line 6:8: '"sum"("sales") OVER (ORDER BY "activity_date" ASC ROWS UNBOUNDED PRECEDING)' must be an aggregate expression or appear in GROUP BY clause
How can I calculate both daily total as well as running total?
I think you can try it, but it will repeat your daily_sales. In this way you don't need to group by your date field.
SELECT date,
SUM(sales) OVER (PARTITION BY DATE) as daily_sales
SUM(sales) OVER (ORDER BY DATE ROWS UNBOUNDED PRECEDING) as cumulative_sales
FROM sales_table;
Presumably, you intend an aggregation query to begin with:
select date, sum(sales) as daily_sales,
sum(sum(sales)) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by date
order by date;