SQL (Redshift) - Rolling average with all preceding values - sql

I have a table like so:
Day
Value
1
3
1
5
1
1
2
4
2
7
3
1
3
1
3
2
3
5
How do I create a rolling average that takes into account all previous days to produce a table like so:
Day
Rolling_avg
1
3
2
4
3
3.22
Day1 = avg(day 1 values)
Day2 = avg(day1 + day2 values)
Day3 = avg(day1 + day2 + day3 values)
so on so forth..thank you!

First aggregate by day to get the sum of values and counts for each day. Then use analytic functions to find the rolling averages.
WITH cte AS (
SELECT Day, SUM(Value) ValueSum, COUNT(*) AS Count
FROM yourTable
GROUP BY Day
)
SELECT Day, SUM(ValueSum) OVER (ORDER BY Day) /
SUM(Count) OVER (ORDER BY Day) AS Rolling_avg
FROM cte
ORDER BY Day;
Demo

Related

SQL query for incoming and outgoing stocks, first and last

I need to make a query that shows sales and stocks (incoming and outgoing) for each model in October 2021.
The point is that for obtaining incoming and outgoing stocks I need to get vt_stocks_cube_sz.qty respectively for the first day of month and for the last day of month .
Now I wrote just sum of stocks (SUM(vt_stocks_cube_sz.qty) as stocks) but it isn't correct.
Could you help me to split the stocks according to the rule above, I cannot understant how to write the query correctly.
%%time
SELECT vt_sales_cube_sz.modc_barc2 model,
SUM(vt_sales_cube_sz.qnt) sales,
SUM(vt_stocks_cube_sz.qty) as stocks
FROM vt_sales_cube_sz
LEFT JOIN vt_date_cube2
ON vt_sales_cube_sz.id_calendar_int = vt_date_cube2.id_calendar_int
LEFT JOIN vt_stocks_cube_sz ON
vt_stocks_cube_sz.parent_modc_barc = vt_sales_cube_sz.modc_barc AND
vt_stocks_cube_sz.id_stock = vt_sales_cube_sz.id_stock AND
vt_stocks_cube_sz.id_calendar_int = vt_sales_cube_sz.id_calendar_int AND
vt_stocks_cube_sz.vipusk_type = vt_sales_cube_sz.price_type
WHERE vt_date_cube2.wk_year_id = 2021
AND vt_date_cube2.wk_MoY_id = 10
AND vt_sales_cube_sz.id_stock IN
(SELECT id_stock
FROM vt_warehouse_cube
WHERE channel = \'OffLine\')
GROUP BY vt_sales_cube_sz.modc_barc2
If you're looking for a robust and generalizable approach I'd suggest using analytic functions such as FIRST_VALUE, LAST_VALUE or something slightly different with RANK or ROW_NUMBER.
A simple example follows, so you can rerun it on your side and adjust it to the specific tables/fields you're using.
N.B.: You might need some tiebreakers in case you had multiple entries for the same first/last day.
with dummy_table as (
SELECT 1 as month, 1 as day, 10 as value UNION ALL
SELECT 1 as month, 2 as day, 20 as value UNION ALL
SELECT 1 as month, 3 as day, 30 as value UNION ALL
SELECT 2 as month, 1 as day, 5 as value UNION ALL
SELECT 2 as month, 3 as day, 15 as value UNION ALL
SELECT 2 as month, 5 as day, 25 as value
)
SELECT
month,
day,
case when day = first_day then 'first' else 'last' end as type,
value,
FROM (
SELECT *
, FIRST_VALUE(day) over (partition by month order by day ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as first_day
, LAST_VALUE(day) over (partition by month order by day ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_day
FROM dummy_table
) tmp
WHERE day = first_day OR day=last_day
Dummy table:
Row
month
day
value
1
1
1
10
2
1
2
20
3
1
3
30
4
2
1
5
5
2
3
15
6
2
5
25
Result:
Row
month
day
type
value
1
1
1
first
10
2
1
3
last
30
3
2
1
first
5
4
2
5
last
25

Running cumulative count group by month

I have an table with values like this:
count month-year
6 12-2020
5 12-2020
4 11-2020
3 11-2020
3 10-2020
2 10-2020
2 09-2020
1 09-2020
I want to group the data by the month and show the sum of the count for the current month and the months before it. I am expecting the following output:
count month-year
26 12-2020 <- month 12 count equal to month 12 sum + count start from month 9
15 11-2020 <- month 11 count equal to month 11 sum + count start from month 9
8 10-2020 <- month 10 count equal to month 9 sum + month 10
3 09-2020 <- assume month 9 is the launch month, count = sum count of month 9
You want to use SUM here twice, both as an aggregate and as an analytic function:
SELECT
[month-year],
SUM(SUM(count)) OVER (ORDER BY [month-year]) AS count
FROM yourTable
GROUP BY
[month-year]
ORDER BY
[month-year] DESC;
Demo
There is another way to calculate the desired result
select Distinct [month-year] ,
SUM(count) OVER (ORDER BY [month-year]) AS count
from yourTable
order by [month-year] desc

sum values based on 7-day cycle in SQL Oracle

I have dates and some value, I would like to sum values within 7-day cycle starting from the first date.
date value
01-01-2021 1
02-01-2021 1
05-01-2021 1
07-01-2021 1
10-01-2021 1
12-01-2021 1
13-01-2021 1
16-01-2021 1
18-01-2021 1
22-01-2021 1
23-01-2021 1
30-01-2021 1
this is my input data with 4 groups to see what groups will create the 7-day cycle.
It should start with first date and sum all values within 7 days after first date included.
then start a new group with next day plus anothe 7 days, 10-01 till 17-01 and then again new group from 18-01 till 25-01 and so on.
so the output will be
group1 4
group2 4
group3 3
group4 1
with match_recognize would be easy current_day < first_day + 7 as a condition for the pattern but please don't use match_recognize clause as solution !!!
One approach is a recursive CTE:
with tt as (
select dte, value, row_number() over (order by dte) as seqnum
from t
),
cte (dte, value, seqnum, firstdte) as (
select tt.dte, tt.value, tt.seqnum, tt.dte
from tt
where seqnum = 1
union all
select tt.dte, tt.value, tt.seqnum,
(case when tt.dte < cte.firstdte + interval '7' day then cte.firstdte else tt.dte end)
from cte join
tt
on tt.seqnum = cte.seqnum + 1
)
select firstdte, sum(value)
from cte
group by firstdte
order by firstdte;
This identifies the groups by the first date. You can use row_number() over (order by firstdte) if you want a number.
Here is a db<>fiddle.

Estimation of Cumulative value every 3 months in SQL

I have a table like this:
ID Date Prod
1 1/1/2009 5
1 2/1/2009 5
1 3/1/2009 5
1 4/1/2009 5
1 5/1/2009 5
1 6/1/2009 5
1 7/1/2009 5
1 8/1/2009 5
1 9/1/2009 5
And I need to get the following result:
ID Date Prod CumProd
1 2009/03/01 5 15 ---Each 3 months
1 2009/06/01 5 30 ---Each 3 months
1 2009/09/01 5 45 ---Each 3 months
What could be the best approach to take in SQL?
You can try the below - using window function
DEMO Here
select * from
(
select *,sum(prod) over(order by DATEPART(qq,dateval)) as cum_sum,
row_number() over(partition by DATEPART(qq,dateval) order by dateval) as rn
from t
)A where rn=1
How about just filtering on the month number?
select t.*
from (select id, date, prod, sum(prod) over (partition by id order by date) as running_prod
from t
) t
where month(date) in (3, 6, 9, 12);

sql command to find average count of user visits to a website from past 6 months

I have a table with 2 columns, Date and number of visits.
i need to calculate average count difference of visits by month from past 6 months
Date Number_of_Visits
2018-04-06 5
2018-02-06 6
2017-04-10 3
2017-02-10 9
SQL should output
Avg_count difference visits past 6 months
5-3=2
6-9=-3
-3+2/2=-0.5
sql query output should be -0.5
creating sql as below
With cte as (
SELECT Year(v1.date) as Year, Month(v1.date) as Month, sum(v1.visits) as SumCount
FROM visits_table v1
group by Year(v1.date), Month(v1.date)
)
You wanted the average of the different of the same month over the years ? Year on Year comparison ?
This will gives you the result that you want -0.5
; With
cte as
(
SELECT Year(v1.date) as Year, Month(v1.date) as Month, sum(v1.visits) as SumCount
FROM visits_table v1
WHERE v1.date >= DATEADD(MONTH, -6, GETDATE()) -- Add here
group by Year(v1.date), Month(v1.date)
)
SELECT AVG (diff * 1.0)
FROM
(
SELECT *, diff = SumCount
- LAG (SumCount) OVER (PARTITION BY Month
ORDER BY Year)
FROM cte
) d