Compare value of current row to average of all previous rows - sql

I'm looking to compare the value of each date to the average value of all previous dates and calculate the percent change. For example, in the source table below, I would want to compare the value of 100 from December 2022 to the average of November, October, and September ((75+60+75)/3) to bring back the 0.43 change.
Source Table
Date
Value
December 2022
100
November 2022
75
October 2022
60
September 2022
75
Desired Output
Date
Value
Comparison
December 2022
100
0.43
November 2022
75
0.11
October 2022
60
-0.20
September 2022
75
-

You need a windowed AVG with an OVER clause using the appropriate range of rows (ORDER BY [Date] ASC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING):
Test data:
SELECT *
INTO Data
FROM (VALUES
(20221201, 100),
(20221101, 75),
(20221001, 60),
(20220901, 75)
) v ([Date], [Value])
Statement:
SELECT [Date], [Value], ([Value] - [Average]) * 1.00 / [Average] AS [Comparison]
FROM (
SELECT
*,
[Average] = AVG([Value]) OVER (ORDER BY [Date] ASC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
FROM Data
) t
ORDER BY [Date] DESC
Result (without rounding):
Date
Value
Comparison
20221201
100
0.4285714285714
20221101
75
0.1194029850746
20221001
60
-0.2000000000000
20220901
75
null

drop table #t
select *
into #t
from
(
VALUES (1,N'December 2022', 100.0)
, (2,N'November 2022', 75.0)
, (3,N'October 2022', 60.0)
, (4,N'September 2022', 75.0)
) t (sort, col1,col2)
select col2, (col2 - AVG(col2) OVER(ORDER BY sort DESC ROWS between UNBOUNDED PRECEDING and 1 PRECEDING)) / AVG(col2) OVER(ORDER BY sort DESC ROWS between UNBOUNDED PRECEDING and 1 PRECEDING)
, AVG(col2) OVER(ORDER BY sort DESC)
from #t
order by sort
Something like this. watch out for 0 values though

Related

Sum of last 12 months

I have a table with 3 columns (Year, Month, Value) like this in Sql Server :
Year
Month
Value
ValueOfLastTwelveMonths
2021
1
30
30
2021
2
24
54 (30 + 24)
2021
5
26
80 (54+26)
2021
11
12
92 (80+12)
2022
1
25
87 (SUM of values from 1 2022 TO 2 2021)
2022
2
40
103 (SUM of values from 2 2022 TO 3 2021)
2022
4
20
123 (SUM of values from 4 2022 TO 5 2021)
I need a SQL request to calculate ValueOfLastTwelveMonths.
SELECT Year,
       Month,
Value,
SUM (Value) OVER (PARTITION BY Year, Month)
FROM MyTable
This is much easier if you have a row for each month and year, and then (if needed) you can filter the NULL rows out. The reason it's easier is because then you know how many rows you need to look back at: 11.
If you make a dataset of the years and months, you can then LEFT JOIN to your data, aggregate, and then finally filter the data out:
SELECT *
INTO dbo.YourTable
FROM (VALUES(2021,1,30),
(2021,2,24),
(2021,5,26),
(2021,11,12),
(2022,1,25),
(2022,2,40),
(2022,4,20))V(Year,Month,Value);
GO
WITH YearMonth AS(
SELECT YT.Year,
V.Month
FROM (SELECT DISTINCT Year
FROM dbo.YourTable) YT
CROSS APPLY (VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))V(Month)),
RunningTotal AS(
SELECT YM.Year,
YM.Month,
YT.Value,
SUM(YT.Value) OVER (ORDER BY YM.Year, YM.Month
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS Last12Months
FROM YearMonth YM
LEFT JOIN dbo.YourTable YT ON YM.Year = YT.Year
AND YM.Month = YT.Month)
SELECT Year,
Month,
Value,
Last12Months
FROM RunningTotal
WHERE Value IS NOT NULL;
GO
DROP TABLE dbo.YourTable;

PostgreSQL year-over-year growth

How can I calculate the year-over-year growth by country in PostgreSQL? I have a query which works reasonably well, but it also takes values from one country and compares it with those from another country, when the value for the first year should be null or zero.
Expected result:
year | country | value | yoy
2019 A 10 -0.66
2018 A 20 0.05
2017 A 19 null
2019 B 8 -0.22
2018 B 10 -0.66
2017 B 20 null
Current result:
year | country | value | yoy
2019 A 10 -0.66
2018 A 20 0.05
2017 A 19 0.81
2019 B 8 -0.22
2018 B 10 -0.66
2017 B 20 null
Query:
SELECT *,
- 100.0 * (1 - LEAD(value) OVER (ORDER BY t.country) / value) AS Grown
FROM tbl AS t
ORDER BY t.country
then get the lead() withing each country ordered by year:
SELECT *,
- 100.0 * (value - LEAD(value) OVER (Partition by Country ORDER BY t.year) / value) AS Growth
FROM tbl AS t
ORDER BY t.country
For monthly data:
SELECT current_table.item_id, current_table.date, (current_table.count - year_ago_table.count)/year_ago_table.count as count_year_over_year,
FROM
(SELECT table.item_id, table - INTERVAL '1 year' as year_ago_date, table.count FROM table) current_table
JOIN
(SELECT table.item_id, table.date, table.count FROM table) year_ago_table
ON current_table.item_id = year_ago_table.item_id AND
current_table.year_ago_date = year_ago_table.date
ORDER BY date DESC

Selecting records that have low numbers consecutively

I have a table as following (using bigquery):
id
year
month
day
rating
111
2020
11
30
4
111
2020
12
01
4
112
2020
11
30
5
113
2020
11
30
5
Is there a way in which I can select ids that have ratings that are consecutively (two or more consecutive records) low (low as in both records' ratings less than 4.5)?
For example, my desired output is:
id
year
month
day
rating
111
2020
11
30
4
111
2020
12
01
4
If you want all rows, then you need to look at both the previous rating and the next rating:
SELECT t.*
FROM (SELECT t.*,
LAG(rating) OVER (PARTITION BY id ORDER BY year, month, day ASC) AS prev_rating,
LEAD(rating) OVER (PARTITION BY id ORDER BY year, month, day ASC) AS next_rating,
FROM dataset.table t
) t
WHERE (rating < 4.5 and prev_rating < 4.5) OR
(rating < 4.5 and next_rating < 4.5)
Below is for BigQuery Standard SQL
select * except(grp, seq_len)
from (
select *, sum(1) over(partition by grp) seq_len
from (
select *,
countif(rating >= 4.5) over(partition by id order by year, month, day) grp
from `project.dataset.table`
)
where rating < 4.5
)
where seq_len > 1

Filling missing months when calculating year to date

I have a table cumulative year todate
year month qty_ytd
2017 01 20
2017 02 30
2018 01 50
I need to fill gabs missing months in the same year till december:
Result as example:
year month qty_ytd
2017 01 20
2017 02 30
2017 03 30
.....
2017 07 30
2017 12 30
2018 01 50
2018 02 50
....
2018 12 50
How to do it? I did'nt figure out how to fill the missing months?
You can use cross join to generate the rows and cross apply to get the data:
select y.y, v.m, t.qty_ytd
from (select distinct year from t) y cross join
(values (1), (2), (3), (4), . . . (12)) v(m) outer apply
(select top (1) t.*
from t
where t.year = y.year and
t.month <= y.m
order by t.m desc
) t;
Assuming qty_ytd is non-decreasing, it might be more performant to use window functions:
select y.y, v.m,
max(t.qty_ytd) over (partition by y.y order by v.m) as qty_ytd
from (select distinct year from t) y cross join
(values (1), (2), (3), (4), . . . (12)) v(m) left join
t
on t.year = y.year and
t.month = v.m;
Another option is to compute delta, add dummy zero deltas, restore running total. I've changed source data to show more common case
create table #t
(
year int,
month int,
qty_ytd int
);
insert #t(year, month, qty_ytd )
values
(2017, 01, 20),
(2017, 02, 30),
(2018, 04, 50) -- note month
;
select distinct year, month, sum(delta) over(partition by year order by month)
from (
-- real delta
select year, month, delta = qty_ytd - isnull(lag(qty_ytd) over (partition by year order by month),0)
from #t
union all
-- tally dummy delta
select top(24) 2017 + (n-1)/12, n%12 + 1 , 0
from
( select row_number() over(order by a.n) n
from
(values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) a(n),
(values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) b(n)
) c
)d
order by year, month;

How to shift a year-week field in bigquery

This question is about shifting values of a year-week field in bigquery.
run_id year_week value
0001 201451 13
0001 201452 6
0001 201503 3
0003 201351 8
0003 201352 5
0003 201403 1
Here for each year the week can range from 01 to 53. For example year 2014 has last week which is 201452 but year 2015 has last week which is 201553.
Now I want to shift the values for each year_week in each run_id by 5 weeks. For the weeks there is no value it is assumed that they have a value of 0. For example the output from the example table above should look like this:
run_id year_week value
0001 201504 13
0001 201505 6
0001 201506 0
0001 201507 0
0001 201508 3
0003 201404 8
0003 201405 5
0003 201406 0
0003 201407 0
0003 201408 1
Explanation of the output: In the table above for run_id 0001 the year_week 201504 has a value of 13 because in the input table we had a value of 13 for year_week 201451 which is 5 weeks before 201504.
I could create a table programmatically by creating a mapping from a year_week to a shifted year_week and then doing a join to get the output, but I was wondering if there is any other way to do it by just using sql.
#standardSQL
WITH `project.dataset.table` AS (
SELECT '001' run_id, 201451 year_week, 13 value UNION ALL
SELECT '001', 201452, 6 UNION ALL
SELECT '001', 201503, 3
), weeks AS (
SELECT 100 * year + week year_week
FROM UNNEST([2013, 2014, 2015, 2016, 2017]) year,
UNNEST(GENERATE_ARRAY(1, IF(EXTRACT(ISOWEEK FROM DATE(1+year,1,1)) = 1, 52, 53))) week
), temp AS (
SELECT i.run_id, w.year_week, d.year_week week2, value
FROM weeks w
CROSS JOIN (SELECT DISTINCT run_id FROM `project.dataset.table`) i
LEFT JOIN `project.dataset.table` d
USING(year_week, run_id)
)
SELECT * FROM (
SELECT run_id, year_week,
SUM(value) OVER(win) value
FROM temp
WINDOW win AS (
PARTITION BY run_id ORDER BY year_week ROWS BETWEEN 5 PRECEDING AND 5 PRECEDING
)
)
WHERE NOT value IS NULL
ORDER BY run_id, year_week
with result as
Row run_id year_week value
1 001 201504 13
2 001 201505 6
3 001 201508 3
if you need to "preserve" zero rows - just change below portion
SELECT i.run_id, w.year_week, d.year_week week2, value
FROM weeks w
to
SELECT i.run_id, w.year_week, d.year_week week2, IFNULL(value, 0) value
FROM weeks w
or
SUM(value) OVER(win) value
FROM temp
to
SUM(IFNULL(value, 0)) OVER(win) value
FROM temp
If you have data in the table for all year-weeks, then you can do:
with yw as (
select year_week, row_number() over (order by year_week) as seqnum
from t
group by year_week
)
select t.*, yw5, year_week as new_year_week
from t join
yw
on t.year_week = yw.year_week left join
yw yw5
on yw5.seqnum = yw.seqnum + 5;
If you don't have a table of year weeks, then I would advise you to create such a table, so you can do such manipulations -- or a more general calendar table.