PostgreSQL year-over-year growth - sql

How can I calculate the year-over-year growth by country in PostgreSQL? I have a query which works reasonably well, but it also takes values from one country and compares it with those from another country, when the value for the first year should be null or zero.
Expected result:
year | country | value | yoy
2019 A 10 -0.66
2018 A 20 0.05
2017 A 19 null
2019 B 8 -0.22
2018 B 10 -0.66
2017 B 20 null
Current result:
year | country | value | yoy
2019 A 10 -0.66
2018 A 20 0.05
2017 A 19 0.81
2019 B 8 -0.22
2018 B 10 -0.66
2017 B 20 null
Query:
SELECT *,
- 100.0 * (1 - LEAD(value) OVER (ORDER BY t.country) / value) AS Grown
FROM tbl AS t
ORDER BY t.country

then get the lead() withing each country ordered by year:
SELECT *,
- 100.0 * (value - LEAD(value) OVER (Partition by Country ORDER BY t.year) / value) AS Growth
FROM tbl AS t
ORDER BY t.country

For monthly data:
SELECT current_table.item_id, current_table.date, (current_table.count - year_ago_table.count)/year_ago_table.count as count_year_over_year,
FROM
(SELECT table.item_id, table - INTERVAL '1 year' as year_ago_date, table.count FROM table) current_table
JOIN
(SELECT table.item_id, table.date, table.count FROM table) year_ago_table
ON current_table.item_id = year_ago_table.item_id AND
current_table.year_ago_date = year_ago_table.date
ORDER BY date DESC

Related

Sum of last 12 months

I have a table with 3 columns (Year, Month, Value) like this in Sql Server :
Year
Month
Value
ValueOfLastTwelveMonths
2021
1
30
30
2021
2
24
54 (30 + 24)
2021
5
26
80 (54+26)
2021
11
12
92 (80+12)
2022
1
25
87 (SUM of values from 1 2022 TO 2 2021)
2022
2
40
103 (SUM of values from 2 2022 TO 3 2021)
2022
4
20
123 (SUM of values from 4 2022 TO 5 2021)
I need a SQL request to calculate ValueOfLastTwelveMonths.
SELECT Year,
       Month,
Value,
SUM (Value) OVER (PARTITION BY Year, Month)
FROM MyTable
This is much easier if you have a row for each month and year, and then (if needed) you can filter the NULL rows out. The reason it's easier is because then you know how many rows you need to look back at: 11.
If you make a dataset of the years and months, you can then LEFT JOIN to your data, aggregate, and then finally filter the data out:
SELECT *
INTO dbo.YourTable
FROM (VALUES(2021,1,30),
(2021,2,24),
(2021,5,26),
(2021,11,12),
(2022,1,25),
(2022,2,40),
(2022,4,20))V(Year,Month,Value);
GO
WITH YearMonth AS(
SELECT YT.Year,
V.Month
FROM (SELECT DISTINCT Year
FROM dbo.YourTable) YT
CROSS APPLY (VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))V(Month)),
RunningTotal AS(
SELECT YM.Year,
YM.Month,
YT.Value,
SUM(YT.Value) OVER (ORDER BY YM.Year, YM.Month
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS Last12Months
FROM YearMonth YM
LEFT JOIN dbo.YourTable YT ON YM.Year = YT.Year
AND YM.Month = YT.Month)
SELECT Year,
Month,
Value,
Last12Months
FROM RunningTotal
WHERE Value IS NOT NULL;
GO
DROP TABLE dbo.YourTable;

Get Last Value of previous row partition in SQL

In my data set, each customer has some orders on different dates.
For each customer each month, I want to check his/her last order in the previous month in which city.
For example, it is my data for one of the customers.
customer
year
month
day
order id
city id
1544
2022
2
6
413
9
1544
2022
2
17
39
10
1544
2022
3
5
115
21
1544
2022
5
29
2153
4
1544
2022
5
30
955
9
the result should be the same as this:
customer
year
month
city of last order of prev month(prevCity)
1544
2022
2
null or 9
1544
2022
3
10
1544
2022
5
21
(the first row of the above table is not my question now. )
I write my query using last_value the same as this:
select customer,
year,
month,
last_value(City) over (partition by customer, year, month order by created_at desc
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as prevCity
from table1
but the result is false!
How can I correct this?
Using the window function lag() over() in concert with the WITH TIES clause
Select top 1 with ties
customer
,year
,month
,LastCityID = lag([city id],1) over (partition by customer order by year, month,day)
From YourTable
order by row_number() over (partition by customer,year,month order by year, month,day)
Or an Nudge More Perforamt
with cte as (
Select *
,LastCityID = lag([city id],1) over (partition by customer order by year, month,day)
,RN = row_number() over (partition by customer,year,month order by year, month,day)
From YourTable
)
Select customer
,year
,month
,LastCityID
From cte
Where RN =1
Results
customer year month LastCityID
1544 2022 2 NULL
1544 2022 3 10
1544 2022 5 21

Selecting records that have low numbers consecutively

I have a table as following (using bigquery):
id
year
month
day
rating
111
2020
11
30
4
111
2020
12
01
4
112
2020
11
30
5
113
2020
11
30
5
Is there a way in which I can select ids that have ratings that are consecutively (two or more consecutive records) low (low as in both records' ratings less than 4.5)?
For example, my desired output is:
id
year
month
day
rating
111
2020
11
30
4
111
2020
12
01
4
If you want all rows, then you need to look at both the previous rating and the next rating:
SELECT t.*
FROM (SELECT t.*,
LAG(rating) OVER (PARTITION BY id ORDER BY year, month, day ASC) AS prev_rating,
LEAD(rating) OVER (PARTITION BY id ORDER BY year, month, day ASC) AS next_rating,
FROM dataset.table t
) t
WHERE (rating < 4.5 and prev_rating < 4.5) OR
(rating < 4.5 and next_rating < 4.5)
Below is for BigQuery Standard SQL
select * except(grp, seq_len)
from (
select *, sum(1) over(partition by grp) seq_len
from (
select *,
countif(rating >= 4.5) over(partition by id order by year, month, day) grp
from `project.dataset.table`
)
where rating < 4.5
)
where seq_len > 1

fetching records for previous month

item loc year month quantity startdate
XYZ A 2020 1 3 23-06-2020
ABC B 2020 2 218 24-06-2020
SDC C 2020 6 107 25-06-2020
QWE D 2020 7 144 25-06-2020
XYZ A 2019 12 89 23-06-2020
ABC B 2019 11 218 24-06-2020
SDC C 2020 5 117 25-06-2020
QWE D 2020 6 144 25-06-2020
if i consider the above table then my output should look like this:
item loc year month quantity startdate
XYZ A 2020 1 89 23-06-2020
ABC B 2020 2 3 24-06-2020
SDC C 2020 6 117 25-06-2020
QWE D 2020 7 144 25-06-2020
so u can see that only quantities values changed and that we are taking from previos months and rest columns values are as it is.
It looks like you want window function lag(). For your sample data, this would produce the desired results:
select *
from (
select
item,
loc,
year,
month,
lag(quantity) over(partition by item, loc order by year, month) quantity,
startdate
from mytable
) t
where quantity is not null
Consider query which works in Access database:
SELECT Table1.*, (SELECT TOP 1 quantity FROM Table1 AS Dupe
WHERE Dupe.item = Table1.item AND Dupe.loc = Table1.loc
AND DateSerial(Dupe.[Year],Dupe.[Month],1)<DateSerial(Table1.[Year],Table1.[Month],1)
ORDER BY DateSerial(Dupe.[Year],Dupe.[Month],1)) AS PrevQty
FROM Table1;
If you want to return 0 when there is a gap in month sequence, consider:
SELECT Table1.*, Nz((SELECT quantity FROM Table1 AS Dupe
WHERE Dupe.item = Table1.item AND Dupe.loc = Table1.loc
AND DateSerial(Dupe.[Year],Dupe.[Month],1)=DateAdd("m",-1,DateSerial(Table1.[Year],Table1.[Month],1))
ORDER BY DateSerial(Dupe.[Year],Dupe.[Month],1)),0) AS PrevQty
FROM Table1;
Or
SELECT Q1.*, Nz(Q2.quantity,0) AS PrevQty FROM (
SELECT Table1.*, DateSerial([Year],[Month],1) AS FD FROM Table1) AS Q1
LEFT JOIN (
SELECT Table1.*, DateAdd("m",+1,DateSerial([Year],[Month],1)) AS PD FROM Table1) AS Q2
ON Q1.FD=Q2.PD AND Q1.item=Q2.item and Q1.loc=Q2.loc;

Grouping data on SQL Server

I have this table in SQL Server:
Year Month Quantity
----------------------------
2015 January 10
2015 February 20
2015 March 30
2014 November 40
2014 August 50
How can I identify the different years and months adding two more columns that group the same years with a number and then different months in sequential way like the example
Year Month Quantity Group Subgroup
------------------------------------------------
2015 January 10 1 1
2015 February 20 1 2
2015 March 30 1 3
2014 November 40 2 1
2014 August 50 2 2
You can use DENSE_RANK to calculate the groups for you:
SELECT t1.*, DENSE_RANK() OVER (ORDER BY Year DESC) AS [Group],
DENSE_RANK() OVER (PARTITION BY Year ORDER BY DATEPART(month, Month + ' 01 2010')) AS [SubGroup]
FROM t1
ORDER BY 4, 5
See this fiddle.
To associate group and subgroup with a number you can do this:
WITH RankedTable AS (
SELECT year, month, quantity,
ROW_NUMBER() OVER (partition by year order by Month) AS rn
FROM yourtable)
SELECT year, month, quantity,
SUM (CASE WHEN rn = 1 THEN 1 ELSE 0 END) OVER (ORDER BY YEAR) as year_group,
rn AS subgroup
FROM RankedTable
Here ROW_NUMBER() OVER clause calculates rank of a month within a year.
And SUM() ... OVER calculates running SUM for the months with rank 1.
SQL Fiddle