Uniform distribution of monthly budget to date - sql

I have monthly budget need to distribute to per day
Datasource
Month
Budget
Jan
31
Feb
56
I want to smoothen out to
Date
Budget
01-Jan
1
02-Jan
1
...
...
01-Feb
2
02-Feb
2
...
...
How can I do this?

Assuming the month is really a date on the first day, then a pretty simply method uses a recursive CTE:
with cte as (
select month as day, budget
from t
union all
select dateadd(day, 1, day), budget
from cte
where day < eomonth(day)
)
select day, budget * 1.0 / day(eomonth(day))
from cte
order by day;
Here is a db<>fiddle.

Just another option using an ad-hoc tally/numbers table
This assumes the source MONTH is a string and the desired year is the current year.
Example or dbFiddle
Declare #YourTable Table ([Month] varchar(50),[Budget] money)
Insert Into #YourTable Values
('Jan',31)
,('Feb',56)
Select Date = DateFromParts(year(D),month(D),N)
,Budget = Budget / day(D)
From #YourTable A
Cross Apply ( values (EOMonth(try_convert(date,concat('01-',Month,'-',year(getdate())))))) B(D)
Join (Select Top 31 N=Row_Number() Over (Order By (Select Null)) From master..spt_values n1) C
on N<=day(D)
Results
Date Budget
2021-01-01 1.00
2021-01-02 1.00
...
2021-01-30 1.00
2021-01-31 1.00
2021-02-01 2.00
...
2021-02-27 2.00
2021-02-28 2.00

Related

PARTITION BY with date between 2 date

I work on Azure SQL Database working with SQL Server
In SQL, I try to have a table by day, but the day is not in the table.
I explain it by the example below:
TABLE STARTER: (Format Date: YYYY-MM-DD)
Date begin
Date End
Category
Value
2021-01-01
2021-01-03
1
0.2
2021-01-02
2021-01-03
1
0.1
2021-01-01
2021-01-02
2
0.3
For the result, I try to have this TABLE RESULT:
Date
Category
Value
2021-01-01
1
0.2
2021-01-01
2
0.3
2021-01-02
1
0.3 (0.2+0.1)
2021-01-02
2
0.3
2021-01-03
1
0.3 (0.2+0.1)
For each day, I want to sum the value if the day is between the beginning and the end of the date. I need to do that for each category.
In terms of SQL code I try to do something like that:
SELECT SUM(CAST(value as float)) OVER (PARTITION BY Date begin, Category) as value,
Date begin,
Category,
Value
FROM TABLE STARTER
This code calculates only the value that has the same Date begin but don't consider all date between Date begin and Date End.
So in my code, it doesn't calculate the sum of the value for the 02-01-2021 of Category 1 because it doesn't write explicitly. (between 01-01-2021 and 03-01-2021)
Is it possible to do that in SQL?
Thanks so much for your help!
You can use a recursive CTE to expand the date ranges into the list of separate days. Then, it's matter of joining and aggregating.
For example:
with
r as (
select category,
min(date_begin) as date_begin, max(date_end) as date_end
from starter
group by category
),
d as (
select category, date_begin as d from r
union all
select d.category, dateadd(day, 1, d.d)
from d
join r on r.category = d.category
where d.d < r.date_end
)
select d.d, d.category, sum(s.value) as value
from d
join starter s on s.category = d.category
and d.d between s.date_begin and s.date_end
group by d.category, d.d;
Result:
d category value
----------- --------- -----
2021-01-01 1 0.20
2021-01-01 2 0.30
2021-01-02 1 0.30
2021-01-02 2 0.30
2021-01-03 1 0.30
See running example at db<>fiddle.
Note: Starting in SQL Server 2022 it seems there is/will be a new GENERATE_SERIES() function that will make this query much shorter.

Snowflake Split bigger time interval in monthly intervals

I have a table which has different interval invoice. Please see sample data below:
Invoice_Start_Date Invoice_End_Date Amount
1/1/2019 2/1/2019 12
1/1/2019 1/1/2020 84
1/1/2019 1/1/2021 140
I need to split this data into monthly invoice. In this case First record will be as is. Second record should be split into 12 records with amount of 84/12 for each record.
Third record should be split into 24 records with amount of 140/24 for each record.
Expected Output:
Invoice_Start_Date Invoice_End_Date Amount
1/1/2019 2/1/2019 12
1/1/2019 2/1/2019 7
2/1/2019 3/1/2019 7
3/1/2019 4/1/2019 7
4/1/2019 5/1/2019 7
.........etc
Can someone please advise. I was thinking of writing many union statements ( one for each month but I realized my interval can be 12 months or 24 months etc. so it won't work)
One method is a recursive CTE:
with recursive cte as (
select Invoice_Start_Date, Invoice_End_Date,
Amount / datediff(month, Invoice_Start_Date, Invoice_End_Date) as month_amount
from t
union all
select dateadd(month, 1, invoice_start_date), invoice_end_date,
month_amount
from cte
where invoice_start_date < invoice_end_date
)
select invoice_start_date,
dateadd(month, 1, invoice_start_date) as invoice_end_date,
month_amount
from cte;
Recursive CTE can be slow for large data sets. This does the same thing with a simple join:
With TEMPTBL as (
select round(sqrt(row_number() over (order by null)*2)) as rnum
from table(generator(rowcount => 10000)) --10000 Allows for up to 140 months difference.
order by 1 )
select Invoice_Start_Date, Invoice_End_Date,
datediff(month, Invoice_Start_Date, Invoice_End_Date) as month_diff,
Amount / datediff(month, Invoice_Start_Date, Invoice_End_Date) as month_amount
from INVOICE t, TEMPTBL y
where t.month_diff = y.rnum

SQL/BIGQUERY Running Average with GAPs in Dates

I'm having trouble with a moving average in BigQuery/SQL, I have table 'SCORES' and I need to make a 30d moving average while grouping the data using users, the problem is my dates aren't sequential, e.g there are gaps in it.
Below is my current code:
SELECT user, date,
AVG(score) OVER (PARTITION BY user ORDER BY date)
FROM SCORES;
I don't know how to add the date restrictions into that line or if this is even possible.
My current table looks like this, but of course with a lot more users:
user date score
AA 13/02/2018 2.00
AA 15/02/2018 3.00
AA 17/02/2018 4.00
AA 01/03/2018 5.00
AA 28/03/2018 6.00
Then I need it to become, this:
user date score 30D Avg
AA 13/02/2018 2.00 2.00
AA 15/02/2018 3.00 2.50
AA 17/02/2018 4.00 3.00
AA 01/03/2018 5.00 3.50
AA 28/03/2018 6.00 5.50
Where in the last row, it's only measuring backward one because of the date (up to 30D backwards) is there any way to implement this in SQL or am I asking for too much?
You want to use range between. For this, you need an integer, so:
select s.*,
avg(score) over (partition by user
order by days
range between 29 preceding and current row
) as avg_30day
from (select s.*, date_diff(s.date, date('2000-01-01'), day) as days
from scores s
) s;
An alternative to date_diff() is unix_date():
select s.*,
avg(score) over (partition by user
order by unix_days
range between 29 preceding and current row
) as avg_30day
from (select s.*, unix_date(s.date) as unix_days
from scores s
) s;
Below is for BigQuery Standard SQL
#standardSQL
SELECT *,
AVG(score) OVER (
PARTITION BY user
ORDER BY UNIX_DATE(PARSE_DATE('%d/%m/%Y', date))
RANGE BETWEEN 29 PRECEDING AND CURRENT ROW
) AS avg_30day
FROM `project.dataset.scores`
You can test / play with above using dummy data from your question
#standardSQL
WITH `project.dataset.scores` AS (
SELECT 'AA' user, '13/02/2018' date, 2.00 score UNION ALL
SELECT 'AA', '15/02/2018', 3.00 UNION ALL
SELECT 'AA', '17/02/2018', 4.00 UNION ALL
SELECT 'AA', '01/03/2018', 5.00 UNION ALL
SELECT 'AA', '28/03/2018', 6.00
)
SELECT *,
AVG(score) OVER (
PARTITION BY user
ORDER BY UNIX_DATE(PARSE_DATE('%d/%m/%Y', date))
RANGE BETWEEN 29 PRECEDING AND CURRENT ROW
) AS avg_30day
FROM `project.dataset.scores`
result
Row user date score avg_30day
1 AA 13/02/2018 2.0 2.0
2 AA 15/02/2018 3.0 2.5
3 AA 17/02/2018 4.0 3.0
4 AA 01/03/2018 5.0 3.5
5 AA 28/03/2018 6.0 5.5

Find From/To Dates across multiple rows - SQL Postgres

I want to be able to "book" within range of dates, but you can't book across gaps of days. So booking across multiple rates is fine as long as they are contiguous.
I am happy to change data structure/index, if there are better ways of storing start/end ranges.
So far I have a "rates" table which contains Start/End Periods of time with a daily rate.
e.g. Rates Table.
ID Price From To
1 75.00 2015-04-12 2016-04-15
2 100.00 2016-04-16 2016-04-17
3 50.00 2016-04-18 2016-04-30
For the above data I would want to return:
From To
2015-04-12 2016-4-30
For simplicity sake it is safe to assume that dates are safely consecutive. For contiguous dates To is always 1 day before from.
For the case there is only 1 row, I would want it to return the From/To of that single row.
Also to clarify if I had the following data:
ID Price From To
1 75.00 2015-04-12 2016-04-15
2 100.00 2016-04-17 2016-04-18
3 50.00 2016-04-19 2016-04-30
4 50.00 2016-05-01 2016-05-21
Meaning where there is a gap >= 1 day it would count as a separate range.
In which case I would expect the following:
From To
2015-04-12 2016-04-15
2015-04-17 2016-05-21
Edit 1
After playing around I have come up with the following SQL which seems to work. Although I'm not sure if there are better ways/issues with it?
WITH grouped_rates AS
(SELECT
from_date,
to_date,
SUM(grp_start) OVER (ORDER BY from_date, to_date) group
FROM (SELECT
gite_id,
from_date,
to_date,
CASE WHEN (from_date - INTERVAL '1 DAY') = lag(to_date)
OVER (ORDER BY from_date, to_date)
THEN 0
ELSE 1
END grp_start
FROM rates
GROUP BY from_date, to_date) AS start_groups)
SELECT
min(from_date) from_date,
max(to_date) to_date
FROM grouped_rates
GROUP BY grp;
This is identifying contiguous overlapping groups in the data. One approach is to find where each group begins and then do a cumulative sum. The following query adds a flag indicating if a row starts a group:
select r.*,
(case when not exists (select 1
from rates r2
where r2.from < r.from and r2.to >= r.to or
(r2.from = r.from and r2.id < r.id)
)
then 1 else 0 end) as StartFlag
from rate r;
The or in the correlation condition is to handle the situation where intervals that define a group overlap on the start date for the interval.
You can then do a cumulative sum on this flag and aggregate by that sum:
with r as (
select r.*,
(case when not exists (select 1
from rates r2
where (r2.from < r.from and r2.to >= r.to) or
(r2.from = r.from and r2.id < r.id)
)
then 1 else 0 end) as StartFlag
from rate r
)
select min(from), max(to)
from (select r.*,
sum(r.StartFlag) over (order by r.from) as grp
from r
) r
group by grp;
CREATE TABLE prices( id INTEGER NOT NULL PRIMARY KEY
, price MONEY
, date_from DATE NOT NULL
, date_upto DATE NOT NULL
);
-- some data (upper limit is EXCLUSIVE)
INSERT INTO prices(id, price, date_from, date_upto) VALUES
( 1, 75.00, '2015-04-12', '2016-04-16' )
,( 2, 100.00, '2016-04-17', '2016-04-19' )
,( 3, 50.00, '2016-04-19', '2016-05-01' )
,( 4, 50.00, '2016-05-01', '2016-05-22' )
;
-- SELECT * FROM prices;
-- Recursive query to "connect the dots"
WITH RECURSIVE rrr AS (
SELECT date_from, date_upto
, 1 AS nperiod
FROM prices p0
WHERE NOT EXISTS (SELECT * FROM prices nx WHERE nx.date_upto = p0.date_from) -- no preceding segment
UNION ALL
SELECT r.date_from, p1.date_upto
, 1+r.nperiod AS nperiod
FROM prices p1
JOIN rrr r ON p1.date_from = r.date_upto
)
SELECT * FROM rrr r
WHERE NOT EXISTS (SELECT * FROM prices nx WHERE nx.date_from = r.date_upto) -- no following segment
;
Result:
date_from | date_upto | nperiod
------------+------------+---------
2015-04-12 | 2016-04-16 | 1
2016-04-17 | 2016-05-22 | 3
(2 rows)

SQL spread month value into weeks

I have a table where I have values by month and I want to spread these values by week, taking into account that weeks that spread into two month need to take part of the value of each of the month and weight on the number of days that correspond to each month.
For example I have the table with a different price of steel by month
Product Month Price
------------------------------------
Steel 1/Jan/2014 100
Steel 1/Feb/2014 200
Steel 1/Mar/2014 300
I need to convert it into weeks as follows
Product Week Price
-------------------------------------------
Steel 06-Jan-14 100
Steel 13-Jan-14 100
Steel 20-Jan-14 100
Steel 27-Jan-14 128.57
Steel 03-Feb-14 200
Steel 10-Feb-14 200
Steel 17-Feb-14 200
As you see above, the week that overlaps between Jan and Feb needs to be calculated as follows
(100*5/7)+(200*2/7)
This takes into account tha the week of the 27th has 5 days that fall into Jan and 2 into Feb.
Is there any possible way to create a query in SQL that would achieve this?
I tried the following
First attempt:
select
WD.week,
PM.PRICE,
DATEADD(m,1,PM.Month),
SUM(PM.PRICE/7) * COUNT(*)
from
( select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE
)PM
join
( select '2014-1-20' as week
union
select '2014-1-27' as week
union
select '2014-2-3' as week
)WD
ON WD.week>=PM.Month
AND WD.week < DATEADD(m,1,PM.Month)
group by
WD.week,PM.PRICE, DATEADD(m,1,PM.Month)
This gives me the following
week PRICE
2014-1-20 100 2014-02-01 00:00:00.000 14
2014-1-27 100 2014-02-01 00:00:00.000 14
2014-2-3 200 2014-03-01 00:00:00.000 28
I tried also the following
;with x as (
select price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n <= ndm.nd
)
select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
This gimes me the following
week price
2014-01-07 00:00:00.000 100.00
2014-01-14 00:00:00.000 100.00
2014-01-21 00:00:00.000 100.00
2014-02-04 00:00:00.000 200.00
2014-02-11 00:00:00.000 200.00
2014-02-18 00:00:00.000 200.00
Thanks
If you have a calendar table it's a simple join:
SELECT
product,
calendar_date - (day_of_week-1) AS week,
SUM(price/7) * COUNT(*)
FROM prices AS p
JOIN calendar AS c
ON c.calendar_date >= month
AND c.calendar_date < DATEADD(m,1,month)
GROUP BY product,
calendar_date - (day_of_week-1)
This could be further simplified to join only to mondays and then do some more date arithmetic in a CASE to get 7 or less days.
Edit:
Your last query returned jan 31st two times, you need to remove the =from on n.n < ndm.nd. And as you seem to work with ISO weeks you better change the DATEPART to avoid problems with different DATEFIRST settings.
Based on your last query I created a fiddle.
;with x as (
select price,
datepart(isowk,dateadd(day, n.n, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n < ndm.nd
) select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
Of course the dates might be from multiple years, so you need to GROUP BY by the year, too.
Actually, you need to spred it over days, and then get the averages by week. To get the days we'll use the Numbers table.
;with x as (
select product, price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from #t t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from #t t
where t1.month = t.month and t1.product = t.product) ndm
inner join numbers n on n.n <= ndm.nd
)
select product, min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by product, wk
having count(*) = 7
order by product, wk
The result of datepart(week,dateadd(day, n.n-2, t1.month)) expression depends on SET DATEFIRST so you might need to adjust accordingly.