Add Missing Dates in Running Total - sql

If I have a table in SQL SERVER:
DATE ITEM Quantity_Received
1/1/2016 Hammer 20
1/3/2016 Hammer 50
1/5/2016 Hammer 100
...
And I want to output:
DATE ITEM Quantity_Running
1/1/2016 Hammer 20
1/2/2016 Hammer 20
1/3/2016 Hammer 70
1/4/2016 Hammer 70
1/5/2016 Hammer 120
Where Quantity_Running is just a cumulative sum. I am confused on how I would add the missing dates to the output table that do not exist in the initial table. Note, I would need to do this for many items and would probably have a temp-table with the dates I want.
Thank you!
EDIT
And is there a way to do it such that you use an inner join?
SELECT TD.Date,
FT1.Item,
SUM(FT2.QTY) AS Cumulative
FROM tempDates TD
LEFT JOIN FutureOrders FT1
ON TD.SETTLE_DATE = FT1.SETTLE_DATE
INNER JOIN FutureOrders FT2
ON FT1.Settle_Date < ft2.Settle_Date AND ft1.ITEM= ft2.ITEM
GROUP BY ft1.ITEM, ft1.Settle_Date
tempDates is a CTE that has the dates I want. Because I am then returning NULL values. And sorry, but to make it a bit more complicated I actually want:
DATE ITEM Quantity_Running
1/1/2016 Hammer 150
1/2/2016 Hammer 100
1/3/2016 Hammer 100
1/4/2016 Hammer 0
1/5/2016 Hammer 0
Thought I would be able to get the answer by myself based on my simpler question.

Generate all the dates (for a specified range of dates) using a recursive cte. Thereafter left join on the generated dates cte and get the running sum for missing dates as well.
with dates_cte as
(select cast('2016-01-01' as date) dt
union all
select dateadd(dd,1,dt) from dates_cte where dt < '2016-01-31' --change this date to the maximum date needed
)
select dt,item,sum(quantity) over(partition by item order by dt) quantity_running
from (select d.dt,i.item,
coalesce(t.quantity_received,0) quantity
from dates_cte d
cross join (select distinct item from t) i
left join t on t.dt=d.dt and i.item=t.item) x
order by 2,1
Edit: To get the reverse cumulative sum excluding the current row, use
with dates_cte as
(select cast('2016-01-01' as date) dt
union all
select dateadd(dd,1,dt) from dates_cte where dt < '2016-01-31' --change this date to the maximum date needed
)
select dt,item,
sum(quantity) over(partition by item order by dt desc rows between unbounded preceding and 1 preceding) quantity_running
from (select d.dt,i.item,
coalesce(t.quantity_received,0) quantity
from dates_cte d
cross join (select distinct item from t) i
left join t on t.dt=d.dt and i.item=t.item) x
order by 2,1

Related

SQL: How to create a daily view based on different time intervals using SQL logic?

Here is an example:
Id|price|Date
1|2|2022-05-21
1|3|2022-06-15
1|2.5|2022-06-19
Needs to look like this:
Id|Date|price
1|2022-05-21|2
1|2022-05-22|2
1|2022-05-23|2
...
1|2022-06-15|3
1|2022-06-16|3
1|2022-06-17|3
1|2022-06-18|3
1|2022-06-19|2.5
1|2022-06-20|2.5
...
Until today
1|2022-08-30|2.5
I tried using the lag(price) over (partition by id order by date)
But i can't get it right.
I'm not familiar with Azure, but it looks like you need to use a calendar table, or generate missing dates using a recursive CTE.
To get started with a recursive CTE, you can generate line numbers for each id (assuming multiple id values) in the source data ordered by date. These rows with row number equal to 1 (with the minimum date value for the corresponding id) will be used as the starting point for the recursion. Then you can use the DATEADD function to generate the row for the next day. To use the price values ​​from the original data, you can use a subquery to get the price for this new date, and if there is no such value (no row for this date), use the previous price value from CTE (use the COALESCE function for this).
For SQL Server query can look like this
WITH cte AS (
SELECT
id,
date,
price
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS rn
FROM tbl
) t
WHERE rn = 1
UNION ALL
SELECT
cte.id,
DATEADD(d, 1, cte.date),
COALESCE(
(SELECT tbl.price
FROM tbl
WHERE tbl.id = cte.id AND tbl.date = DATEADD(d, 1, cte.date)),
cte.price
)
FROM cte
WHERE DATEADD(d, 1, cte.date) <= GETDATE()
)
SELECT * FROM cte
ORDER BY id, date
OPTION (MAXRECURSION 0)
Note that I added OPTION (MAXRECURSION 0) to make the recursion run through all the steps, since the default value is 100, this is not enough to complete the recursion.
db<>fiddle here
The same approach for MySQL (you need MySQL of version 8.0 to use CTE)
WITH RECURSIVE cte AS (
SELECT
id,
date,
price
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS rn
FROM tbl
) t
WHERE rn = 1
UNION ALL
SELECT
cte.id,
DATE_ADD(cte.date, interval 1 day),
COALESCE(
(SELECT tbl.price
FROM tbl
WHERE tbl.id = cte.id AND tbl.date = DATE_ADD(cte.date, interval 1 day)),
cte.price
)
FROM cte
WHERE DATE_ADD(cte.date, interval 1 day) <= NOW()
)
SELECT * FROM cte
ORDER BY id, date
db<>fiddle here
Both queries produces the same results, the only difference is the use of the engine's specific date functions.
For MySQL versions below 8.0, you can use a calendar table since you don't have CTE support and can't generate the required date range.
Assuming there is a column in the calendar table to store date values ​​(let's call it date for simplicity) you can use the CROSS JOIN operator to generate date ranges for the id values in your table that will match existing dates. Then you can use a subquery to get the latest price value from the table which is stored for the corresponding date or before it.
So the query would be like this
SELECT
d.id,
d.date,
(SELECT
price
FROM tbl
WHERE tbl.id = d.id AND tbl.date <= d.date
ORDER BY tbl.date DESC
LIMIT 1
) price
FROM (
SELECT
t.id,
c.date
FROM calendar c
CROSS JOIN (SELECT DISTINCT id FROM tbl) t
WHERE c.date BETWEEN (
SELECT
MIN(date) min_date
FROM tbl
WHERE tbl.id = t.id
)
AND NOW()
) d
ORDER BY id, date
Using my pseudo-calendar table with date values ranging from 2022-05-20 to 2022-05-30 and source data in that range, like so
id
price
date
1
2
2022-05-21
1
3
2022-05-25
1
2.5
2022-05-28
2
10
2022-05-25
2
100
2022-05-30
the query produces following results
id
date
price
1
2022-05-21
2
1
2022-05-22
2
1
2022-05-23
2
1
2022-05-24
2
1
2022-05-25
3
1
2022-05-26
3
1
2022-05-27
3
1
2022-05-28
2.5
1
2022-05-29
2.5
1
2022-05-30
2.5
2
2022-05-25
10
2
2022-05-26
10
2
2022-05-27
10
2
2022-05-28
10
2
2022-05-29
10
2
2022-05-30
100
db<>fiddle here

MS-SQL how to add missing month in a table values

I have a table with the following entries,
ID
date
Frequency
1
'2012-04-30'
5
1
'2012-06-30'
4
1
'2012-07-31'
25
2
'2012-04-30'
7
2
'2012-05-31'
4
2
'2012-06-30'
1
2
'2012-07-31'
6
I need to add missing month and the date which gets added should be the last date of that month with frequency value as 0.
The expected output is
ID
date
Frequency
1
'2012-04-30'
5
1
'2012-05-31'
0
1
'2012-06-30'
4
1
'2012-07-31'
25
2
'2012-04-30'
7
2
'2012-05-31'
4
2
'2012-06-30'
1
2
'2012-07-31'
6
I need to add missing month and the date which gets added should be the last date of that
I would suggest recursive CTEs:
with cte as (
select id, date, frequency,
lead(date) over (partition by id order by date) as next_date
from t
union all
select id, eomonth(date, 1), 0, next_date
from cte
where eomonth(date, 1) < dateadd(day, -1, next_date)
)
select id, date, frequency
from cte
order by id, date;
The anchor part of the CTE calculates the end date for a given row. The recursive part then just keeps adding months to fill in the missing rows (and none if there are none). The use of eomonth(date, 1) is just a handy way of getting the last day of the next month.
Here is a db<>fiddle.
If you have all dates in the table, you can also use cross join to generate the rows and then left join to bring in the existing data:
select i.id, d.date, coalesce(t.frequency, 0) as frequency
from (select distinct id from t) i cross join
(select distinct date from t) d left join
t
on i.id = t.id and d.date = t.date
order by i.id, d.date;
If you have a large amount of data, you can compare performance. This may be a case where a recursive CTE is faster than alternative methods.

Find From/To Dates across multiple rows - SQL Postgres

I want to be able to "book" within range of dates, but you can't book across gaps of days. So booking across multiple rates is fine as long as they are contiguous.
I am happy to change data structure/index, if there are better ways of storing start/end ranges.
So far I have a "rates" table which contains Start/End Periods of time with a daily rate.
e.g. Rates Table.
ID Price From To
1 75.00 2015-04-12 2016-04-15
2 100.00 2016-04-16 2016-04-17
3 50.00 2016-04-18 2016-04-30
For the above data I would want to return:
From To
2015-04-12 2016-4-30
For simplicity sake it is safe to assume that dates are safely consecutive. For contiguous dates To is always 1 day before from.
For the case there is only 1 row, I would want it to return the From/To of that single row.
Also to clarify if I had the following data:
ID Price From To
1 75.00 2015-04-12 2016-04-15
2 100.00 2016-04-17 2016-04-18
3 50.00 2016-04-19 2016-04-30
4 50.00 2016-05-01 2016-05-21
Meaning where there is a gap >= 1 day it would count as a separate range.
In which case I would expect the following:
From To
2015-04-12 2016-04-15
2015-04-17 2016-05-21
Edit 1
After playing around I have come up with the following SQL which seems to work. Although I'm not sure if there are better ways/issues with it?
WITH grouped_rates AS
(SELECT
from_date,
to_date,
SUM(grp_start) OVER (ORDER BY from_date, to_date) group
FROM (SELECT
gite_id,
from_date,
to_date,
CASE WHEN (from_date - INTERVAL '1 DAY') = lag(to_date)
OVER (ORDER BY from_date, to_date)
THEN 0
ELSE 1
END grp_start
FROM rates
GROUP BY from_date, to_date) AS start_groups)
SELECT
min(from_date) from_date,
max(to_date) to_date
FROM grouped_rates
GROUP BY grp;
This is identifying contiguous overlapping groups in the data. One approach is to find where each group begins and then do a cumulative sum. The following query adds a flag indicating if a row starts a group:
select r.*,
(case when not exists (select 1
from rates r2
where r2.from < r.from and r2.to >= r.to or
(r2.from = r.from and r2.id < r.id)
)
then 1 else 0 end) as StartFlag
from rate r;
The or in the correlation condition is to handle the situation where intervals that define a group overlap on the start date for the interval.
You can then do a cumulative sum on this flag and aggregate by that sum:
with r as (
select r.*,
(case when not exists (select 1
from rates r2
where (r2.from < r.from and r2.to >= r.to) or
(r2.from = r.from and r2.id < r.id)
)
then 1 else 0 end) as StartFlag
from rate r
)
select min(from), max(to)
from (select r.*,
sum(r.StartFlag) over (order by r.from) as grp
from r
) r
group by grp;
CREATE TABLE prices( id INTEGER NOT NULL PRIMARY KEY
, price MONEY
, date_from DATE NOT NULL
, date_upto DATE NOT NULL
);
-- some data (upper limit is EXCLUSIVE)
INSERT INTO prices(id, price, date_from, date_upto) VALUES
( 1, 75.00, '2015-04-12', '2016-04-16' )
,( 2, 100.00, '2016-04-17', '2016-04-19' )
,( 3, 50.00, '2016-04-19', '2016-05-01' )
,( 4, 50.00, '2016-05-01', '2016-05-22' )
;
-- SELECT * FROM prices;
-- Recursive query to "connect the dots"
WITH RECURSIVE rrr AS (
SELECT date_from, date_upto
, 1 AS nperiod
FROM prices p0
WHERE NOT EXISTS (SELECT * FROM prices nx WHERE nx.date_upto = p0.date_from) -- no preceding segment
UNION ALL
SELECT r.date_from, p1.date_upto
, 1+r.nperiod AS nperiod
FROM prices p1
JOIN rrr r ON p1.date_from = r.date_upto
)
SELECT * FROM rrr r
WHERE NOT EXISTS (SELECT * FROM prices nx WHERE nx.date_from = r.date_upto) -- no following segment
;
Result:
date_from | date_upto | nperiod
------------+------------+---------
2015-04-12 | 2016-04-16 | 1
2016-04-17 | 2016-05-22 | 3
(2 rows)

SQL spread month value into weeks

I have a table where I have values by month and I want to spread these values by week, taking into account that weeks that spread into two month need to take part of the value of each of the month and weight on the number of days that correspond to each month.
For example I have the table with a different price of steel by month
Product Month Price
------------------------------------
Steel 1/Jan/2014 100
Steel 1/Feb/2014 200
Steel 1/Mar/2014 300
I need to convert it into weeks as follows
Product Week Price
-------------------------------------------
Steel 06-Jan-14 100
Steel 13-Jan-14 100
Steel 20-Jan-14 100
Steel 27-Jan-14 128.57
Steel 03-Feb-14 200
Steel 10-Feb-14 200
Steel 17-Feb-14 200
As you see above, the week that overlaps between Jan and Feb needs to be calculated as follows
(100*5/7)+(200*2/7)
This takes into account tha the week of the 27th has 5 days that fall into Jan and 2 into Feb.
Is there any possible way to create a query in SQL that would achieve this?
I tried the following
First attempt:
select
WD.week,
PM.PRICE,
DATEADD(m,1,PM.Month),
SUM(PM.PRICE/7) * COUNT(*)
from
( select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE
)PM
join
( select '2014-1-20' as week
union
select '2014-1-27' as week
union
select '2014-2-3' as week
)WD
ON WD.week>=PM.Month
AND WD.week < DATEADD(m,1,PM.Month)
group by
WD.week,PM.PRICE, DATEADD(m,1,PM.Month)
This gives me the following
week PRICE
2014-1-20 100 2014-02-01 00:00:00.000 14
2014-1-27 100 2014-02-01 00:00:00.000 14
2014-2-3 200 2014-03-01 00:00:00.000 28
I tried also the following
;with x as (
select price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n <= ndm.nd
)
select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
This gimes me the following
week price
2014-01-07 00:00:00.000 100.00
2014-01-14 00:00:00.000 100.00
2014-01-21 00:00:00.000 100.00
2014-02-04 00:00:00.000 200.00
2014-02-11 00:00:00.000 200.00
2014-02-18 00:00:00.000 200.00
Thanks
If you have a calendar table it's a simple join:
SELECT
product,
calendar_date - (day_of_week-1) AS week,
SUM(price/7) * COUNT(*)
FROM prices AS p
JOIN calendar AS c
ON c.calendar_date >= month
AND c.calendar_date < DATEADD(m,1,month)
GROUP BY product,
calendar_date - (day_of_week-1)
This could be further simplified to join only to mondays and then do some more date arithmetic in a CASE to get 7 or less days.
Edit:
Your last query returned jan 31st two times, you need to remove the =from on n.n < ndm.nd. And as you seem to work with ISO weeks you better change the DATEPART to avoid problems with different DATEFIRST settings.
Based on your last query I created a fiddle.
;with x as (
select price,
datepart(isowk,dateadd(day, n.n, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n < ndm.nd
) select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
Of course the dates might be from multiple years, so you need to GROUP BY by the year, too.
Actually, you need to spred it over days, and then get the averages by week. To get the days we'll use the Numbers table.
;with x as (
select product, price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from #t t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from #t t
where t1.month = t.month and t1.product = t.product) ndm
inner join numbers n on n.n <= ndm.nd
)
select product, min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by product, wk
having count(*) = 7
order by product, wk
The result of datepart(week,dateadd(day, n.n-2, t1.month)) expression depends on SET DATEFIRST so you might need to adjust accordingly.

Concatenation of adjacent dates in SQL

I would like to know how to make intersections or concatenations of adjacent date ranges in sql.
I have a list of customer start and end dates, for example (in dd/mm/yyyy format, where 31/12/9999 means the customer is still a current customer).
CustID | StartDate | Enddate |
1 | 01/08/2011|19/06/2012|
1 | 20/06/2012|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|10/10/2010|
2 | 11/10/2010|31/12/9999|
3 | 01/08/2010|19/08/2010|
3 | 20/08/2010|26/12/2011|
Although the dates in different rows don't overlap, I would consider some of the ranges as a contigous period of time, e.g when the start date comes one day after an end date (for a given customer). Hence I would like to return a query that returns just the intersection of the dates,
CustID | StartDate | Enddate |
1 | 01/08/2011|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|31/12/9999|
3 | 01/08/2010|26/12/2011|
I've looked at CTE tables, but I can't figure out how to return just one row for one contigous block of dates.
This should work in 2005 forward:
;WITH cte2 AS (SELECT 0 AS Number
UNION ALL
SELECT Number + 1
FROM cte2
WHERE Number < 10000)
SELECT CustID, Min(GroupStart) StartDate, MAX(EndDate) EndDate
FROM (SELECT *
, DATEADD(DAY,b.number,a.StartDate) GroupStart
, DATEADD(DAY,1- DENSE_RANK() OVER (PARTITION BY CustID ORDER BY DATEADD(DAY,b.number,a.StartDate)),DATEADD(DAY,b.number,a.StartDate)) GroupDate
FROM Table1 a
JOIN cte2 b
ON b.number <= DATEDIFF(d, startdate, EndDate)
) X
GROUP BY CustID, GroupDate
ORDER BY CustID, StartDate
OPTION (MAXRECURSION 0)
Demo: SQL Fiddle
You can build a quick table of numbers 0-something large enough to cover the spread of dates in your ranges to replace the cte so it doesn't run each time, indexed properly it will run quickly.
you can do this with recursive common table expression:
with cte as (
select t.CustID, t.StartDate, t.EndDate, t2.StartDate as NextStartDate
from Table1 as t
left outer join Table1 as t2 on t2.CustID = t.CustID and t2.StartDate = case when t.EndDate < '99991231' then dateadd(dd, 1, t.EndDate) end
), cte2 as (
select c.CustID, c.StartDate, c.EndDate, c.NextStartDate
from cte as c
where c.NextStartDate is null
union all
select c.CustID, c.StartDate, c2.EndDate, c2.NextStartDate
from cte2 as c2
inner join cte as c on c.CustID = c2.CustID and c.NextStartDate = c2.StartDate
)
select CustID, min(StartDate) as StartDate, EndDate
from cte2
group by CustID, EndDate
order by CustID, StartDate
option (maxrecursion 0);
sql fiddle demo
Quick performance tests:
Results on 750 rows, small periods of 2 days length:
sql fiddle demo
My query: 300 ms
Goat CO query with CTE: 10804 ms
Goat CO query with table of fixed numbers: 7 ms
Results on 5 rows, large periods:
sql fiddle demo
My query: 1 ms
Goat CO query with CTE: 700 ms
Goat CO query with table of fixed numbers: 36 ms