How can I contstruct this T-SQL query involving missing date ranges? - sql

I'll try to keep the specific details of my problem out of this question and focus only on the pertinent issues.
Lets say I have an Assets table with a primary key of AssetID.
I have another table called ProcessedDates with primary key PID and with additional columns AssetID, StartDate, EndDate.
I want to run a process for a list of assets between a start date and end date. Before I can run this process, I need to know which assets and which date ranges have already been processed.
For example, there are 2 entries in ProcessedDates:
AssetID StartDate EndDate
--------------------------
Asset1 Day4 day7
Asset1 Day10 Day12
I want to process Asset1 between day2 and day11. I don't need to waste time by processing on days that have already been done so in this example, I will only process asset1 from day2 to day3 and from day8 to day 9.
So what I need is a query that returns the gaps in the date ranges. In this case, the result set will be 2 lines:
AssetID StartDate EndDate
--------------------------
Asset1 day2 day3
Asset1 day8 day9
In my actual requirement I have many assetIDs. The ProcessedDates table may have multiple entries for each asset or none at all and each asset does not necessarily have the same processed dates as any other asset.
declare #StartDate date, #EndDate date (assume these are given)
--get distinct assets
select distinct AssetIDs from (some query) into #Assets
--get the already processed date ranges
select p.AssetID, p.StartDate, p.EndDate
from ProcessedDates p inner join #Assets a on p.AssetID = a.AssetID
where p.StartDate between #StartDate and #EndDate
or p.EndDate between #StartDate and #EndDate
From here I have no clue how to proceed. How do I get it to return AssetID, StartDate, EndDate for all the gaps in between?

Something like this:
declare #StartDate date = '2015-01-01', #EndDate date = '2015-05-05'
declare #Assets table (AssetID varchar(50), StartDate date, EndDate date)
declare #AssetTypes table (AssetID varchar(50))
insert into #AssetTypes values
('Asset1'),
('Asset2')
insert into #Assets values
('Asset1', '2014-12-10', '2014-12-31'), -- Ignored
('Asset1', '2015-02-02', '2015-03-02'),
('Asset1', '2015-03-05', '2015-05-01'),
('Asset1', '2015-06-01', '2015-06-06') -- Ignored
;WITH Base AS (
SELECT AT.AssetID
, CASE WHEN A.AssetID IS NULL THEN 1 ELSE 0 END EmptyAsset
, A.StartDate
, A.EndDate
, ROW_NUMBER() OVER (PARTITION BY AT.AssetID ORDER BY StartDate) RN
FROM #AssetTypes AT
LEFT JOIN #Assets A ON A.AssetID = AT.AssetID
WHERE A.AssetID IS NULL -- case of totally missing asset
OR (StartDate <= #EndDate AND EndDate >= #StartDate)
)
-- first missing range, before the first row
SELECT AssetID, #StartDate StartDate, DATEADD(dd, -1, StartDate) EndDate
FROM Base
WHERE RN = 1 AND StartDate > #StartDate
UNION ALL
-- each row joined with the next one
SELECT B1.AssetID, DATEADD(dd, 1, B1.EndDate), ISNULL(DATEADD(dd, -1, B2.StartDate), #EndDate)
FROM Base B1
LEFT JOIN Base B2 ON B2.AssetID = B1.AssetID AND B2.RN = B1.RN + 1
WHERE B1.EmptyAsset = 0
AND (B2.AssetID IS NULL -- Last row case
OR DATEADD(dd, 1, B1.EndDate) < B2.StartDate) -- Other rows case
AND B1.EndDate < #EndDate -- If the range ends after #EndDate, nothing to do
UNION ALL
-- case of totally missing asset
SELECT AssetID, #StartDate, #EndDate
FROM Base
WHERE EmptyAsset = 1
The main idea is that each row is joined with the next one. A new range is generated (if necessary) between the EndDate + 1 and the StartDate - 1. There is a special handling for the last row (B2.AssetID IS NULL and ISNULL(... #EndDate)). The first SELECT generated a row before the first range, and the last select is for the special case of no ranges present for an asset.
As I've written in the comments, it gets ugly quite quickly.

Here's an simple version to get the result you want. I use integer as date, and assume the min date is 0 and the max date is 999.
--DDL
create table Assets (AssetID integer, StartDate integer, EndDate integer);
insert into Assets values
(1,4,7),
(1,10,12),
(1,15,17),
(2,5,7),
(2,9,10);
with temp as(
select a1.AssetId,
a1.enddate+1 as StartDate,
coalesce(min(a2.startdate) - 1,999) as EndDate
from Assets a1
left join Assets a2
on a1.assetid = a2.assetid
and a1.enddate < a2.startdate
group by a1.assetid,a1.enddate
union all
select a.assetid,0,min(startdate) -1
from Assets a
group by a.assetid
)
select AssetId,
case when StartDate<2 then 2 else StartDate end as StartDate,
case when EndDate>11 then 11 else EndDate end as EndDate
from temp
where StartDate<=11 and EndDate>=2
order by AssetId,StartDate
The temp table can get the missing ranges. Then filter the match ranges between Day2 and Day11, will get the result that you want.
AssetId StartDate EndDate
1 2 3
1 8 9
2 2 4
2 8 8
2 11 11
Here's the SqlFiddle Demo

Related

How to insert values based on another column value

Below is a subset of my table (for the first id)
date
id
value
01/01/2022
1
5
08/01/2022
1
2
For each id, the dates are not consecutive (e.g., for id 1, the min date is 01/01/2022 and the max date is 08/01/2022)--there are 7 days in between both dates. I want to insert rows to make the dates for each id consecutive and contiguous - the value for the value field/column to be filled with 0s so that the updated table looks like:
date
id
value
01/01/2022
1
5
02/01/2022
1
0
03/01/2022
1
0
04/01/2022
1
0
05/01/2022
1
0
06/01/2022
1
0
07/01/2022
1
0
08/01/2022
1
2
Any SQL code on how to implement this would be highly appreciated. I have a calendar table but am unsure how to join it with the above table so that I fill in missing dates dynamically for each id with 0s.
My calendar table looks like:
date
01/01/2022
02/01/2022
03/01/2022
04/01/2022
Considering you state you have a calendar table, it seems what you need to do with JOIN to it with the MIN and MAX dates from your other table, and the LEFT JOIN back to your table:
WITH MinMax AS(
SELECT ID,
MIN(date) AS MinDate,
MAX(date) AS MaxDate
FROM dbo.YourTable
GROUP BY ID),
Dates AS(
SELECT MM.ID,
C.CalendarDate AS [Date]
FROM MinMax MM
JOIN dbo.CalendarTable C ON MM.MinDate <= C.CalendarDate
AND MM.MaxDate >= C.CalendarDate)
SELECT D.ID,
D.[Date],
ISNULL(YT.[Value],0) AS [Value]
FROM Dates D
LEFT JOIN dbo.YourTable YT ON D.ID = YT.ID
AND D.[Date] = YT.[Date];
SET DATEFORMAT DMY
-- CREATE A TABLE WITH OUR INPUT DATA
DROP TABLE IF EXISTS #TheData
GO
CREATE TABLE #TheData
(TheDate DATE, id INT, TheValue INT)
INSERT INTO #TheData
(TheDate,id,Thevalue)
VALUES
('01/01/2022',1,5),
('08/01/2022',1,2),
('17/01/2022',2,7),
('25/01/2022',2,7),
('15/02/2022',2,7)
-- CREATE A CALENDAR CTE
DECLARE #StartDate date = '20210101';
DECLARE #CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 2, #StartDate));
;WITH DateSeq(TheDate) AS
(
SELECT #StartDate
UNION ALL
SELECT DATEADD(dd,1,TheDate) FROM DateSeq
WHERE TheDate < #CutoffDate
)
-- CROSS JOIN OUR CALENDAR CTE TO OUR SOURCE DATA. DERIVED TABLE TO GET FIRST AND LAST OF EACH RANGE TO USE FOR JOIN
SELECT
ds.*
,SourceDataRangesByID.ID
,ISNULL(td.TheValue,0) AS TheValue
FROM
DateSeq ds
CROSS JOIN
(
SELECT
d.ID
,MIN(d.TheDate) AS MinDatePerID
,MAX(d.TheDate) AS MaxDatePerID
FROM #TheData d
GROUP BY d.ID
) SourceDataRangesByID
LEFT JOIN #TheData td ON td.id = SourceDataRangesByID.ID AND td.TheDate = ds.TheDate
WHERE ds.TheDate >= SourceDataRangesByID.MinDatePerID
AND ds.TheDate <= SourceDataRangesByID.MaxDatePerID
OPTION (MAXRECURSION 0);
try the generate_series to create a date table then right join with it and coalesce for the non null value
SELECT generate_series('2016-01-01', -- series start date
'2018-06-30', -- series end date
'1 day'::interval)::date AS day) AS daily_series
from mytable
See Generate_Series for TSQL
https://dba.stackexchange.com/questions/255165/does-ms-sql-server-have-generate-series-function
(Sql server 2022)
https://learn.microsoft.com/en-us/sql/t-sql/functions/generate-series-transact-sql?view=sql-server-ver16

Generate data between two range date by some values

I have 2 dates, StartDate and EndDate:
Declare #StartDate date='2018/01/01', #Enddate date ='2018/12/31'
Then there is some data with a date and value in a mytable table:
----------------------------
ID date value
----------------------------
1 2018/02/14 4
2 2018/09/26 7
3 2017/09/20 2
data maybe start before 2018 and if it exist before #startdate get before values
else get 0
I'm looking to get a result that looks like this:
-----------------------------------
fromdate todate value
-----------------------------------
2018/01/01 2018/02/13 2
2018/02/14 2018/09/25 4
2018/09/26 2018/12/31 7
The first fromdate comes from #StartDate and the last todate is from #Enddate, and the other data should be generated.
I'm hoping to get this in an SQL query. I use sql-server 2016
You could use a CTE to create your full range of dates, and then LEAD to create the ToDate column:
DECLARE #FromDate date = '20180101',
#ToDate date = '20181231';
WITH VTE AS(
SELECT ID,
CONVERT(date,[date]) [date], --This is why using keywords for column names is a bad idea
[value]
FROM (VALUES(1,'20180214',4),
(2,'20180926',7),
(3,'20170314',4))V(ID,[date],[value])),
Dates AS(
SELECT [date]
FROM VTE V
WHERE V.[date] BETWEEN #FromDate and #ToDate
UNION ALL
SELECT [date]
FROM (VALUES(#FromDate))V([date]))
SELECT D.[date] AS FromDate,
LEAD(DATEADD(DAY, -1,D.[date]),1,#ToDate) OVER (ORDER BY D.[date]) AS ToDate,
ISNULL(V.[value],0) AS [value]
FROM Dates D
LEFT JOIN VTE V ON D.[date] = V.[date];
db<>fiddle
with cte as
(
select 0 as row_num, #StartDate as start_date, 0 as val
UNION
select ROW_NUMBER() OVER(ORDER BY start_date) as row_num, * from input
)
select curr.start_date
, DATEADD(day,-1,ISNULL(nex.start_date,DATEADD(day,1,#Enddate))) as end_date
, curr.val
from cte curr
left join cte nex on curr.row_num = nex.row_num - 1;
You can find the simulation here: https://rextester.com/EIAXW23839

SQL Server 2008 - Enumerate multiple date ranges

How can I enumerate multiple date ranges in SQL Server 2008? I know how to do this if my table contains a single record
StartDate EndDate
2014-01-01 2014-01-03
;WITH DateRange
AS (
SELECT #StartDate AS [Date]
UNION ALL
SELECT DATEADD(d, 1, [Date])
FROM DateRange
WHERE [Date] < #EndDate
)
SELECT * FROM DateRange
OUTPUT
2014-01-01, 2014-01-02, 2014-01-03
I am however lost as how to do it if my table contains multiple records. I could possibly use the above logic in a cursor but want to know if there is a set based solution instead.
StartDate EndDate
2014-01-01 2014-01-03
2014-01-05 2014-01-06
DESIRED OUTPUT:
2014-01-01, 2014-01-02, 2014-01-03, 2014-01-05, 2014-01-06
Well, let's see. Define the ranges as a table. Then generate the full range of dates from the first to the last date. Finally, select the dates that are in the range:
with dateranges as (
select cast('2014-01-01' as date) as StartDate, cast('2014-01-03' as date) as EndDate union all
select '2014-01-05', '2014-01-06'
),
_dates as (
SELECT min(StartDate) AS [Date], max(EndDate) as enddate
FROM dateranges
UNION ALL
SELECT DATEADD(d, 1, [Date]), enddate
FROM _dates
WHERE [Date] < enddate
),
dates as (
select [date]
from _dates d
where exists (select 1 from dateranges dr where d.[date] >= dr.startdate and d.[date] <= dr.enddate)
)
select *
from dates
. . .
You can see this work here.
You could grab the min and max dates first, like so:
SELECT #startDate = MIN(StartDate), #endDate = MAX(EndDate)
FROM YourTable
WHERE ...
And then pass those variables into your date range enumerator.
Edit... Whoops, I missed an important requirement. See the accepted answer.
As GordonLinoff mentioned, you should:
Store your ranges in a table
Generate a range of dates that encompasses your ranges
Filter down to only those dates that fall within the range
The following query builds up a collection of numbers, and then uses that to quickly generate all of the dates that fall within each range.
-- Create a table of digits (0-9)
DECLARE #Digits TABLE (digit INT NOT NULL PRIMARY KEY);
INSERT INTO #Digits(digit)
VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
WITH
-- Store our ranges in a common table expression
CTE_DateRanges(StartDate, EndDate) AS (
SELECT '2014-01-01', '2014-01-03'
UNION ALL
SELECT '2014-01-05', '2014-01-06'
)
SELECT DATEADD(DAY, NUMBERS.num, RANGES.StartDate) AS Date
FROM
(
-- Create the list of all 3-digit numbers (0-999)
SELECT D3.digit * 100 + D2.digit * 10 + D1.digit AS num
FROM #Digits AS D1
CROSS JOIN #Digits AS D2
CROSS JOIN #Digits AS D3
-- Add more CROSS JOINs to #Digits if your ranges span more than 999 days
) NUMBERS
-- Join to our ranges table to generate the dates and filter them
-- down to those that fall within a range
INNER JOIN CTE_DateRanges RANGES
ON DATEADD(DAY, NUMBERS.num, RANGES.StartDate) <= RANGES.EndDate
ORDER BY
Date
The date creation is done by joining our number list with our date ranges, using the number as a number of days to add to the StartDate of the range. We then filter out any results where the generated date for a given range falls beyond that range's EndDate. Since we're adding a non-negative number of days to the StartDate to generate the date, we know that our date will always be greater-than-or-equal-to the StartDate of the range, so we don't need to include StartDate in the WHERE clause.
This query will return DATETIME values. If you need a DATE value, rather than a DATETIME value, you can simply cast the value in the SELECT clause.
Credit goes to Itzik Ben-Gan for the digits table.

display list of dates by giving start and end date and get data against those dates

I have a sql table having three columns
id, balance, datetime
i want to get data from the table by giving time duration. suppose i want to get data between 1/1/2013 to 1/15/2013. data is given table is shown as:
#id Datetime Balance #
1 1/1/2013 1500
2 1/2/2013 2000
3 1/4/2013 1500
4 1/5/2013 2500
now I want the output as
#id Datetime Balance #
1 1/1/2013 1500
2 1/2/2013 2000
3 1/3/2013 0
4 1/4/2013 1500
5 1/5/2013 2500
i want to display all the dates and if there is no balance against the date. it shows O or null value
I would get rid of ID column as it is useless when you are adding additional rows and do something like this:
set dateformat mdy
declare #tmpTable table (dates date)
declare #startDate date = '1/1/2013'
declare #endDate date = '1/15/2013'
while #startDate <= #endDate
begin
insert into #tmpTable (dates) values (#startDate)
set #startDate = DATEADD(DAY, 1, #startDate)
end
select tmp.dates, yourtable.balance
from #tmpTable as tmp
left outer join yourTable on yourTable.[Datetime] = tmp.dates
where yourtable.[Datetime] between #startDate and #endDate
I'm not sure what flavor of SQL you're using but you can a table of all dates and to do an outer join to it. For example, if you have a table of all dates called 'Dates', then you could do:
select id, dateTime, balance from table1 t RIGHT JOIN Dates d on t.dateTime = d.dateTime
where t.dateTime BETWEEN '1/1/2013' AND '1/5/2013'
You can do it by creating a cte with the dates between your daterange and outer joining it with your table.
with cte as
(
select start_date dt
union all
select dateadd(dd,1,dt) from cte where dt < end_date
)
select id, cte.dt, balance from cte left outer join yourtable b on cte.dt = b.date;

How to select a record that is between two dateranges and two input parameters StartDate and EndDate in SQL server?

In my database I have a group which can be active in a week with opening hours.
Group A Startdate: 22-10-2012 EndDate: 28-10-2012 (Startdate is always beginning of the week)
Group A Startdate: 29-10-2012 EndDate: 04-11-2012 (Always end of the week)
Group A has two different opening hours in this two weeks.
Now I have a stored procedure that returns the active group by looking the input parameters StartDate and EndDate. If this two parameters are between the startdate and end date of ONE week then I'm getting the right opening hours.
But when the input parameters are about TWO weeks (Begindate: 22-10-2012 and EndDate: 30-10-2012) then I get the opening hours of two weeks.
declare #Begindate datetime
set #Begindate = '2012-10-22'
declare #Enddate datetime
set #Enddate = '2012-11-02'
SELECT Id, Date, ..., ...
FROM Table1 t1
INNER JOIN Combinations c ON t1.Id=c.Table1Id
INNER JOIN Group g ON c.GroupId=g.Id
WHERE t1.Date<= #Enddate) AND (t1.Date>= #Begindate (gets dates I need)
AND g.BeginDate <=#Enddate and g.Enddate >= #Begindate (Gets active groups)
Some table data:
Table1:
Id Date GroupId
1 2012-10-23 10
1 2012-10-29 10
Combinations: (holds the relation between Table1 and Group)
ID Table1Id GroupId
1 10 1
Group Table:
ID Name StartDate EndDate MondayOpen MondayClose ...
1 Group A 2012-10-22 2012-10-28 08:00 18:00
1 Group A 2012-10-29 2012-11-04 13:00 18:00
With this query I am getting a group two times because both the inputparametes are over two weeks.
How can I get for the dates (from t1) the correct opening dates looking the group Begin and End date?
You are missing a condition between table1 and [group], which is the last line in the query below.
SELECT *
FROM Table1 t1
INNER JOIN Combinations c ON t1.Id=c.Table1Id
INNER JOIN [Group] g ON c.GroupId=g.Id
WHERE t1.Date<= #Enddate AND t1.Date>= #Begindate
AND g.BeginDate <=#Enddate and g.Enddate >= #Begindate
AND t1.Date between g.BeginDate and g.EndDate
SQLFiddle