Determine time gaps in SQL - sql

I'm trying to self learn the lag and lead functions and thought I would try it with the following report but I'm not having much luck. My goal is to take some on-call start and stop times for a department and create a report detailing the time gaps for the day that there was no coverage. The department is assumed to have 24 hour coverage, 7 days a week. Is the only way to handle this to join it to a datetime table that has the date and every minute available? Any suggestions would be greatly appreciated.
The expected outcome for the data below would be:
on 09/01/2020 dept 3042300031 had a time gap from 20:59 to 21:00 and a time gap from 22:59 to 23:59
on 09/02/2020 dept 3042300031 had a time gap from 00:00 to 00:05 and a time gap from 20:59 to 22:00 and a time gap from 22:50 to 23:59
on 09/03/2020 dept 3042300031 had a time gap from 00:00 to 23:59
on 09/04/2020 dept 3042300031 had a time gap from 20:59 to 23:59
IF OBJECT_ID('tempdb..#report') IS NOT NULL DROP TABLE #report
GO
CREATE TABLE #report (
Contact_Date date
,Line int
,Start_Instant_dttm smalldatetime
,End_Instant_dttm datetime
,Asgn_to_Role int
,Asgn_to_Team bigint
);
INSERT INTO #report
SELECT
'9/1/2020',1,'9/1/2020 00:00','9/1/2020 20:59',270,3042300031
UNION
SELECT
'9/1/2020',2,'9/1/2020 21:00','9/1/2020 22:59',270,3042300031
UNION
SELECT
'9/2/2020',1,'9/2/2020 00:05','9/2/2020 20:59',270,3042300031
UNION
SELECT
'9/2/2020',2,'9/2/2020 22:00','9/2/2020 22:59',270,3042300031
UNION
SELECT
'9/4/2020',1,'9/4/2020 00:00','9/4/2020 20:59',270,3042300031;

Finding the gaps is simple:
with cte as
(
select Asgn_to_Team, Start_Instant_dttm, End_Instant_dttm
,lag(End_Instant_dttm)
over (partition by Asgn_to_Team
order by Start_Instant_dttm) as prev_end
from #report
)
select Asgn_to_Team, prev_end as gap_start, Start_Instant_dttm as gap_end
from cte
where prev_end < Start_Instant_dttm
But splitting them into days is much harder, you need to join to a calendar table:
with cte as
(
select Asgn_to_Team, Start_Instant_dttm, End_Instant_dttm
,lag(End_Instant_dttm)
over (partition by Asgn_to_Team
order by Start_Instant_dttm) as prev_end
from #report
)
select Asgn_to_Team,
case when prev_end > cast(cal.cal_date as datetime)
then prev_end
else cast(cal.cal_date as datetime)
end as gap_start,
case when Start_Instant_dttm < cast(dateadd(day, 1, cal.cal_date) as datetime)
then Start_Instant_dttm
else cast(dateadd(day, 1, cal.cal_date) as datetime)
end as gap_end
from cte join cal -- one row for each date covered
on cast(cal.cal_date as datetime) <= Start_Instant_dttm
and cast(dateadd(day, 1, cal.cal_date) as datetime) > prev_end
where prev_end < Start_Instant_dttm
Hopefully I got the >/< right, see fiddle

Related

SqlServer - How to display data of each month including months with no data

Can you please help me in Sql server, I have table from where I'am getting date wise wise data.
Table structure .
Date Amount
-----------------
2019-05-04 16128.00
2019-05-06 527008.00
2019-05-07 407608.00
2019-05-10 407608.00
Above query I want to fill the missing date , My expectation as shown below
Date Amount
-----------------
2019-05-04 16128.00
2019-05-05 00
2019-05-06 527008.00
2019-05-07 407608.00
2019-05-08 0
2019-05-09 0
2019-05-10 407608.00
Thanks in Advance
You can use calndar help table otherwise assign start date and from date like this DECLARE #fromdate DATE = '20190504', #todate DATE = '20190510' this from date and to date you can change.
CREATE TABLE #amounttable
(
Dt DATE,
Amount BIGINT
);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-04',16128);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-06',527008);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-07',407608);
INSERT into #amounttable(Dt, Amount) VALUES('2019-05-10',407608);
DECLARE #fromdate DATE = '20190504', #todate DATE = '20190510';
SELECT c.d as Date, Amount = COALESCE(s.Amount,0)
FROM
(
SELECT TOP (DATEDIFF(DAY, #fromdate, #todate)+1)
DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY number)-1, #fromdate)
FROM [master].dbo.spt_values
WHERE [type] = N'P' ORDER BY number
) AS c(d)
LEFT OUTER JOIN #amounttable AS s
ON c.d = s.Dt
WHERE c.d >= #fromdate
AND c.d < DATEADD(DAY, 1, #todate);
for more reference check the answer here from stack exchange
One pretty simple method is to use a recursive CTE to generate the dates and then a left join to bring in the days:
with cte as (
select min(t.dte) as dte, max(t.dte) as maxdate
from t
union all
select dateadd(day, 1, dte), maxdate
from cte
where dte < maxdate
)
select cte.dte, coalesce(t.amount, 0) as amount
from cte left join
t
on cte.dte = t.dte;
Here is a db<>fiddle.
Note that the default depth for recursion is 100, so for longer periods you should add OPTION (MAXRECURSION 0).

How to calculate overlaping time and length of conflict

Employee table has four columns. Employee, Start, End, Diff. Diff column is the duration and calculated as End - Start.
I want to find the conflict between the Start and End time range.
For instance, Employee A has three rows:
First rows Start time is 01:02 and end time is 01:05 but second row start time is 01:03 which is a conflict in the first row data.
Sample Data:
employee StartDate EndDate Start End Diff
A 04/08/2019 04/08/2019 01:02:00 01:05:00 3
A 04/08/2019 04/08/2019 01:03:00 01:08:00 5
A 04/08/2019 04/08/2019 01:014:00 01:21:00 7
B 04/08/2019 04/08/2019 02:00:00 02:17:00 17
I want to only select the specific start and end time for employee A that has an overlap in their start and end time and want to calculate total length of conflict in a new column using t-sql. i'm a newbie and need help. please anyone?
SELECT TOP (100) a.ccx_employeename AS employee
,CONVERT(Date,[a].[ccx_starttime]) AS [Start Date],CONVERT(Date,[a].[ccx_endtime]) AS [End Date], CONVERT(time (0), a.ccx_starttime) AS StartTime
, CONVERT(time (0), a.ccx_endtime) AS EndTime
, CONVERT (time(0), (a.ccx_endtime - a.ccx_starttime)) AS Duration
FROM ccp_sim_MSCRM.dbo.Filteredccx_Recorded_Service as a
where CONVERT(time (0), a.ccx_starttime) BETWEEN CONVERT(time (0), a.ccx_starttime) And CONVERT(time (0), a.ccx_endtime)
As first and second rows has conflict I want to show that two rows. As well as conflict duration is 2 minutes in this example. First row end time is 01:05 but second rows start time is 01:03 so conflict duration is 01:05 - 01:03 = 2 minutes
Desired Output
employee StartDate EndDate Start End Diff
A 04/08/2019 04/08/2019 01:02:00 01:05:00 3
A 04/08/2019 04/08/2019 01:03:00 01:08:00 5
duration of conflict : 2 mins
I would join the table over itself, though maybe not the most effective :
SELECT
e1.employee,
e1.Start as firstStart,
e1.End as firstEnd,
e2.Start as secondStart,
e2.End as secondEnd,
e1.End - e2.Start as conflictDuration
FROM
Employee as e1 inner join
Employee as e2 on (
e1.employee = e2.employee and
e2.Start < e1.End and
e2.End > e1.Start
)
There are a few parts to your question:
finding the conflicting rows
calculating the conflicting time
output in your desired format
The solution below only covers the first two parts and assumes a combined date and time field.
I have added some unique key to deduplicate the results and "sort" the rows for comparison. In the code below, it is "id".
declare #t table (id int identity,employee char(1), StartDateTime smalldatetime, EndDateTime smalldatetime, diff as DATEDIFF(minute,StartDateTime,EndDateTime))
insert into #t values('A','2019-04-08 01:02','2019-04-08 01:05')
insert into #t values('A','2019-04-08 01:03','2019-04-08 01:08')
insert into #t values('A','2019-04-08 01:14','2019-04-08 01:21')
insert into #t values('B','2019-04-08 02:00','2019-04-08 02:17')
SELECT T1.employee, T1.StartDateTime, T1.EndDateTime, T2.StartDateTime, T2.EndDateTime
, (T1.diff + T2.diff)
- DATEDIFF(minute, CASE WHEN T1.StartDateTime < T2.StartDateTime THEN T1.StartDateTime ELSE T2.StartDateTime END -- MIN(Start)
, CASE WHEN T1.EndDateTime > T2.EndDateTime THEN T1.EndDateTime ELSE T2.EndDateTime END) -- MAX(End)
AS "duration of conflict"
FROM #t AS T1
JOIN #t AS T2
ON T2.employee = T1.employee
AND T2.id > T1.id -- Each only once
AND T2.StartDateTime < T1.EndDateTime
AND T2.EndDateTime > T1.StartDateTime
This feels like the perfect place to use the LEAD/LAG functions to me. Combine that with some sub queries and IIF statements and you can calculate the results you're looking for.
Example:
DECLARE #Employee TABLE
(
Employee VARCHAR(1),
startDate DATE,
endDate DATE,
[start] TIME,
[end] TIME,
diff AS DATEDIFF(MINUTE,[start],[end])
)
INSERT INTO #Employee (Employee, startDate, endDate, start, [end])
VALUES
('A',CAST('2019-04-08' AS DATE),CAST('2019-04-08' AS DATE),'01:02:00','01:05:00'),
('A',CAST('2019-04-08' AS DATE),CAST('2019-04-08' AS DATE),'01:03:00','01:08:00'),
('A',CAST('2019-04-08' AS DATE),CAST('2019-04-08' AS DATE),'01:14:00','01:21:00'),
('B',CAST('2019-04-08' AS DATE),CAST('2019-04-08' AS DATE),'02:00:00','02:17:00')
SELECT
Employee.Employee,
Employee.startDate,
Employee.endDate,
Employee.start,
Employee.[end],
diff,
(IIF(ISNULL(lagConflict,0)>0,ISNULL(lagConflict,0),0)+IIF(ISNULL(Employee.leadConflict,0)>0,ISNULL(Employee.leadConflict,0),0)) AS conflict
FROM
(
SELECT
Employee,
startDate,
endDate,
start,
[end],
diff,
DATEDIFF
(
MINUTE,
[start],
LAG([end],1)
OVER
(
PARTITION BY
Employee,
startDate,
endDate
ORDER BY
[start],
[end]
)
) AS lagConflict,
DATEDIFF
(
MINUTE,
[end],
LEAD([start],1)
OVER
(
PARTITION BY
Employee,
startDate,
endDate
ORDER BY
[start],
[end]
)
)*-1 AS leadConflict
FROM
#Employee
) AS Employee
WHERE
Employee.leadConflict > 0
OR Employee.lagConflict > 0;
Microsoft SQL Docs: LAG]1

Creating a status log from rows of datetimes of status changes

I'm pulling down some data from a remote API to a local SQL Server table, which is formatted like so. (imagine it's sorted by StatusDT descending)
DriverID StatusDT Status
-------- -------- ------
b103 2019-03-05 05:42:52:000 D
b103 2019-03-03 23:45:42.000 SB
b103 2019-03-03 21:49:41.000 ON
What would be the best way to eventually get to a point where I can return a query showing the total amount of time spent in each status on each day for each driver?
Also, it's possible that there could be gaps of a whole day or more between status updates, in which case I'd need a row showing a continuation of the previous status from 00:00:00 to 23:59:59 for each skipped day. So, if I'm looping through this table to populate another with the structure below, the example above would need to wind up looking like this... (again, sorted descending by date)
DriverID StartDT EndDT Status
-------- --------------- -------------- ------
b103 2019-03-05 05:42:52 D
b103 2019-03-05 00:00:00 2019-03-05 05:42:51 SB
b103 2019-03-04 00:00:00 2019-03-04 23:59:59 SB
b103 2019-03-03 23:45:42 2019-03-03 23:59:59 SB
b103 2019-03-03 21:49:41 2019-03-03 23:45:41 ON
Does that make sense?
I wound up dumping the API data to a "work" table and running a cursor over it to add rows to another table, with the starting and ending date/time, but I'm curious if there's another way that might be more efficient.
Thanks very much.
I think this query is what you need. I couldn't test it, however, for syntax errors:
with x as (
select
DriverID,
StatusDT as StartDT,
lead(StatusID) over(partition by DriverID order by StatusDT) as EndDT,
Status
from my_table
)
select -- start & end on the same day
DriverID,
StartDT,
EndDT,
Status
from x
where convert(date, StartDT) = convert(date, EndDT)
or EndDT is null
union all
select -- start & end on different days; first day up to midnight
DriverID,
StartDT,
dateadd(ms, -3, convert(date, EndDT)) as EndDT,
Status
from x
where convert(date, StartDT) <> convert(date, EndDT)
and or EndDT is not null
union all
select -- start & end on different days; next day from midnight
DriverID,
convert(date, EndDT) as StartDT,
EndDT,
Status
from x
where convert(date, StartDT) <> convert(date, EndDT)
and or EndDT is not null
order by StartDT desc
Most of your answer is just using lead():
select driverid, status, statusdt,
lead(statusdt) over (partition by driverid order by statusdt) as enddte
from t;
This does not give the breaks by day. But you can add those. I think the easiest way is to add in the dates (using a recursive CTE) and compute the status at that time. So:
I would do the following:
use a recursive CTE to calculate the dates
"fill in" the statuses and union to the original table
use lead() to get the end date
This looks like:
with day_boundaries as (
select driverid, dateadd(day, 1, convert(min(statusdt) as date) as statusdt, max(statusdt) as finaldt
from t
group by driverid
having datediff(da, min(statusdt), max(statusdt)) > 0
union all
select driverid, dateadd(day, 1, statusdt), finaldt
from day_boundaries
where statusdt < finaldt
),
unioned as (
select driverid, status, statusdt
from t
union all
select db.driverid, s.status, db.statusdt
from day_boundaries db cross apply
(select top (1) status
from t
where t.statusdt < db.statusdt
order by t.statusdt desc
) s
)
select driverid, status, statusdt,
lead(statusdt) over (partition by driverid order by statusdt) as enddte
from unioned;
Note that this does not subtract any seconds from the end date. The end date matches the previous start date. Time is continuous. It makes no sense to have gaps for records that should snugly fit together.

SQL calculate date segments within calendar year

What I need is to calculate the missing time periods within the calendar year given a table such as this in SQL:
DatesTable
|ID|DateStart |DateEnd |
1 NULL NULL
2 2015-1-1 2015-12-31
3 2015-3-1 2015-12-31
4 2015-1-1 2015-9-30
5 2015-1-1 2015-3-31
5 2015-6-1 2015-12-31
6 2015-3-1 2015-6-30
6 2015-7-1 2015-10-31
Expected return would be:
1 2015-1-1 2015-12-31
3 2015-1-1 2015-2-28
4 2015-10-1 2015-12-31
5 2015-4-1 2015-5-31
6 2015-1-1 2015-2-28
6 2015-11-1 2015-12-31
It's essentially work blocks. What I need to show is the part of the calendar year which was NOT worked. So for ID = 3, he worked from 3/1 through the rest of the year. But he did not work from 1/1 till 2/28. That's what I'm looking for.
You can do it using LEAD, LAG window functions available from SQL Server 2012+:
;WITH CTE AS (
SELECT ID,
LAG(DateEnd) OVER (PARTITION BY ID ORDER BY DateEnd) AS PrevEnd,
DateStart,
DateEnd,
LEAD(DateStart) OVER (PARTITION BY ID ORDER BY DateEnd) AS NextStart
FROM DatesTable
)
SELECT ID, DateStart, DateEnd
FROM (
-- Get interval right before current [DateStart, DateEnd] interval
SELECT ID,
CASE
WHEN DateStart IS NULL THEN '20150101'
WHEN DateStart > start THEN start
ELSE NULL
END AS DateStart,
CASE
WHEN DateStart IS NULL THEN '20151231'
WHEN DateStart > start THEN DATEADD(d, -1, DateStart)
ELSE NULL
END AS DateEnd
FROM CTE
CROSS APPLY (SELECT COALESCE(DATEADD(d, 1, PrevEnd), '20150101')) x(start)
-- If there is no next interval then get interval right after current
-- [DateStart, DateEnd] interval (up-to end of year)
UNION ALL
SELECT ID, DATEADD(d, 1, DateEnd) AS DateStart, '20151231' AS DateEnd
FROM CTE
WHERE DateStart IS NOT NULl -- Do not re-examine [Null, Null] interval
AND NextStart IS NULL -- There is no next [DateStart, DateEnd] interval
AND DateEnd < '20151231' -- Current [DateStart, DateEnd] interval
-- does not terminate on 31/12/2015
) AS t
WHERE t.DateStart IS NOT NULL
ORDER BY ID, DateStart
The idea behind the above query is simple: for every [DateStart, DateEnd] interval get 'not worked' interval right before it. If there is no interval following the current interval, then also get successive 'not worked' interval (if any).
Also note that I assume that if DateStart is NULL then DateStart is also NULL for the same ID.
Demo here
If your data is not too big, this approach will work. It expands all the days and ids and then re-groups them:
with d as (
select cast('2015-01-01' as date)
union all
select dateadd(day, 1, d)
from d
where d < cast('2015-12-31' as date)
),
td as (
select *
from d cross join
(select distinct id from t) t
where not exists (select 1
from t t2
where d.d between t2.startdate and t2.enddate
)
)
select id, min(d) as startdate, max(d) as enddate
from (select td.*,
dateadd(day, - row_number() over (partition by id order by d), d) as grp
from td
) td
group by id, grp
order by id, grp;
An alternative method relies on cumulative sums and similar functionality that is much easier to expression in SQL Server 2012+.
Somewhat simpler approach I think.
Basically create a list of dates for all work block ranges (A). Then create a list of dates for the whole year for each ID (B). Then remove the A from B. Compile the remaining list of dates into date ranges for each ID.
DECLARE #startdate DATETIME, #enddate DATETIME
SET #startdate = '2015-01-01'
SET #enddate = '2015-12-31'
--Build date ranges from remaining date list
;WITH dateRange(ID, dates, Grouping)
AS
(
SELECT dt1.id, dt1.Dates, dt1.Dates + row_number() over (order by dt1.id asc, dt1.Dates desc) AS Grouping
FROM
(
--Remove (A) from (B)
SELECT distinct dt.ID, tmp.Dates FROM DatesTable dt
CROSS APPLY
(
--GET (B) here
SELECT DATEADD(DAY, number, #startdate) [Dates]
FROM master..spt_values
WHERE type = 'P' AND DATEADD(DAY, number, #startdate) <= #enddate
) tmp
left join
(
--GET (A) here
SELECT DISTINCT T.Id,
D.Dates
FROM DatesTable AS T
INNER JOIN master..spt_values as N on N.number between 0 and datediff(day, T.DateStart, T.DateEnd)
CROSS APPLY (select dateadd(day, N.number, T.DateStart)) as D(Dates)
WHERE N.type ='P'
) dr
ON dr.Id = dt.Id and dr.Dates = tmp.Dates
WHERE dr.id is null
) dt1
)
SELECT ID, CAST(MIN(Dates) AS DATE) DateStart, CAST(MAX(Dates) AS DATE) DateEnd
FROM dateRange
GROUP BY ID, Grouping
ORDER BY ID
Heres the code:
http://sqlfiddle.com/#!3/f3615/1
I hope this helps!

To club the rows for week days

I have data like below:
StartDate EndDate Duration
----------
41890 41892 3
41898 41900 3
41906 41907 2
41910 41910 1
StartDate and EndDate are respective ID values for any dates from calendar. I want to calculate the sum of duration for consecutive days. Here I want to include the days which are weekends. E.g. in the above data, let's say 41908 and 41909 are weekends, then my required result set should look like below.
I already have another proc that can return me the next working day, i.e. if I pass 41907 or 41908 or 41909 as DateID in that proc, it will return 41910 as the next working day. Basically I want to check if the DateID returned by my proc when I pass the above EndDateID is same as the next StartDateID from above data, then both the rows should be clubbed. Below is the data I want to get.
ID StartDate EndDate Duration
----------
278457 41890 41892 3
278457 41898 41900 3
278457 41906 41910 3
Please let me know in case the requirement is not clear, I can explain further.
My Date Table is like below:
DateId Date Day
----------
41906 09-04-2014 Thursday
41907 09-05-2014 Friday
41908 09-06-2014 Saturdat
41909 09-07-2014 Sunday
41910 09-08-2014 Monday
Here is the SQL Code for setup:
CREATE TABLE Table1
(
StartDate INT,
EndDate INT,
LeaveDuration INT
)
INSERT INTO Table1
VALUES(41890, 41892, 3),
(41898, 41900, 3),
(41906, 41907, 3),
(41910, 41910, 1)
CREATE TABLE DateTable
(
DateID INT,
Date DATETIME,
Day VARCHAR(20)
)
INSERT INTO DateTable
VALUES(41907, '09-05-2014', 'Friday'),
(41908, '09-06-2014', 'Saturday'),
(41909, '09-07-2014', 'Sunday'),
(41910, '09-08-2014', 'Monday'),
(41911, '09-09-2014', 'Tuesday')
This is rather complicated. Here is an approach using window functions.
First, use the date table to enumerate the dates without weekends (you can also take out holidays if you want). Then, expand the periods into one day per row, by using a non-equijoin.
You can then use a trick to identify sequential days. This trick is to generate a sequential number for each id and subtract it from the sequential number for the dates. This is a constant for sequential days. The final step is simply an aggregation.
The resulting query is something like this:
with d as (
select d.*, row_number() over (order by date) as seqnum
from dates d
where day not in ('Saturday', 'Sunday')
)
select t.id, min(t.date) as startdate, max(t.date) as enddate, sum(duration)
from (select t.*, ds.seqnum, ds.date,
(d.seqnum - row_number() over (partition by id order by ds.date) ) as grp
from table t join
d ds
on ds.date between t.startdate and t.enddate
) t
group by t.id, grp;
EDIT:
The following is the version on this SQL Fiddle:
with d as (
select d.*, row_number() over (order by date) as seqnum
from datetable d
where day not in ('Saturday', 'Sunday')
)
select t.id, min(t.date) as startdate, max(t.date) as enddate, sum(duration)
from (select t.*, ds.seqnum, ds.date,
(ds.seqnum - row_number() over (partition by id order by ds.date) ) as grp
from (select t.*, 'abc' as id from table1 t) t join
d ds
on ds.dateid between t.startdate and t.enddate
) t
group by grp;
I believe this is working, but the date table doesn't have all the dates in it.