Date difference between two locations in same table with one date column - sql

Tag will be placed physically in client location and will move around the places. i need find the how long it placed in one location. example if tag is placed in location 1 at 10 am and moved to location 2 at 10:15 then time difference is 15 minutes. here is sample data i have
create table #Tagm (tagname varchar(10),created_date datetime ,Loc int )
insert into #Tagm values ('AC1', '2018-07-01 09:35:37.370' ,56)
,( 'AC1', '2018-07-01 10:35:37.370' ,64),( 'AC1', '2018-07-01 10:55:37.370' ,84),( 'AC1', '2018-07-01 11:55:37.370' ,76)
I tried this but this is giving me the count for all the locations
select tagname ,DATEDIFF(MINUTE, min(created_date),max(created_date) )as totaltime
from #Tagm
group by tagname
the result i am looking for is shown below
Any help will be appreciated

Because you mentioned it is possible to have the same Location several times in a row you need to find the true start and end of the time at that location. By using LAG you can do that similar to one of the other answers. After finding true start and end then you can grab the difference. This can be done in less Common Table Expressions or as a subquery but I have split it like this so you can see the logic a bit easier.
I also added a second tagname and the use case that location doesn't change to both.
create table #Tagm (tagname varchar(10),created_date datetime ,Loc int )
insert into #Tagm values ('AC1', '2018-07-01 09:35:37.370' ,56), ('AC1', '2018-07-01 09:40:37.370' ,56) ,( 'AC1', '2018-07-01 10:35:37.370' ,64),( 'AC1', '2018-07-01 10:55:37.370' ,84),( 'AC1', '2018-07-01 11:55:37.370' ,76)
insert into #Tagm values ('AC2', '2018-08-01 09:35:37.370' ,56), ('AC2', '2018-08-01 09:40:37.370' ,64) ,( 'AC2', '2018-08-01 10:35:37.370' ,64),( 'AC2', '2018-08-01 10:55:37.370' ,84),( 'AC2', '2018-08-01 11:55:37.370' ,76)
;WITH cte AS (
SELECT
*
,LAG(Loc) OVER (PARTITION BY tagname ORDER BY created_date) as PrevLoc
FROM
#Tagm
)
, cteLocationStart AS (
SELECT
*
,IIF(PrevLoc IS NULL or PrevLoc <> Loc, 1,0) as StartSequence
FROM
cte
)
SELECT
s.tagname
,s.Loc
,s.created_date as StartDateTime
,MIN(n.created_date) as EndDateTime
,DATEDIFF(MINUTE,s.created_date, MIN(n.created_date)) as TotalTime
FROM
cteLocationStart s
LEFT JOIN cteLocationStart n
ON s.tagname = n.tagname
AND s.created_date < n.created_date
AND n.StartSequence = 1
WHERE
s.StartSequence = 1
GROUP BY
s.tagname
,s.Loc
,s.created_date
ORDER BY
tagname
,StartDateTime

I think you just want lead():
SELECT tagname,
DATEDIFF(MINUTE,
created_date,
LEAD(created_date) OVER (PARTITION BY tagname
ORDER BY created_date
)
) AS totaltime
FROM #Tagm t;

with CTE as
(select row_number() over (order by created_date desc) as rn, created_date, tagname,loc from #Tagm
)
SELECT t1.loc,t1.created_date, t1.tagname, ISNULL(DATEDIFF(mi, t1.created_date, t2.created_date), NULL)
AS seconds FROM CTE t1
LEFT JOIN CTE t2
ON t1.rn = t2.rn + 1 ORDER BY t1.created_date

Related

Query that returns date of first and last record for each day within a d

I have a table that records vehicle locations and I wish to query this to get the first and the last record for each vehicle for each day in a date range. The table looks like:
Registration Latitude Longitude dateOfRecord
A1 XBO 123.066 1.456 2019-08-01 00:04:19.000
A1 XBO 128.066 1.436 2019-08-01 22:04:19.000
A1 XBO 118.066 1.456 2019-08-01 23:45:00.000
There are multiple vehicles with three weeks worth of data being held in the table 100,000 records this is written to an archive every night which leaves a 21 days of records which I wish to query. With my sample I would like to get:
Reg Day StartTime StartLat StartLong EndTime EndLat EndLong
A2 XBO 01-08-19 00:04 123.066 1.456 23:45 118.066 1.456
I have an existing query that gets the most recent records but this can't be used for my requirements as it uses the MAX(ID) within the query and I don't believe that you can mix both MAX and MIN in the same query. I could use this as the basis of a table in a stored procedure and then loop through the records and query each to get the first record in the date range but this would be a very resource greedy process! I have included this purely to show what I already have:
SELECT TOP (100) PERCENT m.Registration, m.Location, m.dateoffix,
m.Latitude, m.Longitude, MAX(m.ID) AS ID
FROM dbo.GPSPositions AS m
INNER JOIN
(SELECT Registration AS vr,
MAX(CONVERT(datetime, dateoffix, 103)) AS tdate
FROM dbo.GPSPositions
GROUP BY Registration) AS s ON m.Registration =
s.vr AND CONVERT(datetime, m.dateoffix, 103) = s.tdate
GROUP BY m.Registration, m.Location, m.dateoffix, m.Latitude, m.Longitude
ORDER BY m.Registration
You can mix Max and Min in the same query.
with firstLast (Registration, firstRec, lastRec) as
(
select [Registration], min([dateOfRecord]) as firstRec, max(dateOfRecord) as lastRec
from GPSPositions
group by [Registration], cast(dateOfRecord as Date)
)
select
fl.Registration as Reg,
Cast(gpsF.dateOfRecord as Date) as [Day],
Cast(gpsF.dateOfRecord as Time) as [StartTime],
gpsF.Latitude as StartLat,
gpsF.Longitude as StartLon,
Cast(gpsL.dateOfRecord as Time) as [EndTime],
gpsL.Latitude as EndLat,
gpsL.Longitude as EndLon
from firstLast fl
inner join GPSPositions gpsF on gpsF.Registration = fl.Registration and gpsF.dateOfRecord = fl.firstRec
inner join GPSPositions gpsL on gpsL.Registration = fl.Registration and gpsL.dateOfRecord = fl.lastRec;
Here is DBFiddle demo.
EDIT: If there could be entries for the same registration at the same time (ID is unique and increasing - ordered by dateOfRecord):
with firstLast (registration,firstRec, lastRec) as
(
select registration,min(id) as firstRec, max(id) as lastRec
from GPSPositions
group by [Registration], cast(dateOfRecord as Date)
)
select
fl.Registration as Reg,
Cast(gpsF.dateOfRecord as Date) as [Day],
Cast(gpsF.dateOfRecord as Time) as [StartTime],
gpsF.Latitude as StartLat,
gpsF.Longitude as StartLon,
Cast(gpsL.dateOfRecord as Time) as [EndTime],
gpsL.Latitude as EndLat,
gpsL.Longitude as EndLon
from firstLast fl
inner join GPSPositions gpsF on gpsF.Id = fl.firstRec
inner join GPSPositions gpsL on gpsL.ID = fl.lastRec;
You could use the APPLY operator and do something like:
DECLARE #t table
(
Registration varchar(10)
, Latitude decimal(6, 3)
, Longitude decimal(6, 3)
, dateOfRecord datetime
)
INSERT INTO #t
VALUES
('A1 XBO', 123.066, 1.456, '2019-08-01 00:04:19.000')
, ('A1 XBO', 128.066, 1.436, '2019-08-01 22:04:19.000')
, ('A1 XBO', 118.066, 1.456, '2019-08-01 23:45:00.000')
SELECT DISTINCT
Registration Reg
, CAST(dateOfRecord AS date) [Day]
, T_MIN.[Time] StartTime
, T_MIN.Latitude StartLat
, T_MIN.Longitude StartLong
, T_MAX.[Time] EndTime
, T_MAX.Latitude EndLat
, T_MAX.Longitude EndLong
FROM
#t T
OUTER APPLY
(
SELECT TOP 1
CAST(T_MIN.dateOfRecord AS time) [Time]
, Latitude
, Longitude
FROM #t T_MIN
WHERE
T_MIN.Registration = T.Registration
AND CAST(T_MIN.dateOfRecord AS date) = CAST(T.dateOfRecord AS date)
ORDER BY T_MIN.dateOfRecord
) T_MIN
OUTER APPLY
(
SELECT TOP 1
CAST(T_MAX.dateOfRecord AS time) [Time]
, Latitude
, Longitude
FROM #t T_MAX
WHERE
T_MAX.Registration = T.Registration
AND CAST(T_MAX.dateOfRecord AS date) = CAST(T.dateOfRecord AS date)
ORDER BY T_MAX.dateOfRecord DESC
) T_MAX
Edit
Based on #SMor's comment, you could also try something like:
DECLARE #t table
(
Registration varchar(10)
, Latitude decimal(6, 3)
, Longitude decimal(6, 3)
, dateOfRecord datetime
)
INSERT INTO #t
VALUES
('A1 XBO', 123.066, 1.456, '2019-08-01 00:04:19.000')
, ('A1 XBO', 128.066, 1.436, '2019-08-01 22:04:19.000')
, ('A1 XBO', 118.066, 1.456, '2019-08-01 23:45:00.000')
SELECT
Reg
, [Day]
, MIN([Time]) StartTime
, MIN(Latitude) StartLat
, MIN(Longitude) StartLong
, MAX([Time]) EndTime
, MAX(Latitude) EndLat
, MAX(Longitude) EndLong
FROM
(
SELECT
Registration Reg
, CAST(dateOfRecord AS date) [Day]
, CAST(dateOfRecord AS time) [Time]
, Latitude
, Longitude
, ROW_NUMBER() OVER (PARTITION BY Registration, CAST(dateOfRecord AS date) ORDER BY dateOfRecord) Mn
, ROW_NUMBER() OVER (PARTITION BY Registration, CAST(dateOfRecord AS date) ORDER BY dateOfRecord DESC) Mx
FROM #t T
) Q
WHERE
Mn = 1
OR Mx = 1
GROUP BY
Reg
, [Day]

SQL grouping data with overlapping timespans

I need to group data together that are related to each other by overlapping timespans based on the records start and end times. SQL-fiddle here: http://sqlfiddle.com/#!18/87e4b/1/0
The current query I have built is giving incorrect results. Callid 3 should give a callCount of 4. It does not because record 6 is not included since it does not overlap with 3, but should be included because it does overlap with one of the other related records. So I believe a recursive CTE may be in need but I am unsure how to write this.
Schema:
CREATE TABLE Calls
([callid] int, [src] varchar(10), [start] datetime, [end] datetime, [conf] varchar(5));
INSERT INTO Calls
([callid],[src],[start],[end],[conf])
VALUES
('1','5555550001','2019-07-09 10:00:00', '2019-07-09 10:10:00', '111'),
('2','5555550002','2019-07-09 10:00:01', '2019-07-09 10:11:00', '111'),
('3','5555550011','2019-07-09 11:00:00', '2019-07-09 11:10:00', '111'),
('4','5555550012','2019-07-09 11:00:01', '2019-07-09 11:11:00', '111'),
('5','5555550013','2019-07-09 11:01:00', '2019-07-09 11:15:00', '111'),
('6','5555550014','2019-07-09 11:12:00', '2019-07-09 11:16:00', '111'),
('7','5555550014','2019-07-09 15:00:00', '2019-07-09 15:01:00', '111');
Current query:
SELECT
detail_record.callid,
detail_record.conf,
MIN(related_record.start) AS sessionStart,
MAX(related_record.[end]) As sessionEnd,
COUNT(related_record.callid) AS callCount
FROM
Calls AS detail_record
INNER JOIN
Calls AS related_record
ON related_record.conf = detail_record.conf
AND ((related_record.start >= detail_record.start
AND related_record.start < detail_record.[end])
OR (related_record.[end] > detail_record.start
AND related_record.[end] <= detail_record.[end])
OR (related_record.start <= detail_record.start
AND related_record.[end] >= detail_record.[end])
)
WHERE
detail_record.start > '1/1/2019'
AND detail_record.conf = '111'
GROUP BY
detail_record.callid,
detail_record.start,
detail_record.conf
HAVING
MIN(related_record.start) >= detail_record.start
ORDER BY sessionStart DESC
Expected Results:
callid conf sessionStart sessionEnd callCount
7 111 2019-07-09T15:00:00Z 2019-07-09T15:01:00Z 1
3 111 2019-07-09T11:00:00Z 2019-07-09T11:15:00Z 4
1 111 2019-07-09T10:00:00Z 2019-07-09T10:11:00Z 2
This is a gaps-and-islands problem. It does not require a recursive CTE. You can use window functions:
select min(callid), conf, grouping, min([start]), max([end]), count(*)
from (select c.*,
sum(case when prev_end < [start] then 1 else 0 end) over (order by start) as grouping
from (select c.*,
max([end]) over (partition by conf order by [start] rows between unbounded preceding and 1 preceding) as prev_end
from calls c
) c
) c
group by conf, grouping;
The innermost subquery calculates the previous end. The middle subquery compares this to the current start, to determine when groups of adjacent rows are the beginning of a new group. A cumulative sum then determines the grouping.
And, the outer query aggregates to summarize information about each group.
Here is a db<>fiddle.

What will be the best possible way to find date difference?

I have a table for operators in which I want to calculate the time difference between two status (10-20) for the whole day .
Here I want the time difference between "ActivityStatus" 10 and 20.
we have total 3 bunch of 10-20 status in this pic. for last status there is no 20 status in this case it will take the last oa_createdDate (ie oa_id 230141).
My expected output for this operator is date diff between cl_id 230096 and 230102 , date diff between cl_id 230103 and 230107 , date diff between cl_id 230109 and cl_id 230141. Once I get these difference I want to sum all the date diff value to calculate busy time for that operator.
Thanks in advance .
I have a sneaking suspicion that the DateDiff() function is the function that you seek
http://www.w3schools.com/sql/func_datediff.asp
There's an easy way to do what I assume you want done with outer apply, like so:
select tmin.*, t.oa_CreateDate oa_CreateDate_20
, datediff(minute, tmin.oa_CreateDate, t.oa_CreateDate) DiffInMinutes
from testtable t
cross apply
(select top 1 *
from testtable tmin
where tmin.oa_CreateDate < t.oa_CreateDate and tmin.oa_OperatorId = t.oa_OperatorId
order by tmin.oa_CreateDate asc) tmin
where t.ActivityStatus = 20
and t.oa_CreateDate < (select min(oa_CreateDate) from testtable where ActivityStatus = 10 and oa_OperatorId = 1960)
and t.oa_OperatorId = 1960
union all
select t.*
, coalesce(a.oa_CreateDate,ma.MaxDate) oa_CreateDate_20
, datediff(minute, t.oa_CreateDate, coalesce(a.oa_CreateDate,ma.MaxDate)) DiffInMinutes
from testtable t
outer apply
(select top 1 a.oa_CreateDate
from testtable a
where a.oa_OperatorId = t.oa_OperatorId and a.ActivityStatus = 20
and t.oa_CreateDate < a.oa_CreateDate order by a.oa_CreateDate asc) a
outer apply
(select max(a2.oa_CreateDate) maxDate
from testtable a2
where a2.oa_OperatorId = t.oa_OperatorId
and t.oa_CreateDate < a2.oa_CreateDate) ma
where oa_OperatorId = 1960
and ActivityStatus = 10
order by oa_CreateDate asc, oa_CreateDate_20 asc
You can see the fiddle here.
But of course, you have to give us the format / accurracy for the datediff comparison. And this assumes you will always have both Status 10 AND 20, and that their timestamp ranges never overlap.
EDIT: Updated the answer based on your comment, check the new script and fiddle. Now the script fill find all Status 10 - 20 datediffs, and in case no Status 20 exists after the last 10, then the latest existing timestamp after that Status 10 will be used instead.
EDIT 2: Updated with your comment below. But at this point the script is getting rather ugly. Unfortunately I don't have the time to clean it up, so I ask that next time you post a question, please make it as clear cut and clean as possible, since there's a lot less effort involved to answer a question once instead of editing 3 different variations along the ride. :)
This should work anyhow, the new section before the UNION ALL in the script will return results only if there are any Status 20's without preceding 10's. Otherwise it'll return nothing, and move to the main portion of the script as before. Fiddle has been updated as well.
This is one way of doing it.
The first OUTER APPLY will retrieve the next row with a status of 20 that is after the current created datetime.
The second OUTER APPLY will retrieve the next row after the current created datetime where there is no status 20.
SELECT
o.*
, COALESCE(NextStatus.oa_CreateDate, NextStatusIsNull.oa_CreateDate) AS NextTimestamp
, COALESCE(NextStatus.ActivityStatus, NextStatusIsNull.ActivityStatus) AS NextStatus
, DATEDIFF(MINUTE, o.oa_CreateDate,
COALESCE(NextStatus.oa_CreateDate, NextStatusIsNull.oa_CreateDate))
AS DifferenceInMinutes
FROM
operators AS o
OUTER APPLY
(
SELECT TOP 1
oa_CreateDate
, ActivityStatus
FROM
operators
WHERE
ActivityStatus = 20
AND oa_CreateDate > o.oa_CreateDate
ORDER BY
oa_CreateDate
) AS NextStatus
OUTER APPLY
(
SELECT TOP 1
oa_CreateDate
, ActivityStatus
FROM
operators
WHERE
NextStatus.oa_CreateDate IS NULL
AND oa_CreateDate > o.oa_CreateDate
ORDER BY
oa_CreateDate
) AS NextStatusIsNull
WHERE
ActivityStatus = 10
I have used some different test data because you used a picture from which I was unable to cut and paste. This should be easy to convert to your table:
Note this should also work with the none-existing start and end dates,
Also note this was done without any joins to optimize performance.
Test table and data:
DECLARE #t table(ActivityStatus int, oa_createdate datetime, oa_operatorid int)
INSERT #t values
(30, '2015-07-23 08:20', 1960),(20, '2015-07-23 08:24', 1960),
(10, '2015-07-23 08:30', 1960),(20, '2015-07-23 08:40', 1960),
(10, '2015-07-23 08:50', 1960),(50, '2015-07-23 09:40', 1960)
Query:
;WITH cte as
(
SELECT
ActivityStatus,
oa_createdate,
oa_operatorid
FROM #t
WHERE ActivityStatus in (10,20)
UNION ALL
SELECT 20, max(oa_createdate), oa_operatorid
FROM #t
GROUP BY oa_operatorid
HAVING
max(case when ActivityStatus = 20 then oa_createdate end) <
max(case when ActivityStatus = 10 then oa_createdate end)
UNION ALL
SELECT 10, min(oa_createdate), oa_operatorid
FROM #t
GROUP BY oa_operatorid
HAVING
min(case when ActivityStatus = 20 then oa_createdate end) <
min(case when ActivityStatus = 10 then oa_createdate else '2999-01-01' end)
)
SELECT
cast(cast(sum(case when activitystatus = 10 then -1 else 1 end
* cast(oa_createdate as float)) as datetime) as time(0)) as difference_in_time,
oa_operatorid
FROM cte
GROUP BY oa_operatorid
Result:
difference_in_time oa_operatorid
01:04:00 1960
Data
create table #Table2 (oa_id int, oa_OperatorId int, ActivityStatus int, oa_CreateDate datetime)
insert into #Table2
values (1, 1960,10,'2015-08-10 10:55:12.317')
,(2, 1960,20,'2015-08-10 11:55:12.317')
,(3, 1960,30,'2015-08-10 14:55:12.317')
,(4, 1960,50,'2015-08-10 14:58:12.317')
,(5, 1960,10,'2015-08-10 15:55:12.317')
,(6, 1960,20,'2015-08-10 16:20:12.317')
,(7, 1960,10,'2015-08-10 16:30:12.317')
,(8, 1960,50,'2015-08-10 17:20:12.317')
Populate target table with the rows we are interested in
select oa_id,
oa_operatorid,
ActivityStatus,
oa_createDate,
rn = row_number() over (order by oa_id desc)
into #Table
from #Table2
where ActivityStatus in (10, 20)
insert #Table
select top 1
oa_id,
oa_operatorid,
ActivityStatus,
oa_createDate,
0
from #Table2
order by oa_id desc
select * into #Table10 from #Table where ActivityStatus = 10
select * into #Table20 from #Table where ActivityStatus = 20
union
select * from #Table where rn = 0 /*add the last record*/
except
select * from #Table where rn = (select max(rn) from #Table) /**discard the first "20" record*/
/*free time info*/
select datediff(second, t10.oa_createDate, t20.oa_createDate) secondssincelast10,
t20.*
from #Table10 t10 join #Table20 t20
on t10.rn = t20.rn + 1
and t10.oa_OperatorId = t20.oa_OperatorId
/*Summarized info per operator*/
select sum(datediff(second, t10.oa_createDate, t20.oa_createDate)) totalbusytime,
t20.oa_OperatorId
from #Table10 t10 join #Table20 t20
on t10.rn = t20.rn + 1
and t10.oa_OperatorId = t20.oa_OperatorId
group by t20.oa_OperatorId
Best way
DATEDIFF(expr1,expr2)
Example:
CREATE TABLE pins
(`id` int, `time` datetime)
;
INSERT INTO pins
(`id`, `time`)
VALUES
(1, '2013-11-15 05:25:25')
;
SELECT DATEDIFF(CURDATE(), `time`)
FROM `pins`

How to "group" events by difference between the row under specified duration between?

The issue I have - is a way to group several events together. The only indicator I have - is the time between the two, three events. A person is doing some tasks (start/end) - everything what is within 14 hrs - is considered to be one working day. Well - it is also over midnight - so date is no option.
I have build a query - which would give me in the first record the indication on how many following records would belong to it. (This is one approach to in).
declare #MyTable table
(UserID int, StartDate datetime, FinishDate datetime, GroupCount int);
insert into #MyTable values
('6', '2014-03-18 10:20:00.000', '2014-03-18 13:10:00.000', '2'), --(should take StartDate from this row - and Enddate from next (2) row)
('6', '2014-03-18 13:35:00.000', '2014-03-18 16:25:00.000', '1'),
('6', '2014-03-19 12:05:00.000', '2014-03-19 14:55:00.000', '1'),
('21', '2014-03-14 14:50:00.000', '2014-03-14 15:40:00.000', '1'),
('21', '2014-03-18 13:35:00.000', '2014-03-18 16:55:00.000', '1'),
('99', '2014-03-10 08:05:00.000', '2014-03-10 10:55:00.000', '2'),
('99', '2014-03-10 11:20:00.000', '2014-03-10 14:10:00.000', '1'),
('99', '2014-03-11 10:20:00.000', '2014-03-11 13:10:00.000', '2'),
('99', '2014-03-11 13:50:00.000', '2014-03-11 16:40:00.000', '1');
select * from #MyTable
I need to find a way - to group them somehow together - so I have the "min" StartDate and "max" FinishDate.
In the end - it should look like this:
declare #MyResult table
(UserID int, StartDate datetime, FinishDate datetime);
insert into #MyResult values
('6', '2014-03-18 10:20:00.000', '2014-03-18 16:25:00.000'),
('6', '2014-03-19 12:05:00.000', '2014-03-19 14:55:00.000'),
('21', '2014-03-14 14:50:00.000', '2014-03-14 15:40:00.000'),
('21', '2014-03-18 13:35:00.000', '2014-03-18 16:55:00.000'),
('99', '2014-03-10 08:05:00.000', '2014-03-10 14:10:00.000'),
('99', '2014-03-11 10:20:00.000', '2014-03-11 16:40:00.000');
select UserID, StartDate, Finishdate, datediff (minute, StartDate, FinishDate) as Duration,
LEAD(startdate,1,NULL) over(partition by userid order by startdate) NextDuty,
DATEDIFF(minute,FinishDate,LEAD(StartDate,1,NULL) over(partition by userid order by StartDate)) as DifMin
from #MyResult
well - this also depens on a UserID. The GroupCount - was just an idea... but I do not know how to jump for "2" records - to select next start - GroupCount field etc.
2 would indicate - the current and next record belong together, 1 only this actual record.
There would be also 3 or 4 - records belonging together.
All should be done in MS-SQL 2012.
Unfortunately, your condition seems to require iterating through the data, one row at a time. If the condition were "start a new row when there is an 8-hour gap at least", then there are some other possibilities. But, the logic has to start at the first row for each customer, assigning the group, and then using the logic to increment the group when all tasks within 14-hours have been identified.
The following approach uses a recursive CTE. The only other SQL alternative I can think of is a cursor.
with t as (
select t.*,
row_number() over (partition by UserId order by StartDate) as seqnum
from MyTable t
),
cte as (
select t.UserId, t.StartDate, t.FinishDate, seqnum,
1 as grp, t.StartDate as grp_start
from t
where seqnum = 1
union all
select t.UserId, t.StartDate, t.FinishDate, t.seqnum,
(case when t.StartDate - cte.grp_start <= 14.0/24
then cte.grp
else cte.grp + 1
end),
(case when t.StartDate - cte.grp_start <= 14.0/24
then cte.grp_start
else t.StartDate
end)
from cte join
t
on cte.UserId = t.UserId and
cte.seqnum = t.seqnum - 1
)
select userid, min(startdate), max(finishdate)
from cte
group by userid, grp
order by 1, 2;
You can see this work here.
I know this question is old, but there's a better way to do this. Given that your problem is a bit more complex than standard islands, might I offer:
select t.UserID
, t.StartDate, isnull(b.FinishDate, t.FinishDate) as FinishDate
, datediff(minute, t.StartDate, isnull(b.FinishDate, t.FinishDate)) as Duration
, n.NextDuty
, datediff(minute, isnull(b.FinishDate, t.FinishDate), n.NextDuty) as DiffMin
from #MyTable t
outer apply (
select top 1 FinishDate
from #MyTable b
where b.UserID = t.UserID
and b.StartDate > t.StartDate
and datediff(hh, t.StartDate, b.StartDate) < 14
order by b.StartDate desc
) b
outer apply (
select top 1 StartDate as NextDuty
from #MyTable n
where n.UserID = t.UserID
and n.StartDate > t.StartDate
and datediff(hh, t.StartDate, n.StartDate) > 14
order by n.StartDate
) n
where not exists (
select top 1 1
from #MyTable p
where p.UserID = t.UserID
and p.StartDate < t.StartDate
and datediff(hh, p.StartDate, t.StartDate) < 14
)
On the real table, you'll want to ensure this index is in place:
CREATE INDEX IX_nameThisIndex ON <#MyTable> (UserId, StartDate, FinishDate)
This should get you reliable results, and I've tested with additional data on my side, but haven't done so exhaustively with large sets. The index will be needed for large sets.
Hope this helps.

Merge adjacent rows in SQL?

I'm doing some reporting based on the blocks of time employees work. In some cases, the data contains two separate records for what really is a single block of time.
Here's a basic version of the table and some sample records:
EmployeeID
StartTime
EndTime
Data:
EmpID Start End
----------------------------
#1001 10:00 AM 12:00 PM
#1001 4:00 PM 5:30 PM
#1001 5:30 PM 8:00 PM
In the example, the last two records are contiguous in time. I'd like to write a query that combines any adjacent records so the result set is this:
EmpID Start End
----------------------------
#1001 10:00 AM 12:00 PM
#1001 4:00 PM 8:00 PM
Ideally, it should also be able to handle more than 2 adjacent records, but that is not required.
This article provides quite a few possible solutions to your question
http://www.sqlmag.com/blog/puzzled-by-t-sql-blog-15/tsql/solutions-to-packing-date-and-time-intervals-puzzle-136851
This one seems like the most straight forward:
WITH StartTimes AS
(
SELECT DISTINCT username, starttime
FROM dbo.Sessions AS S1
WHERE NOT EXISTS
(SELECT * FROM dbo.Sessions AS S2
WHERE S2.username = S1.username
AND S2.starttime < S1.starttime
AND S2.endtime >= S1.starttime)
),
EndTimes AS
(
SELECT DISTINCT username, endtime
FROM dbo.Sessions AS S1
WHERE NOT EXISTS
(SELECT * FROM dbo.Sessions AS S2
WHERE S2.username = S1.username
AND S2.endtime > S1.endtime
AND S2.starttime <= S1.endtime)
)
SELECT username, starttime,
(SELECT MIN(endtime) FROM EndTimes AS E
WHERE E.username = S.username
AND endtime >= starttime) AS endtime
FROM StartTimes AS S;
If this is strictly about adjacent rows (not overlapping ones), you could try the following method:
Unpivot the timestamps.
Leave only those that have no duplicates.
Pivot the remaining ones back, coupling every Start with the directly following End.
Or, in Transact-SQL, something like this:
WITH unpivoted AS (
SELECT
EmpID,
event,
dtime,
count = COUNT(*) OVER (PARTITION BY EmpID, dtime)
FROM atable
UNPIVOT (
dtime FOR event IN (StartTime, EndTime)
) u
)
, filtered AS (
SELECT
EmpID,
event,
dtime,
rowno = ROW_NUMBER() OVER (PARTITION BY EmpID, event ORDER BY dtime)
FROM unpivoted
WHERE count = 1
)
, pivoted AS (
SELECT
EmpID,
StartTime,
EndTime
FROM filtered
PIVOT (
MAX(dtime) FOR event IN (StartTime, EndTime)
) p
)
SELECT *
FROM pivoted
;
There's a demo for this query at SQL Fiddle.
CTE with cumulative sum:
DECLARE #t TABLE(EmpId INT, Start TIME, Finish TIME)
INSERT INTO #t (EmpId, Start, Finish)
VALUES
(1001, '10:00 AM', '12:00 PM'),
(1001, '4:00 PM', '5:30 PM'),
(1001, '5:30 PM', '8:00 PM')
;WITH rowind AS (
SELECT EmpId, Start, Finish,
-- IIF returns 1 for each row that should generate a new row in the final result
IIF(Start = LAG(Finish, 1) OVER(PARTITION BY EmpId ORDER BY Start), 0, 1) newrow
FROM #t),
groups AS (
SELECT EmpId, Start, Finish,
-- Cumulative sum
SUM(newrow) OVER(PARTITION BY EmpId ORDER BY Start) csum
FROM rowind)
SELECT
EmpId,
MIN(Start) Start,
MAX(Finish) Finish
FROM groups
GROUP BY EmpId, csum
I have changed a lil' bit the names and types to make the example smaller but this works and should be very fast and it has no number of records limit:
with cte as (
select
x1.id
,x1.t1
,x1.t2
,case when x2.t1 is null then 1 else 0 end as bef
,case when x3.t1 is null then 1 else 0 end as aft
from x x1
left join x x2 on x1.id=x2.id and x1.t1=x2.t2
left join x x3 on x1.id=x3.id and x1.t2=x3.t1
where x2.id is null
or x3.id is null
)
select
cteo.id
,cteo.t1
,isnull(z.t2,cteo.t2) as t2
from cte cteo
outer apply (select top 1 *
from cte ctei
where cteo.id=ctei.id and cteo.aft=0 and ctei.t1>cteo.t1
order by t1) z
where cteo.bef=1
and the fiddle for it : http://sqlfiddle.com/#!3/ad737/12/0
Option with Inline User-Defined Function AND CTE
CREATE FUNCTION dbo.Overlap
(
#availStart datetime,
#availEnd datetime,
#availStart2 datetime,
#availEnd2 datetime
)
RETURNS TABLE
RETURN
SELECT CASE WHEN #availStart > #availEnd2 OR #availEnd < #availStart2
THEN #availStart ELSE
CASE WHEN #availStart > #availStart2 THEN #availStart2 ELSE #availStart END
END AS availStart,
CASE WHEN #availStart > #availEnd2 OR #availEnd < #availStart2
THEN #availEnd ELSE
CASE WHEN #availEnd > #availEnd2 THEN #availEnd ELSE #availEnd2 END
END AS availEnd
;WITH cte AS
(
SELECT EmpID, Start, [End], ROW_NUMBER() OVER (PARTITION BY EmpID ORDER BY Start) AS Id
FROM dbo.TableName
), cte2 AS
(
SELECT Id, EmpID, Start, [End]
FROM cte
WHERE Id = 1
UNION ALL
SELECT c.Id, c.EmpID, o.availStart, o.availEnd
FROM cte c JOIN cte2 ct ON c.Id = ct.Id + 1
CROSS APPLY dbo.Overlap(c.Start, c.[End], ct.Start, ct.[End]) AS o
)
SELECT EmpID, Start, MAX([End])
FROM cte2
GROUP BY EmpID, Start
Demo on SQLFiddle