Duration overlap causing double counting - sql

I'm using SQL Server Management Studio 2008 for query creation. Reporting Services 2008 for report creation.
I have been trying to work this out over a couple of weeks and I have hit a brick wall. I’m hoping someone will be able to come up with the solution as right now my brain has turned to mush.
I am currently developing an SQL query that will be feeding data through to a Reporting Services report. The aim of the report is to show the percentage of availability for first aid providers within locations around the county we are based in. The idea is that there should be only one first aider providing cover at a time at each of our 20 locations.
This has all been working fine apart from the first aiders at one location have been overlapping their cover at the start and end of each period of cover.
Example of cover overlap:
| Location | start_date | end_date |
+----------+---------------------+---------------------+
| Wick | 22/06/2015 09:00:00 | 22/06/2015 19:00:00 |
| Wick | 22/06/2015 18:30:00 | 23/06/2015 09:00:00 |
| Wick | 23/06/2015 09:00:00 | 23/06/2015 18:30:00 |
| Wick | 23/06/2015 18:00:00 | 24/06/2015 09:00:00 |
+----------+---------------------+---------------------+
In a perfect world the database that they set their cover in wouldn’t allow them to do this but it’s an externally developed database that doesn’t allow us to make changes like that to it. We also aren’t allowed to create functions, stored procedures, tally tables etc…
The query itself should return the number of minutes that each location has had first aid cover for, then broken down into hours of the day. Any overlap in cover shouldn’t end up adding additional cover and should be merged. One person can be on at a time, if they overlap then it should only count as one person lot of cover.
Example Output:
+----------+---------------------+---------------------+----------+--------------+--------+-------+------+----------+
| Location | fromDt | toDt | TimeDiff | Availability | DayN | DayNo | Hour | DayCount |
+----------+---------------------+---------------------+----------+--------------+--------+-------+------+----------+
| WicK | 22/06/2015 18:00:00 | 22/06/2015 18:59:59 | 59 | 100 | Monday | 1 | 18 | 0 |
| WicK | 22/06/2015 18:30:00 | 22/06/2015 18:59:59 | 29 | 50 | Monday | 1 | 18 | 0 |
| WicK | 22/06/2015 19:00:00 | 22/06/2015 19:59:59 | 59 | 100 | Monday | 1 | 19 | 0 |
+----------+---------------------+---------------------+----------+--------------+--------+-------+------+----------+
Example Code:
DECLARE
#StartTime datetime,
#EndTime datetime,
#GivenDate datetime;
SET #GivenDate = '2015-06-22';
SET #StartTime = #GivenDate + ' 00:00:00';
SET #EndTime = '2015-06-23' + ' 23:59:59';
Declare #Sample Table
(
Location Varchar(50),
StartDate Datetime,
EndDate Datetime
)
Insert #Sample
Select
sta.location,
act.Start,
act.END
from emp,
con,
sta,
act
where
emp.ID = con.ID
and con.location = sta.location
and SUBSTRING(sta.ident,3,2) in ('51','22')
and convert(varchar(10),act.start,111) between #GivenDate and #EndTime
and act.ACT= 18
group by sta.location,
act.Start,
act.END
order by 2
;WITH Yak (location, fromDt, toDt, maxDt,hourdiff)
AS (
SELECT location,
StartDate,
/*check if the period of cover rolls onto the next hour */
convert(datetime,convert(varchar(21),
CONVERT(varchar(10),StartDate,111)+' '
+convert(varchar(2),datepart(hour,StartDate))+':59'+':59'))
,
EndDate
,dateadd(hour,1,dateadd(hour, datediff(hour, 0, StartDate), 0))-StartDate
FROM #Sample
UNION ALL
SELECT location,
dateadd(second,1,toDt),
dateadd(hour, 1, toDt),
maxDt,
hourdiff
FROM Yak
WHERE toDt < maxDt
) ,
TAB1 (location, FROMDATE,TODATE1,TODATE) AS
(SELECT
location,
#StartTime,
convert(datetime,convert(varchar(21),
CONVERT(varchar(10),#StartTime,120)+' '
+convert(varchar(2),datepart(hour,#StartTime))+':59'+':59.999')),
#EndTime
from #Sample
UNION ALL
SELECT
location,
(DATEADD(hour, 1,(convert(datetime,convert(varchar(21),
CONVERT(varchar(10),FROMDATE,120)+' '
+convert(varchar(2),datepart(hour,FROMDATE))+':00'+':00.000')))))ToDate,
(DATEADD(hour, 1,(convert(datetime,convert(varchar(21),
CONVERT(varchar(10),TODATE1,120)+' '
+convert(varchar(2),datepart(hour,TODATE1))+':59'+':59.999'))))) Todate1,
TODATE
FROM TAB1 WHERE TODATE1 < TODATE
),
/*CTE Tab2 adds zero values to all possible hours between start and end dates */
TAB2 AS
(SELECT location, FROMDATE,
CASE WHEN TODATE1 > TODATE THEN TODATE ELSE TODATE1 END AS TODATE
FROM TAB1)
SELECT location,
fromDt,
/* Display MaxDT as start time if cover period goes into next dat */
CASE WHEN toDt > maxDt THEN maxDt ELSE toDt END AS toDt,
/* If the end date is on the next day find out the minutes between the start date and the end of the day or find out the minutes between the next day and the end date */
Case When ToDt > Maxdt then datediff(mi,fromDt,maxDt) else datediff(mi,FromDt,ToDt) end as TimeDiff,
Case When ToDt > Maxdt then round(datediff(S,fromDt,maxDt)/3600.0*100,0) else round(datediff(S,FromDt,ToDt)/3600.0*100.0,0) end as Availability,
/*Display the name of the day of the week*/
CASE WHEN toDt > maxDt THEN datename(dw,maxDt) ELSE datename(dw,fromDt) END AS DayN,
CASE WHEN toDt > maxDt THEN case when datepart(dw,maxDt)-1 = 0 then 7 else datepart(dw,maxDt)-1 end ELSE case when datepart(dw,fromDt)-1 = 0 then 7 else datepart(dw,fromDt)-1 END end AS DayNo
,DATEPART(hour, fromDt) as Hour,
'0' as DayCount
FROM Yak
where Case When ToDt > Maxdt then datediff(mi,fromDt,maxDt) else datediff(mi,FromDt,ToDt) end <> 0
group by location,fromDt,maxDt,toDt
Union all
SELECT
tab2.location,
convert(varchar(19),Tab2.FROMDATE,120),
convert(varchar(19),Tab2.TODATE,120),
'0',
'0',
datename(dw,FromDate) DayN,
case when datepart(dw,FromDate)-1 = 0 then 7 else datepart(dw,FromDate)-1 end AS DayNo,
DATEPART(hour, fromDate) as Hour,
COUNT(distinct datename(dw,fromDate))
FROM TAB2
Where datediff(MINUTE,convert(varchar(19),Tab2.FROMDATE,120),convert(varchar(19),Tab2.TODATE,120)) > 0
group by location, TODATE, FROMDATE
Order by 2
option (maxrecursion 0)
I have tried the following forum entries but they haven't worked in my case:
http://forums.teradata.com/forum/general/need-help-merging-consecutive-and-overlapping-date-spans
Checking for time range overlap, the watchman problem [SQL]
Calculate Actual Downtime ignoring overlap in dates/times
Sorry for being so lengthy but I thought I would try to give you as much detail as possible. Any help will be really appreciated. Thank you.

So the solution I came up with uses temp tables, which you can easily change to be CTEs so you can avoid using a stored procedure.
I tried working with window functions to find overlapping records and get the min and max times, the issue is where you have overlap chaining e.g. 09:00 - 09:10, 09:05 - 09:15, 09:11 - 09:20, so all minutes from 09:00 to 09:20 are covered, but it's almost impossible to tell that 09:00 - 09:10 is related to 09:11 - 09:20 without recursing through the results until you get to the bottom of the chain. (Hopefully that makes sense).
So I exploded out all of the date ranges into every minute between the StartDate and EndDate, then you can use the ROW_NUMBER() window function to catch any duplicates, which in turn you can use to see how many different people covered the same minute.
CREATE TABLE dbo.dates
(
Location VARCHAR(64),
StartDate DATETIME,
EndDate DATETIME
);
INSERT INTO dbo.dates VALUES
('Wick','20150622 09:00:00','20150622 19:00:00'),
('Wick','20150622 18:30:00','20150624 09:00:00'),
('Wick','20150623 09:00:00','20150623 18:30:00'),
('Wick','20150623 18:00:00','20150624 09:00:00'),
('Wick','20150630 09:00:00','20150630 09:30:00'),
('Wick','20150630 09:00:00','20150630 09:45:00'),
('Wick','20150630 09:10:00','20150630 09:25:00'),
('Wick','20150630 09:35:00','20150630 09:55:00'),
('Wick','20150630 09:57:00','20150630 10:10:00');
SELECT ROW_NUMBER() OVER (PARTITION BY Location ORDER BY StartDate) [Id],
Location,
StartDate,
EndDate
INTO dbo.overlaps
FROM dbo.dates;
SELECT TOP 10000 N=IDENTITY(INT, 1, 1)
INTO dbo.Num
FROM master.dbo.syscolumns a CROSS JOIN master.dbo.syscolumns b;
SELECT 0 [N] INTO dbo.Numbers;
INSERT INTO dbo.Numbers SELECT * FROM dbo.Num;
SELECT [Location] = raw.Location,
[WorkedDate] = CAST([MinuteWorked] AS DATE),
[DayN] = DATENAME(WEEKDAY, [MinuteWorked]),
[DayNo] = DATEPART(WEEKDAY, [MinuteWorked]) -1,
[Hour] = DATEPART(HOUR, [MinuteWorked]),
[MinutesWorked] = SUM(IIF(raw.[Minutes] = 1, 1, 0)),
[MaxWorkers] = MAX(raw.[Minutes])
FROM
(
SELECT
o.Location,
DATEADD(MINUTE, n.N, StartDate) [MinuteWorked],
ROW_NUMBER() OVER (PARTITION BY o.Location, DATEADD(MINUTE, n.N, StartDate) ORDER BY DATEADD(MINUTE, n.N, StartDate)) [Minutes]
FROM dbo.overlaps o
INNER JOIN dbo.Numbers n ON n.N < DATEDIFF(MINUTE, StartDate, EndDate)
) raw
GROUP BY
raw.Location,
CAST([MinuteWorked] AS DATE),
DATENAME(WEEKDAY, [MinuteWorked]),
DATEPART(WEEKDAY, [MinuteWorked]) - 1,
DATEPART(HOUR, [MinuteWorked])
Here's a subset of the results:
Location WorkedDate DayN DayNo Hour MinutesWorked MaxWorkers
Wick 2015-06-24 Wednesday 3 8 60 2
Wick 2015-06-30 Tuesday 2 9 58 3
Wick 2015-06-30 Tuesday 2 10 10 1
Here's the fiddle

Related

Group time series by time intervals (e.g. days) with aggregate of duration

I have a table containing a time series with following information. Each record represents the event of "changing the mode".
Timestamp | Mode
------------------+------
2018-01-01 12:00 | 1
2018-01-01 18:00 | 2
2018-01-02 01:00 | 1
2018-01-02 02:00 | 2
2018-01-04 04:00 | 1
By using the LEAD function, I can create a query with the following result. Now each record contains the information, when and how long the "mode was active".
Please check the 2nd and the 4th record. They "belong" to multiple days.
StartDT | EndDT | Mode | Duration
------------------+------------------+------+----------
2018-01-01 12:00 | 2018-01-01 18:00 | 1 | 6:00
2018-01-01 18:00 | 2018-01-02 01:00 | 2 | 7:00
2018-01-02 01:00 | 2018-01-02 02:00 | 1 | 1:00
2018-01-02 02:00 | 2018-01-04 04:00 | 2 | 50:00
2018-01-04 04:00 | (NULL) | 1 | (NULL)
Now I would like to have a query that groups the data by day and mode and aggregates the duration.
This result table is needed:
Date | Mode | Total
------------+------+-------
2018-01-01 | 1 | 6:00
2018-01-01 | 2 | 6:00
2018-01-02 | 1 | 1:00
2018-01-02 | 2 | 23:00
2018-01-03 | 2 | 24:00
2018-01-04 | 2 | 04:00
I didn't known how to handle the records that "belongs" to multiple days. Any ideas?
create table ChangeMode ( ModeStart datetime2(7), Mode int )
insert into ChangeMode ( ModeStart, Mode ) values
( '2018-11-15T21:00:00.0000000', 1 ),
( '2018-11-16T17:18:19.1231234', 2 ),
( '2018-11-16T18:00:00.5555555', 1 ),
( '2018-11-16T18:00:01.1234567', 2 ),
( '2018-11-16T19:02:22.8888888', 1 ),
( '2018-11-16T20:00:00.9876543', 2 ),
( '2018-11-17T09:00:00.0000000', 1 ),
( '2018-11-17T23:23:23.0230450', 2 ),
( '2018-11-19T17:00:00.0172839', 1 ),
( '2018-11-20T03:07:00.7033077', 2 )
;
with
-- Determine the earliest and latest dates.
-- Cast to date to remove the time portion.
-- Cast results back to datetime because we're going to add hours later.
MinMaxDates
as
(select cast(min(cast(ModeStart as date))as datetime) as MinDate,
cast(max(cast(ModeStart as date))as datetime) as MaxDate from ChangeMode),
-- How many days have passed during that period
Dur
as
(select datediff(day,MinDate,MaxDate) as Duration from MinMaxDates),
-- Create a list of numbers.
-- These will be added to MinDate to get a list of dates.
NumList
as
( select 0 as Num
union all
select Num+1 from NumList,Dur where Num<Duration ),
-- Create a list of dates by adding those numbers to MinDate
DayList
as
( select dateadd(day,Num,MinDate)as ModeDate from NumList, MinMaxDates ),
-- Create a list of day periods
PeriodList
as
( select ModeDate as StartTime,
dateadd(day,1,ModeDate) as EndTime
from DayList ),
-- Use LEAD to get periods for each record
-- Final record would return NULL for ModeEnd
-- We replace that with end of last day
ModePeriodList
as
( select ModeStart,
coalesce( lead(ModeStart)over(order by ModeStart),
dateadd(day,1,MaxDate) ) as ModeEnd,
Mode from ChangeMode, MinMaxDates ),
ModeDayList
as
( select * from ModePeriodList, PeriodList
where ModeStart<=EndTime and ModeEnd>=StartTime
),
-- Keep the later of the mode start time, and the day start time
-- Keep the earlier of the mode end time, and the day end time
ModeDayPeriod
as
( select case when ModeStart>=StartTime then ModeStart else StartTime end as StartTime,
case when ModeEnd<=EndTime then ModeEnd else EndTime end as EndTime,
Mode from ModeDayList ),
SumDurations
as
( select cast(StartTime as date) as ModeDate,
Mode,
DateDiff_Big(nanosecond,StartTime,EndTime)
/3600000000000
as DurationHours from ModeDayPeriod )
-- List the results in order
-- Use MaxRecursion option in case there are more than 100 days
select ModeDate as [Date], Mode, sum(DurationHours) as [Total Duration Hours]
from SumDurations
group by ModeDate, Mode
order by ModeDate, Mode
option (maxrecursion 0)
Result is:
Date Mode Total Duration Hours
---------- ----------- ---------------------------------------
2018-11-15 1 3.00000000000000
2018-11-16 1 18.26605271947221
2018-11-16 2 5.73394728052777
2018-11-17 1 14.38972862361111
2018-11-17 2 9.61027137638888
2018-11-18 2 24.00000000000000
2018-11-19 1 6.99999519891666
2018-11-19 2 17.00000480108333
2018-11-20 1 3.11686202991666
2018-11-20 2 20.88313797008333
you could use a CTE to create a table of days then join the time slots to it
DECLARE #MAX as datetime2 = (SELECT MAX(CAST(Timestamp as date)) MX FROM process);
WITH StartEnd AS (select p1.Timestamp StartDT,
P2.Timestamp EndDT ,
p1.mode
from process p1
outer apply
(SELECT TOP 1 pOP.* FROM
process pOP
where pOP.Timestamp > p1.Timestamp
order by pOP.Timestamp asc) P2
),
CAL AS (SELECT (SELECT MIN(cast(StartDT as date)) MN FROM StartEnd) DT
UNION ALL
SELECT DATEADD(day,1,DT) DT FROM CAL WHERE CAL.DT < #MAX
),
TMS AS
(SELECT CASE WHEN S.StartDT > C.DT THEN S.StartDT ELSE C.DT END AS STP,
CASE WHEN S.EndDT < DATEADD(day,1,C.DT) THEN S.ENDDT ELSE DATEADD(day,1,C.DT) END AS STE
FROM StartEnd S JOIN CAL C ON NOT(S.EndDT <= C.DT OR S.StartDT>= DATEADD(day,1,C.dt))
)
SELECT *,datediff(MI ,TMS.STP, TMS.ste) as x from TMS
The following uses recursive CTE to build a list of dates (a calendar or number table works equally well). It then intersect the dates with date times so that missing dates are populated with matching data. The important bit is that for each row, if start datetime belongs to previous day then it is clamped to 00:00. Likewise for end datetime.
DECLARE #t TABLE (timestamp DATETIME, mode INT);
INSERT INTO #t VALUES
('2018-01-01 12:00', 1),
('2018-01-01 18:00', 2),
('2018-01-02 01:00', 1),
('2018-01-02 02:00', 2),
('2018-01-04 04:00', 1);
WITH cte1 AS (
-- the min and max dates in your data
SELECT
CAST(MIN(timestamp) AS DATE) AS mindate,
CAST(MAX(timestamp) AS DATE) AS maxdate
FROM #t
), cte2 AS (
-- build all dates between min and max dates using recursive cte
SELECT mindate AS day_start, DATEADD(DAY, 1, mindate) AS day_end, maxdate
FROM cte1
UNION ALL
SELECT DATEADD(DAY, 1, day_start), DATEADD(DAY, 2, day_start), maxdate
FROM cte2
WHERE day_start < maxdate
), cte3 AS (
-- pull end datetime from next row into current
SELECT
timestamp AS dt_start,
LEAD(timestamp) OVER (ORDER BY timestamp) AS dt_end,
mode
FROM #t
), cte4 AS (
-- join datetime with date using date overlap query
-- then clamp start datetime to 00:00 of the date
-- and clamp end datetime to 00:00 of next date
SELECT
IIF(dt_start < day_start, day_start, dt_start) AS dt_start_fix,
IIF(dt_end > day_end, day_end, dt_end) AS dt_end_fix,
mode
FROM cte2
INNER JOIN cte3 ON day_end > dt_start AND dt_end > day_start
)
SELECT dt_start_fix, dt_end_fix, mode, datediff(minute, dt_start_fix, dt_end_fix) / 60.0 AS total
FROM cte4
DB Fiddle
Thanks everybody!
The answer from Cato put me on the right track. Here my final solution:
DECLARE #Start AS datetime;
DECLARE #End AS datetime;
DECLARE #Interval AS int;
SET #Start = '2018-01-01';
SET #End = '2018-01-05';
SET #Interval = 24 * 60 * 60;
WITH
cteDurations AS
(SELECT [Timestamp] AS StartDT,
LEAD ([Timestamp]) OVER (ORDER BY [Timestamp]) AS EndDT,
Mode
FROM tblLog
WHERE [Timestamp] BETWEEN #Start AND #End
),
cteTimeslots AS
(SELECT #Start AS StartDT,
DATEADD(SECOND, #Interval, #Start) AS EndDT
UNION ALL
SELECT EndDT,
DATEADD(SECOND, #Interval, EndDT)
FROM cteTimeSlots WHERE StartDT < #End
),
cteDurationsPerTimesplot AS
(SELECT CASE WHEN S.StartDT > C.StartDT THEN S.StartDT ELSE C.StartDT END AS StartDT,
CASE WHEN S.EndDT < C.EndDT THEN S.EndDT ELSE C.EndDT END AS EndDT,
C.StartDT AS Slot,
S.Mode
FROM cteDurations S
JOIN cteTimeslots C ON NOT(S.EndDT <= C.StartDT OR S.StartDT >= C.EndDT)
)
SELECT Slot,
Mode,
SUM(DATEDIFF(SECOND, StartDT, EndDT)) AS Duration
FROM cteDurationsPerTimesplot
GROUP BY Slot, Mode
ORDER BY Slot, Mode;
With the variable #Interval you are able to define the size of the timeslots.
The CTE cteDurations creates a subresult with the durations of all necessary entries by using the TSQL function LEAD (available in MSSQL >= 2012). This will be a lot faster than an OUTER APPLY.
The CTE cteTimeslots generates a list of timeslots with start time and end time.
The CTE cteDurationsPerTimesplot is a subresult with a JOIN between cteDurations and cteTimeslots. This this the magic JOIN statement from Cato!
And finally the SELECT statement will do the grouping and sum calculation per Slot and Mode.
Once again: Thanks a lot to everybody! Especially to Cato! You saved my weekend!
Regards
Oliver

SQL how to count census points occurring between date records

I’m using MS-SQL-2008 R2 trying to write a script that calculates the Number of Hospital Beds occupied on any given day, at 2 census points: midnight, and 09:00.
I’m working from a data set of patient Ward Stays. Basically, each row in the table is a record of an individual patient's stay on a single ward, and records the date/time the patient is admitted onto the ward, and the date/time the patient leaves the ward.
A sample of this table is below:
Ward_Stay_Primary_Key | Ward_Start_Date_Time | Ward_End_Date_Time
1 | 2017-09-03 15:04:00.000 | 2017-09-27 16:55:00.000
2 | 2017-09-04 18:08:00.000 | 2017-09-06 18:00:00.000
3 | 2017-09-04 13:00:00.000 | 2017-09-04 22:00:00.000
4 | 2017-09-04 20:54:00.000 | 2017-09-08 14:30:00.000
5 | 2017-09-04 20:52:00.000 | 2017-09-13 11:50:00.000
6 | 2017-09-05 13:32:00.000 | 2017-09-11 14:49:00.000
7 | 2017-09-05 13:17:00.000 | 2017-09-12 21:00:00.000
8 | 2017-09-05 23:11:00.000 | 2017-09-06 17:38:00.000
9 | 2017-09-05 11:35:00.000 | 2017-09-14 16:12:00.000
10 | 2017-09-05 14:05:00.000 | 2017-09-11 16:30:00.000
The key thing to note here is that a patient’s Ward Stay can span any length of time, from a few hours to many days.
The following code enables me to calculate the number of beds at both census points for any given day, by specifying the date in the case statement:
SELECT
'05/09/2017' [Date]
,SUM(case when Ward_Start_Date_Time <= '05/09/2017 00:00:00.000' AND (Ward_End_Date_Time >= '05/09/2017 00:00:00.000' OR Ward_End_Date_Time IS NULL)then 1 else 0 end)[No. Beds Occupied at 00:00]
,SUM(case when Ward_Start_Date_Time <= '05/09/2017 09:00:00.000' AND (Ward_End_Date_Time >= '05/09/2017 09:00:00.000' OR Ward_End_Date_Time IS NULL)then 1 else 0 end)[No. Beds Occupied at 09:00]
FROM
WardStaysTable
And, based on the sample 10 records above, generates this output:
Date | No. Beds Occupied at 00:00 | No. Beds Occupied at 09:00
05/09/2017 | 4 | 4
To perform this for any number of days is obviously onerous, so what I’m looking to create is a query where I can specify a start/end date parameter (e.g. 1st-5th Sept), and for the query to then evaluate the Ward_Start_Date_Time and Ward_End_Date_Time variables for each record, and – grouping by the dates defined in the date parameter – count each time the 00:00:00.000 and 09:00:00.000 census points fall between these 2 variables, to give an output something along these lines (based on the above 10 records):
Date | No. Beds Occupied at 00:00 | No. Beds Occupied at 09:00
01/09/2017 | 0 | 0
02/09/2017 | 0 | 0
03/09/2017 | 0 | 0
04/09/2017 | 1 | 1
05/09/2017 | 4 | 4
I’ve approached this (perhaps naively) thinking that if I use a cte to create a table of dates (defined by the input parameters), along with associated midnight and 9am census date/time points, then I could use these variables to group and evaluate the dataset.
So, this code generates the grouping dates and census date/time points:
DECLARE
#StartDate DATE = '01/09/2017'
,#EndDate DATE = '05/09/2017'
,#0900 INT = 540
SELECT
DATEADD(DAY, nbr - 1, #StartDate) [Date]
,CONVERT(DATETIME,(DATEADD(DAY, nbr - 1, #StartDate))) [MidnightDate]
,DATEADD(mi, #0900,(CONVERT(DATETIME,(DATEADD(DAY, nbr - 1, #StartDate))))) [0900Date]
FROM
(
SELECT
ROW_NUMBER() OVER ( ORDER BY c.object_id ) AS nbr
FROM sys.columns c
) nbrs
WHERE nbr - 1 <= DATEDIFF(DAY, #StartDate, #EndDate)
The stumbling block I’ve hit is how to join the cte to the WardStays dataset, because there’s no appropriate key… I’ve tried a few iterations of using a subquery to make this work, but either I’m taking the wrong approach or I’m getting my syntax in a mess.
In simple terms, the logic I’m trying to create to get the output is something like:
SELECT
[Date]
,SUM (case when WST.Ward_Start_Date_Time <= [MidnightDate] AND (WST.Ward_End_Date_Time >= [MidnightDate] OR WST.Ward_End_Date_Time IS NULL then 1 else 0 end) [No. Beds Occupied at 00:00]
,SUM (case when WST.Ward_Start_Date_Time <= [0900Date] AND (WST.Ward_End_Date_Time >= [0900Date] OR WST.Ward_End_Date_Time IS NULL then 1 else 0 end) [No. Beds Occupied at 09:00]
FROM WardStaysTable WST
GROUP BY [Date]
Is the above somehow possible, or am I barking up the wrong tree and need to take a different approach altogether? Appreciate any advice.
I would expect something like this:
WITH dates as (
SELECT CAST(#StartDate as DATETIME) as dte
UNION ALL
SELECT DATEADD(DAY, 1, dte)
FROM dates
WHERE dte < #EndDate
)
SELECT dates.dte [Date],
SUM(CASE WHEN Ward_Start_Date_Time <= dte AND
Ward_END_Date_Time >= dte
THEN 1 ELSE 0
END) as num_beds_0000,
SUM(CASE WHEN Ward_Start_Date_Time <= dte + CAST('09:00' as DATETIME) AND
Ward_END_Date_Time >= dte + CAST('09:00' as DATETIME)
THEN 1 ELSE 0
END) as num_beds_0900
FROM dates LEFT JOIN
WardStaysTable wt
ON wt.Ward_Start_Date_Time <= DATEADD(day, 1, dates.dte) AND
wt.Ward_END_Date_Time >= dates.dte
GROUP BY dates.dte
ORDER BY dates.dte;
The cte is just creating the list of dates.
What a cool exercise. Here is what I came up with:
CREATE TABLE #tmp (ID int, StartDte datetime, EndDte datetime)
INSERT INTO #tmp values(1,'2017-09-03 15:04:00.000','2017-09-27 06:55:00.000')
INSERT INTO #tmp values(2,'2017-09-04 08:08:00.000','2017-09-06 18:00:00.000')
INSERT INTO #tmp values(3,'2017-09-04 13:00:00.000','2017-09-04 22:00:00.000')
INSERT INTO #tmp values(4,'2017-09-04 20:54:00.000','2017-09-08 14:30:00.000')
INSERT INTO #tmp values(5,'2017-09-04 20:52:00.000','2017-09-13 11:50:00.000')
INSERT INTO #tmp values(6,'2017-09-05 13:32:00.000','2017-09-11 14:49:00.000')
INSERT INTO #tmp values(7,'2017-09-05 13:17:00.000','2017-09-12 21:00:00.000')
INSERT INTO #tmp values(8,'2017-09-05 23:11:00.000','2017-09-06 07:38:00.000')
INSERT INTO #tmp values(9,'2017-09-05 11:35:00.000','2017-09-14 16:12:00.000')
INSERT INTO #tmp values(10,'2017-09-05 14:05:00.000','2017-09-11 16:30:00.000')
DECLARE
#StartDate DATE = '09/01/2017'
,#EndDate DATE = '10/01/2017'
, #nHours INT = 9
;WITH d(OrderDate) AS
(
SELECT DATEADD(DAY, n-1, #StartDate)
FROM (SELECT TOP (DATEDIFF(DAY, #StartDate, #EndDate) + 1)
ROW_NUMBER() OVER (ORDER BY [object_id]) FROM sys.all_objects) AS x(n)
)
, CTE AS(
select OrderDate, t2.*
from #tmp t2
cross apply(select orderdate from d ) d
where StartDte >= #StartDate and EndDte <= #EndDate)
select OrderDate,
SUM(CASE WHEN OrderDate >= StartDte and OrderDate <= EndDte THEN 1 ELSE 0 END) [No. Beds Occupied at 00:00],
SUM(CASE WHEN StartDTE <= DateAdd(hour,#nHours,CAST(OrderDate as datetime)) and DateAdd(hour,#nHours,CAST(OrderDate as datetime)) <= EndDte THEN 1 ELSE 0 END) [No. Beds Occupied at 09:00]
from CTE
GROUP BY OrderDate
This should allow you to check for any hour of the day using the #nHours parameter if you so choose. If you only want to see records that actually fall within your date range then you can filter the cross apply on start and end dates.

SQL SUM hours between two dates & GROUP BY Column Name?

I need to display the total amount of hours elapsed for an action within a month and the previous month before it like this:
___________________________________________
| Rank | Action | Month | Prev Month |
|------|----------|------------|------------|
| 1 | Action1 | 580.2 | 200.7 |
| 2 | Action8 | 412.5 | 550.2 |
| 3 | Action10 | 405.0 | 18.1 |
---------------------------------------------
I have a SQL table in the format of:
_____________________________________________________
| Action | StartTime | EndTime |
|---------|---------------------|---------------------|
| Action1 | 2015-02-03 06:01:53 | 2015-02-03 06:12:05 |
| Action1 | 2015-02-03 06:22:16 | 2015-02-03 06:25:33 |
| Action2 | 2015-02-03 06:36:07 | 2015-02-03 06:36:49 |
| Action1 | 2015-02-03 06:36:46 | 2015-02-03 06:48:10 |
| ..etc | 20..-..-.. ...etc | 20..-..-.. ...etc |
-------------------------------------------------------
What would the query look like?
EDIT:
A ツ's answer got me headed in the right direction however I solved the problem using a JOIN. See below for my solution.
i changed the values a bit since only one day is rather boring
INSERT INTO yourtable
([Action], [StartTime], [EndTime])
VALUES
('Action1', '2015-02-18 06:01:53', '2015-02-18 06:12:05'),
('Action1', '2015-02-18 06:22:16', '2015-02-18 06:25:33'),
('Action2', '2015-04-03 06:36:07', '2015-04-03 06:36:49'),
('Action1', '2015-03-19 06:36:46', '2015-03-19 06:48:10'),
('Action2', '2015-04-13 06:36:46', '2015-04-13 06:48:10'),
('Action2', '2015-04-14 06:36:46', '2015-04-14 06:48:10')
;
now define the date borders:
declare #dateEntry datetime = '2015-04-03';
declare #date1 date
, #date2 date
, #date3 date;
set #date1 = #dateEntry; -- 2015-04-03
set #date2 = dateadd(month,-1,#date1); -- 2015-03-03
set #date3 = dateadd(month,-1,#date2); -- 2015-02-03
the selected date will include all action which starts before 2015-04-03 00:00 and starts after 2015-02-03 00:00
select date1 = #date1
, date2 = #date2
, date3 = #date3
, [Action]
, thisMonth =
sum(
case when Starttime between #date2 and #date1
then datediff(second, starttime, endtime)/360.0
end)
, lastMonth =
sum(
case when Starttime between #date3 and #date2
then datediff(second, starttime, endtime)/360.0
end)
from yourtable
where starttime between #date3 and #date1
group by [Action]
http://sqlfiddle.com/#!6/35784/5
I'm just revisiting this question after researching a bit more about SQL Server.
A temporary table can be created from a query and then used inside another query - a nested query if you like.
This way the results can JOIN together like through any other normal table without nasty CASE statements. This is also useful to display other datas required of the first query like COUNT(DISTINCT ColumnName)
JOIN two SELECT statement results
SELECT TOP 10
t1.Action, t1.[Time],
COALESCE(t2.[Time],0) AS [Previous Period 'Time'],
COALESCE( ( ((t1.[Time]*1.0) - (t2.[Time]*1.0)) / (t2.[Time]*1.0) ), -1 ) AS [Change]
FROM
(
SELECT
Action,
SUM(DATEDIFF(SECOND, StartTime, EndTime)) AS [Time],
FROM Actions
WHERE StartTime BETWEEN #start AND #end
GROUP BY Action
) t1
LEFT JOIN
(
SELECT
Action,
SUM(DATEDIFF(SECOND, StartTime, EndTime)) AS [Time]
FROM Actions
WHERE StartTime BETWEEN #prev AND #start
GROUP BY Action
) t2
ON
t1.Action = t2.Action
ORDER BY t1.[Time] DESC
Hopefully this information is helpful for someone.
Probably you should check use grouping on preprocessed data, like this:
select Action, SUM(Hours)
from (select Action, DATEDIFF('hh',StartTime, EndTime) as Hours
FROM Actions)
group by Action
My assumption is that you're start - end time span is so short that you don't have to worry about dates spanning 2 months, so then you'll probably need something like this:
select
dense_rank() over (order by Month desc) as Rank,
action,
Month,
PrevMonth
from
(
select
action,
sum(case when StartTime >= #curMonth then hours else 0 end) as Month,
sum(case when StartTime >= #prevMonth and StartTime < #curMonth then hours else 0 end) as PrevMonth
from
(
select
action,
StartTime,
datediff(second, StartTime, EndTime) / 3600.0 as hours
from
yourtable
) T1
group by
action
) T2
This calculates the duration is seconds and then divides it with 3600 to get hours. The rank is based just on current month. This expects that you have 2 variables #curMonth and #prevMonth that have the dates for the limit, and that there is no data for future.
SQL Fiddle for testing: http://sqlfiddle.com/#!6/d64b7d/1

Combine consecutive date ranges

Using SQL Server 2008 R2,
I'm trying to combine date ranges into the maximum date range given that one end date is next to the following start date.
The data is about different employments. Some employees may have ended their employment and have rejoined at a later time. Those should count as two different employments (example ID 5). Some people have different types of employment, running after each other (enddate and startdate neck-to-neck), in this case it should be considered as one employment in total (example ID 30).
An employment period that has not ended has an enddate that is null.
Some examples is probably enlightening:
declare #t as table (employmentid int, startdate datetime, enddate datetime)
insert into #t values
(5, '2007-12-03', '2011-08-26'),
(5, '2013-05-02', null),
(30, '2006-10-02', '2011-01-16'),
(30, '2011-01-17', '2012-08-12'),
(30, '2012-08-13', null),
(66, '2007-09-24', null)
-- expected outcome
EmploymentId StartDate EndDate
5 2007-12-03 2011-08-26
5 2013-05-02 NULL
30 2006-10-02 NULL
66 2007-09-24 NULL
I've been trying different "islands-and-gaps" techniques but haven't been able to crack this one.
The strange bit you see with my use of the date '31211231' is just a very large date to handle your "no-end-date" scenario. I have assumed you won't really have many date ranges per employee, so I've used a simple Recursive Common Table Expression to combine the ranges.
To make it run faster, the starting anchor query keeps only those dates that will not link up to a prior range (per employee). The rest is just tree-walking the date ranges and growing the range. The final GROUP BY keeps only the largest date range built up per starting ANCHOR (employmentid, startdate) combination.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table Tbl (
employmentid int,
startdate datetime,
enddate datetime);
insert Tbl values
(5, '2007-12-03', '2011-08-26'),
(5, '2013-05-02', null),
(30, '2006-10-02', '2011-01-16'),
(30, '2011-01-17', '2012-08-12'),
(30, '2012-08-13', null),
(66, '2007-09-24', null);
/*
-- expected outcome
EmploymentId StartDate EndDate
5 2007-12-03 2011-08-26
5 2013-05-02 NULL
30 2006-10-02 NULL
66 2007-09-24 NULL
*/
Query 1:
;with cte as (
select a.employmentid, a.startdate, a.enddate
from Tbl a
left join Tbl b on a.employmentid=b.employmentid and a.startdate-1=b.enddate
where b.employmentid is null
union all
select a.employmentid, a.startdate, b.enddate
from cte a
join Tbl b on a.employmentid=b.employmentid and b.startdate-1=a.enddate
)
select employmentid,
startdate,
nullif(max(isnull(enddate,'32121231')),'32121231') enddate
from cte
group by employmentid, startdate
order by employmentid
Results:
| EMPLOYMENTID | STARTDATE | ENDDATE |
-----------------------------------------------------------------------------------
| 5 | December, 03 2007 00:00:00+0000 | August, 26 2011 00:00:00+0000 |
| 5 | May, 02 2013 00:00:00+0000 | (null) |
| 30 | October, 02 2006 00:00:00+0000 | (null) |
| 66 | September, 24 2007 00:00:00+0000 | (null) |
SET NOCOUNT ON
DECLARE #T TABLE(ID INT,FromDate DATETIME, ToDate DATETIME)
INSERT INTO #T(ID,FromDate,ToDate)
SELECT 1,'20090801','20090803' UNION ALL
SELECT 2,'20090802','20090809' UNION ALL
SELECT 3,'20090805','20090806' UNION ALL
SELECT 4,'20090812','20090813' UNION ALL
SELECT 5,'20090811','20090812' UNION ALL
SELECT 6,'20090802','20090802'
SELECT ROW_NUMBER() OVER(ORDER BY s1.FromDate) AS ID,
s1.FromDate,
MIN(t1.ToDate) AS ToDate
FROM #T s1
INNER JOIN #T t1 ON s1.FromDate <= t1.ToDate
AND NOT EXISTS(SELECT * FROM #T t2
WHERE t1.ToDate >= t2.FromDate
AND t1.ToDate < t2.ToDate)
WHERE NOT EXISTS(SELECT * FROM #T s2
WHERE s1.FromDate > s2.FromDate
AND s1.FromDate <= s2.ToDate)
GROUP BY s1.FromDate
ORDER BY s1.FromDate
An alternative solution that uses window functions rather than recursive CTEs
SELECT
employmentid,
MIN(startdate) as startdate,
NULLIF(MAX(COALESCE(enddate,'9999-01-01')), '9999-01-01') as enddate
FROM (
SELECT
employmentid,
startdate,
enddate,
DATEADD(
DAY,
-COALESCE(
SUM(DATEDIFF(DAY, startdate, enddate)+1) OVER (PARTITION BY employmentid ORDER BY startdate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),
0
),
startdate
) as grp
FROM #t
) withGroup
GROUP BY employmentid, grp
ORDER BY employmentid, startdate
This works by calculating a grp value that will be the same for all consecutive rows. This is achieved by:
Determine totals days the span occupies (+1 as the dates are inclusive)
SELECT *, DATEDIFF(DAY, startdate, enddate)+1 as daysSpanned FROM #t
Cumulative sum the days spanned for each employment, ordered by startdate. This gives us the total days spanned by all the previous employment spans
We coalesce with 0 to ensure we dont have NULLs in our cumulative sum of days spanned
We do not include current row in our cumulative sum, this is because we will use the value against startdate rather than enddate (we cant use it against enddate because of the NULLs)
SELECT *, COALESCE(
SUM(daysSpanned) OVER (
PARTITION BY employmentid
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
,0
) as cumulativeDaysSpanned
FROM (
SELECT *, DATEDIFF(DAY, startdate, enddate)+1 as daysSpanned FROM #t
) inner1
Subtract the cumulative days from the startdate to get our grp. This is the crux of the solution.
If the start date increases at the same rate as the days spanned then the days are consecutive, and subtracting the two will give us the same value.
If the startdate increases faster than the days spanned then there is a gap and we will get a new grp value greater than the previous one.
Although grp is a date, the date itself is meaningless we are using just as a grouping value
SELECT *, DATEADD(DAY, -cumulativeDaysSpanned, startdate) as grp
FROM (
SELECT *, COALESCE(
SUM(daysSpanned) OVER (
PARTITION BY employmentid
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
,0
) as cumulativeDaysSpanned
FROM (
SELECT *, DATEDIFF(DAY, startdate, enddate)+1 as daysSpanned FROM #t
) inner1
) inner2
With the results
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
| employmentid | startdate | enddate | daysSpanned | cumulativeDaysSpanned | grp |
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
| 5 | 2007-12-03 00:00:00.000 | 2011-08-26 00:00:00.000 | 1363 | 0 | 2007-12-03 00:00:00.000 |
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
| 5 | 2013-05-02 00:00:00.000 | NULL | NULL | 1363 | 2009-08-08 00:00:00.000 |
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
| 30 | 2006-10-02 00:00:00.000 | 2011-01-16 00:00:00.000 | 1568 | 0 | 2006-10-02 00:00:00.000 |
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
| 30 | 2011-01-17 00:00:00.000 | 2012-08-12 00:00:00.000 | 574 | 1568 | 2006-10-02 00:00:00.000 |
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
| 30 | 2012-08-13 00:00:00.000 | NULL | NULL | 2142 | 2006-10-02 00:00:00.000 |
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
| 66 | 2007-09-24 00:00:00.000 | NULL | NULL | 0 | 2007-09-24 00:00:00.000 |
+--------------+-------------------------+-------------------------+-------------+-----------------------+-------------------------+
Finally we can GROUP BY grp to get the get rid of the consecutive days.
Use MIN and MAX to get the new startdate and endate
To handle the NULL enddate we give them a large value to get picked up by MAX then convert them back to NULL again
SELECT
employmentid,
MIN(startdate) as startdate,
NULLIF(MAX(COALESCE(enddate,'9999-01-01')), '9999-01-01') as enddate
FROM (
SELECT *, DATEADD(DAY, -cumulativeDaysSpanned, startdate) as grp
FROM (
SELECT *, COALESCE(
SUM(daysSpanned) OVER (
PARTITION BY employmentid
ORDER BY startdate
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
,0
) as cumulativeDaysSpanned
FROM (
SELECT *, DATEDIFF(DAY, startdate, enddate)+1 as daysSpanned FROM #t
) inner1
) inner2
) inner3
GROUP BY employmentid, grp
ORDER BY employmentid, startdate
To get the desired result
+--------------+-------------------------+-------------------------+
| employmentid | startdate | enddate |
+--------------+-------------------------+-------------------------+
| 5 | 2007-12-03 00:00:00.000 | 2011-08-26 00:00:00.000 |
+--------------+-------------------------+-------------------------+
| 5 | 2013-05-02 00:00:00.000 | NULL |
+--------------+-------------------------+-------------------------+
| 30 | 2006-10-02 00:00:00.000 | NULL |
+--------------+-------------------------+-------------------------+
| 66 | 2007-09-24 00:00:00.000 | NULL |
+--------------+-------------------------+-------------------------+
We can combine the inner queries to get the query at the start of this answer. Which is shorter, but less explainable
Limitations of all this required that
there are no overlaps of startdate and enddate for an employment. This could produce collisions in our grp.
startdate is not NULL. However this could be overcome by replacing NULL start dates with small date values
Future developers can decipher the window black magic you performed
A modified script for combining all overlapping periods. For example
01.01.2001-01.01.2010
05.05.2005-05.05.2015
will give one period:
01.01.2001-05.05.2015
tbl.enddate must be completed
;WITH cte
AS(
SELECT
a.employmentid
,a.startdate
,a.enddate
from tbl a
left join tbl c on a.employmentid=c.employmentid
and a.startdate > c.startdate
and a.startdate <= dateadd(day, 1, c.enddate)
WHERE c.employmentid IS NULL
UNION all
SELECT
a.employmentid
,a.startdate
,a.enddate
from cte a
inner join tbl c on a.startdate=c.startdate
and (c.startdate = dateadd(day, 1, a.enddate) or (c.enddate > a.enddate and c.startdate <= a.enddate))
)
select distinct employmentid,
startdate,
nullif(max(enddate),'31.12.2099') enddate
from cte
group by employmentid, startdate

Query to return all the days of a month

This problem is related to this, which has no solution in sight: here
I have a table that shows me all sessions of an area.
This session has a start date.
I need to get all the days of month of the start date of the session by specific area (in this case)
I have this query:
SELECT idArea, idSession, startDate FROM SessionsPerArea WHERE idArea = 1
idArea | idSession | startDate |
1 | 1 | 01-01-2013 |
1 | 2 | 04-01-2013 |
1 | 3 | 07-02-2013 |
And i want something like this:
date | Session |
01-01-2013 | 1 |
02-01-2013 | NULL |
03-01-2013 | NULL |
04-01-2013 | 1 |
........ | |
29-01-2013 | NULL |
30-01-2013 | NULL |
In this case, the table returns me all the days of January.
The second column is the number of sessions that occur on that day, because there may be several sessions on the same day.
Anyone can help me?
Please try:
DECLARE #SessionsPerArea TABLE (idArea INT, idSession INT, startDate DATEtime)
INSERT #SessionsPerArea VALUES (1,1,'2013-01-01')
INSERT #SessionsPerArea VALUES (1,2,'2013-01-04')
INSERT #SessionsPerArea VALUES (1,3,'2013-07-02')
DECLARE #RepMonth as datetime
SET #RepMonth = '01/01/2013';
WITH DayList (DayDate) AS
(
SELECT #RepMonth
UNION ALL
SELECT DATEADD(d, 1, DayDate)
FROM DayList
WHERE (DayDate < DATEADD(d, -1, DATEADD(m, 1, #RepMonth)))
)
SELECT *
FROM DayList t1 left join #SessionsPerArea t2 on t1.DayDate=startDate and t2.idArea = 1
This will work:
DECLARE #SessionsPerArea TABLE (idArea INT, idSession INT, startDate DATE)
INSERT #SessionsPerArea VALUES
(1,1,'2013-01-01'),
(1,2,'2013-01-04'),
(1,3,'2013-07-02')
;WITH t1 AS
(
SELECT startDate
, DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', startDate), '1900-01-01') firstInMonth
, DATEADD(DAY, -1, DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', startDate) + 1, '1900-01-01')) lastInMonth
, COUNT(*) cnt
FROM #SessionsPerArea
WHERE idArea = 1
GROUP BY
startDate
)
, calendar AS
(
SELECT DISTINCT DATEADD(DAY, c.number, t1.firstInMonth) d
FROM t1
JOIN master..spt_values c ON
type = 'P'
AND DATEADD(DAY, c.number, t1.firstInMonth) BETWEEN t1.firstInMonth AND t1.lastInMonth
)
SELECT d date
, cnt Session
FROM calendar c
LEFT JOIN t1 ON t1.startDate = c.d
It uses simple join on master..spt_values table to generate rows.
Just an example of calendar table. To return data for a month adjust the number of days between < 32, for a year to 365+1. You can calculate the number of days in a month or between start/end dates with query. I'm not sure how to do this in SQL Server. I'm using hardcoded values to display all dates in Jan-2013. You can adjust start and end dates for diff. month or to get start/end dates with queries...:
WITH data(r, start_date) AS
(
SELECT 1 r, date '2012-12-31' start_date FROM any_table --dual in Oracle
UNION ALL
SELECT r+1, date '2013-01-01'+r-1 FROM data WHERE r < 32 -- number of days between start and end date+1
)
SELECT start_date FROM data WHERE r > 1
/
START_DATE
----------
1/1/2013
1/2/2013
1/3/2013
...
...
1/31/2013