Summarising a table containing timestamps into 10-minute periods - sql

I have a SQL-Server table called UserConnections that is structured like this:
ID
User
From
To
1
Bob
31-jan-2023 09:00:00
31-jan-2023 10:00:00
2
Bob
31-jan-2023 12:00:00
31-jan-2023 15:00:00
3
Sally
31-jan-2023 14:00:00
31-jan-2023 16:00:00
and I want to create a summary table for the previous day specifying the number of users connected during each 10-minute period. So it would look something like this
Period Start
User Count
31-jan-2023 00:00:00
0
31-jan-2023 00:10:00
0
...
...
31-jan-2023 09:00:00
1
31-jan-2023 09:10:00
1
...
...
31-jan-2023 12:00:00
1
31-jan-2023 12:10:00
1
...
...
31-jan-2023 14:00:00
2
31-jan-2023 14:10:00
2
...
...
31-jan-2023 15:00:00
1
31-jan-2023 15:10:00
1
...
...
31-jan-2023 16:00:00
0
31-jan-2023 16:10:00
0
So I need to get the start of each 10-minute period, and then count the number of connections where the [from] <= PeriodStart and [To] >= PeriodEnd
Given the start and end I can probably do the counting but I have no idea how to get the 10-minute periods (I am not experienced with complex SQL!).
I've looked at a few Date/Time functions but really don't know where to start.
I've also looked at this: MSSQL Start and End Time Grouped by 10 Minute, for Elapsed Time
which looks similar but I'm having difficulty seeing how to adjust it for my data.

The first trick here, which is really the basis for the entire solution, is to generate a list of all the 10 minute intervals for the day. To accomplish this I used a tally table. I keep one as a view on my system like this. I got my version from Jeff Moden who has a great article on the topic.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
Next I need to create a table so I have something to work with on my machine.
declare #Something table
(
ID int
, [User] varchar(10)
, [From] datetime
, [To] datetime
)
insert #Something values
(1, 'Bob', '31-jan-2023 09:00:00', '31-jan-2023 10:00:00')
, (2, 'Bob', '31-jan-2023 12:00:00', '31-jan-2023 15:00:00')
, (3, 'Sally', '31-jan-2023 14:00:00', '31-jan-2023 16:00:00')
Now that I have the foundation setup the query is fairly straight forward.
with DateVals as --used a cte here so I don't have to write out the date math over and over. ;)
(
select PeriodStart = dateadd(minute, (t.N - 1) * 10, convert(datetime, convert(date, getdate())))
from cteTally t
where t.N <= 144 --24 hours a day, 6 times an hour
)
select dv.PeriodStart
, UserCount = count(s.[From])
from DateVals dv
left join #Something s on s.[From] <= dv.PeriodStart and s.[To] > dv.PeriodStart
group by dv.PeriodStart
order by dv.PeriodStart

Related

Summing field in other rows conditionally

I have table in the form like below:
Pilot
Leg
Duration
Takeoff
John
1
60
9:00:00
John
2
60
9:00:00
John
3
30
9:00:00
Paul
1
60
12:00:00
Paul
2
30
12:00:00
Paul
3
30
12:00:00
Paul
4
60
12:00:00
And I am trying to figure out is a query to get the following:
Pilot
Leg
Duration
Takeoff
LegStart
John
1
60
9:00:00
9:00:00
John
2
60
9:00:00
10:00:00
John
3
30
9:00:00
10:30:00
Paul
1
60
12:00:00
12:00:00
Paul
2
30
12:00:00
13:00:00
Paul
3
30
12:00:00
13:30:00
Paul
4
60
12:00:00
14:00:00
So the 'LegStart' time is the 'TakeOff' time, plus the duration of prior legs for that pilot.
Now , to do this in SQL, I need to somehow add up the durations of prior legs for the same pilot. But for the life of me... I cannot figure out how you can do this because the pilots can have a variable number of legs, so joining doesn't get you anywhere.
You can use a cumulative sum. The trick is including this in the
select t.*,
sum(duration) over (partition by pilot order by leg) as running_duration,
datetime_add(takeoff,
interval (sum(duration) over (partition by pilot order by leg) - duration) minute
) as leg_start
from t;
Note: This assumes that takeoff is a datetime.
Try analytic SUM sum(duration) over (partition by pilot order by leg):
with mytable as (
select 'John' as pilot, 1 as leg, 60 as duration, time '9:00:00' as takeoff union all
select 'John', 2, 60, '9:00:00' union all
select 'John', 3, 30, '9:00:00' union all
select 'Paul', 1, 60, '12:00:00' union all
select 'Paul', 2, 30, '12:00:00' union all
select 'Paul', 3, 30, '12:00:00' union all
select 'Paul', 4, 60, '12:00:00'
)
select
*,
time_add(takeoff, interval ifnull(sum(duration) over (partition by pilot order by leg ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) minute) as legstart
from mytable

How to Create Times Based on an Interval and Start Time?

I am performing some scheduling optimization in my database.
For the job schedule, many records appear like this
Time NextRunTime
Every 3 hours 2019-06-03 10:00:00
Every 3 hours 2019-05-28 20:00:00
Every 4 hours 2017-07-31 18:00:00
Every 1 hours 2019-06-03 14:00:00
Every 4 hours 2017-06-08 16:00:00
What is an efficient means to split the "every" records into separate records within the 24 hour day?
For example, for the first record (every 3 hours from 10:00), I need it to insert the following into a table.
Time
13:00:00
16:00:00
19:00:00
22:00:00
01:00:00
04:00:00
07:00:00
10:00:00
I need to repeat this for every record in the first table with "every".
Can anybody help?
If you want to do this correctly you will need a tally table. First let's look at the logic required to solve this.
DECLARE #starttime DATETIME = '2019-06-03 10:00:00', #hours INT = 3;
SELECT t.N, Tm = CAST(DATEADD(HOUR,t.N*3,#startTime) AS TIME)
FROM
(
SELECT TOP (24/#hours) ROW_NUMBER() OVER (ORDER BY (SELECT 1))
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS a(x),
(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS b(x)
) AS t(N);
Returns:
N Tm
------- ----------------
1 13:00:00.0000000
2 16:00:00.0000000
3 19:00:00.0000000
4 22:00:00.0000000
5 01:00:00.0000000
6 04:00:00.0000000
7 07:00:00.0000000
8 10:00:00.0000000
Now for some sample data and a record identifier (named "someId") so we can calculate this for all rows in your table.
-- Sample Data
DECLARE #yourTable TABLE (someId INT IDENTITY PRIMARY KEY, freq INT, NextRunTime DATETIME);
INSERT #yourTable(freq, NextRunTime) VALUES (3, '2019-06-03 10:00:00'),
(3, '2019-05-28 20:00:00'),(4, '2017-07-31 18:00:00'),
(1, '2019-06-03 14:00:00'),(4, '2017-06-08 16:00:00');
-- Solution
SELECT yt.someId, f.Tm
FROM #yourTable AS yt
CROSS APPLY
(
SELECT t.N, CAST(DATEADD(HOUR,t.N*yt.freq,yt.NextRunTime) AS TIME)
FROM
(
SELECT TOP (24/yt.freq) ROW_NUMBER() OVER (ORDER BY (SELECT 1))
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS a(x),
(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS b(x)
) AS t(N)
) AS f(N,Tm);
Returns:
someId Tm
----------- ----------------
1 13:00:00.0000000
1 16:00:00.0000000
1 19:00:00.0000000
1 22:00:00.0000000
1 01:00:00.0000000
1 04:00:00.0000000
1 07:00:00.0000000
1 10:00:00.0000000
2 23:00:00.0000000
2 02:00:00.0000000
2 05:00:00.0000000
2 08:00:00.0000000
2 11:00:00.0000000
2 14:00:00.0000000
2 17:00:00.0000000
2 20:00:00.0000000
3 22:00:00.0000000
3 02:00:00.0000000
3 06:00:00.0000000
3 10:00:00.0000000
3 14:00:00.0000000
3 18:00:00.0000000
4 15:00:00.0000000
4 16:00:00.0000000
4 17:00:00.0000000
4 18:00:00.0000000
4 19:00:00.0000000
4 20:00:00.0000000
4 21:00:00.0000000
4 22:00:00.0000000
4 23:00:00.0000000
4 00:00:00.0000000
4 01:00:00.0000000
4 02:00:00.0000000
4 03:00:00.0000000
4 04:00:00.0000000
4 05:00:00.0000000
4 06:00:00.0000000
4 07:00:00.0000000
4 08:00:00.0000000
4 09:00:00.0000000
4 10:00:00.0000000
4 11:00:00.0000000
4 12:00:00.0000000
4 13:00:00.0000000
4 14:00:00.0000000
5 20:00:00.0000000
5 00:00:00.0000000
5 04:00:00.0000000
5 08:00:00.0000000
5 12:00:00.0000000
5 16:00:00.0000000
I have always thought that a recursive cte is a nice way to solve your query:
-- the sample data
declare #data as table (freq varchar(100), tmst datetime, each_ int)
insert into #data
select s.freq,s.tmst,
convert(int,replace(replace(freq,'Every ',''),' hours','')) as each_
from (
select 'Every 4 hours' as freq, convert(datetime,'2019-06-02 10:00:00') as tmst union all
select 'Every 3 hours' as freq, convert(datetime,'2019-06-02 11:00:00') as tmst union all
select 'Every 2 hours' as freq, convert(datetime,'2019-06-02 10:00:00') as tmst
) s
-- the query
;with cte as (
select freq, tmst, each_, null t1 from #data
union all
select freq, tmst, each_, isnull(t1,datepart(hour,tmst)) + each_
from cte
where isnull(t1,datepart(hour,tmst)) + each_ <= 23
)
select freq,
isnull(convert(datetime, convert(varchar(8),tmst,112) + ' ' + (convert(varchar(100),t1) + ':00:00' ), 120),tmst)
from cte
order by 1, 2
With this second version, you can get all the ranges from 0 to 23 (in the previous example you got just from the starting point to the 23)
-- the query
;with findfirst as (
select freq, tmst, datepart(hour,tmst) as fhour, datepart(hour,tmst) as init, each_ from #data
union all
select freq, tmst, fhour, init - each_, each_ from findfirst where init - each_ >= 0
),
cte as (
select min(init) as init, freq, tmst, each_, fhour from findfirst group by freq, tmst, each_, fhour
union all
select init + each_, freq, tmst, each_, fhour from cte where init + each_ <= 23
)
select freq,tmst,convert(time, right('0' + convert(varchar(2),init), 2) + ':00:00')
from cte order by freq,init,each_
Remember to continue using the #data table.
Output:
First drop any temp tables that have the names of the temp tables you will create:
IF OBJECT_ID(N'tempdb..#Interval', N'U') IS NOT NULL DROP TABLE #Interval
GO
IF OBJECT_ID(N'tempdb..#Interval2', N'U') IS NOT NULL DROP TABLE #Interval2
GO
IF OBJECT_ID(N'tempdb..#Runstart', N'U') IS NOT NULL DROP TABLE #Runstart
GO
Then create, and insert data into, your temp tables:
CREATE TABLE #Interval
(
_Time NVARCHAR(13),
NextRunTime DATETIME
)
GO
INSERT INTO #Interval VALUES ('Every 3 hours','2019-06-03 10:00:00')
INSERT INTO #Interval VALUES ('Every 3 hours','2019-05-28 20:00:00')
INSERT INTO #Interval VALUES ('Every 4 hours','2017-07-31 18:00:00')
INSERT INTO #Interval VALUES ('Every 1 hours','2019-06-03 14:00:00')
INSERT INTO #Interval VALUES ('Every 4 hours','2017-06-08 16:00:00')
GO
CREATE TABLE #Interval2
(
RunID INT IDENTITY(10001,1) NOT NULL PRIMARY KEY,
_Time INT NOT NULL,
NextRunTime DATETIME NOT NULL
)
GO
The code below ensures that you get the right interval for each run start, note: If the number of hours are ever 10 or more, it would be necessary to add some more code here to make the number of digits selected conditional upon the length of the string which contains them, let me know if you need this code too.
INSERT INTO #Interval2 (_TIME,NextRunTime) SELECT SUBSTRING(_Time,7,1),NextRunTime FROM #Interval WHERE LEFT(_Time,5) = 'Every'
GO
CREATE TABLE #Runstart
(
StartID INT IDENTITY(10001,1) NOT NULL PRIMARY KEY,
RunID INT NOT NULL,
[Start_DTTM] DATETIME
)
GO
Next populate the #RUNSTART table using this loop:
DECLARE #RunID INT = 10001
DECLARE #RunTime INT = (SELECT _TIME FROM #Interval2 WHERE RunID = #RunID)
DECLARE #NextRun DATETIME = (SELECT NextRunTime FROM #Interval2 WHERE RunID = #RunID)
WHILE #RunID <= (SELECT MAX(RunID) FROM #Interval2)
BEGIN
WHILE #NextRun < (SELECT DATEADD(DD,1,NextRunTime) FROM #Interval2 WHERE RunID = #RunID)
BEGIN
INSERT INTO #Runstart (RunID,[Start_DTTM]) SELECT #RunID,DATEADD(HH,#RunTime,#NextRun)
SET #NextRun = (SELECT DATEADD(HH,#RunTime,#NextRun))
END
SET #RunID = #RunID+1
SET #RunTime = (SELECT _TIME FROM #Interval2 WHERE RunID = #RunID)
SET #NextRun = (SELECT NextRunTime FROM #Interval2 WHERE RunID = #RunID)
END
GO
SELECT StartID, RunID,CONVERT(VARCHAR,START_DTTM,108) AS Start_time FROM #RUNSTART

Calculate Every n record SQL

I have the following table:
oDateTime oValue
------------------------------------
2017-09:30 23:00:00 8
2017-09-30 23:15:00 7
2017-09-30 23:30:00 7
2017-09-30 23:45:00 7
2017-10-01 00:00:00 6
2017-10-01 00:15:00 5
2017-10-01 00:30:00 8
2017-10-01 00:45:00 7
2017-10-01 01:00:00 6
2017-10-01 01:15:00 9
2017-10-01 01:30:00 5
2017-10-01 01:45:00 6
2017-10-01 02:00:00 7
The table will have one record every 15 minutes. I want to SUM or Average those records every 15 minutes.
So, the result should be:
oDateTime Sum_Value Avg_Value
---------------------------------------------------
2017-10-01 00:00:00 35 7
2017-10-01 01:00:00 32 6.4
2017-10-01 02:00:00 33 6.6
the SUM for 2017-10-01 00:00:00 is taken from 5 records before it and so on.
does anyone know how to achieve this?
Thank you.
Here is one method in SQL Server 2008:
select t.oDateTime, tt.sum_value, tt.avg_value
from (select oDateTime
from t
where datepart(minute, oDateTime) = 0
) t outer apply
(select sum(oValue) as sum_value, avg(oValue) as avg_Value
from (select top 5 t2.*
from t t2
where t2.oDateTime <= t.oDateTime
order by t2.oDateTime desc
) tt
) tt;
In more recent versions of SQL Server, you can use window functions for this purpose.
Just join the table to itself, and group by the master timestamp
This below is easily adjustable, to include how many minutes back you want. Handles change in frequency, i.e. doesn't assume 5 rows wanted, so if the data came in in 5 minutes intervals this is handled.
select cast('2017-09-30 23:00:00' as datetime) t,8 o
into #a
union all
select '2017-09-30 23:15:00',7 union all
select '2017-09-30 23:30:00',7 union all
select '2017-09-30 23:45:00',7 union all
select '2017-10-01 00:00:00',6 union all
select '2017-10-01 00:15:00',5 union all
select '2017-10-01 00:30:00',8 union all
select '2017-10-01 00:45:00',7 union all
select '2017-10-01 01:00:00',6 union all
select '2017-10-01 01:15:00',9 union all
select '2017-10-01 01:30:00',5 union all
select '2017-10-01 01:45:00',6 union all
select '2017-10-01 02:00:00',7
select x.t,sum(x2.o),avg(cast(x2.o as float))
from #a x, #a x2
where x2.t between dateadd(mi,-60,x.t) and x.t
group by x.t

Number of rows per hour starting at a specific time

In order to get data for some reporting, I have to know how much lines have been inserted per hour in a table starting at a specific hour for a specific day. I already found a part of the solution in another question but I didn't manage to find a way to adapt it in my case. This is the code I've written so far:
SELECT DATEADD(HOUR, DATEDIFF(HOUR, 0, t.mydatetime), 0) AS HOUR_CONCERNED,
COUNT(*) AS NB_ROWS
FROM mytable t
WHERE CONVERT(DATETIME, FLOOR(CONVERT(FLOAT, t.mydatetime))) = '2016-06-06'
GROUP BY DATEADD(HOUR, DATEDIFF(HOUR, 0, t.mydatetime), 0)
ORDER BY HOUR_CONCERNED;
It gives me the following results:
HOUR_CONCERNED NB_ROWS
------------------- --------
2016-06-06 10:00:00 2157
2016-06-06 11:00:00 60740
2016-06-06 12:00:00 66189
2016-06-06 13:00:00 77096
2016-06-06 14:00:00 90039
The problem is that I can't find a way to start my results at a specific time such as 9.30am and to get the number of rows per hour starting from this time. In other words, I'm looking for the number of rows between 9.30am and 10.30am, between 10.30am and 11.30am, etc. The results I'm looking for should look like this:
HOUR_CONCERNED NB_ROWS
------------------- --------
2016-06-06 09:30:00 3550
2016-06-06 10:30:00 33002
2016-06-06 11:30:00 42058
2016-06-06 12:30:00 55008
2016-06-06 13:30:00 72000
Is there an easy way to adapt my query and get those results ?
Given a specific starting time, you can get hour blocks by finding the number of minutes since your start time, and dividing by 60, then adding this number of hours back to the start time e.g.
DECLARE #StartTime DATETIME2(0) = '20160606 09:30';
WITH DummyData (mydatetime) AS
( SELECT TOP 200 DATEADD(MINUTE, ROW_NUMBER() OVER(ORDER BY [object_id]) - 1, #StartTime)
FROM sys.all_objects
)
SELECT HoursSinceStart = FLOOR(DATEDIFF(MINUTE, #StartTime, mydatetime) / 60.0),
Display = DATEADD(HOUR, FLOOR(DATEDIFF(MINUTE, #StartTime, mydatetime) / 60.0), #StartTime),
Records = COUNT(*)
FROM DummyData
WHERE myDateTime >= #StartTime
GROUP BY FLOOR(DATEDIFF(MINUTE, #StartTime, mydatetime) / 60.0)
ORDER BY Display;
Which gives:
HoursSinceStart Display Records
0 2016-06-06 09:30:00 60
1 2016-06-06 10:30:00 40
2 2016-06-06 11:30:00 60
3 2016-06-06 12:30:00 20
I have left the HoursSinceStart column in, to hopefully assist in deconstructing the logic contained in the Display column
The problem with this method is that it will only give you results for blocks that exist, if you also need those that don't you will need to generate all time blocks using a numbers table, then left join to your data:
You can quickly generate a series of numbers using this:
DECLARE #StartTime DATETIME2(0) = '20160606 09:30';
-- GENERATE 10 ROWS
WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (N)),
-- CROSS JOIN THE 10 ROWS TO GET 100 ROWS
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
--CROSS JOIN THE 100 ROWS TO GET 10,000 ROWS
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
--APPLY ROW_NUMBER TO GET A SET OF NUMBERS FROM 0 - 99,999
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N) - 1 FROM N3)
SELECT *,
TimeStart = DATEADD(HOUR, N, #StartTime),
TimeEnd = DATEADD(HOUR, N + 1, #StartTime)
FROM Numbers;
Which gives something like:
N TimeStart TimeEnd
--------------------------------------------------
0 2016-06-06 09:30:00 2016-06-06 10:30:00
1 2016-06-06 10:30:00 2016-06-06 11:30:00
2 2016-06-06 11:30:00 2016-06-06 12:30:00
3 2016-06-06 12:30:00 2016-06-06 13:30:00
4 2016-06-06 13:30:00 2016-06-06 14:30:00
5 2016-06-06 14:30:00 2016-06-06 15:30:00
6 2016-06-06 15:30:00 2016-06-06 16:30:00
7 2016-06-06 16:30:00 2016-06-06 17:30:00
Then you can left join your data to this (you will probably need an end time too);
DECLARE #StartTime DATETIME2(0) = '20160606 09:30',
#EndTime DATETIME2(0) = '20160606 15:30';
WITH DummyData (mydatetime) AS
( SELECT TOP 200 DATEADD(MINUTE, ROW_NUMBER() OVER(ORDER BY [object_id]) - 1, #StartTime)
FROM sys.all_objects
),
-- GENERATE NUMBERS
N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N) - 1 FROM N3),
TimePeriods AS
( SELECT TimeStart = DATEADD(HOUR, N, #StartTime),
TimeEnd = DATEADD(HOUR, N + 1, #StartTime)
FROM Numbers
WHERE DATEADD(HOUR, N, #StartTime) < #EndTime
)
SELECT tp.TimeStart, tp.TimeEnd, Records = COUNT(dd.myDateTime)
FROM TimePeriods AS tp
LEFT JOIN DummyData AS dd
ON dd.mydatetime >= tp.TimeStart
AND dd.mydatetime < tp.TimeEnd
GROUP BY tp.TimeStart, tp.TimeEnd
ORDER BY tp.TimeStart;
Which will return 0 where there are no records:
TimeStart TimeEnd Records
---------------------------------------------------------
2016-06-06 09:30:00 2016-06-06 10:30:00 60
2016-06-06 10:30:00 2016-06-06 11:30:00 60
2016-06-06 11:30:00 2016-06-06 12:30:00 60
2016-06-06 12:30:00 2016-06-06 13:30:00 20
2016-06-06 13:30:00 2016-06-06 14:30:00 0
2016-06-06 14:30:00 2016-06-06 15:30:00 0
Try this:
SELECT DATEADD( MINUTE, 30, DATEADD(HOUR, DATEDIFF(HOUR, 0, DATEADD( MINUTE, -30, t.mydatetime)), 0)) AS HOUR_CONCERNED,
COUNT(*) AS NB_ROWS
FROM mytable t
WHERE CONVERT(DATETIME, FLOOR(CONVERT(FLOAT, t.mydatetime))) = '2016-06-06'
GROUP BY DATEADD(HOUR, DATEDIFF(HOUR, 0, DATEADD( MINUTE, -30, t.mydatetime)), 0)
ORDER BY HOUR_CONCERNED;
I added a 30 min offset into the GROUP BY function to treat 9:30 as 9:00, 10:30 as 10:00 and so on. In the select part I reverse this offset to give a proper interval.
The WHERE condition in your query needs to change though for performance reasons. Instead of truncating timestamps to a nearest day, you should filter by a range:
WHERE t.mydatetime >= CONVERT( DATETIME, '2016-06-06' ) AND t.mydatetime < CONVERT( DATETIME, '2016-06-07' )
You need to have the time in the where clause, and set a greater than against the time you want to measure from? Also, you can use DATEPART to get the hours.
SELECT NB_ROWS = COUNT(*)
,HOUR_CONCERNED = DATEPART(HOUR, InsertedDate)
FROM table
WHERE InsertedDate = '20160531'
AND InsertedDate> time
GROUP BY DATEPART(HOUR, InsertedDate)

SQL and Temporal data

Given a table of appointments, like this:
User Start End
UserA 2016-01-15 12:00:00 2016-01-15 14:00:00
UserA 2016-01-15 15:00:00 2016-01-15 17:00:00
UserB 2016-01-15 13:00:00 2016-01-15 15:00:00
UserB 2016-01-15 13:32:00 2016-01-15 15:00:00
UserB 2016-01-15 15:30:00 2016-01-15 15:30:00
UserB 2016-01-15 15:45:00 2016-01-15 16:00:00
UserB 2016-01-15 17:30:00 2016-01-15 18:00:00
I want to create a list of distinct time intervals in which the same amount of people have an appointment:
Start End Count
2016-01-15 12:00:00 2016-01-15 13:00:00 1
2016-01-15 13:00:00 2016-01-15 14:00:00 2
2016-01-15 14:00:00 2016-01-15 15:45:00 1
2016-01-15 15:45:00 2016-01-15 16:00:00 2
2016-01-15 16:00:00 2016-01-15 17:00:00 1
2016-01-15 17:00:00 2016-01-15 17:30:00 0
2016-01-15 17:30:00 2016-01-15 18:00:00 1
How would I do this in SQL, preferably SQL Server 2008?
EDIT: To clarify: Manually, the result is obtained by making one row for each user, marking the blocked time, and then summing up the count of rows that have a mark:
Time 12 13 14 15 16 17
UserA xxxxxxxx xxxxxxxx
UserB xxxxxxxx x xx
Count 1 2 1 21 0 1
That result set would start at the minimum time available, end at the maximum time available, and while the ASCII art has only a 15min resolution, I would require at least resolution to the minute. I guess you can leave the rows with "0" out of the result, if this is easier for you.
There's got to be an easier way than this, but at least you can probably follow each step individually:
declare #t table ([User] varchar(19) not null,Start datetime2 not null,[End] datetime2 not null)
insert into #t([User], Start, [End]) values
('UserA','2016-01-15T12:00:00','2016-01-15T14:00:00'),
('UserA','2016-01-15T15:00:00','2016-01-15T17:00:00'),
('UserB','2016-01-15T13:00:00','2016-01-15T15:00:00'),
('UserB','2016-01-15T13:32:00','2016-01-15T15:00:00'),
('UserB','2016-01-15T15:30:00','2016-01-15T15:30:00'),
('UserB','2016-01-15T15:45:00','2016-01-15T16:00:00'),
('UserB','2016-01-15T17:30:00','2016-01-15T18:00:00')
;With Times as (
select Start as Point from #t
union
select [End] from #t
), Ordered as (
select Point,ROW_NUMBER() OVER (ORDER BY Point) as rn
from Times
), Periods as (
select
o1.Point as Start,
o2.Point as [End]
from
Ordered o1
inner join
Ordered o2
on
o1.rn = o2.rn - 1
), UserCounts as (
select p.Start,p.[End],COUNT(distinct [User]) as Cnt,ROW_NUMBER() OVER (Order BY p.[Start]) as rn
from
Periods p
left join
#t t
on
p.Start < t.[End] and
t.Start < p.[End]
group by
p.Start,p.[End]
), Consolidated as (
select uc.*
from
UserCounts uc
left join
UserCounts uc_anti
on
uc.rn = uc_anti.rn + 1 and
uc.Cnt = uc_anti.Cnt
where
uc_anti.Cnt is null
union all
select c.Start,uc.[End],c.Cnt,uc.rn
from
Consolidated c
inner join
UserCounts uc
on
c.Cnt = uc.Cnt and
c.[End] = uc.Start
)
select
Start,MAX([End]) as [End],Cnt
from
Consolidated
group by
Start,Cnt
order by Start
CTEs are - Times - since any given start or end stamp can start or end a period in the final results, we just get them all in one column - so the Ordered can number them, and so that Periods can then re-assembly them into each smallest possible period.
UserCounts then goes back to the original data and finds out how many Users where overlapped by each calculated period.
Consolidated is the trickiest CTE to follow, but it's basically merging periods that abut each other where the user count is equal.
Results:
Start End Cnt
--------------------------- --------------------------- -----------
2016-01-15 12:00:00.0000000 2016-01-15 13:00:00.0000000 1
2016-01-15 13:00:00.0000000 2016-01-15 14:00:00.0000000 2
2016-01-15 14:00:00.0000000 2016-01-15 15:45:00.0000000 1
2016-01-15 15:45:00.0000000 2016-01-15 16:00:00.0000000 2
2016-01-15 16:00:00.0000000 2016-01-15 17:00:00.0000000 1
2016-01-15 17:00:00.0000000 2016-01-15 17:30:00.0000000 0
2016-01-15 17:30:00.0000000 2016-01-15 18:00:00.0000000 1
(And I even got the zero row I was unsure I'd be able to conjure into existence)
This kind of query is much easier to write if you have a calendar table. But in this example I've built one on the fly using a recursive CTE. The CTE returns the appointment blocks, which we can then join to the appointment data. I couldn't determine the interval pattern in your sample data, so I've shown the results in blocks of one hour. You could modify this section, or define your own within a second table.
Sample Data
/* Table variables make sharing data easier
*/
DECLARE #Sample TABLE
(
[User] VARCHAR(50),
[Start] DATETIME,
[End] DATETIME
)
;
INSERT INTO #Sample
(
[User],
[Start],
[End]
)
VALUES
('UserA', '2016-01-15 12:00:00', '2016-01-15 14:00:00'),
('UserA', '2016-01-15 15:00:00', '2016-01-15 17:00:00'),
('UserB', '2016-01-15 13:00:00', '2016-01-15 15:00:00'),
('UserB', '2016-01-15 13:32:00', '2016-01-15 15:00:00'),
('UserB', '2016-01-15 15:30:00', '2016-01-15 15:30:00'),
('UserB', '2016-01-15 15:45:00', '2016-01-15 16:00:00'),
('UserB', '2016-01-15 17:30:00', '2016-01-15 18:00:00')
;
I've used two variables to limit the returned results to just those appointments that fall within the given start and end point.
/* Set an start and end point for the next query
*/
DECLARE #Start DATETIME = '2016-01-15 12:00:00';
DECLARE #End DATETIME = '2016-01-15 18:00:00';
WITH Calendar AS
(
/* Anchor returns start of first appointment
*/
SELECT
#Start AS [Start],
DATEADD(SECOND, -1, DATEADD(HOUR, 1, #Start)) AS [End]
UNION ALL
/* Recursion, keep adding new records until end of last appointment
*/
SELECT
DATEADD(HOUR, 1, [Start]) AS [Start],
DATEADD(HOUR, 1, [End]) AS [End]
FROM
Calendar
WHERE
[End] <= #End
)
SELECT
c [Start],
c [End],
COUNT(DISTINCT s [User]) AS [Count]
FROM
Calendar AS c
LEFT OUTER JOIN #Sample AS s ON s [Start] BETWEEN c [Start] AND c [End]
OR s [End] BETWEEN c [Start] AND c [End]
GROUP BY
c [Start],
c [End]
;
Because an appointment can exceed one hour it may contribute to more than one row. This explains why 7 sample rows leads to a returned total of 9.