SQL calculates total down time minutes - sql

I'm working on a down time management system that is capable of saving support tickets for problems in a database, my database has the following columns:
-ID
-DateOpen
-DateClosed
-Total
I want to obtain the sum of minutes in a day, taking into account that the tickets can be simultaneous, for example:
ID | DateOpen | DateClosed | Total
1 2019-04-01 08:00:00 AM 2019-04-01 08:45:00 45
2 2019-04-01 08:10:00 AM 2019-04-01 08:20:00 10
3 2019-04-01 09:06:00 AM 2019-04-01 09:07:00 1
4 2019-04-01 09:06:00 AM 2019-04-01 09:41:00 33
Someone can helpme with that please!! :c
If I use the query "SUM", it will return 89, but if you see the dates, you will understand that the actual result must be 78 because the tickets 2 and 3 were launched while another ticket was working ...
DECLARE #DateOpen date = '2019-04-01'
SELECT AlarmID, DateOpen, DateClosed, TDT FROM AlarmHistory
WHERE CONVERT(date,DateOpen) = #DateOpen

What you need to do is generate a sequence of integers and use that to generate times of the day. Join that sequence of times on between your open and close dates, then count the number of distinct times.
Here is an example that will work with MySQL:
SET #row_num = 0;
SELECT COUNT(DISTINCT time_stamp)
-- this simulates your dateopen and dateclosed table
FROM (SELECT '2019-04-01 08:00:00' open_time, '2019-04-01 08:45:00' close_time
UNION SELECT '2019-04-01 08:10:00', '2019-04-01 08:20:00'
UNION SELECT '2019-04-01 09:06:00', '2019-04-01 09:07:00'
UNION SELECT '2019-04-01 09:06:00', '2019-04-01 09:41:00') times_used
JOIN (
-- generate sequence of minutes in day
SELECT TIME(sequence*100) time_stamp
FROM (
-- create sequence 1 - 10000
SELECT (#row_num:=#row_num + 1) AS sequence
FROM {table_with_10k+_records}
LIMIT 10000
) minutes
HAVING time_stamp IS NOT NULL
LIMIT 1440
) times ON (time_stamp >= TIME(open_time) AND time_stamp < TIME(close_time));
Since you are selecting only distinct times that are found in the result, minutes that overlap will not be counted.
NOTE: Depending on your database, there may be a better way to go about generating a sequence. MySQL does not have a generate sequence function I did it this way to show the basic idea that can easily be converted to work with whatever database you are using.

#drakin8564's answer adapted for SQL Server which I believe you're using:
;WITH Gen AS
(
SELECT TOP 1440
CONVERT(TIME, DATEADD(minute, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), '00:00:00')) AS t
FROM sys.all_objects a1
CROSS
JOIN sys.all_objects a2
)
SELECT COUNT(DISTINCT t)
FROM incidents inci
JOIN Gen
ON Gen.t >= CONVERT(TIME, inci.DateOpen)
AND Gen.t < CONVERT(TIME, inci.DateClosed)
Your total for the last record is wrong, says 33 while it's 35, so the query results in 80, not 78.

By the way, just as MarcinJ told you, 41 - 6 is 35, not 33. So the answer is 80, not 78.
The following solution would work even if the date parameter is not one day only (1,440 minutes). Say if the date parameter is a month, or even year, this solution would still work.
Live demo: http://sqlfiddle.com/#!18/462ac/5
-- arranged the opening and closing downtime
with a as
(
select
DateOpen d, 1 status
from dt
union all
select
DateClosed, 2
from dt
)
-- don't compute the downtime from previous date
-- if the current date's status is opened
-- yet the previous status is closed
, downtime_minutes AS
(
select
*,
lag(status) over(order by d, status desc) as prev_status,
case when status = 1 and lag(status) over(order by d, status desc) = 2 then
null
else
datediff(minute, lag(d) over(order by d, status desc), d)
end as downtime
from a
)
select sum(downtime) as all_downtime from downtime_minutes;
Output:
| all_downtime |
|--------------|
| 80 |
See how it works:
It works by computing the downtime from previous downtime. Don't compute downtime if the current date's status is open and the previous date's status is closed, which means the current downtime is a non-overlapping one. Non-overlapping downtime are denoted by null.
For that new downtime opened, its downtime is null initially, downtime will be computed on succeeding dates up to when it is closed.
Can make the code shorter by reversing the condition:
-- arranged the opening and closing downtime
with a as
(
select
DateOpen d, 1 status
from dt
union all
select
DateClosed, 2
from dt
-- order by d. postgres can do this?
)
-- don't compute the downtime from previous date
-- if the current date's status is opened
-- yet the previous status is closed
, downtime_minutes AS
(
select
*,
lag(status) over(order by d, status desc) as prev_status,
case when not ( status = 1 and lag(status) over(order by d, status desc) = 2 ) then
datediff(minute, lag(d) over(order by d, status desc), d)
end as downtime
from a
)
select sum(downtime) from downtime_minutes;
Not particularly proud of my original solution: http://sqlfiddle.com/#!18/462ac/1
As for the status desc on order by d, status desc, if a DateClosed is similar to other downtime's DateOpen, status desc will sort the DateClosed first.
For this data where 8:00 is present on both DateOpened and DateClosed:
INSERT INTO dt
([ID], [DateOpen], [DateClosed], [Total])
VALUES
(1, '2019-04-01 07:00:00', '2019-04-01 07:50:00', 50),
(2, '2019-04-01 07:45:00', '2019-04-01 08:00:00', 15),
(3, '2019-04-01 08:00:00', '2019-04-01 08:45:00', 45);
;
For similar time (e.g., 8:00), if we will not sort the closing first before the open, then 7:00 will be computed up to 7:50 only, instead of up to 8:00, as 8:00-open's downtime is initially zero. Here's how the open and closed downtimes are arranged and computed if there's no status desc for similar date, e.g., 8:00. The total downtime is 95 minutes only, which is wrong. It should be 105 minutes.
Here's how that will be arranged and computed if we sort the DateClosed first before the DateOpen (by using status desc) when they have similar date, e.g., 8:00. The total downtime is 105 minutes, which is correct.

Another approach, uses gaps and islands approach. Answer is based on SQL Time Packing of Islands
Live test: http://sqlfiddle.com/#!18/462ac/11
with gap_detector as
(
select
DateOpen, DateClosed,
case when
lag(DateClosed) over (order by DateOpen) is null
or lag(DateClosed) over (order by DateOpen) < DateOpen
then
1
else
0
end as gap
from dt
)
, downtime_grouper as
(
select
DateOpen, DateClosed,
sum(gap) over (order by DateOpen) as downtime_group
from gap_detector
)
-- group's open and closed detector. then computes the group's downtime
select
downtime_group,
min(DateOpen) as group_date_open,
max(DateClosed) as group_date_closed,
datediff(minute, min(DateOpen), max(DateClosed)) as group_downtime,
sum(datediff(minute, min(DateOpen), max(DateClosed)))
over(order by downtime_group) as downtime_running_total
from downtime_grouper
group by downtime_group
Output:
How it works
A DateOpen is the start of the series of downtime if it has no previous downtime (indicated by null lag(DateClosed)). A DateOpen is also a start of the series of downtime if it has a gap from the previous downtime's DateClosed.
with gap_detector as
(
select
lag(DateClosed) over (order by DateOpen) as previous_downtime_date_closed,
DateOpen, DateClosed,
case when
lag(DateClosed) over (order by DateOpen) is null
or lag(DateClosed) over (order by DateOpen) < DateOpen
then
1
else
0
end as gap
from dt
)
select *
from gap_detector
order by DateOpen;
Output:
After detecting the gap starters, we do a running total of the gap so we can group downtimes that are contiguous to each other.
with gap_detector as
(
select
DateOpen, DateClosed,
case when
lag(DateClosed) over (order by DateOpen) is null
or lag(DateClosed) over (order by DateOpen) < DateOpen
then
1
else
0
end as gap
from dt
)
select
DateOpen, DateClosed, gap,
sum(gap) over (order by DateOpen) as downtime_group
from gap_detector
order by DateOpen;
As we can see from the output above, we can now easily detect the downtime group's earliest DateOpen and latest DateClosed by applying MIN(DateOpen) and MAX(DateClosed) by grouping on downtime_group. On downtime_group 1, we have earliest DateOpen of 08:00 and latest DateClosed of 08:45. On downtime_group 2, we have earliest DateOpen of 09:06 and latest DateClosed of 9:41. And from that we can recalculate the correct downtime even if there are simultaneous downtimes.
We can make the code shorter by eliminating the detection of null previous downtime (the current row we are evaluating is the firstmost row in the table) by reversing the logic. Instead of detecting the gaps, we detect the islands (contiguous downtimes). Something is contiguous if the previous downtime's DateClosed overlaps the DateOpen of the current downtime, denoted by 0. If it does not overlaps, then it is a gap, denoted by 1.
Here's the query:
Live test: http://sqlfiddle.com/#!18/462ac/12
with gap_detector as
(
select
DateOpen, DateClosed,
case when lag(DateClosed) over (order by DateOpen) >= DateOpen
then
0
else
1
end as gap
from dt
)
, downtime_grouper as
(
select
DateOpen, DateClosed,
sum(gap) over (order by DateOpen) as downtime_group
from gap_detector
)
-- group's open and closed detector. then computes the group's downtime
select
downtime_group,
min(DateOpen) as group_date_open,
max(DateClosed) as group_date_closed,
datediff(minute, min(DateOpen), max(DateClosed)) as group_downtime,
sum(datediff(minute, min(DateOpen), max(DateClosed)))
over(order by downtime_group) as downtime_running_total
from downtime_grouper
group by downtime_group
If you are using SQL Server 2012 or higher:
iif(lag(DateClosed) over (order by DateOpen) >= DateOpen, 0, 1) as gap

Related

Get Start and End date from multiple rows of dates, excluding weekends

I'm trying figure out how to return Start Date and End date based on data like in the below table:
Name
Date From
Date To
A
2022-01-03
2022-01-03
A
2021-12-29
2021-12-31
A
2021-12-28
2021-12-28
A
2021-12-27
2021-12-27
A
2021-12-23
2021-12-24
A
2021-11-08
2021-11-09
The result I am after would show like this:
Name
Date From
Date To
A
2021-12-23
2022-01-03
A
2021-11-08
2021-11-09
The dates in first table will sometimes go over weekends with the Date From and Date To, but in cases where the row ends on a Friday and next row starts on following Monday it will need to be classified as the same "block", as presented in the second table. I was hoping to use DATEFIRST setting to cater for the weekends to avoid using a calendar table, as per How do I exclude Weekend days in a SQL Server query?, but if calendar table ends up being the easiest way out I'm happy to look into creating one.
In above example I only have 1 Name, but the table will have multiple names and it will need to be grouped by that.
The only examples of this I am seeing are using only 1 date column for records and I struggled changing their code around to cater for my example. The closest example I found doesn't work for me as it is based on datetime fields and the time differences - find start and stop date for contiguous dates in multiple rows
This is a Gaps & Island problem with the twist that you need to consider weekend continuity.
You can do:
select max(name) as name, min(date_from) as date_from, max(date_to) as date_to
from (
select *, sum(inc) over(order by date_to) as grp
from (
select *,
case when lag(ext_to) over(order by date_to) = date_from
then 0 else 1 end as inc
from (
select *,
case when (datepart(weekday, date_to) = 6)
then dateadd(day, 3, date_to)
else dateadd(day, 1, date_to) end as ext_to
from t
) x
) y
) z
group by grp
Result:
name date_from date_to
---- ---------- ----------
A 2021-11-08 2021-11-09
A 2021-12-23 2022-01-03
See running example at db<>fiddle #1.
Note: Your question doesn't mention it, but you probably want to segment per person. I didn't do it.
EDIT: Adding partition by name
Partitioning by name is quite easy actually. The following query does it:
select name, min(date_from) as date_from, max(date_to) as date_to
from (
select *, sum(inc) over(partition by name order by date_to) as grp
from (
select *,
case when lag(ext_to) over(partition by name order by date_to) = date_from
then 0 else 1 end as inc
from (
select *,
case when (datepart(weekday, date_to) = 6)
then dateadd(day, 3, date_to)
else dateadd(day, 1, date_to) end as ext_to
from t
) x
) y
) z
group by name, grp
order by name, grp
See running query at db<>fiddle #2.
with extended as (
select name,
date_from,
case when datepart(weekday, date_to) = 6
then dateadd(day, 2, date_to) else date_to end as date_to
from t
), adjacent as (
select *,
case when dateadd(day, 1,
lag(date_to) over (partition by name order by date_from)) = date_from
then 0 else 1 end as brk
from extended
), blocked as (
select *, sum(brk) over (partition by name order by date_from) as grp
from adjacent
)
select name, min(date_from), max(date_to) from blocked
group by name, grp;
I'm assuming that ranges do no overlap and that all input dates do fall on weekdays. While hammering this out on my cellphone I originally made two mistakes. For some reason I got to and from dates reversed in my head and then I was thinking that Friday is 5 (as with ##datefirst) rather than 6. (Of course this could otherwise vary with the regional setting anyway.) One advantage of using table expressions is to modularize and bury certain details in lower levels of the logic. In this case it would be very easy to adjust dates should some of these assumptions prove to be wrong.
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=42e0c452d57d474232bcf991d6d3c43c

Creating a status log from rows of datetimes of status changes

I'm pulling down some data from a remote API to a local SQL Server table, which is formatted like so. (imagine it's sorted by StatusDT descending)
DriverID StatusDT Status
-------- -------- ------
b103 2019-03-05 05:42:52:000 D
b103 2019-03-03 23:45:42.000 SB
b103 2019-03-03 21:49:41.000 ON
What would be the best way to eventually get to a point where I can return a query showing the total amount of time spent in each status on each day for each driver?
Also, it's possible that there could be gaps of a whole day or more between status updates, in which case I'd need a row showing a continuation of the previous status from 00:00:00 to 23:59:59 for each skipped day. So, if I'm looping through this table to populate another with the structure below, the example above would need to wind up looking like this... (again, sorted descending by date)
DriverID StartDT EndDT Status
-------- --------------- -------------- ------
b103 2019-03-05 05:42:52 D
b103 2019-03-05 00:00:00 2019-03-05 05:42:51 SB
b103 2019-03-04 00:00:00 2019-03-04 23:59:59 SB
b103 2019-03-03 23:45:42 2019-03-03 23:59:59 SB
b103 2019-03-03 21:49:41 2019-03-03 23:45:41 ON
Does that make sense?
I wound up dumping the API data to a "work" table and running a cursor over it to add rows to another table, with the starting and ending date/time, but I'm curious if there's another way that might be more efficient.
Thanks very much.
I think this query is what you need. I couldn't test it, however, for syntax errors:
with x as (
select
DriverID,
StatusDT as StartDT,
lead(StatusID) over(partition by DriverID order by StatusDT) as EndDT,
Status
from my_table
)
select -- start & end on the same day
DriverID,
StartDT,
EndDT,
Status
from x
where convert(date, StartDT) = convert(date, EndDT)
or EndDT is null
union all
select -- start & end on different days; first day up to midnight
DriverID,
StartDT,
dateadd(ms, -3, convert(date, EndDT)) as EndDT,
Status
from x
where convert(date, StartDT) <> convert(date, EndDT)
and or EndDT is not null
union all
select -- start & end on different days; next day from midnight
DriverID,
convert(date, EndDT) as StartDT,
EndDT,
Status
from x
where convert(date, StartDT) <> convert(date, EndDT)
and or EndDT is not null
order by StartDT desc
Most of your answer is just using lead():
select driverid, status, statusdt,
lead(statusdt) over (partition by driverid order by statusdt) as enddte
from t;
This does not give the breaks by day. But you can add those. I think the easiest way is to add in the dates (using a recursive CTE) and compute the status at that time. So:
I would do the following:
use a recursive CTE to calculate the dates
"fill in" the statuses and union to the original table
use lead() to get the end date
This looks like:
with day_boundaries as (
select driverid, dateadd(day, 1, convert(min(statusdt) as date) as statusdt, max(statusdt) as finaldt
from t
group by driverid
having datediff(da, min(statusdt), max(statusdt)) > 0
union all
select driverid, dateadd(day, 1, statusdt), finaldt
from day_boundaries
where statusdt < finaldt
),
unioned as (
select driverid, status, statusdt
from t
union all
select db.driverid, s.status, db.statusdt
from day_boundaries db cross apply
(select top (1) status
from t
where t.statusdt < db.statusdt
order by t.statusdt desc
) s
)
select driverid, status, statusdt,
lead(statusdt) over (partition by driverid order by statusdt) as enddte
from unioned;
Note that this does not subtract any seconds from the end date. The end date matches the previous start date. Time is continuous. It makes no sense to have gaps for records that should snugly fit together.

SQL how to write a query that return missing date ranges?

I am trying to figure out how to write a query that looks at certain records and finds missing date ranges between today and 9999-12-31.
My data looks like below:
ID |start_dt |end_dt |prc_or_disc_1
10412 |2018-07-17 00:00:00.000 |2018-07-20 00:00:00.000 |1050.000000
10413 |2018-07-23 00:00:00.000 |2018-07-26 00:00:00.000 |1040.000000
So for this data I would want my query to return:
2018-07-10 | 2018-07-16
2018-07-21 | 2018-07-22
2018-07-27 | 9999-12-31
I'm not really sure where to start. Is this possible?
You can do that using the lag() function in MS SQL (but that is available starting with 2012?).
with myData as
(
select *,
lag(end_dt,1) over (order by start_dt) as lagEnd
from myTable),
myMax as
(
select Max(end_dt) as maxDate from myTable
)
select dateadd(d,1,lagEnd) as StartDate, dateadd(d, -1, start_dt) as EndDate
from myData
where lagEnd is not null and dateadd(d,1,lagEnd) < start_dt
union all
select dateAdd(d,1,maxDate) as StartDate, cast('99991231' as Datetime) as EndDate
from myMax
where maxDate < '99991231';
If lag() is not available in MS SQL 2008, then you can mimic it with row_number() and joining.
select
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then end_dt +1 END as F1,
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then ISNULL(LEAD(start_dt) over (order by ID) - 1, '99991231') END as F2
from t
Working SQLFiddle example is -> Here
FOR 2008 VERSION
SELECT
X.end_dt + 1 as F1,
ISNULL(Y.start_dt-1, '99991231') as F2
FROM t X
LEFT JOIN (
SELECT
*
, (SELECT MAX(ID) FROM t WHERE ID < A.ID) as ID2
FROM t A) Y ON X.ID = Y.ID2
WHERE DATEDIFF(day, X.end_dt, ISNULL(Y.start_dt, '99991231')) > 1
Working SQLFiddle example is -> Here
This should work in 2008, it assumes that ranges in your table do not overlap. It will also eliminate rows where the end_date of the current row is a day before the start date of the next row.
with dtRanges as (
select start_dt, end_dt, row_number() over (order by start_dt) as rownum
from table1
)
select t2.end_dt + 1, coalesce(start_dt_next -1,'99991231')
FROM
( select dr1.start_dt, dr1.end_dt,dr2.start_dt as start_dt_next
from dtRanges dr1
left join dtRanges dr2 on dr2.rownum = dr1.rownum + 1
) t2
where
t2.end_dt + 1 <> coalesce(start_dt_next,'99991231')
http://sqlfiddle.com/#!18/65238/1
SELECT
*
FROM
(
SELECT
end_dt+1 AS start_dt,
LEAD(start_dt-1, 1, '9999-12-31')
OVER (ORDER BY start_dt)
AS end_dt
FROM
yourTable
)
gaps
WHERE
gaps.end_dt >= gaps.start_dt
I would, however, strongly urge you to use end dates that are "exclusive". That is, the range is everything up to but excluding the end_dt.
That way, a range of one day becomes '2018-07-09', '2018-07-10'.
It's really clear that my range is one day long, if you subtract one from the other you get a day.
Also, if you ever change to needing hour granularity or minute granularity you don't need to change your data. It just works. Always. Reliably. Intuitively.
If you search the web you'll find plenty of documentation on why inclusive-start and exclusive-end is a very good idea from a software perspective. (Then, in the query above, you can remove the wonky +1 and -1.)
This solves your case, but provide some sample data if there will ever be overlaps, fringe cases, etc.
Take one day after your end date and 1 day before the next line's start date.
DECLARE # TABLE (ID int, start_dt DATETIME, end_dt DATETIME, prc VARCHAR(100))
INSERT INTO # (id, start_dt, end_dt, prc)
VALUES
(10410, '2018-07-09 00:00:00.00','2018-07-12 00:00:00.000','1025.000000'),
(10412, '2018-07-17 00:00:00.00','2018-07-20 00:00:00.000','1050.000000'),
(10413, '2018-07-23 00:00:00.00','2018-07-26 00:00:00.000','1040.000000')
SELECT DATEADD(DAY, 1, end_dt)
, DATEADD(DAY, -1, LEAD(start_dt, 1, '9999-12-31') OVER(ORDER BY id) )
FROM #
You may want to take a look at this:
http://sqlfiddle.com/#!18/3a224/1
You just have to edit the begin range to today and the end range to 9999-12-31.

trying to find the maximum number of occurrences over time T-SQL

I have data recording the StartDateTime and EndDateTime (both DATETIME2) of a process for all of the year 2013.
My task is to find the maximum amount of times the process was being ran at any specific time throughout the year.
I have wrote some code to check every minute/second how many processes were running at the specific time, but this takes a very long time and would be impossible to let it run for the whole year.
Here is the code (in this case check every minute for the date 25/10/2013)
CREATE TABLE dbo.#Hit
(
ID INT IDENTITY (1,1) PRIMARY KEY,
Moment DATETIME2,
COUNT INT
)
DECLARE #moment DATETIME2
SET #moment = '2013-10-24 00:00:00'
WHILE #moment < '2013-10-25'
BEGIN
INSERT INTO #Hit ( Moment, COUNT )
SELECT #moment, COUNT(*)
FROM dbo.tblProcessTimeLog
WHERE ProcessFK IN (25)
AND #moment BETWEEN StartDateTime AND EndDateTime
AND DelInd = 0
PRINT #moment
SET #moment = DATEADD(MINute,1,#moment)
END
SELECT * FROM #Hit
ORDER BY COUNT DESC
Can anyone think how i could get a similar result (I just need the maximum amount of processes being run at any given time), but for all year?
Thanks
DECLARE #d DATETIME = '20130101'; -- the first day of the year you care about
;WITH m(m) AS
( -- all the minutes in a day
SELECT TOP (1440) ROW_NUMBER() OVER (ORDER BY number) - 1
FROM master..spt_values
),
d(d) AS
( -- all the days in *that* year (accounts for leap years vs. hard-coding 365)
SELECT TOP (DATEDIFF(DAY, #d, DATEADD(YEAR, 1, #d))) DATEADD(DAY, number, #d)
FROM master..spt_values WHERE type = N'P' ORDER BY number
),
x AS
( -- all the minutes in *that* year
SELECT moment = DATEADD(MINUTE, m.m, d.d) FROM m CROSS JOIN d
)
SELECT TOP (1) WITH TIES -- in case more than one at the top
x.moment, [COUNT] = COUNT(l.ProcessFK)
FROM x
INNER JOIN dbo.tblProcessTimeLog AS l
ON x.moment >= l.StartDateTime
AND x.moment <= l.EndDateTime
WHERE l.ProcessFK = 25 AND l.DelInd = 0
GROUP BY x.moment
ORDER BY [COUNT] DESC;
See this post for why I don't think you should use BETWEEN for range queries, even in cases where it does semantically do what you want.
Create a table T whose rows represent some time segments.
This table could well be a temporary table (depending on your case).
Say:
row 1 - [from=00:00:00, to=00:00:01)
row 2 - [from=00:00:01, to=00:00:02)
row 3 - [from=00:00:02, to=00:00:03)
and so on.
Then just join from your main table
(tblProcessTimeLog, I think) to this table
based on the datetime values recorded in
tblProcessTimeLog.
A year has just about half million minutes
so it is not that many rows to store in T.
I recently pulled some code from SO trying to solve the 'island and gaps' problem, and the algorithm for that should help you solve your problem.
The idea is that you want to find the point in time that has the most started processes, much like figuring out the deepest nesting of parenthesis in an expression:
( ( ( ) ( ( ( (deepest here, 6)))))
This sql will produce this result for you (I included a temp table with sample data):
/*
CREATE TABLE #tblProcessTimeLog
(
StartDateTime DATETIME2,
EndDateTime DATETIME2
)
-- delete from #tblProcessTimeLog
INSERT INTO #tblProcessTimeLog (StartDateTime, EndDateTime)
Values ('1/1/2012', '1/6/2012'),
('1/2/2012', '1/6/2012'),
('1/3/2012', '1/6/2012'),
('1/4/2012', '1/6/2012'),
('1/5/2012', '1/7/2012'),
('1/6/2012', '1/8/2012'),
('1/6/2012', '1/10/2012'),
('1/6/2012', '1/11/2012'),
('1/10/2012', '1/12/2012'),
('1/15/2012', '1/16/2012')
;
*/
with cteProcessGroups (EventDate, GroupId) as
(
select EVENT_DATE, (E.START_ORDINAL - E.OVERALL_ORDINAL) GROUP_ID
FROM
(
select EVENT_DATE, EVENT_TYPE,
MAX(START_ORDINAL) OVER (ORDER BY EVENT_DATE, EVENT_TYPE ROWS UNBOUNDED PRECEDING) as START_ORDINAL,
ROW_NUMBER() OVER (ORDER BY EVENT_DATE, EVENT_TYPE) AS OVERALL_ORDINAL
from
(
Select StartDateTime AS EVENT_DATE, 1 as EVENT_TYPE, ROW_NUMBER() OVER (ORDER BY StartDateTime) as START_ORDINAL
from #tblProcessTimeLog
UNION ALL
select EndDateTime, 0 as EVENT_TYPE, NULL
FROM #tblProcessTimeLog
) RAWDATA
) E
)
select Max(EventDate) as EventDate, count(GroupId) as OpenProcesses
from cteProcessGroups
group by (GroupId)
order by COUNT(GroupId) desc
Results:
EventDate OpenProcesses
2012-01-05 00:00:00.0000000 5
2012-01-06 00:00:00.0000000 4
2012-01-15 00:00:00.0000000 2
2012-01-10 00:00:00.0000000 2
2012-01-08 00:00:00.0000000 1
2012-01-07 00:00:00.0000000 1
2012-01-11 00:00:00.0000000 1
2012-01-06 00:00:00.0000000 1
2012-01-06 00:00:00.0000000 1
2012-01-06 00:00:00.0000000 1
2012-01-16 00:00:00.0000000 1
Note that the 'in-between' rows don't give anything meaningful. Basically this output is only tuned to tell you when the most activity was. Looking at the other rows in the out put, there wasn't just 1 process running on 1/8 (there was actually 3). But the way this code works is that by grouping the processes that are concurrent together in a group, you can count the number of simultaneous processes. The date returned is when the max concurrent processes began. It doesn't tell you how long they were going on for, but you can solve that with an additional query. (once you know the date the most was ocurring, you can find out the specific process IDs by using a BETWEEN statement on the date.)
Hope this helps.

Returning only one row for an aggregate function per a time period and user

I'm using SQL Server 2008.
I have table constructed the following way:
Date (datetime)
TimeIn (datetime)
TimeOut (datetime)
UserReference (nvarchar)
LocationID
My desired results are: For every hour between hour 7 (7am) and hour 18 (6pm) I want to know the user who had the highest (TimeIn - TimeOut) for every location. -last condition is optional-
So I've got an aggregated function which calculates the datediff in seconds between TimeOut and TimeIn aliased as Total
I want my results to look a bit like this:
Hour 7 | K1345 | 50 | Place #5
Hour 7 | K3456 | 10 | Place #4
Hour 8 | K3333 | 5 | Place #5
etc.
What I've tried so far:
A CTE using the ROW_NUMBER() function, partitioning by my aggregated column and ordering by it. This only returns one row.
A CTE where I do all my aggregations (including datepart(hour,date)) and use the max aggregation to get the highest total time in my outer query.
I know I have to do it with a CTE somehow, I'm just not exactly sure how to join the cte and my outer query.
Am I on the right track using a ROW_NUMBER() or Rank()?
Queries I've tried:
WITH cte as
(
SELECT * ,
rn = ROW_NUMBER() over (partition by datediff(second, [TimeIn], [TimeOut])order by datediff(second, [TimeIn], [TimeOut]) desc)
FROM TimeTable (nolock)
where DateCreated > '20131023 00:00:00' and DateCreated < '20131023 23:59:00'
)
SELECT datepart(hour,cte.DateCreated) as hour,cte.UserReference,(datediff(second, [TimeIn], [TimeOut])) as [Response Time],LocationID
from cte
where cte.rn = 1
and DATEPART(hh,datecreated) >= 7 and DATEPART(hh,datecreated) <= 18
order by hour asc
This only returns a few rows
something else I've tried:
with cte as
(
SELECT Datecreated as Date,
UserReference as [User],
datediff(second, [TimeIn], [TimeOut]) as Time,
LocationID as Location
FROM TimeTable
WHERE datecreated... --daterange
)
SELECT DATEPART(HOUR,date), cte.[User], MAX(Time), Location
FROM cte
WHERE DATEPART(hh,datecreated) >= 7 and DATEPART(hh,datecreated) <= 18
GROUP BY DATEPART(HOUR,date), cte.[User], Location
Row of sample data
Date UserRef TimeIn TimeOut locationid
2013-10-23 06:26:12.783 KF34334 2013-10-23 06:27:07.000 2013-10-23 06:27:08.000 10329
I hope this will help
WITH TotalTime AS (
SELECT
CAST(DateCreated AS DATE) as [date]
,DATEPART(hour,DateCreated) AS [hour]
,SUM(DATEDIFF(second,TimeIn,TimeOut)) AS Total
,UserReference
,locationid
FROM TimeTable
GROUP BY UserReference,locationid,CAST(DateCreated AS DATE),DATEPART(hour,DateCreated)
HAVING DATEPART(hh,DateCreated) >= 7 and DATEPART(hh,DateCreated) <= 18
)
, rn AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY [date],[hour],locationid ORDER BY Total DESC) AS row_num
FROM TotalTime
)
SELECT *
FROM rn
WHERE row_num = 1