I am way over my head with an SQL problem. I have a query which makes a temporary table, fills it with data from several other tables, makes some calculations and updates and provides this data to an app. The final step to do is calculate how many hours and how many minutes there are between two datetimes, but they should be divided in dayHours, dayMins, nightHours, nightMins (datetimes can be 20+ days in between). The following bulletpoints will visualize what I want to do:
Let say, night time is from 23:00 to 06:00.
We have DateTime1 = 20-04-2016 13:30.
We have DateTime2 = 21-04-2016 07:15.
NightTime: from 23:00 to 06:00 = 7 hours 0 minutes.
DayTime: from 13:30 to 23:00 (9h30m), and then again from 06:00 to 07:15(1h15m) the following day for a total of 10 hours 45 minutes.
I am providing a create table query, but I only need help with the calculation so you could ignore my table and data. Note, I have erased almost all formatting to reduce, as the post got really long.
CREATE TABLE [dbo].[myTestTable](
[JHID] [int] NULL, [ToDateTime] [datetime] NULL,
[startPayDateTime] [datetime] NULL, [opDayHour] [int] NULL,
[opDayMin] [int] NULL, [opNightHour] [int] NULL,
[opNightMin] [int] NULL, ) ON [PRIMARY] GO
Consider inserting this as a test data. The columns (for test purposes) are startPayDateTime and ToDateTime
INSERT INTO [myTestTable]
([JHID],[ToDateTime],[startPayDateTime],[opDayHour],[opDayMin],[opNightHour],[opNightMin])
VALUES (301533,'14-03-2016 01:54','14-03-2016 04:54',1,1,1,1),
(302488,'14-03-2016 01:54','14-03-2016 08:31',0,0,0,0),
(302676,'14-03-2016 01:54','28-03-2016 08:11',1,1,1,1) GO
So now I have to
UPDATE
SET opDayHour = (CASE WHEN ... THEN *value* ELSE 0 end),
opDayMin = (CASE WHEN ... THEN *value* ELSE 0 end),
opNightHour = (CASE WHEN ... THEN *value* ELSE 0 end),
opNightMin = (CASE WHEN ... THEN *value* ELSE 0 end),
How do I Thank you for your consideration, if my question is not clear enough leave a comment ! :)
The idea is to detect first day, a number of hole days (if present) and the last day (if not the same as first). So we need only one day long tally table for minutes. Drawback is more computations of those first/last intervals. When you need computations wich involve a number of intermidiate variables CROSS APPLY is a handy tool.
Try this, you may need to ajust +-1 logic to conform to your rules. This query computes minutes which can be easily converted to hours + minutes.
with myMinutes as (
select rn
-- day time is from 6:00 to 23:00
, mday = case when rn between 6*60 and 23*60-1 then 1 else 0 end
, mnight = 1 - case when rn between 6*60 and 23*60-1 then 1 else 0 end
from (select top(24*60) rn=row_number () over (order by (select null))
from sys.all_objects s1, sys.all_objects s2) t
)
select dayMinutes=r1.dayMin + case holedays when 0 then 0 else r2.dayMin + (holedays-1)*(23*60 - 6*60) end
, nightMinutes=r1.nightMin + case holedays when 0 then 0 else r2.nightMin + (holedays-1)*(24*60 -(23*60 - 6*60)) end
, totalMinutes= datediff(MINUTE, [FromDateTime], [ToDateTime]) -- control
,[JHID],[JetReg],[ArrFltID],[DepFltID],[ArrDateTime],[FromDateTime],[ToDateTime]
-- more columns sipped
from [myTestTable]
cross apply (select fD = dateadd(DAY,datediff(DAY,'19000101',[FromDateTime]),'19000101')
,tD = dateadd(DAY,datediff(DAY,'19000101',[ToDateTime]),'19000101')
,holedays = datediff(DAY,[FromDateTime],[ToDateTime]) ) xd
cross apply (select fFirstMin = datediff(MINUTE, fd, [FromDateTime])
,fLastMin = case holedays when 0 then datediff(MINUTE, td,[ToDateTime]) else 24*60 end - 1
,tFirstMin = 1
,tLastMin = datediff(MINUTE, td, [ToDateTime])
) xb
cross apply (select dayMin = sum(mm.mday)
, nightMin = sum(mm.mnight)
from myminutes mm
where mm.rn between fFirstMin and fLastMin ) r1
cross apply (select dayMin = sum(mm.mday)
, nightMin = sum(mm.mnight)
from myminutes mm
where mm.rn between tFirstMin and tLastMin ) r2
You can use cte for that count:
DECLARE
#DateTime1 datetime = '2016-04-20 13:30',
#DateTime2 datetime = '2016-04-21 07:15'
;WITH times AS(
SELECT #DateTime1 as d,
CASE WHEN DATEPART(hour,#DateTime1) between 6 and 22 then 'd' else 'n' end as a,
0 as m
UNION ALL
SELECT DATEADD(minute,1,d),
CASE WHEN DATEPART(hour,DATEADD(minute,1,d)) between 6 and 22 then 'd' else 'n' end as a,
DATEDIFF(minute,d,DATEADD(minute,1,d))
FROM times
WHERE DATEADD(minute,1,d) <= #DateTime2
)
SELECT CASE WHEN a = 'd' THEN 'DayTime' ELSE 'NightTime' END as TimePart,
sum(m)/60 as H,
sum(m) - (sum(m)/60)* 60 as M
FROM times
GROUP BY a
OPTION (MAXRECURSION 0)
Output be like:
TimePart H M
--------- ----------- -----------
DayTime 10 45
NightTime 7 0
(2 row(s) affected)
Related
There's a table with three columns: start date, end date and task duration in hours. For example, something like that:
Id
StartDate
EndDate
Duration
1
07-11-2022
15-11-2022
40
2
02-09-2022
02-11-2022
122
3
10-10-2022
05-11-2022
52
And I want to get a table like that:
Id
Month
HoursPerMonth
1
11
40
2
09
56
2
10
62
2
11
4
3
10
42
3
11
10
Briefly, I wanted to know, how many working hours is in each month between start and end dates. Proportionally. How can I achieve that by MS SQL Query? Data is quite big so the query speed is important enough. Thanks in advance!
I've tried DATEDIFF and EOMONTH, but that solution doesn't work with tasks > 2 months. And I'm sure that this solution is bad decision. I hope, that it can be done more elegant way.
Here is an option using an ad-hoc tally/calendar table
Not sure I'm agree with your desired results
Select ID
,Month = month(D)
,HoursPerMonth = (sum(1.0) / (1+max(datediff(DAY,StartDate,EndDate)))) * max(Duration)
From YourTable A
Join (
Select Top 75000 D=dateadd(day,Row_Number() Over (Order By (Select NULL)),0)
From master..spt_values n1, master..spt_values n2
) B on D between StartDate and EndDate
Group By ID,month(D)
Order by ID,Month
Results
This answer uses CTE recursion.
This part just sets up a temp table with the OP's example data.
DECLARE #source
TABLE (
SOURCE_ID INT
,STARTDATE DATE
,ENDDATE DATE
,DURATION INT
)
;
INSERT
INTO
#source
VALUES
(1, '20221107', '20221115', 40 )
,(2, '20220902', '20221102', 122 )
,(3, '20221010', '20221105', 52 )
;
This part is the query based on the above data. The recursive CTE breaks the time period into months. The second CTE does the math. The final selection does some more math and presents the results the way you want to seem them.
WITH CTE AS (
SELECT
SRC.SOURCE_ID
,SRC.STARTDATE
,SRC.ENDDATE
,SRC.STARTDATE AS 'INTERIM_START_DATE'
,CASE WHEN EOMONTH(SRC.STARTDATE) < SRC.ENDDATE
THEN EOMONTH(SRC.STARTDATE)
ELSE SRC.ENDDATE
END AS 'INTERIM_END_DATE'
,SRC.DURATION
FROM
#source SRC
UNION ALL
SELECT
CTE.SOURCE_ID
,CTE.STARTDATE
,CTE.ENDDATE
,CASE WHEN EOMONTH(CTE.INTERIM_START_DATE) < CTE.ENDDATE
THEN DATEADD( DAY, 1, EOMONTH(CTE.INTERIM_START_DATE) )
ELSE CTE.STARTDATE
END
,CASE WHEN EOMONTH(CTE.INTERIM_START_DATE, 1) < CTE.ENDDATE
THEN EOMONTH(CTE.INTERIM_START_DATE, 1)
ELSE CTE.ENDDATE
END
,CTE.DURATION
FROM
CTE
WHERE
CTE.INTERIM_END_DATE < CTE.ENDDATE
)
, CTE2 AS (
SELECT
CTE.SOURCE_ID
,CTE.STARTDATE
,CTE.ENDDATE
,CTE.INTERIM_START_DATE
,CTE.INTERIM_END_DATE
,CAST( DATEDIFF( DAY, CTE.INTERIM_START_DATE, CTE.INTERIM_END_DATE ) + 1 AS FLOAT ) AS 'MNTH_DAYS'
,CAST( DATEDIFF( DAY, CTE.STARTDATE, CTE.ENDDATE ) + 1 AS FLOAT ) AS 'TTL_DAYS'
,CAST( CTE.DURATION AS FLOAT ) AS 'DURATION'
FROM
CTE
)
SELECT
CTE2.SOURCE_ID AS 'Id'
,MONTH( CTE2.INTERIM_START_DATE ) AS 'Month'
,ROUND( CTE2.MNTH_DAYS/CTE2.TTL_DAYS * CTE2.DURATION, 0 ) AS 'HoursPerMonth'
FROM
CTE2
ORDER BY
CTE2.SOURCE_ID
,CTE2.INTERIM_END_DATE
;
My results agree with Mr. Cappelletti's, not the OP's. Perhaps some tweaking regarding the definition of a "Day" is needed. I don't know.
If time between start and end date is large (more than 100 months) you may want to specify OPTION (MAXRECURSION 0) at the end.
I have a union query that runs abysmally slow I believe mostly because there are two functions in the where clause of each union. I am pretty sure that there is no getting around the unions, but there may be a way to move the functions from the where of each. I won't post ALL of the union sections because I don't think it is necessary as they are all almost identical with the exception of one table in each. The first function was created by someone else but it takes a date, and uses the "frequency" value like "years, months, days, etc." and the "interval" value like 3, 4, 90 to calculate the new "Due Date". For instance, a date of today with a frequency of years, and an interval of 3, would produce the date 4/21/2025. Here is the actual function:
ALTER FUNCTION [dbo].[ReturnExpiration_IntervalxFreq](#Date datetime2,#int int, #freq int)
RETURNS datetime2
AS
BEGIN
declare #d datetime2;
SELECT #d = case when #int = 1 then null-- '12-31-9999'
when #int = 2 then dateadd(day,#freq,#date)
when #int = 3 then dateadd(week,#freq,#date)
when #int = 4 then dateadd(month,#freq,#date)
when #int = 5 then dateadd(quarter,#freq,#date)
when #int = 6 then dateadd(year,#freq,#date)
end
RETURN #d;
The query itself is supposed to find and identify records whose Due Date has past or is within 90 days of the current date. Here is what each section of the union looks like
SELECT
R.RequirementId
, EC.EmployeeCompanyId
, EC.CompanyId
, DaysOverdue =
CASE WHEN
R.DueDate IS NULL
THEN
CASE WHEN
EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId AND RC.RequirementId = R.RequirementId AND RC.Active = 1 AND ((DATEDIFF(DAY, R.DueDate, GETDATE()) > -91 OR R.DueDate Is Null ) OR (DATEDIFF(DAY, dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), GETDATE()) > -91)) OR R.IntervalId IS NULL)
THEN
DateDiff(day,ISNULL(dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), '12/31/9999'),getdate())
ELSE
0
END
ELSE
DATEDIFF(day,R.DueDate,getdate())
END
,CASE WHEN
EXISTS(SELECT 1 FROM tbl_Training_Requirement_Compliance RC WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId AND RC.RequirementId = R.RequirementId AND RC.Active=1 AND (GETDATE() > dbo.ReturnExpiration_IntervalxFreq(RC.EffectiveDate, R.IntervalId, R.Frequency) OR R.IntervalId IS NULL))
THEN
CONVERT(VARCHAR(12),dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), 101)
ELSE
CONVERT(VARCHAR(12),R.DueDate,101)
END As DateDue
FROM
#Employees AS EC
INNER JOIN dbo.tbl_Training_Requirement_To_Position TRP ON TRP.PositionId = EC.PositionId
INNER JOIN #CompanyReqs R ON R.RequirementId = TRP.RequirementId
LEFT OUTER JOIN tbl_Training_Requirement_Compliance TRC ON TRC.EmployeeCompanyId = EC.EmployeeCompanyId AND TRC.RequirementId = R.RequirementId AND TRC.Active = 1
WHERE
NOT EXISTS(SELECT 1
FROM tbl_Training_Requirement_Compliance RC
WHERE RC.EmployeeCompanyId = EC.EmployeeCompanyId
AND RC.RequirementId = R.RequirementId
AND RC.Active = 1
)
OR (
(DATEDIFF(DAY, R.DueDate, GETDATE()) > -91
OR R.DueDate Is Null )
OR (DATEDIFF(DAY, dbo.ReturnExpiration_IntervalxFreq(TRC.EffectiveDate, R.IntervalId, R.Frequency), GETDATE()) > -91))
UNION...
It is supposed to exclude records that either don't exist at all on the tbl_Training_Requirement_Compliance table, or if they do exist, once the frequency an intervals have been calculated, would have a new due date that is within 90 days of the current date. I am hoping that someone with much more experience and expertise in SQL Server can show me a way, if possible, to remove the functions from the WHERE clause and help the performance of this stored procedure.
I have a simplified table called Bookings that has two columns BookDate and BookSlot. The BookDate column will have dates only (no time) and the BookSlot column will contain the time of the day in intervals of 30 minutes from 0 to 1410 inclusive. (i.e. 600 = 10:00am)
How can I find the first slot available in the future (not booked) without running through a loop?
Here is the table definition and test data:
Create Table Bookings(
BookDate DateTime Not Null,
BookSlot Int Not Null
)
Go
Insert Into Bookings(BookDate,BookSlot) Values('2014-07-01',0);
Insert Into Bookings(BookDate,BookSlot) Values('2014-07-01',30);
Insert Into Bookings(BookDate,BookSlot) Values('2014-07-01',60);
Insert Into Bookings(BookDate,BookSlot) Values('2014-07-01',630);
Insert Into Bookings(BookDate,BookSlot) Values('2014-07-02',60);
Insert Into Bookings(BookDate,BookSlot) Values('2014-07-02',90);
Insert Into Bookings(BookDate,BookSlot) Values('2014-07-02',120);
I want a way to return the first available slot that is not in the table and that is in the future (based on server time).
Based on above test data:
If the current server time was 1st Jul, 00:10am, the result should be 1st Jul, 90min (01:30am).
If the current server time was 2nd Jul, 01:05am, the result should be 2nd Jul, 150min (02:30am).
If there are no bookings in the future, the function would simply return the closest half-hour in the future.
--
SQL Fiddle for this is here:
http://sqlfiddle.com/#!6/0e93d/1
Below is one method that will allow bookings up to 256 days in the future, and allow for an empty Booking table. I assume you are using SQL Server 2005 since your BookDate is dateTime instead of date.
In any case, you might consider storing the slots as a complete datetime instead of separate columns. That will facilitate queries and improve performance.
DECLARE #now DATETIME = '2014-07-01 00:10:00';
WITH T4
AS (SELECT N
FROM (VALUES(0),
(0),
(0),
(0),
(0),
(0),
(0),
(0)) AS t(N)),
T256
AS (SELECT Row_number()
OVER(
ORDER BY (SELECT 0)) - 1 AS n
FROM T4 AS a
CROSS JOIN T4 AS b
CROSS JOIN T4 AS c),
START_DATE
AS (SELECT Dateadd(DAY, Datediff(DAY, '', #now), '') AS start_date),
START_TIME
AS (SELECT Dateadd(MINUTE, Datediff(MINUTE, '', #now) / 30 * 30, '') AS
start_time),
DAILY_INTERVALS
AS (SELECT N * 30 AS interval
FROM T256
WHERE N < 48)
SELECT TOP (1) Dateadd(DAY, future_days.N, START_DATE) AS BookDate,
DAILY_INTERVALS.INTERVAL AS BookSlot
FROM START_DATE
CROSS APPLY START_TIME
CROSS APPLY DAILY_INTERVALS
CROSS APPLY T256 AS future_days
WHERE Dateadd(MINUTE, DAILY_INTERVALS.INTERVAL,
Dateadd(DAY, future_days.N, START_DATE)) > START_TIME
AND NOT EXISTS(SELECT *
FROM DBO.BOOKINGS
WHERE BOOKDATE = START_DATE
AND BOOKSLOT = DAILY_INTERVALS.INTERVAL)
ORDER BY BOOKDATE,
BOOKSLOT;
See this SQL Fiddle
It's a bit complicated but try this:
WITH DATA
AS (SELECT *,
Row_number()
OVER (
ORDER BY BOOKDATE, BOOKSLOT) RN
FROM BOOKINGS)
SELECT CASE
WHEN T.BOOKSLOT = 1410 THEN Dateadd(DAY, 1, BOOKDATE)
ELSE BOOKDATE
END Book_Date,
CASE
WHEN T.BOOKSLOT = 1410 THEN 0
ELSE BOOKSLOT + 30
END Book_Slot
FROM (SELECT TOP 1 T1.*
FROM DATA T1
LEFT JOIN DATA t2
ON t1.RN = T2.RN - 1
WHERE t2.BOOKSLOT - t1.BOOKSLOT > 30
OR ( t1.BOOKDATE != T2.BOOKDATE
AND ( t2.BOOKSLOT != 0
OR t1.BOOKSLOT != 630 ) )
OR t2.BOOKSLOT IS NULL)T
Here is the SQL fiddle example.
Explanation
This solution contains 2 parts:
Comparing each line to the next and checking for a gap (can be done easier in SQL 2012)
Adding a half an hour to create the next slot, this includes moving to the next day if needed.
Edit
Added TOP 1 in the query so that only the first slot is returned as requested.
Update
Here is the updated version including 2 new elements (getting current date+ time and dealing with empty table):
DECLARE #Date DATETIME = '2014-07-01',
#Slot INT = 630
DECLARE #time AS TIME = Cast(Getdate() AS TIME)
SELECT #Slot = Datepart(HOUR, #time) * 60 + Round(Datepart(MINUTE, #time) / 30,
0) * 30
+ 30
SET #Date = Cast(Getdate() AS DATE)
;WITH DATA
AS (SELECT *,
Row_number()
OVER (
ORDER BY BOOKDATE, BOOKSLOT) RN
FROM BOOKINGS
WHERE BOOKDATE > #Date
OR ( BOOKDATE = #Date
AND BOOKSLOT >= #Slot ))
SELECT TOP 1 BOOK_DATE,
BOOK_SLOT
FROM (SELECT CASE
WHEN RN = 1
AND NOT (#slot = BOOKSLOT
AND #Date = BOOKDATE) THEN #Date
WHEN T.BOOKSLOT = 1410 THEN Dateadd(DAY, 1, BOOKDATE)
ELSE BOOKDATE
END Book_Date,
CASE
WHEN RN = 1
AND NOT (#slot = BOOKSLOT
AND #Date = BOOKDATE) THEN #Slot
WHEN T.BOOKSLOT = 1410 THEN 0
ELSE BOOKSLOT + 30
END Book_Slot,
1 AS ID
FROM (SELECT TOP 1 T1.*
FROM DATA T1
LEFT JOIN DATA t2
ON t1.RN = T2.RN - 1
WHERE t2.BOOKSLOT - t1.BOOKSLOT > 30
OR ( t1.BOOKDATE != T2.BOOKDATE
AND ( t2.BOOKSLOT != 0
OR t1.BOOKSLOT != 1410 ) )
OR t2.BOOKSLOT IS NULL)T
UNION
SELECT #date AS bookDate,
#slot AS BookSlot,
2 ID)X
ORDER BY X.ID
Play around with the SQL fiddle and let me know what you think.
In SQL Server 2012 and later, you can use the lead() function. The logic is a bit convoluted because of all the boundary conditions. I think this captures it:
select top 1
(case when BookSlot = 1410 then BookDate else BookDate + 1 end) as BookDate,
(case when BookSlot = 1410 then 0 else BookSlot + 30 end) as BookSlot
from (select b.*,
lead(BookDate) over (order by BookDate) as next_dt,
lead(BookSlot) over (partition by BookDate order by BookSlot) as next_bs
from bookings b
) b
where (next_bs is null and BookSlot < 1410 or
next_bs - BookSlot > 30 or
BookSlot = 1410 and (next_dt <> BookDate + 1 or next_dt = BookDate and next_bs <> 0)
)
order by BookDate, BookSlot;
Using a tally table to generate a list of originally available booking slots out 6 weeks (adjustable below):
declare #Date as date = getdate();
declare #slot as int = 30 * (datediff(n,#Date,getdate()) /30);
with
slots as (
select (ROW_NUMBER() over (order by s)-1) * 30 as BookSlot
from(
values (1),(1),(1),(1),(1),(1),(1),(1) -- 4 hour block
)slots(s)
cross join (
values (1),(1),(1),(1),(1),(1) -- 6 blocks of 4 hours each day
)QuadHours(t)
)
,days as (
select (ROW_NUMBER() over (order by s)-1) + getdate() as BookDate
from (
values (1),(1),(1),(1),(1),(1),(1) -- 7 days in a week
)dayList(s)
cross join (
-- set this to number of weeks out to allow bookings to be made
values (1),(1),(1),(1),(1),(1) -- allow 6 weeks of bookings at a time
)weeks(t)
)
,tally as (
select
cast(days.BookDate as date) as BookDate
,slots.BookSlot as BookSLot
from slots
cross join days
)
select top 1
tally.BookDate
,tally.BookSlot
from tally
left join #Bookings book
on tally.BookDate = book.BookDate
and tally.BookSlot = book.BookSlot
where book.BookSlot is null
and ( tally.BookDate > #Date or tally.BookSlot > #slot )
order by tally.BookDate,tally.BookSlot;
go
try this:
SELECT a.bookdate, ((a.bookslot/60.)+.5) * 60
FROM bookings a LEFT JOIN bookings b
ON a.bookdate=b.bookdate AND (a.bookslot/60.)+.50=b.bookslot/60.
WHERE b.bookslot IS null
Long time stalker, first time poster (and SQL beginner). My question is similar to this one SQL to find time elapsed from multiple overlapping intervals, except I'm able to use CTE, UDFs etc and am looking for more detail.
On a piece of large scale equipment I have a record of all faults that arise. Faults can arise on different sub-components of the system, some may take it offline completely (complete outage = yes), while others do not (complete outage = no). Faults can overlap in time, and may not have end times if the fault has not yet been repaired.
Outage_ID StartDateTime EndDateTime CompleteOutage
1 07:00 3-Jul-13 08:55 3-Jul13 Yes
2 08:30 3-Jul-13 10:00 4-Jul13 No
3 12:00 4-Jul-13 No
4 12:30 4-Jul13 12:35 4-Jul-13 No
1 |---------|
2 |---------|
3 |--------------------------------------------------------------
4 |---|
I need to be able to work out for a user defined time period, how long the total system is fully functional (no faults), how long its degraded (one or more non-complete outages) and how long inoperable (one or more complete outages). I also need to be able to work out for any given time period which faults were on the system. I was thinking of creating a "Stage Change" table anytime a fault is opened or closed, but I am stuck on the best way to do this - any help on this or better solutions would be appreciated!
This isn't a complete solution (I leave that as an exercise :)) but should illustrate the basic technique. The trick is to create a state table (as you say). If you record a 1 for a "start" event and a -1 for an "end" event then a running total in event date/time order gives you the current state at that particular event date/time. The SQL below is T-SQL but should be easily adaptable to whatever database server you're using.
Using your data for partial outage as an example:
DECLARE #Faults TABLE (
StartDateTime DATETIME NOT NULL,
EndDateTime DATETIME NULL
)
INSERT INTO #Faults (StartDateTime, EndDateTime)
SELECT '2013-07-03 08:30', '2013-07-04 10:00'
UNION ALL SELECT '2013-07-04 12:00', NULL
UNION ALL SELECT '2013-07-04 12:30', '2013-07-04 12:35'
-- "Unpivot" the events and assign 1 to a start and -1 to an end
;WITH FaultEvents AS (
SELECT *, Ord = ROW_NUMBER() OVER(ORDER BY EventDateTime)
FROM (
SELECT EventDateTime = StartDateTime, Evt = 1
FROM #Faults
UNION ALL SELECT EndDateTime, Evt = -1
FROM #Faults
WHERE EndDateTime IS NOT NULL
) X
)
-- Running total of Evt gives the current state at each date/time point
, FaultEventStates AS (
SELECT A.Ord, A.EventDateTime, A.Evt, [State] = (SELECT SUM(B.Evt) FROM FaultEvents B WHERE B.Ord <= A.Ord)
FROM FaultEvents A
)
SELECT StartDateTime = S.EventDateTime, EndDateTime = F.EventDateTime
FROM FaultEventStates S
OUTER APPLY (
-- Find the nearest transition to the no-fault state
SELECT TOP 1 *
FROM FaultEventStates B
WHERE B.[State] = 0
AND B.Ord > S.Ord
ORDER BY B.Ord
) F
-- Restrict to start events transitioning from the no-fault state
WHERE S.Evt = 1 AND S.[State] = 1
If you are using SQL Server 2012 then you have the option to calculate the running total using a windowing function.
The below is a rough guide to getting this working. It will compare against an interval table of dates and an interval table of 15 mins. It will then sum the outage events (1 event per interval), but not sum a partial outage if there is a full outage.
You could use a more granular time interval if you needed, I choose 15 mins for speed of coding.
I already had a date interval table set up "CAL.t_Calendar" so you would need to create one of your own to run this code.
Please note, this does not represent actual code you should use. It is only intended as a demonstration and to point you in a possible direction...
EDIT I've just realised I have't accounted for the null end dates. The code will need amending to check for NULL endDates and use #EndDate or GETDATE() if #EndDate is in the future
--drop table ##Events
CREATE TABLE #Events (OUTAGE_ID INT IDENTITY(1,1) PRIMARY KEY
,StartDateTime datetime
,EndDateTime datetime
, completeOutage bit)
INSERT INTO #Events VALUES ('2013-07-03 07:00','2013-07-03 08:55',1),('2013-07-03 08:30','2013-07-04 10:00',0)
,('2013-07-04 12:00',NULL,0),('2013-07-04 12:30','2013-07-04 12:35',0)
--drop table #FiveMins
CREATE TABLE #FiveMins (ID int IDENTITY(1,1) PRIMARY KEY, TimeInterval Time)
DECLARE #Time INT = 0
WHILE #Time <= 1410 --number of 15 min intervals in day * 15
BEGIN
INSERT INTO #FiveMins SELECT DATEADD(MINUTE , #Time, '00:00')
SET #Time = #Time + 15
END
SELECT * from #FiveMins
DECLARE #StartDate DATETIME = '2013-07-03'
DECLARE #EndDate DATETIME = '2013-07-04 23:59:59.999'
SELECT SUM(FullOutage) * 15 as MinutesFullOutage
,SUM(PartialOutage) * 15 as MinutesPartialOutage
,SUM(NoOutage) * 15 as MinutesNoOutage
FROM
(
SELECT DateAnc.EventDateTime
, CASE WHEN COUNT(OU.OUTAGE_ID) > 0 THEN 1 ELSE 0 END AS FullOutage
, CASE WHEN COUNT(OU.OUTAGE_ID) = 0 AND COUNT(pOU.OUTAGE_ID) > 0 THEN 1 ELSE 0 END AS PartialOutage
, CASE WHEN COUNT(OU.OUTAGE_ID) > 0 OR COUNT(pOU.OUTAGE_ID) > 0 THEN 0 ELSE 1 END AS NoOutage
FROM
(
SELECT CAL.calDate + MI.TimeInterval AS EventDateTime
FROM CAL.t_Calendar CAL
CROSS JOIN #FiveMins MI
WHERE CAL.calDate BETWEEN #StartDate AND #EndDate
) DateAnc
LEFT JOIN #Events OU
ON DateAnc.EventDateTime BETWEEN OU.StartDateTime AND OU.EndDateTime
AND OU.completeOutage = 1
LEFT JOIN #Events pOU
ON DateAnc.EventDateTime BETWEEN pOU.StartDateTime AND pOU.EndDateTime
AND pOU.completeOutage = 0
GROUP BY DateAnc.EventDateTime
) AllOutages
I want to count the number of 2 or more consecutive week periods that have negative values within a range of weeks.
Example:
Week | Value
201301 | 10
201302 | -5 <--| both weeks have negative values and are consecutive
201303 | -6 <--|
Week | Value
201301 | 10
201302 | -5
201303 | 7
201304 | -2 <-- negative but not consecutive to the last negative value in 201302
Week | Value
201301 | 10
201302 | -5
201303 | -7
201304 | -2 <-- 1st group of negative and consecutive values
201305 | 0
201306 | -12
201307 | -8 <-- 2nd group of negative and consecutive values
Is there a better way of doing this other than using a cursor and a reset variable and checking through each row in order?
Here is some of the SQL I have setup to try and test this:
IF OBJECT_ID('TempDB..#ConsecutiveNegativeWeekTestOne') IS NOT NULL DROP TABLE #ConsecutiveNegativeWeekTestOne
IF OBJECT_ID('TempDB..#ConsecutiveNegativeWeekTestTwo') IS NOT NULL DROP TABLE #ConsecutiveNegativeWeekTestTwo
CREATE TABLE #ConsecutiveNegativeWeekTestOne
(
[Week] INT NOT NULL
,[Value] DECIMAL(18,6) NOT NULL
)
-- I have a condition where I expect to see at least 2 consecutive weeks with negative values
-- TRUE : Week 201328 & 201329 are both negative.
INSERT INTO #ConsecutiveNegativeWeekTestOne
VALUES
(201327, 5)
,(201328,-11)
,(201329,-18)
,(201330, 25)
,(201331, 30)
,(201332, -36)
,(201333, 43)
,(201334, 50)
,(201335, 59)
,(201336, 0)
,(201337, 0)
SELECT * FROM #ConsecutiveNegativeWeekTestOne
WHERE Value < 0
ORDER BY [Week] ASC
CREATE TABLE #ConsecutiveNegativeWeekTestTwo
(
[Week] INT NOT NULL
,[Value] DECIMAL(18,6) NOT NULL
)
-- FALSE: The negative weeks are not consecutive
INSERT INTO #ConsecutiveNegativeWeekTestTwo
VALUES
(201327, 5)
,(201328,-11)
,(201329,20)
,(201330, -25)
,(201331, 30)
,(201332, -36)
,(201333, 43)
,(201334, 50)
,(201335, -15)
,(201336, 0)
,(201337, 0)
SELECT * FROM #ConsecutiveNegativeWeekTestTwo
WHERE Value < 0
ORDER BY [Week] ASC
My SQL fiddle is also here:
http://sqlfiddle.com/#!3/ef54f/2
First, would you please share the formula for calculating week number, or provide a real date for each week, or some method to determine if there are 52 or 53 weeks in any particular year? Once you do that, I can make my queries properly skip missing data AND cross year boundaries.
Now to queries: this can be done without a JOIN, which depending on the exact indexes present, may improve performance a huge amount over any solution that does use JOINs. Then again, it may not. This is also harder to understand so may not be worth it if other solutions perform well enough (especially when the right indexes are present).
Simulate a PREORDER BY windowing function (respects gaps, ignores year boundaries):
WITH Calcs AS (
SELECT
Grp =
[Week] -- comment out to ignore gaps and gain year boundaries
-- Row_Number() OVER (ORDER BY [Week]) -- swap with previous line
- Row_Number() OVER
(PARTITION BY (SELECT 1 WHERE Value < 0) ORDER BY [Week]),
*
FROM dbo.ConsecutiveNegativeWeekTestOne
)
SELECT
[Week] = Min([Week])
-- NumWeeks = Count(*) -- if you want the count
FROM Calcs C
WHERE Value < 0
GROUP BY C.Grp
HAVING Count(*) >= 2
;
See a Live Demo at SQL Fiddle (1st query)
And another way, simulating LAG and LEAD with a CROSS JOIN and aggregates (respects gaps, ignores year boundaries):
WITH Groups AS (
SELECT
Grp = T.[Week] + X.Num,
*
FROM
dbo.ConsecutiveNegativeWeekTestOne T
CROSS JOIN (VALUES (-1), (0), (1)) X (Num)
)
SELECT
[Week] = Min(C.[Week])
-- Value = Min(C.Value)
FROM
Groups G
OUTER APPLY (SELECT G.* WHERE G.Num = 0) C
WHERE G.Value < 0
GROUP BY G.Grp
HAVING
Min(G.[Week]) = Min(C.[Week])
AND Max(G.[Week]) > Min(C.[Week])
;
See a Live Demo at SQL Fiddle (2nd query)
And, my original second query, but simplified (ignores gaps, handles year boundaries):
WITH Groups AS (
SELECT
Grp = (Row_Number() OVER (ORDER BY T.[Week]) + X.Num) / 3,
*
FROM
dbo.ConsecutiveNegativeWeekTestOne T
CROSS JOIN (VALUES (0), (2), (4)) X (Num)
)
SELECT
[Week] = Min(C.[Week])
-- Value = Min(C.Value)
FROM
Groups G
OUTER APPLY (SELECT G.* WHERE G.Num = 2) C
WHERE G.Value < 0
GROUP BY G.Grp
HAVING
Min(G.[Week]) = Min(C.[Week])
AND Max(G.[Week]) > Min(C.[Week])
;
Note: The execution plan for these may be rated as more expensive than other queries, but there will be only 1 table access instead of 2 or 3, and while the CPU may be higher it is still respectably low.
Note: I originally was not paying attention to only producing one row per group of negative values, and so I produced this query as only requiring 2 table accesses (respects gaps, ignores year boundaries):
SELECT
T1.[Week]
FROM
dbo.ConsecutiveNegativeWeekTestOne T1
WHERE
Value < 0
AND EXISTS (
SELECT *
FROM dbo.ConsecutiveNegativeWeekTestOne T2
WHERE
T2.Value < 0
AND T2.[Week] IN (T1.[Week] - 1, T1.[Week] + 1)
)
;
See a Live Demo at SQL Fiddle (3rd query)
However, I have now modified it to perform as required, showing only each starting date (respects gaps, ignored year boundaries):
SELECT
T1.[Week]
FROM
dbo.ConsecutiveNegativeWeekTestOne T1
WHERE
Value < 0
AND EXISTS (
SELECT *
FROM
dbo.ConsecutiveNegativeWeekTestOne T2
WHERE
T2.Value < 0
AND T1.[Week] - 1 <= T2.[Week]
AND T1.[Week] + 1 >= T2.[Week]
AND T1.[Week] <> T2.[Week]
HAVING
Min(T2.[Week]) > T1.[Week]
)
;
See a Live Demo at SQL Fiddle (3rd query)
Last, just for fun, here is a SQL Server 2012 and up version using LEAD and LAG:
WITH Weeks AS (
SELECT
PrevValue = Lag(Value, 1, 0) OVER (ORDER BY [Week]),
SubsValue = Lead(Value, 1, 0) OVER (ORDER BY [Week]),
PrevWeek = Lag(Week, 1, 0) OVER (ORDER BY [Week]),
SubsWeek = Lead(Week, 1, 0) OVER (ORDER BY [Week]),
*
FROM
dbo.ConsecutiveNegativeWeekTestOne
)
SELECT #Week = [Week]
FROM Weeks W
WHERE
(
[Week] - 1 > PrevWeek
OR PrevValue >= 0
)
AND Value < 0
AND SubsValue < 0
AND [Week] + 1 = SubsWeek
;
See a Live Demo at SQL Fiddle (4th query)
I am not sure I am doing this the best way as I haven't used these much, but it works nonetheless.
You should do some performance testing of the various queries presented to you, and pick the best one, considering that code should be, in order:
Correct
Clear
Concise
Fast
Seeing that some of my solutions are anything but clear, other solutions that are fast enough and concise enough will probably win out in the competition of which one to use in your own production code. But... maybe not! And maybe someone will appreciate seeing these techniques, even if they can't be used as-is this time.
So let's do some testing and see what the truth is about all this! Here is some test setup script. It will generate the same data on your own server as it did on mine:
IF Object_ID('dbo.ConsecutiveNegativeWeekTestOne', 'U') IS NOT NULL DROP TABLE dbo.ConsecutiveNegativeWeekTestOne;
GO
CREATE TABLE dbo.ConsecutiveNegativeWeekTestOne (
[Week] int NOT NULL CONSTRAINT PK_ConsecutiveNegativeWeekTestOne PRIMARY KEY CLUSTERED,
[Value] decimal(18,6) NOT NULL
);
SET NOCOUNT ON;
DECLARE
#f float = Rand(5.1415926535897932384626433832795028842),
#Dt datetime = '17530101',
#Week int;
WHILE #Dt <= '20140106' BEGIN
INSERT dbo.ConsecutiveNegativeWeekTestOne
SELECT
Format(#Dt, 'yyyy') + Right('0' + Convert(varchar(11), DateDiff(day, DateAdd(year, DateDiff(year, 0, #Dt), 0), #Dt) / 7 + 1), 2),
Rand() * 151 - 76
;
SET #Dt = DateAdd(day, 7, #Dt);
END;
This generates 13,620 weeks, from 175301 through 201401. I modified all the queries to select the Week values instead of the count, in the format SELECT #Week = Expression ... so that tests are not affected by returning rows to the client.
I tested only the gap-respecting, non-year-boundary-handling versions.
Results
Query Duration CPU Reads
------------------ -------- ----- ------
ErikE-Preorder 27 31 40
ErikE-CROSS 29 31 40
ErikE-Join-IN -------Awful---------
ErikE-Join-Revised 46 47 15069
ErikE-Lead-Lag 104 109 40
jods 12 16 120
Transact Charlie 12 16 120
Conclusions
The reduced reads of the non-JOIN versions are not significant enough to warrant their increased complexity.
The table is so small that the performance almost doesn't matter. 261 years of weeks is insignificant, so a normal business operation won't see any performance problem even with a poor query.
I tested with an index on Week (which is more than reasonable), doing two separate JOINs with a seek was far, far superior to any device to try to get the relevant related data in one swoop. Charlie and jods were spot on in their comments.
This data is not large enough to expose real differences between the queries in CPU and duration. The values above are representative, though at times the 31 ms were 16 ms and the 16 ms were 0 ms. Since the resolution is ~15 ms, this doesn't tell us much.
My tricky query techniques do perform better. They might be worth it in performance critical situations. But this is not one of those.
Lead and Lag may not always win. The presence of an index on the lookup value is probably what determines this. The ability to still pull prior/next values based on a certain order even when the order by value is not sequential may be one good use case for these functions.
you could use a combination of EXISTS.
Assuming you only want to know groups (series of consecutive weeks all negative)
--Find the potential start weeks
;WITH starts as (
SELECT [Week]
FROM #ConsecutiveNegativeWeekTestOne AS s
WHERE s.[Value] < 0
AND NOT EXISTS (
SELECT 1
FROM #ConsecutiveNegativeWeekTestOne AS p
WHERE p.[Week] = s.[Week] - 1
AND p.[Value] < 0
)
)
SELECT COUNT(*)
FROM
Starts AS s
WHERE EXISTS (
SELECT 1
FROM #ConsecutiveNegativeWeekTestOne AS n
WHERE n.[Week] = s.[Week] + 1
AND n.[Value] < 0
)
If you have an index on Week this query should even be moderately efficient.
You can replace LEAD and LAG with a self-join.
The counting idea is basically to count start of negative sequences rather than trying to consider each row.
SELECT COUNT(*)
FROM ConsecutiveNegativeWeekTestOne W
LEFT OUTER JOIN ConsecutiveNegativeWeekTestOne Prev
ON W.week = Prev.week + 1
INNER JOIN ConsecutiveNegativeWeekTestOne Next
ON W.week = Next.week - 1
WHERE W.value < 0
AND (Prev.value IS NULL OR Prev.value > 0)
AND Next.value < 0
Note that I simply did "week + 1", which would not work when there is a year change.