SQL Audit Log Running Totals - sql

I have a table with an audit log:
BugId Timestamp Status
1 2010-06-24 10:00:00 open
2 2010-06-24 11:00:00 open
1 2010-06-25 12:00:00 closed
2 2010-06-26 13:00:00 closed
I want a running total of open and closed bugs like:
Timestamp # Status
2010-06-25 00:00:00 2 open
2010-06-26 00:00:00 1 open
2010-06-26 00:00:00 1 closed
2010-06-27 00:00:00 2 closed
How may I do this query (or similar) in Microsoft SQL Server 2000?
The output is intended to be used to feed a time series chart so I do not care if there are rows with 0 output since I will probably only select a timespan like the last month.

I think the output actually matches the sample data: on the 25th (12am), there are two open bugs. On the 26th, there is one open bug and one closed. And by the 27th, all bugs are closed.
It isn't clear how the main dates should be created. For my example, I pre-loaded the dates that I knew to be right but this could be accomplished in a variety of ways depending on the requirements of the user.
Anyway, the code is below. This should work for instances where a bug is opened and closed multiple times on the same day. It operates under the assumption that a bug cannot be opened and closed at the same time.
/** Setup the tables **/
IF OBJECT_ID('tempdb..#bugs') IS NOT NULL DROP TABLE #bugs
CREATE TABLE #bugs (
BugID INT,
[Timestamp] DATETIME,
[Status] VARCHAR(10)
)
IF OBJECT_ID('tempdb..#dates') IS NOT NULL DROP TABLE #dates
CREATE TABLE #dates (
[Date] DATETIME
)
/** Load the sample data. **/
INSERT #bugs
SELECT 1, '2010-06-24 10:00:00', 'open' UNION ALL
SELECT 2, '2010-06-24 11:00:00', 'open' UNION ALL
SELECT 1, '2010-06-25 12:00:00', 'closed' UNION ALL
SELECT 2, '2010-06-26 13:00:00', 'closed'
/** Build an arbitrary date table **/
INSERT #dates
SELECT '2010-06-24' UNION ALL
SELECT '2010-06-25' UNION ALL
SELECT '2010-06-26' UNION ALL
SELECT '2010-06-27'
/**
Subquery x:
For each date in the #date table,
get the BugID and it's last status.
This is for BugIDs that have been
opened and closed on the same day.
Subquery y:
Drawing from subquery x, get the
date, BugID, and Status of its
last status for that day
Main query:
For each date, get the count
of the most recent statuses for
that date. This will give the
running totals of open and
closed bugs for each date
**/
SELECT
[Date],
COUNT(*) AS [#],
[Status]
FROM (
SELECT
Date,
x.BugID,
b.[Status]
FROM (
SELECT
[Date],
BugID,
MAX([Timestamp]) AS LastStatus
FROM #dates d
INNER JOIN #bugs b
ON d.[Date] > b.[Timestamp]
GROUP BY
[Date],
BugID
) x
INNER JOIN #bugs b
ON x.BugID = b.BugID
AND x.LastStatus = b.[Timestamp]
) y
GROUP BY [Date], [Status]
ORDER BY [Date], CASE WHEN [Status] = 'Open' THEN 1 ELSE 2 END
Results:
Date # Status
----------------------- ----------- ----------
2010-06-25 00:00:00.000 2 open
2010-06-26 00:00:00.000 1 open
2010-06-26 00:00:00.000 1 closed
2010-06-27 00:00:00.000 2 closed

use tempdb
go
create table audit_log
(
BugID integer not null
, dt_entered_utc datetime not null default ( getutcdate () )
, [status] varchar(10) not null
);
INSERT INTO audit_log ( BugID, dt_entered_utc, [status] ) VALUES ( 1, '2010-06-24 10:00', 'open' );
INSERT INTO audit_log ( BugID, dt_entered_utc, [status] ) VALUES ( 2, '2010-06-24 11:00', 'open' );
INSERT INTO audit_log ( BugID, dt_entered_utc, [status] ) VALUES ( 1, '2010-06-25 12:00', 'closed' );
INSERT INTO audit_log ( BugID, dt_entered_utc, [status] ) VALUES ( 2, '2010-06-26 13:00', 'closed' );
SELECT
[Date] = CAST ( CONVERT ( varchar, a.dt_entered_utc, 101 ) as datetime )
, [#] = COUNT ( 1 )
, [Status] = a.status
FROM audit_log a
GROUP BY CAST ( CONVERT ( varchar, a.dt_entered_utc, 101 ) as datetime ), a.status
ORDER by [Date] ASC
Date # Status
2010-06-24 00:00:00.000 2 open
2010-06-25 00:00:00.000 1 closed
2010-06-26 00:00:00.000 1 closed

Related

Finding the largest time overlap in T-SQL

I'm trying to do this on SQL Server 2008 R2.
I have a table with 4 columns:
parent_id INT
child_id INT
start_time TIME
end_time TIME
You should look at the children as sub-processes that run for the parent program. All these sub-processes are run once every day, and each child run within its given time span. I want to find the largest overlap of time intervals for each parent based on the times of its children, i.e. I want to know the longest possible overlap where all the sub-processes are running. The fact that each time span is repeated every day means that even if child's time interval spans midnight (i.e. 23:00-10:00), it can overlap with a child that only runs in the morning (i.e. 07:00-09:00), because even if they don't overlap on "the first day", they will overlap on all subsequent days.
The output should look like this:
parent_id INT
start_time TIME
end_time TIME
valid BIT
Where valid = 1 if an overlap was found and valid = 0 if no overlap was found.
A couple of important pieces of information:
A time interval can span midnight, i.e. start_time = 23:00 and end_time = 03:00, which is a time interval of 4 hours.
Two time intervals may overlap in two different places, i.e. start_time1 = 13:00, end_time1 = 06:00, start_time2 = 04:00, end_time2 = 14:00. This would give the largest overlap as 04:00 - 06:00 = 2 hours.
There may be no overlap common for the children of a given parent, in which case the out put for that parent would be start_time = NULL, end_time = NULL and valid = 0.
If a child interval spans the whole day, then start_time = NULL and end_time = NULL. This was chosen to avoid having a day as 00:00-24:00, which would slice overlaps crossing midnight in two, i.e. parent 3 below would end up having two overlaps (23:00-24:00 and 00:00 - 004:00), in stead of one (23:00-04:00).
An overlap is only an overlap if the time interval is shared by all the children of a parent.
The time span of one child can never be longer than 24 hours.
Take this example:
parent_id child_id start_time end_time
1 1 06:00 14:00
1 2 13:00 09:00
1 3 07:00 09:00
2 1 12:00 17:00
2 2 09:00 11:00
3 1 NULL NULL
3 2 23:00 04:00
4 1 NULL NULL
4 2 NULL NULL
10 1 06:11 14:00
10 2 06:00 09:00
10 3 05:00 08:44
11 1 11:38 17:00
11 2 09:02 12:11
These data would produce this result set:
parent_id start_time end_time valid
1 07:00 09:00 1
2 NULL NULL 0
3 23:00 04:00 1
4 NULL NULL 1
10 06:11 08:44 1
11 11:38 12:11 1
The overlap for a parent is the time interval that is shared by all its children. So the overlap for parent 10 is found by finding the overlap where all 3 children share time:
Child 1 (06:11-14:00) and 2 (06:00-09:00) overlap from 06:11 to 09:00. This overlap time interval is then applied to child 3 (05:00-08:44), which gives an overlap of 06:11 to 08:44, since this interval is the only interval where all 3 children share common time.
I hope this makes sense.
I can do it with a cursor, but I would really prefer to avoid cursors. I have been wracking my brain about how to do it without cursors, but I have come up short. Is there any way of doing it without cursors?
EDIT: Expanded the text for clause 4, to explain the decision of having a full day be NULL to NULL, in stead of 00:00 to 00:00.
EDIT: Expanded the examples with two more cases. The new cases have parent ID 10 and 11.
EDIT: Inserted explanation of how the overlap for parent 10 is found.
EDIT: Clarified clause 3. Added clauses 5 and 6. Went into detail about what this is all about.
Based on your question, I think your output should be:
parent_id start_time end_time valid
1 07:00 09:00 1
2 NULL NULL 0
3 23:00 04:00 1
4 NULL NULL 1
10 06:11 08:44 1
11 11:38 12:11 1
And here is a set-based solution:
DECLARE #Times TABLE
(
parent_id INT
,child_id INT
,start_time TIME
,end_time TIME
);
INSERT INTO #Times
VALUES
(1, 1, '06:00', '14:00')
,(1, 2, '13:00', '09:00')
,(1, 3, '07:00', '09:00')
,(2, 1, '12:00', '17:00')
,(2, 2, '09:00', '11:00')
,(3, 1, NULL, NULL)
,(3, 2, '23:00', '04:00')
,(4, 1, NULL, NULL)
,(4, 2, NULL, NULL)
,(10, 1, '06:11', '14:00')
,(10, 2, '06:00', '09:00')
,(10, 3, '05:00', '08:44')
,(11, 1, '11:38', '17:00')
,(11, 2, '09:02', '12:11');
DECLARE #Parents TABLE
(
parent_id INT PRIMARY KEY
,ChildCount INT
)
INSERT INTO #Parents
SELECT
parent_id
,COUNT(DISTINCT child_id) AS ChildCount
FROM
#Times
GROUP BY
parent_id
DECLARE #StartTime DATETIME2 = '00:00'
DECLARE #MinutesInTwoDays INT = 2880
DECLARE #Minutes TABLE(ThisMinute DATETIME2 PRIMARY KEY);
WITH
MinutesCTE AS
(
SELECT
1 AS MinuteNumber
,#StartTime AS ThisMinute
UNION ALL
SELECT
NextMinuteNumber
,NextMinute
FROM MinutesCTE
CROSS APPLY (VALUES(MinuteNumber+1,DATEADD(MINUTE,1,ThisMinute))) NextDates(NextMinuteNumber,NextMinute)
WHERE
NextMinuteNumber <= #MinutesInTwoDays
)
INSERT INTO #Minutes
SELECT ThisMinute FROM MinutesCTE M OPTION (MAXRECURSION 2880);
DECLARE #SharedMinutes TABLE
(
ThisMinute DATETIME2
,parent_id INT
,UNIQUE(ThisMinute,parent_id)
);
WITH TimesCTE AS
(
SELECT
Times.parent_id
,Times.child_id
,CAST(ISNULL(Times.start_time,'00:00') AS datetime2) AS start_time
,
DATEADD
(
DAY
,
CASE
WHEN Times.end_time IS NULL THEN 2
WHEN Times.start_time > Times.end_time THEN 1
ELSE 0
END
,CAST(ISNULL(Times.end_time,'00:00') AS datetime2)
) as end_time
FROM
#Times Times
UNION ALL
SELECT
Times.parent_id
,Times.child_id
,DATEADD(DAY,1,CAST(Times.start_time as datetime2)) AS start_time
,DATEADD(DAY,1,CAST(Times.end_time AS datetime2)) AS end_time
FROM
#Times Times
WHERE
start_time < end_time
)
--Get minutes shared by all children of each parent
INSERT INTO #SharedMinutes
SELECT
M.ThisMinute
,P.parent_id
FROM
#Minutes M
JOIN
TimesCTE T
ON
M.ThisMinute BETWEEN start_time AND end_time
JOIN
#Parents P
ON T.parent_id = P.parent_id
GROUP BY
M.ThisMinute
,P.parent_id
,P.ChildCount
HAVING
COUNT(DISTINCT T.child_id) = P.ChildCount
--get results
SELECT
parent_id
,CAST(CASE WHEN start_time = '1900-01-01' AND end_time = '1900-01-02 23:59' THEN NULL ELSE start_time END AS TIME) AS start_time
,CAST(CASE WHEN start_time = '1900-01-01' AND end_time = '1900-01-02 23:59' THEN NULL ELSE end_time END AS TIME) AS end_time
,valid
FROM
(
SELECT
P.parent_id
,MIN(ThisMinute) AS start_time
,MAX(ThisMinute) AS end_time
,CASE WHEN MAX(ThisMinute) IS NOT NULL THEN 1 ELSE 0 END AS valid
FROM
#Parents P
LEFT JOIN
#SharedMinutes SM
ON P.parent_id = SM.parent_id
GROUP BY
P.parent_id
) Results
You may find that the iterative algorithm you have outlined in your question would be more efficient. But I would use a WHILE loop instead of a cursor if you take that approach.
This might be a very verbose method of achieving the desired results, but it works for the given dataset, although it should be tested with larger data.
I've simply joined the table to itself where the parent_id matches and the child_id is different to get all of the combinations of times that might overlap and then performed some DATEDIFF's to calculate the difference, before filtering and grouping the output.
You can run the below in isolation to test and tweak if required:
-- setup initial table
CREATE TABLE #OverlapTable
(
[parent_id] INT ,
[child_id] INT ,
[start_time] TIME ,
[end_time] TIME
);
-- insert dummy data
INSERT INTO #OverlapTable
( [parent_id], [child_id], [start_time], [end_time] )
VALUES ( 1, 1, '06:00', '14:00' ),
( 1, 2, '13:00', '09:00' ),
( 1, 3, '07:00', '09:00' ),
( 2, 1, '12:00', '17:00' ),
( 2, 2, '09:00', '11:00' ),
( 3, 1, NULL, NULL ),
( 3, 2, '23:00', '04:00' ),
( 4, 1, NULL, NULL ),
( 4, 2, NULL, NULL );
-- insert all combinations into a new temp table #Results with overlap calculations
SELECT *
INTO #Results
FROM ( SELECT t1.parent_id ,
t1.start_time ,
t1.end_time ,
t2.start_time AS t2_start_time ,
t2.end_time AS t2_end_time ,
CASE WHEN t1.start_time IS NULL
AND t1.end_time IS NULL THEN 0
WHEN t1.start_time BETWEEN t2.start_time
AND t2.end_time
THEN DATEDIFF(HOUR, t1.start_time, t2.end_time)
WHEN t1.end_time BETWEEN t2.start_time AND t2.end_time
THEN DATEDIFF(HOUR, t2.start_time, t1.end_time)
ELSE NULL
END AS Overlap
FROM #OverlapTable t1
INNER JOIN #OverlapTable t2 ON t2.parent_id = t1.parent_id
AND t2.child_id != t1.child_id
) t
-- SELECT * FROM #Results -- this shows intermediate results
-- filter and group results with the largest overlaps and handle other cases
SELECT DISTINCT
r.parent_id ,
CASE WHEN r.Overlap IS NULL THEN NULL
ELSE CASE WHEN r.start_time IS NULL THEN r.t2_start_time
ELSE r.start_time
END
END start_time ,
CASE WHEN r.Overlap IS NULL THEN NULL
ELSE CASE WHEN r.end_time IS NULL THEN r.t2_end_time
ELSE r.end_time
END
END end_time ,
CASE WHEN r.Overlap IS NULL THEN 0
ELSE 1
END Valid
FROM #Results r
WHERE EXISTS ( SELECT parent_id ,
MAX(Overlap)
FROM #Results
WHERE r.parent_id = parent_id
GROUP BY parent_id
HAVING MAX(Overlap) = r.Overlap
OR ( MAX(Overlap) IS NULL
AND r.Overlap IS NULL
) )
DROP TABLE #Results
DROP TABLE #OverlapTable
Hope that helps.

Returning a set of the most recent rows from a table

I'm trying to retrieve the latest set of rows from a source table containing a foreign key, a date and other fields present. A sample set of data could be:
create table #tmp (primaryId int, foreignKeyId int, startDate datetime,
otherfield varchar(50))
insert into #tmp values (1, 1, '1 jan 2010', 'test 1')
insert into #tmp values (2, 1, '1 jan 2011', 'test 2')
insert into #tmp values (3, 2, '1 jan 2013', 'test 3')
insert into #tmp values (4, 2, '1 jan 2012', 'test 4')
The form of data that I'm hoping to retrieve is:
foreignKeyId maxStartDate otherfield
------------ ----------------------- -------------------------------------------
1 2011-01-01 00:00:00.000 test 2
2 2013-01-01 00:00:00.000 test 3
That is, just one row per foreignKeyId showing the latest start date and associated other fields - the primaryId is irrelevant.
I've managed to come up with:
select t.foreignKeyId, t.startDate, t.otherField from #tmp t
inner join (
select foreignKeyId, max(startDate) as maxStartDate
from #tmp
group by foreignKeyId
) s
on t.foreignKeyId = s.foreignKeyId and s.maxStartDate = t.startDate
but (a) this uses inner queries, which I suspect may lead to performance issues, and (b) it gives repeated rows if two rows in the original table have the same foreignKeyId and startDate.
Is there a query that will return just the first match for each foreign key and start date?
Depending on your sql server version, try the following:
select *
from (
select *, rnum = ROW_NUMBER() over (
partition by #tmp.foreignKeyId
order by #tmp.startDate desc)
from #tmp
) t
where t.rnum = 1
If you wanted to fix your attempt as opposed to re-engineering it then
select t.foreignKeyId, t.startDate, t.otherField from #tmp t
inner join (
select foreignKeyId, max(startDate) as maxStartDate, max(PrimaryId) as Latest
from #tmp
group by foreignKeyId
) s
on t.primaryId = s.latest
would have done the job, assuming PrimaryID increases over time.
Qualms about inner query would have been laid to rest as well assuming some indexes.

Select one row per specific time

I have a table that looks like this:
ID UserID DateTime TypeID
1 1 1/1/2010 10:00:00 1
2 2 1/1/2010 10:01:50 1
3 1 1/1/2010 10:02:50 1
4 1 1/1/2010 10:03:50 1
5 1 1/1/2010 11:00:00 1
6 2 1/1/2010 11:00:50 1
I need to query all users where their typeID is 1, but have only one row per 15 mins
For example, the result should be:
1 1 1/1/2010 10:00:00 1
2 2 1/1/2010 10:01:50 1
5 1 1/1/2010 11:00:00 1
6 2 1/1/2010 11:00:50 1
IDs 3 & 4 are not shown because 15 min haven't been passed since the last record for the specific userID.
IDs 1 & 5 are shown because 15 minutes has been passed for this specific userID
Same as for IDs 2 & 6.
How can I do it?
Thanks
Try this:
select * from
(
select ID, UserID,
Max(DateTime) as UpperBound,
Min(DateTime) as LowerBound,
TypeID
from the_table
where TypeID=1
group by ID,UserID,TypeID
) t
where datediff(mi,LowerBound,UpperBound)>=15
EDIT: SINCE MY ABOVE ATTEMPT WAS WRONG, I'm adding one more approach using a Sql table-valued Function that does not require recursion, since, understandable, it's a big concern.
Step 1: Create a table-type as follows (LoginDate is the DateTime column in Shay's example - DateTime name conflicts with a SQL data type and I think it's wise to avoid these conflicts)
CREATE TYPE [dbo].[TVP] AS TABLE(
[ID] [int] NOT NULL,
[UserID] [int] NOT NULL,
[LoginDate] [datetime] NOT NULL,
[TypeID] [int] NOT NULL
)
GO
Step 2: Create the following Function:
CREATE FUNCTION [dbo].[fnGetLoginFreq]
(
-- notice: TVP is the type (declared above)
#TVP TVP readonly
)
RETURNS
#Table_Var TABLE
(
-- This will be our result set
ID int,
UserId int,
LoginTime datetime,
TypeID int,
RowNumber int
)
AS
BEGIN
--We will insert records in this table as we go through the rows in the
--table passed in as parameter and decide that we should add an entry because
--15' had elapsed between logins
DECLARE #temp table
(
ID int,
UserId int,
LoginTime datetime,
TypeID int
)
-- seems silly, but is not because we need to add a row_number column to help
-- in our iteration and table-valued paramters cannot be modified inside the function
insert into #Table_var
select ID,UserID,Logindate,TypeID,row_number() OVER(ORDER BY UserID,LoginDate) AS [RowNumber]
from #TVP order by UserID asc,LoginDate desc
declare #Index int,#End int,#CurrentLoginTime datetime, #NextLoginTime datetime, #CurrentUserID int , #NextUserID int
select #Index=1,#End=count(*) from #Table_var
while(#Index<=#End)
begin
select #CurrentLoginTime=LoginTime,#CurrentUserID=UserID from #Table_var where RowNumber=#Index
select #NextLoginTime=LoginTime,#NextUserID=UserID from #Table_var where RowNumber=(#Index+1)
if(#CurrentUserID=#NextUserID)
begin
if( abs(DateDiff(mi,#CurrentLoginTime,#NextLoginTime))>=15)
begin
insert into #temp
select ID,UserID,LoginTime,TypeID
from #Table_var
where RowNumber=#Index
end
END
else
bEGIN
insert into #temp
select ID,UserID,LoginTime,TypeID
from #Table_var
where RowNumber=#Index and UserID=#CurrentUserID
END
if(#Index=#End)--last element?
begin
insert into #temp
select ID,UserID,LoginTime,TypeID
from #Table_var
where RowNumber=#Index and not
abs((select datediff(mi,#CurrentLoginTime,max(LoginTime)) from #temp where UserID=#CurrentUserID))<=14
end
select #Index=#Index+1
end
delete from #Table_var
insert into #Table_var
select ID, UserID ,LoginTime ,TypeID ,row_number() OVER(ORDER BY UserID,LoginTime) AS 'RowNumber'
from #temp
return
END
Step 3: Give it a spin
declare #TVP TVP
INSERT INTO #TVP
select ID,UserId,[DateType],TypeID from Shays_table where TypeID=1 --AND any other date restriction you want to add
select * from fnGetLoginFreq(#TVP) order by LoginTime asc
My tests returned this:
ID UserId LoginTime TypeID RowNumber
2 2 2010-01-01 10:01:50.000 1 3
4 1 2010-01-01 10:03:50.000 1 1
5 1 2010-01-01 11:00:00.000 1 2
6 2 2010-01-01 11:00:50.000 1 4
How about this, it's fairly straight forward and gives you the result you need:
SELECT ID, UserID, [DateTime], TypeID
FROM Users
WHERE Users.TypeID = 1
AND NOT EXISTS (
SELECT TOP 1 1
FROM Users AS U2
WHERE U2.ID <> Users.ID
AND U2.UserID = Users.UserID
AND U2.[DateTime] BETWEEN DATEADD(MI, -15, Users.[DateTime]) AND Users.[DateTime]
AND U2.TypeID = 1)
The NOT EXISTS restricts to only show records that have no record within 15minutes before them, so you will see the first record in a block rather than one every 15mins.
Edit: Since you want to see one every 15mins this should do without using recursion:
SELECT Users.ID, Users.UserID, Users.[DateTime], Users.TypeID
FROM
(
SELECT MIN(ID) AS ID, UserID,
DATEADD(minute, DATEDIFF(minute,0,[DateTime]) / 15 * 15, 0) AS [DateTime]
FROM Users
GROUP BY UserID, DATEADD(minute, DATEDIFF(minute,0,[DateTime]) / 15 * 15, 0)
) AS Dates
INNER JOIN Users AS Users ON Users.ID = Dates.ID
WHERE Users.TypeID = 1
AND NOT EXISTS (
SELECT TOP 1 1
FROM
(
SELECT MIN(ID) AS ID, UserID,
DATEADD(minute, DATEDIFF(minute,0,[DateTime]) / 15 * 15, 0) AS [DateTime]
FROM Users
GROUP BY UserID, DATEADD(minute, DATEDIFF(minute,0,[DateTime]) / 15 * 15, 0)
) AS Dates2
INNER JOIN Users AS U2 ON U2.ID = Dates2.ID
WHERE U2.ID <> Users.ID
AND U2.UserID = Users.UserID
AND U2.[DateTime] BETWEEN DATEADD(MI, -15, Users.[DateTime]) AND Users.[DateTime]
AND U2.TypeID = 1
)
ORDER BY Users.DateTime
If this doesn't work please post more sample data so that I can see what is missing.
Edit2 same as directly above but just using CTE now instead for improved readability and help improve maintainability, also I improved it to highlighted where you would also restrict the Dates table by whatever DateTime range that you would be restricting to the main query:
WITH Dates(ID, UserID, [DateTime])
AS
(
SELECT MIN(ID) AS ID, UserID,
DATEADD(minute, DATEDIFF(minute,0,[DateTime]) / 15 * 15, 0) AS [DateTime]
FROM Users
WHERE Users.TypeID = 1
--AND Users.[DateTime] BETWEEN #StartDateTime AND #EndDateTime
GROUP BY UserID, DATEADD(minute, DATEDIFF(minute,0,[DateTime]) / 15 * 15, 0)
)
SELECT Users.ID, Users.UserID, Users.[DateTime], Users.TypeID
FROM Dates
INNER JOIN Users ON Users.ID = Dates.ID
WHERE Users.TypeID = 1
--AND Users.[DateTime] BETWEEN #StartDateTime AND #EndDateTime
AND NOT EXISTS (
SELECT TOP 1 1
FROM Dates AS Dates2
INNER JOIN Users AS U2 ON U2.ID = Dates2.ID
WHERE U2.ID <> Users.ID
AND U2.UserID = Users.UserID
AND U2.[DateTime] BETWEEN DATEADD(MI, -15, Users.[DateTime]) AND Users.[DateTime]
AND U2.TypeID = 1
)
ORDER BY Users.DateTime
Also as a performance note, whenever dealing with something that might end up being recursive like this potentially could be (from other answers), you should straight away be considering if you are able to restrict the main query by a date range in general even if it's a whole year or longer range
You can use a recursive CTE for this though I would also evaluate a cursor if the result set is at all large as it may work out more efficient.
I've left out the ID column in my answer. If you really need it it would be possible to add it. It just makes the anchor part of the recursive CTE a bit more unwieldy.
DECLARE #T TABLE
(
ID INT PRIMARY KEY,
UserID INT,
[DateTime] DateTime,
TypeID INT
)
INSERT INTO #T
SELECT 1,1,'20100101 10:00:00', 1 union all
SELECT 2,2,'20100101 10:01:50', 1 union all
SELECT 3,1,'20100101 10:02:50', 1 union all
SELECT 4,1,'20100101 10:03:50', 1 union all
SELECT 5,1,'20100101 11:00:00', 1 union all
SELECT 6,2,'20100101 11:00:50', 1;
WITH RecursiveCTE
AS (SELECT UserID,
MIN([DateTime]) As [DateTime],
1 AS TypeID
FROM #T
WHERE TypeID = 1
GROUP BY UserID
UNION ALL
SELECT UserID,
[DateTime],
TypeID
FROM (
--Can't use TOP directly
SELECT T.*,
rn = ROW_NUMBER() OVER (PARTITION BY T.UserID ORDER BY
T.[DateTime])
FROM #T T
JOIN RecursiveCTE R
ON R.UserID = T.UserID
AND T.[DateTime] >=
DATEADD(MINUTE, 15, R.[DateTime])) R
WHERE R.rn = 1)

How could I "auto-rotate" appended records in SQL (1 goes to 2, 2 goes 3, 3 goes to 4, 4 goes back to 1)?

I'm working on a system (ASP.NET/MSSQL/C#) for scheduling restaurant employees.
The problem I'm having is I need to "auto-rotate" the shift "InTimes" every week.
The user needs to be able to copy one day's schedule to the same day next week with all the employee shift times rotated one slot.
For example, in the table below, Monica has the 10:30am shift this Monday, so she would have the 11:00am next week, and Adam would go from 12:00pm to 10:30am.
The time between shifts is not constant, nor is the number of employees on each shift.
Any ideas on how to do this (ideally with SQL statements) would be greatly appreciated.
Please keep in mind I'm a relative novice.
RecordID EmpType Date Day Meal ShiftOrder InTime EmployeeID
1 Server 29-Aug-11 Monday Lunch 1 10:30:00 AM Monica
2 Server 29-Aug-11 Monday Lunch 2 11:00:00 AM Sofia
3 Server 29-Aug-11 Monday Lunch 3 11:30:00 AM Jenny
4 Server 29-Aug-11 Monday Lunch 4 12:00:00 PM Adam
5 Server 29-Aug-11 Monday Dinner 1 4:30:00 PM Adam
6 Server 29-Aug-11 Monday Dinner 2 4:45:00 PM Jenny
7 Server 29-Aug-11 Monday Dinner 3 5:00:00 PM Shauna
8 Server 29-Aug-11 Monday Dinner 4 5:15:00 PM Sofia
10 Server 29-Aug-11 Monday Dinner 5 5:30:00 PM Monica
Somehow an employee would need to get his last (few) shifts
SELECT TOP 3 * FROM shift WHERE EmployeeID LIKE 'monica' ORDER BY [date] DESC
Next he/she would need to enter the time and date offset he would like to work next week, relative to a schedule before.
INSERT INTO shift SELECT
recordID
,[date]
,CASE [Intime]
WHEN [Intime] BETWEEN 00:00 AND 10:00 THEN 'Breakfast'
WHEN [Intime] BETWEEN 10:01 AND 04:29 THEN 'Lunch'
WHEN [Intime] BETWEEN 04:30 AND 23:59 THEN 'Dinner'
END as Meal
,No_idea_how_to_generate_this AS ShiftOrder
,[Intime]
,EmployeeID
FROM (SELECT
NULL as recordID
,DATEADD(DAY, 7+#dateoffset, ls.[date]) as [date]
,CAST(DATEADD(MINUTE, #timeoffset, ls.[time] AS TIME) as [Intime]
,EmployeeId
FROM Shift WHERE recordID = #recordID ) AS subselect
Here:
- #recordID is the record the employee choose as the starting point for the new appointment.
- #dateoffset is the number of days to add the the starting record
- #timeoffset is the number of minutes to add to the starting record
All the rest is determined by the row the user used as the starting point.
Here's what I came up with:
CREATE TABLE #tmp
(
[RecordID] INT ,
[EmpType] VARCHAR(20) ,
[Date] DATE ,
[Day] VARCHAR(10) ,
[Meal] VARCHAR(10) ,
[ShiftOrder] INT ,
[InTime] TIME ,
[EmployeeID] VARCHAR(50)
)
INSERT INTO [#tmp]
( [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
[InTime] ,
[EmployeeID]
)
VALUES (1,'Server','29-Aug-11','Monday','Lunch',1,'10:30:00 AM','Monica'),
(2,'Server','29-Aug-11','Monday','Lunch',2,'11:00:00 AM','Sofia'),
(3,'Server','29-Aug-11','Monday','Lunch',3,'11:30:00 AM','Jenny'),
(4,'Server','29-Aug-11','Monday','Lunch',4,'12:00:00 PM','Adam'),
(5,'Server','29-Aug-11','Monday','Dinner',1,'4:30:00 PM','Adam'),
(6,'Server','29-Aug-11','Monday','Dinner',2,'4:45:00 PM','Jenny'),
(7,'Server','29-Aug-11','Monday','Dinner',3,'5:00:00 PM','Shauna'),
(8,'Server','29-Aug-11','Monday','Dinner',4,'5:15:00 PM','Sofia'),
(10,'Server','29-Aug-11','Monday','Dinner',5,'5:30:00 PM','Monica');
WITH CountByShift AS (SELECT *, COUNT(1) OVER (PARTITION BY EmpType, [Day], [Meal]) AS [CountByShiftByDayByEmpType]
FROM [#tmp]
),
NewShiftOrder AS (
SELECT *, ([ShiftOrder] + 1) % [CountByShiftByDayByEmpType] AS [NewShiftOrder]
FROM [CountByShift]
)
SELECT [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
CASE WHEN [NewShiftOrder] = 0 THEN [CountByShiftByDayByEmpType] ELSE [NewShiftOrder] END AS [NewShiftOrder],
[InTime] ,
[EmployeeID]
FROM NewShiftOrder
ORDER BY [RecordID]
You need a table with all of the shifts in it:
create table dbo.Shifts (
[Day] varchar(9) not null,
Meal varchar(6) not null,
ShiftOrder integer not null,
InTime time not null,
constraint PK__dbo_Shifts primary key ([Day], Meal, ShiftOrder)
);
If that table is properly populated you can then run this to get a map of the current Day, Meal, ShiftOrder n-tuple to the next in that Day, Meal pair:
with numbers_per_shift as (
select [Day], Meal, max(ShiftOrder) as ShiftOrderCount
from dbo.Shifts s
group by [Day], Meal
)
select s.[Day], s.Meal, s.ShiftOrder,
s.ShiftOrder % n.ShiftOrderCount + 1 as NextShiftOrder
from dbo.Shifts as s
inner join numbers_per_shift as n
on s.[Day] = n.[Day]
and s.Meal = n.Meal;
For the table to be properly populated each of the shift orders would have to begin with one and increase by one with no skipping or repeating within a Day, Meal pair.
Borrowing most of the #tmp table definition from #Ben Thul, assuming you have an identity field, not assuming you are storing dates and times as dates and times...this should run well over and over, copying the latest date into the following week:
CREATE TABLE #tmp
(
[RecordID] INT ,
[EmpType] VARCHAR(20) ,
[Date] VARCHAR(9) ,
[Day] VARCHAR(10) ,
[Meal] VARCHAR(10) ,
[ShiftOrder] INT ,
[InTime] VARCHAR(11) ,
[EmployeeID] VARCHAR(50)
)
INSERT INTO [#tmp]
( [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
[InTime] ,
[EmployeeID]
)
VALUES (1,'Server','29-Aug-11','Monday','Lunch',1,'10:30:00 AM','Monica'),
(2,'Server','29-Aug-11','Monday','Lunch',2,'11:00:00 AM','Sofia'),
(3,'Server','29-Aug-11','Monday','Lunch',3,'11:30:00 AM','Jenny'),
(4,'Server','29-Aug-11','Monday','Lunch',4,'12:00:00 PM','Adam'),
(5,'Server','29-Aug-11','Monday','Dinner',1,' 4:30:00 PM','Adam'),
(6,'Server','29-Aug-11','Monday','Dinner',2,' 4:45:00 PM','Jenny'),
(7,'Server','29-Aug-11','Monday','Dinner',3,' 5:00:00 PM','Shauna'),
(8,'Server','29-Aug-11','Monday','Dinner',4,' 5:15:00 PM','Sofia'),
(10,'Server','29-Aug-11','Monday','Dinner',5,' 5:30:00 PM','Monica');
with
Shifts as (
select EmpType, [Day], Meal, ShiftOrder, InTime
from #tmp
where [Date] = (select max(cast([Date] as datetime)) from #tmp)
),
MaxShifts as (
select EmpType, [Day], Meal, max(ShiftOrder) as MaxShiftOrder
from #tmp
where [Date] = (select max(cast([Date] as datetime)) from #tmp)
group by EmpType, [Day], Meal
)
insert into #tmp (EmpType, [Date], [Day], Meal, ShiftOrder, InTime, EmployeeID)
select s.EmpType
, replace(convert(varchar(11), dateadd(dd, 7, cast(a.[Date] as datetime)), 6), ' ', '-') as [Date]
, s.Day
, s.Meal
, s.ShiftOrder
, s.InTime
, a.EmployeeID
from #tmp as a
join MaxShifts as m on a.EmpType = m.EmpType
and a.[Day] = m.[Day]
and a.Meal = m.Meal
join Shifts as s on a.EmpType = s.EmpType
and a.[Day] = s.[Day]
and a.Meal = s.Meal
and 1 + a.ShiftOrder % m.MaxShiftOrder = s.ShiftOrder
where a.[Date] = (select max(cast([Date] as datetime)) from #tmp)
I'm assuming that the schedule is really tied to a meal and weekday in a below answer.
Also I would like to note that ShiftOrder and Day columns should not be columns. Day is obviously determined by Date so it is a total waste of space (computed column OR determine it on the UI side) and ShiftOrder is determined by Date and InTime columns (probably easy to calculate in a query with RANK() function or on the UI side). That said it will make this query a bit easier :)
declare #dt date = cast('29-Aug-11' as date)
/* note: the date above may be passed from UI or it maybe calculated based on getdate() and dateadd function or s.t. like that */
INSERT INTO [table] (EmpType,Date,Day,Meal,ShiftOrder,InTime,EmployeeID)
SELECT t1.EmpType, dateadd(day, 7, t1.date), t1.day, t1.meal, t2.ShiftOrder, t2.InTime, t1.EmployeeID
FROM [table] t1
INNER JOIN [table] t2
ON (t1.Date = t2.Date
and t1.Meal = t2.Meal
and (
t1.ShiftOrder = t2.ShiftOrder + 1
or
(
t1.ShiftOrder = (select max(shiftOrder) from [table] where meal = t1.meal and date =t1.date)
and
t2.ShiftOrder = (select min(shiftOrder) from [table] where meal = t1.meal and date =t1.date)
)
)
)
WHERE t1.Date = #dt
This is a pretty straight-forward set-oriented problem. Aggregations (count(*) and max()) and lookup tables are unnecessary. You can do it with one SQL statement.
The first step (set) is to identity those employees who simply slide down in the schedule.
The next step (set) is to identity those employees who need to "wrap around" to the head of the schedule.
Here's what I came up with:
/* Set up the temp table for demo purposes */
DROP TABLE #tmp
CREATE TABLE #tmp
(
[RecordID] INT ,
[EmpType] VARCHAR(20) ,
[Date] DATE ,
[Day] VARCHAR(10) ,
[Meal] VARCHAR(10) ,
[ShiftOrder] INT ,
[InTime] TIME,
[EmployeeID] VARCHAR(50)
)
INSERT INTO [#tmp]
( [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
[InTime] ,
[EmployeeID]
)
VALUES (1,'Server','29-Aug-11','Monday','Lunch',1,'10:30:00 AM','Monica'),
(2,'Server','29-Aug-11','Monday','Lunch',2,'11:00:00 AM','Sofia'),
(3,'Server','29-Aug-11','Monday','Lunch',3,'11:30:00 AM','Jenny'),
(4,'Server','29-Aug-11','Monday','Lunch',4,'12:00:00 PM','Adam'),
(5,'Server','29-Aug-11','Monday','Dinner',1,' 4:30:00 PM','Adam'),
(6,'Server','29-Aug-11','Monday','Dinner',2,' 4:45:00 PM','Jenny'),
(7,'Server','29-Aug-11','Monday','Dinner',3,' 5:00:00 PM','Shauna'),
(8,'Server','29-Aug-11','Monday','Dinner',4,' 5:15:00 PM','Sofia'),
(10,'Server','29-Aug-11','Monday','Dinner',5,' 5:30:00 PM','Monica');
/* the "fills" CTE will find those employees who "wrap around" */
;WITH fills AS (
SELECT
[d2].[EmpType],
[d2].[Date],
[d2].[Day],
[d2].[Meal],
1 AS [ShiftOrder],
[d2].[InTime],
[d2].[EmployeeID]
FROM
[#tmp] d1
RIGHT OUTER JOIN
[#tmp] d2 ON
([d1].[Meal] = [d2].[Meal])
AND ([d1].[ShiftOrder] = [d2].[ShiftOrder] + 1)
WHERE
[d1].[EmployeeID] IS NULL
)
INSERT INTO [table] (EmpType,Date,Day,Meal,ShiftOrder,InTime,EmployeeID)
SELECT
[d1].[EmpType],
DATEADD(DAY, 7, [d1].[Date]) AS [Date],
DATENAME(dw,(DATEADD(DAY, 7, [d1].[Date]))) AS [Day],
[d1].[Meal],
[d1].[ShiftOrder],
[d1].[InTime],
ISNULL([d2].[EmployeeID], [f].[EmployeeID]) AS [EmployeeID]
FROM
[#tmp] d1
LEFT OUTER JOIN
[#tmp] d2 ON
([d1].[Meal] = [d2].[Meal]) AND ([d1].[ShiftOrder] = [d2].[ShiftOrder] + 1)
LEFT OUTER JOIN
[fills] f ON
([d1].[Meal] = [f].[Meal]) AND ([d1].[ShiftOrder] = [f].[ShiftOrder])
You can use a subquery (for a tutorial on subqueries, see http://www.databasejournal.com/features/mssql/article.php/3464481/Using-a-Subquery-in-a-T-SQL-Statement.htm) to get the last shift time.
After this, its trivial addition and modular division (in case you don't know what that is, have a look at this).
Hope this helped. I'm a bit tired right now, so I can't provide you with an example.
I'm a SQL programmer and DBA for 20 yrs now. With that said, business logic this complex should be in the C# part of the system. Then the TDD built application can handle the inevitable changes, and still be refactor-able and correct.
My recommendation is 'push-back'. Your response should be something along the lines of "This isn't just some look-up/fill-in the blank logic. This kind of complex business logic belongs in the App". It belongs in something that can be unit tested, and will be unit tested every time its changed.
The right answer sometimes is 'No', this is one of them.
How about using a Pivot Table for all employees and then adding shift timings as rows?? Order the names based on Shift for the initial Day.
Something like this..
Date_time Shift_Order Monica Sofia Jenny Adam Shauna
08/29/11 1 10:30AM 11:00AM 11:30AM 12:00PM NULL
08/29/11 2 5:30PM 5:15PM 4:45PM 4:30PM 5:00PM

Find conflicted date intervals using SQL

Suppose I have following table in Sql Server 2008:
ItemId StartDate EndDate
1 NULL 2011-01-15
2 2011-01-16 2011-01-25
3 2011-01-26 NULL
As you can see, this table has StartDate and EndDate columns. I want to validate data in these columns. Intervals cannot conflict with each other. So, the table above is valid, but the next table is invalid, becase first row has End Date greater than StartDate in the second row.
ItemId StartDate EndDate
1 NULL 2011-01-17
2 2011-01-16 2011-01-25
3 2011-01-26 NULL
NULL means infinity here.
Could you help me to write a script for data validation?
[The second task]
Thanks for the answers.
I have a complication. Let's assume, I have such table:
ItemId IntervalId StartDate EndDate
1 1 NULL 2011-01-15
2 1 2011-01-16 2011-01-25
3 1 2011-01-26 NULL
4 2 NULL 2011-01-17
5 2 2011-01-16 2011-01-25
6 2 2011-01-26 NULL
Here I want to validate intervals within a groups of IntervalId, but not within the whole table. So, Interval 1 will be valid, but Interval 2 will be invalid.
And also. Is it possible to add a constraint to the table in order to avoid such invalid records?
[Final Solution]
I created function to check if interval is conflicted:
CREATE FUNCTION [dbo].[fnIntervalConflict]
(
#intervalId INT,
#originalItemId INT,
#startDate DATETIME,
#endDate DATETIME
)
RETURNS BIT
AS
BEGIN
SET #startDate = ISNULL(#startDate,'1/1/1753 12:00:00 AM')
SET #endDate = ISNULL(#endDate,'12/31/9999 11:59:59 PM')
DECLARE #conflict BIT = 0
SELECT TOP 1 #conflict = 1
FROM Items
WHERE IntervalId = #intervalId
AND ItemId <> #originalItemId
AND (
(ISNULL(StartDate,'1/1/1753 12:00:00 AM') >= #startDate
AND ISNULL(StartDate,'1/1/1753 12:00:00 AM') <= #endDate)
OR (ISNULL(EndDate,'12/31/9999 11:59:59 PM') >= #startDate
AND ISNULL(EndDate,'12/31/9999 11:59:59 PM') <= #endDate)
)
RETURN #conflict
END
And then I added 2 constraints to my table:
ALTER TABLE dbo.Items ADD CONSTRAINT
CK_Items_Dates CHECK (StartDate IS NULL OR EndDate IS NULL OR StartDate <= EndDate)
GO
and
ALTER TABLE dbo.Items ADD CONSTRAINT
CK_Items_ValidInterval CHECK (([dbo].[fnIntervalConflict]([IntervalId], ItemId,[StartDate],[EndDate])=(0)))
GO
I know, the second constraint slows insert and update operations, but it is not very important for my application.
And also, now I can call function fnIntervalConflict from my application code before inserts and updates of data in the table.
Something like this should give you all overlaping periods
SELECT
*
FROM
mytable t1
JOIN mytable t2 ON t1.EndDate>t2.StartDate AND t1.StartDate < t2.StartDate
Edited for Adrians comment bellow
This will give you the rows that are incorrect.
Added ROW_NUMBER() as I didnt know if all entries where in order.
-- Testdata
declare #date datetime = '2011-01-17'
;with yourTable(itemID, startDate, endDate)
as
(
SELECT 1, NULL, #date
UNION ALL
SELECT 2, dateadd(day, -1, #date), DATEADD(day, 10, #date)
UNION ALL
SELECT 3, DATEADD(day, 60, #date), NULL
)
-- End testdata
,tmp
as
(
select *
,ROW_NUMBER() OVER(order by startDate) as rowno
from yourTable
)
select *
from tmp t1
left join tmp t2
on t1.rowno = t2.rowno - 1
where t1.endDate > t2.startDate
EDIT:
As for the updated question:
Just add a PARTITION BY clause to the ROW_NUMBER() query and alter the join.
-- Testdata
declare #date datetime = '2011-01-17'
;with yourTable(itemID, startDate, endDate, intervalID)
as
(
SELECT 1, NULL, #date, 1
UNION ALL
SELECT 2, dateadd(day, 1, #date), DATEADD(day, 10, #date),1
UNION ALL
SELECT 3, DATEADD(day, 60, #date), NULL, 1
UNION ALL
SELECT 4, NULL, #date, 2
UNION ALL
SELECT 5, dateadd(day, -1, #date), DATEADD(day, 10, #date),2
UNION ALL
SELECT 6, DATEADD(day, 60, #date), NULL, 2
)
-- End testdata
,tmp
as
(
select *
,ROW_NUMBER() OVER(partition by intervalID order by startDate) as rowno
from yourTable
)
select *
from tmp t1
left join tmp t2
on t1.rowno = t2.rowno - 1
and t1.intervalID = t2.intervalID
where t1.endDate > t2.startDate
declare #T table (ItemId int, IntervalID int, StartDate datetime, EndDate datetime)
insert into #T
select 1, 1, NULL, '2011-01-15' union all
select 2, 1, '2011-01-16', '2011-01-25' union all
select 3, 1, '2011-01-26', NULL union all
select 4, 2, NULL, '2011-01-17' union all
select 5, 2, '2011-01-16', '2011-01-25' union all
select 6, 2, '2011-01-26', NULL
select T1.*
from #T as T1
inner join #T as T2
on coalesce(T1.StartDate, '1753-01-01') < coalesce(T2.EndDate, '9999-12-31') and
coalesce(T1.EndDate, '9999-12-31') > coalesce(T2.StartDate, '1753-01-01') and
T1.IntervalID = T2.IntervalID and
T1.ItemId <> T2.ItemId
Result:
ItemId IntervalID StartDate EndDate
----------- ----------- ----------------------- -----------------------
5 2 2011-01-16 00:00:00.000 2011-01-25 00:00:00.000
4 2 NULL 2011-01-17 00:00:00.000
Not directly related to the OP, but since Adrian's expressed an interest. Here's a table than SQL Server maintains the integrity of, ensuring that only one valid value is present at any time. In this case, I'm dealing with a current/history table, but the example can be modified to work with future data also (although in that case, you can't have the indexed view, and you need to write the merge's directly, rather than maintaining through triggers).
In this particular case, I'm dealing with a link table that I want to track the history of. First, the tables that we're linking:
create table dbo.Clients (
ClientID int IDENTITY(1,1) not null,
Name varchar(50) not null,
/* Other columns */
constraint PK_Clients PRIMARY KEY (ClientID)
)
go
create table dbo.DataItems (
DataItemID int IDENTITY(1,1) not null,
Name varchar(50) not null,
/* Other columns */
constraint PK_DataItems PRIMARY KEY (DataItemID),
constraint UQ_DataItem_Names UNIQUE (Name)
)
go
Now, if we were building a normal table, we'd have the following (Don't run this one):
create table dbo.ClientAnswers (
ClientID int not null,
DataItemID int not null,
IntValue int not null,
Comment varchar(max) null,
constraint PK_ClientAnswers PRIMARY KEY (ClientID,DataItemID),
constraint FK_ClientAnswers_Clients FOREIGN KEY (ClientID) references dbo.Clients (ClientID),
constraint FK_ClientAnswers_DataItems FOREIGN KEY (DataItemID) references dbo.DataItems (DataItemID)
)
But, we want a table that can represent a complete history. In particular, we want to design the structure such that overlapping time periods can never appear in the database. We always know which record was valid at any particular time:
create table dbo.ClientAnswerHistories (
ClientID int not null,
DataItemID int not null,
IntValue int null,
Comment varchar(max) null,
/* Temporal columns */
Deleted bit not null,
ValidFrom datetime2 null,
ValidTo datetime2 null,
constraint UQ_ClientAnswerHistories_ValidFrom UNIQUE (ClientID,DataItemID,ValidFrom),
constraint UQ_ClientAnswerHistories_ValidTo UNIQUE (ClientID,DataItemID,ValidTo),
constraint CK_ClientAnswerHistories_NoTimeTravel CHECK (ValidFrom < ValidTo),
constraint FK_ClientAnswerHistories_Clients FOREIGN KEY (ClientID) references dbo.Clients (ClientID),
constraint FK_ClientAnswerHistories_DataItems FOREIGN KEY (DataItemID) references dbo.DataItems (DataItemID),
constraint FK_ClientAnswerHistories_Prev FOREIGN KEY (ClientID,DataItemID,ValidFrom)
references dbo.ClientAnswerHistories (ClientID,DataItemID,ValidTo),
constraint FK_ClientAnswerHistories_Next FOREIGN KEY (ClientID,DataItemID,ValidTo)
references dbo.ClientAnswerHistories (ClientID,DataItemID,ValidFrom),
constraint CK_ClientAnswerHistory_DeletionNull CHECK (
Deleted = 0 or
(
IntValue is null and
Comment is null
)),
constraint CK_ClientAnswerHistory_IntValueNotNull CHECK (Deleted=1 or IntValue is not null)
)
go
That's a lot of constraints. The only way to maintain this table is through merge statements (see examples below, and try to reason about why yourself). We're now going to build a view that mimics that ClientAnswers table defined above:
create view dbo.ClientAnswers
with schemabinding
as
select
ClientID,
DataItemID,
ISNULL(IntValue,0) as IntValue,
Comment
from
dbo.ClientAnswerHistories
where
Deleted = 0 and
ValidTo is null
go
create unique clustered index PK_ClientAnswers on dbo.ClientAnswers (ClientID,DataItemID)
go
And we have the PK constraint we originally wanted. We've also used ISNULL to reinstate the not null-ness of the IntValue column (even though the check constraints already guarantee this, SQL Server is unable to derive this information). If we're working with an ORM, we let it target ClientAnswers, and the history gets automatically built. Next, we can have a function that lets us look back in time:
create function dbo.ClientAnswers_At (
#At datetime2
)
returns table
with schemabinding
as
return (
select
ClientID,
DataItemID,
ISNULL(IntValue,0) as IntValue,
Comment
from
dbo.ClientAnswerHistories
where
Deleted = 0 and
(ValidFrom is null or ValidFrom <= #At) and
(ValidTo is null or ValidTo > #At)
)
go
And finally, we need the triggers on ClientAnswers that build this history. We need to use merge statements, since we need to simultaneously insert new rows, and update the previous "valid" row to end date it with a new ValidTo value.
create trigger T_ClientAnswers_I
on dbo.ClientAnswers
instead of insert
as
set nocount on
;with Dup as (
select i.ClientID,i.DataItemID,i.IntValue,i.Comment,CASE WHEN cah.ClientID is not null THEN 1 ELSE 0 END as PrevDeleted,t.Dupl
from
inserted i
left join
dbo.ClientAnswerHistories cah
on
i.ClientID = cah.ClientID and
i.DataItemID = cah.DataItemID and
cah.ValidTo is null and
cah.Deleted = 1
cross join
(select 0 union all select 1) t(Dupl)
)
merge into dbo.ClientAnswerHistories cah
using Dup on cah.ClientID = Dup.ClientID and cah.DataItemID = Dup.DataItemID and cah.ValidTo is null and Dup.Dupl = 0 and Dup.PrevDeleted = 1
when matched then update set ValidTo = SYSDATETIME()
when not matched and Dup.Dupl=1 then insert (ClientID,DataItemID,IntValue,Comment,Deleted,ValidFrom)
values (Dup.ClientID,Dup.DataItemID,Dup.IntValue,Dup.Comment,0,CASE WHEN Dup.PrevDeleted=1 THEN SYSDATETIME() END);
go
create trigger T_ClientAnswers_U
on dbo.ClientAnswers
instead of update
as
set nocount on
;with Dup as (
select i.ClientID,i.DataItemID,i.IntValue,i.Comment,t.Dupl
from
inserted i
cross join
(select 0 union all select 1) t(Dupl)
)
merge into dbo.ClientAnswerHistories cah
using Dup on cah.ClientID = Dup.ClientID and cah.DataItemID = Dup.DataItemID and cah.ValidTo is null and Dup.Dupl = 0
when matched then update set ValidTo = SYSDATETIME()
when not matched then insert (ClientID,DataItemID,IntValue,Comment,Deleted,ValidFrom)
values (Dup.ClientID,Dup.DataItemID,Dup.IntValue,Dup.Comment,0,SYSDATETIME());
go
create trigger T_ClientAnswers_D
on dbo.ClientAnswers
instead of delete
as
set nocount on
;with Dup as (
select d.ClientID,d.DataItemID,t.Dupl
from
deleted d
cross join
(select 0 union all select 1) t(Dupl)
)
merge into dbo.ClientAnswerHistories cah
using Dup on cah.ClientID = Dup.ClientID and cah.DataItemID = Dup.DataItemID and cah.ValidTo is null and Dup.Dupl = 0
when matched then update set ValidTo = SYSDATETIME()
when not matched then insert (ClientID,DataItemID,Deleted,ValidFrom)
values (Dup.ClientID,Dup.DataItemID,1,SYSDATETIME());
go
Obviously, I could have built a simpler table (not a join table), but this is my standard go-to example (albeit it took me a while to reconstruct it - I forgot the set nocount on statements for a while). But the strength here is that, the base table, ClientAnswerHistories is incapable of storing overlapping time ranges for the same ClientID and DataItemID values.
Things get more complex when you need to deal with temporal foreign keys.
Of course, if you don't want any real gaps, then you can remove the Deleted column (and associated checks), make the not null columns really not null, modify the insert trigger to do a plain insert, and make the delete trigger raise an error instead.
I've always taken a slightly different approach to the design if I have data that is never to have overlapping intervals... namely don't store intervals, but only start times. Then, have a view that helps with displaying the intervals.
CREATE TABLE intervalStarts
(
ItemId int,
IntervalId int,
StartDate datetime
)
CREATE VIEW intervals
AS
with cte as (
select ItemId, IntervalId, StartDate,
row_number() over(partition by IntervalId order by isnull(StartDate,'1753-01-01')) row
from intervalStarts
)
select c1.ItemId, c1.IntervalId, c1.StartDate,
dateadd(dd,-1,c2.StartDate) as 'EndDate'
from cte c1
left join cte c2 on c1.IntervalId=c2.IntervalId
and c1.row=c2.row-1
So, sample data might look like:
INSERT INTO intervalStarts
select 1, 1, null union
select 2, 1, '2011-01-16' union
select 3, 1, '2011-01-26' union
select 4, 2, null union
select 5, 2, '2011-01-26' union
select 6, 2, '2011-01-14'
and a simple SELECT * FROM intervals yields:
ItemId | IntervalId | StartDate | EndDate
1 | 1 | null | 2011-01-15
2 | 1 | 2011-01-16 | 2011-01-25
3 | 1 | 2011-01-26 | null
4 | 2 | null | 2011-01-13
6 | 2 | 2011-01-14 | 2011-01-25
5 | 2 | 2011-01-26 | null