How to design database table for working hours? - sql

Hei guys, I'm trying to help my friend to design database tables. It is for a system tracking workers' working hours in a factory by reading card info from certain card readers. Each time a worker log his in/out information, there would be an record saved.
My problem is, how can I calculate each worker's working time (in minutes), each workday? A worker may work from 8:00AM~20:00PM, or 20:00PM~8:00AM.
Anyone can help me?
Thanks!
You guys did give me a lot of help.
The previous design is a table with in-record or out-record. It was hard for me to locate which ones belong to the same work-time-span. I now use another table with records both have the in-time and out-time in the same record. Insert to save in-time, update to save out-time, which makes it easy to calculate the total minutes between in-time and out-time.

SELECT datediff(hh,'2011-08-30 04:47','2011-08-30 05:48') as [Hour(s) Worked]
Hour(s) Worked
--------------
1

a simple example with 2 tables
[TblUsers]
User_id PK
FirstName
LastName
[TblSchedule]
Schedule_id PK
User_id FK
Date_From
Date_To
to get a daily work grid with times, you can write something like:
SELECT
u.FirstName + ' ' + u.LastName as [username],
CAST(FLOOR(CAST(#datetime as float)) as datetime) as [date],
DATEDIFF(minute, s.Date_To, s.Date_From) as [workMinutes]
FROM
[TblSchedule] s, [TblUsers] u
WHERE
s.user_id = u.user_id
GROUP BY
u.FirstName + ' ' + u.LastName,
CAST(FLOOR(CAST(#datetime as float)) as datetime)
ORDER BY
s.Date_From;

Just calculate the Minutes between each IN-Record and the following OUT-Record from this worker. If you want it for a whole day then fetch the relevant records and sum up the relevant differences.
The more complex thing here is when some worker forget about stamping. Your program have to be prepared for such cases.
Also be aware of things like daylight saving time. Time-Calcs can be very complicated.
I think I would do calculation on application level and not in SQL in this case.

DATEDIFF can give you some strange results.
For example take this two DATETIME2 (I presume you have SQL Server 2008) values that have a difference of 5 minutes:
SELECT DATEDIFF(hh,'2011-01-01 04:59:00','2011-01-01 05:04:00')
Results
-----------
1
The result is somehow strange: 1 hour. Strange, because the difference in minutes is 5 minutes but the difference in hours is 1 hour and we know that 1 hour = 60 minutse. Please read this article to see the explanations.
Solutions:
1) Instead of DATEDIFF(hh,...) use DATEDIFF(mi,...)
Ex:
SELECT DATEDIFF(mi,'2011-01-01 07:55:00','2011-01-01 16:02:00') [Minutes]
,DATEDIFF(mi,'2011-01-01 07:55:00','2011-01-01 16:02:00')/60 [Hours]
--8 hours
,DATEDIFF(mi,'2011-01-01 07:55:00','2011-01-01 16:02:00')%60 [Additional minute]
--7 minute
But:
SELECT DATEDIFF(mi,'2011-01-01 08:00:59','2011-01-01 16:00:05') [Minutes]
--480
,DATEDIFF(ss,'2011-01-01 08:00:59','2011-01-01 16:00:05')/60 [Seconds/60]
--479
2) Instead of using DATEDIFF function (with DATETIME[2][OFFSET] data types) use DATETIME values with the - operator:
DECLARE #Test TABLE
(
TestId INT IDENTITY(1,1) PRIMARY KEY
,[Enter] DATETIME NOT NULL
,[Exit] DATETIME NOT NULL
);
INSERT #Test
VALUES ('2011-01-01 07:55:00','2011-01-01 16:02:02')
,('2011-01-01 08:00:59','2011-01-01 16:00:05');
SELECT *
,t.[Exit] - t.[Enter] AS MyDateDiff
,DATEPART(hh,t.[Exit] - t.[Enter]) [Hours]
,DATEPART(mi,t.[Exit] - t.[Enter]) [Additional minutes]
,DATEPART(ss,t.[Exit] - t.[Enter]) [Additional seconds]
FROM #Test t
Results:
TestId Enter Exit MyDateDiff Hours Additional minute Additional seconds
----------- ----------------------- ----------------------- ----------------------- ----------- ----------------- ------------------
1 2011-01-01 07:55:00.000 2011-01-01 16:02:02.000 1900-01-01 08:07:02.000 8 7 2
2 2011-01-01 08:00:59.000 2011-01-01 16:00:05.000 1900-01-01 07:59:06.000 7 59 6

Related

Calculating working time with overlapping events (SQL)

I have found similar queries on StackOverflow (e.g. Finding simultaneous events in a database between times) but nothing that matches exactly what I am after as far as I can tell so thought it OK to add as a new question.
I have a table that logs jobs (or "Activities"), with a start/end time for the job. I need to calculate working time (you can disregard non-working days, break times etc. as I have that covered). The complication is an individual can work on simultaneous jobs, overlapping at different points (the assumption is equal effort on simultaneous jobs), and the working time needs to reflect that. Minute accuracy is all that is required, not to the second.
Based on other suggestions I have this query, implemented as a table-valued function. It will look at each minute that activity is running, and if any other activities are running in the same period for the same person, and make calculations based on that. It works, but is very inefficient - taking over a minute to execute. Any ideas how I can do this more efficiently?
Running SQL 2005. I have done the obvious such as to add indexes on foreign keys by the way.
CREATE FUNCTION [dbo].[WorkActivity_WorkTimeCalculations] (#StartDate smalldatetime, #EndDate smalldatetime)
RETURNS #retActivity TABLE
(
ActivityID bigint PRIMARY KEY NOT NULL,
WorkMins decimal NOT NULL
)
/********************************************************************
Summary: Calculates the WORKING time on each activity running in a given date/time range
Remarks: Takes into account staff working simultaneously on jobs
(evenly distributes working time across simultaneous jobs)
Input Params: #StartDate - the start of the period to calculate
#EndDate - the end of the period to calculate
Output Params:
Returns: Recordset of activities and associated working time (minutes)
********************************************************************/
AS
BEGIN
-- any work activities still running use the overall end date as the activity's end date for the purpose of calculating
-- simulateneous jobs running
-- POPULATE A TEMP TABLE WITH EVERY MINUTE IN THE DATE RANGE
DECLARE #Minutes TABLE (MinuteDateTime smalldatetime NOT NULL)
;WITH cte AS (
SELECT #StartDate AS myDate
UNION ALL
SELECT DATEADD(minute,1,myDate)
FROM cte
WHERE DATEADD(minute,1,myDate) <= #EndDate
)
INSERT INTO #Minutes (MinuteDateTime)
SELECT myDate FROM cte
OPTION (MAXRECURSION 0)
-- POPULATE A TEMP TABLE WITH WORKLOAD PER EMPLOYEE PER MINUTE
DECLARE #JobsRunningByStaff TABLE (StaffID smallint NOT NULL, MinuteDateTime smalldatetime NOT NULL, JobsRunning decimal NOT NULL)
INSERT INTO #JobsRunningByStaff (StaffID, MinuteDateTime, JobsRunning)
SELECT wka_StaffID, MinuteDateTime, COUNT(DISTINCT wka_ItemID) JobsRunning
FROM dbo.WorkActivities
INNER JOIN #Minutes ON (MinuteDateTime BETWEEN wka_StartTime AND DATEADD(minute,-1,ISNULL(wka_EndTime,#EndDate)))
GROUP BY wka_StaffID, MinuteDateTime
-- FINALLY MAKE THE CALCULATIONS FOR EACH ACTIVITY
INSERT INTO #retActivity
SELECT wka_ActivityID, SUM(1/JobsRunning)WorkMins
FROM dbo.WorkActivities
INNER JOIN #JobsRunningByStaff ON (wka_StaffID = StaffID AND MinuteDateTime BETWEEN wka_StartTime AND DATEADD(minute,-1,ISNULL(wka_EndTime,#EndDate)))
GROUP BY wka_ActivityID
RETURN
END
Some example data (sorry for the poor formatting!)...
Source Data from WorkActivities table:
ACTIVITY ID | START TIME | END TIME | STAFF ID
1 | 03/03/2016 10:30 | 03/03/2016 10:50 | 1
2 | 03/03/2016 10:40 | 03/03/2016 11:00 | 1
And the desired results for a function call of SELECT * FROM dbo.WorkActivity_WorkTimeCalculations ('03-Mar-2016 10:30','03-Mar-2016 11:30'):
ACTIVITY ID | WORKMINS
1 | 25
2 | 15
So, the results take into account between 10:40 and 10:50 there are two jobs happening simultaneously, so calculates 5 mins working time on each over that period.
As suggested by posters, indexing made a significant difference - creating an index with wka_StartTime and wka_EndTime sorted it.
(sorry, couldn't see how to mark the comments made by others as an answer!)

Join Table variable with a Table - SQL

This question might looks simple and repeated, Since I am beginner in SQL, I have stuck up with this problem.
I have created a table variable to store hour range in a 24 hr format. Here is the code
DECLARE #TIMERANGE TABLE ([TIME] NVARCHAR(MAX))
;with hrs (time)
AS
(
SELECT 0
UNION ALL
SELECT time+1
FROM hrs WHERE time<23
)
INSERT INTO #TIMERANGE select
RIGHT ('0000' + CONVERT(VARCHAR, time), 4) + '-' + RIGHT('0000' + CONVERT(VARCHAR, time + 1), 4) AS [TIME]
from hrs
output for this table is:
TIME
0000-0001
0001-0002
0002-0003
0003-0004
0004-0005
0005-0006
0006-0007
0007-0008
0008-0009
0009-0010
0010-0011
0011-0012
0012-0013
0013-0014
0014-0015
0015-0016
0016-0017
0017-0018
0018-0019
0019-0020
0020-0021
0021-0022
0022-0023
0023-0024
Condition is, I want to join this with my real table with a specific condition
Id Date Time Score
1 2008-01-01 00:05 15
2 2008-01-01 00:15 20
3 2008-01-02 10:15 05
4 2008-01-02 11.00 55
I want to find the sum of score in specific time range, Eg, 00.15 will falls in Time range 0000-0001.
Desired output is:
Time Range Score
0000-0001 25
........ ..
........ ..
Please Help
I am hoping I understood the requirements. I see why you did the CTE. I've done that to support graphs so every hour in the day is represented, with or without resulting data.
I re factored the query to produce the following:
declare #tmp TABLE (MyDate DATE, MyTime TIME,Score INT)
INSERT INTO #tmp VALUES('2008-01-01','00:05',15),
('2008-01-01','00:15',20),
('2008-01-02','10:15',05),
('2008-01-02','11:00',55)
SELECT SUM(Score) Score,datepart(hour,GETDATE()) TimeRange FROM #tmp Group By datepart(hour,MyTime)
The result will show the SUM (or average or whatever you need) by Hour. If you still need to graph the result THEN join back into your CTE on the Hour component of the time.
Hope this helps.

Find total time worked with multiple jobs / orders with overlap / overlapping times on each worker and job / order

I searched night and day back when I was first starting out in the sql world for an answer to this question. Could not find anything similar to this for my needs so I decided to ask and answer my own question in case others need help like I did.
Here is an example of the data I have. For simplicity, it is all from the Job table. Each JobID has it's own Start and End time that are basically random and can overlap, have gaps, start and end at the same time as other jobs etc.
--Available--
JobID WorkerID JobStart JobEnd
1 25 '2012-11-17 16:00' '2012-11-17 17:00'
2 25 '2012-11-18 16:00' '2012-11-18 16:50'
3 25 '2012-11-19 18:00' '2012-11-19 18:30'
4 25 '2012-11-19 17:30' '2012-11-19 18:10'
5 26 '2012-11-18 16:00' '2012-11-18 17:10'
6 26 '2012-11-19 16:00' '2012-11-19 16:50'
What I wanted the result of the query to show would be:
WorkerID TotalTime(in Mins)
25 170
26 120
EDIT: Forgot to mention that the overlaps need to be ignored. Basically this is supposed to treat these workers and their jobs like you would an hourly employee and not a contractor. Like if I worked two jobIDs and started and finished them both from 12:00pm to 12:30pm, as an employee I would only get paid for 30 mins, whereas a contractor would likely get paid 60 mins, since their jobs are treated individually and get paid per job. The point of this query is to analyze jobs in a database that are tied to a worker, and need to find out if that worker was treated as an employee, what would his total hours worked in a given set of time come out to be.
EDIT2: won't let me answer my own question for 7 hours, will move it there later.
Ok, Answering Question now. Basically, I use temp table to build each minute between the min and max datetime of the jobs I am looking up.
IF OBJECT_ID('tempdb..#time') IS NOT NULL
BEGIN
drop table #time
END
DECLARE #FromDate AS DATETIME,
#ToDate AS DATETIME,
#Current AS DATETIME
SET #FromDate = '2012-11-17 16:00'
SET #ToDate = '2012-11-19 18:30'
create table #time (cte_start_date datetime)
set #current = #FromDate
while (#current < #ToDate)
begin
insert into #time (cte_start_date)
values (#current)
set #current = DATEADD(n, 1, #current)
end
Now I have all the mins in a temp table. Now I need to join all the Job table info into it and select out what I need in one go.
SELECT J.WorkerID
,COUNT(DISTINCT t.cte_start_date) AS TotalTime
FROM #time AS t
INNER JOIN Job AS J ON t.cte_start_date >= J.JobStart AND t.cte_start_date < J.JobEnd --Thanks ErikE
GROUP BY J.WorkerID --Thanks Martin Parkin
drop table #time
That is the very simplified answer and is good to get someone started.
This query does the job as well. Its performance is very good (while the execution plan looks not so great, the actual CPU and IO beat many other queries).
See it working in a Sql Fiddle.
WITH Times AS (
SELECT DISTINCT
H.WorkerID,
T.Boundary
FROM
dbo.JobHistory H
CROSS APPLY (VALUES (H.JobStart), (H.JobEnd)) T (Boundary)
), Groups AS (
SELECT
WorkerID,
T.Boundary,
Grp = Row_Number() OVER (PARTITION BY T.WorkerID ORDER BY T.Boundary) / 2
FROM
Times T
CROSS JOIN (VALUES (1), (1)) X (Dup)
), Boundaries AS (
SELECT
G.WorkerID,
TimeStart = Min(Boundary),
TimeEnd = Max(Boundary)
FROM
Groups G
GROUP BY
G.WorkerID,
G.Grp
HAVING
Count(*) = 2
)
SELECT
B.WorkerID,
WorkedMinutes = Sum(DateDiff(minute, 0, B.TimeEnd - B.TimeStart))
FROM
Boundaries B
WHERE
EXISTS (
SELECT *
FROM dbo.JobHistory H
WHERE
B.WorkerID = H.WorkerID
AND B.TimeStart < H.JobEnd
AND B.TimeEnd > H.JobStart
)
GROUP BY
WorkerID
;
With a clustered index on WorkerID, JobStart, JobEnd, JobID, and with the sample 7 rows from the above fiddle a template for new worker/job data repeated enough times to yield a table with 14,336 rows, here are the performance results. I've included the other working/correct answers on the page (so far):
Author CPU Elapsed Reads Scans
------ --- ------- ------ -----
Erik 157 166 122 2
Gordon 375 378 106964 53251
I did a more exhaustive test from a different (slower) server (where each query was run 25 times, the best and worst values for each metric were thrown out, and the remaining 23 values were averaged) and got the following:
Query CPU Duration Reads Notes
-------- ---- -------- ------ ----------------------------------
Erik 1 215 231 122 query as above
Erik 2 326 379 116 alternate technique with no EXISTS
Gordon 1 578 682 106847 from j
Gordon 2 584 673 106847 from dbo.JobHistory
The alternate technique I thought to be sure to improve things. Well, it saved 6 reads, but cost a lot more CPU (which makes sense). Instead of carrying through the start/end statistics of each timeslice to the end, it is best just recalculating which slices to keep with the EXISTS against the original data. It may be that a different profile of few workers with many jobs could change the performance statistics for different queries.
In case anyone wants to try it, use the CREATE TABLE and INSERT statements from my fiddle and then run this 11 times:
INSERT dbo.JobHistory
SELECT
H.JobID + A.MaxJobID,
H.WorkerID + A.WorkerCount,
DateAdd(minute, Elapsed + 45, JobStart),
DateAdd(minute, Elapsed + 45, JobEnd)
FROM
dbo.JobHistory H
CROSS JOIN (
SELECT
MaxJobID = Max(JobID),
WorkerCount = Max(WorkerID) - Min(WorkerID) + 1,
Elapsed = DateDiff(minute, Min(JobStart), Min(JobEnd))
FROM dbo.JobHistory
) A
;
I built two other solutions to this query but the best one with about double the performance had a fatal flaw (not correctly handling fully enclosed time ranges). The other had very high/bad statistics (which I knew but had to try).
Explanation
Using all the endpoint times from each row, build up a distinct list of all possible time ranges of interest by duplicating each endpoint time and then grouping in such a way as to pair each time with the next possible time. Sum the elapsed minutes of these ranges wherever they coincide with any actual worker's working time.
A query such as the following should provide the answer you are looking for:
SELECT WorkerID,
SUM(DATEDIFF(minute, JobStart, JobEnd)) AS TotalTime
FROM Job
GROUP BY WorkerID
Apologies that it is untested (I have no SQL Server to test it here) but it should do the trick.
This is a complicated query. Explanation follows.
with j as (
select j.*,
(select 1
from jobs j2
where j2.workerid = j.workerid and
j2.starttime < j.endtime and
j2.starttime > j.starttime
) as HasOverlap
from jobs j
)
select workerId,
sum(datediff(minute, periodStart, PeriodEnd)) as NumMinutes
from (select workerId, min(startTime) as periodStart, max(endTime) as PeriodEnd
from (select j.*,
(select min(starttime)
from j j2
where j2.workerid = j.workerid and
j2.starttime >= j.starttime and
j2.HasOverlap is null
) as thegroup
from j
) j
group by workerId, thegroup
) j
group by workerId;
The key to understanding this approach is to understand the "overlap" logic. One time period overlaps with the next when the next start time is before the previous end time. By assigning an overlap flag to each record, we know if it overlaps with the "next" record. The above logic is using the start time for this. It might be better to use the JobId, especially if two jobs for the same worker could start at the same time.
The calculation of the overlap flag uses a correlated subquery (this is j in the with clause).
Then, for each record we go back and find the first record afterwards where the overlap value is NULL. This provides a grouping key for all records in a given overlap set.
The rest, then, is just to aggregate the results, first at the workerId/group level and then at the workerId level to get the final results.
I have not run this SQL, so it might have syntax errors.

Find rows in a database with no time in a datetime column

During testing I have failed to notice an incorrect date/time entry into the database on certain orders. Instead of entering the date and time I have only been entering the date. I was using the correct time stamp createodbcdatetime(now()) however I was using cfsqltype="cf_sql_date" to enter it into the database.
I am lucky enough to have the order date/time correctly recorded, meaning I can use the time from the order date/time field.
My question being can I filter for all rows in the table with only dates entered. My data below;
Table Name: tbl_orders
uid_orders dte_order_stamp
2000 02/07/2012 03:02:52
2001 03/07/2012 01:24:21
2002 03/07/2012 08:34:00
Table Name: tbl_payments
uid_payment dte_pay_paydate uid_pay_orderid
1234 02/07/2012 03:02:52 2000
1235 03/07/2012 2001
1236 03/07/2012 2002
I need to be able to select all payments with no time entered from tbl_payments, i can then loop around the results grabbing the time from my order table add it to the date from my payment table and update the field with the new date/time.
I can pretty much handle the re-inserting the date/time. It's just selecting the no time rows I'm not sure about?
Any help would be appreciated.
The following is the select statements for both orders and payments and if they need to be joined.(just fyi)
SQL Server 2008, Cold Fusion 9
SELECT
dbo.tbl_orders.uid_orders,
dbo.tbl_orders.dte_order_stamp,
dbo.tbl_payment.dte_pay_paydate,
dbo.tbl_payment.uid_pay_orderid
FROM
dbo.tbl_orders
INNER JOIN dbo.tbl_payment ON (dbo.tbl_orders.uid_orders = dbo.tbl_payment.uid_pay_orderid)
SELECT
dbo.tbl_orders.uid_orders,
dbo.tbl_orders.dte_order_stamp
FROM dbo.tbl_orders
SELECT
uid_paymentid,
uid_pay_orderid,
dte_pay_paydate,
FROM
dbo.tbl_payment
Select the records where the hours, minutes, seconds and millisecond value is zero.
select *
from table
where datePart(hour, datecolumn) = 0
and datePart(minute, datecolumn) = 0
and datePart(second, datecolumn) = 0
and datePart(millisecond, datecolumn) = 0
You can probably get those values by casting to time and checking for 0:
SELECT * FROM table WHERE CAST(datetimecolumn AS TIME) = '00:00'
That may not be particularly efficient though, depending on how smart SQL Server's indexes are.
Something like this should work:
....
WHERE CAST(CONVERT(VARCHAR, dbo.tbl_payment.dte_pay_paydate, 101) AS DATETIME) =
dbo.tbl_payment.dte_pay_paydate
This will return all rows where the time is missing.

How do I analyse time periods between records in SQL data without cursors?

The root problem: I have an application which has been running for several months now. Users have been reporting that it's been slowing down over time (so in May it was quicker than it is now). I need to get some evidence to support or refute this claim. I'm not interested in precise numbers (so I don't need to know that a login took 10 seconds), I'm interested in trends - that something which used to take x seconds now takes of the order of y seconds.
The data I have is an audit table which stores a single row each time the user carries out any activity - it includes a primary key, the user id, a date time stamp and an activity code:
create table AuditData (
AuditRecordID int identity(1,1) not null,
DateTimeStamp datetime not null,
DateOnly datetime null,
UserID nvarchar(10) not null,
ActivityCode int not null)
(Notes: DateOnly (datetime) is the DateTimeStamp with the time stripped off to make group by for daily analysis easier - it's effectively duplicate data to make querying faster).
Also for the sake of ease you can assume that the ID is assigned in date time order, that is 1 will always be before 2 which will always be before 3 - if this isn't true I can make it so).
ActivityCode is an integer identifying the activity which took place, for instance 1 might be user logged in, 2 might be user data returned, 3 might be search results returned and so on.
Sample data for those who like that sort of thing...:
1, 01/01/2009 12:39, 01/01/2009, P123, 1
2, 01/01/2009 12:40, 01/01/2009, P123, 2
3, 01/01/2009 12:47, 01/01/2009, P123, 3
4, 01/01/2009 13:01, 01/01/2009, P123, 3
User data is returned (Activity Code 2) immediate after login (Activity Code 1) so this can be used as a rough benchmark of how long the login takes (as I said, I'm interested in trends so as long as I'm measuring the same thing for May as July it doesn't matter so much if this isn't the whole login process - it takes in enough of it to give a rough idea).
(Note: User data can also be returned under other circumstances so it's not a one to one mapping).
So what I'm looking to do is select the average time between login (say ActivityID 1) and the first instance after that for that user on that day of user data being returned (say ActivityID 2).
I can do this by going through the table with a cursor, getting each login instance and then for that doing a select to say get the minimum user data return following it for that user on that day but that's obviously not optimal and is slow as hell.
My question is (finally) - is there a "proper" SQL way of doing this using self joins or similar without using cursors or some similar procedural approach? I can create views and whatever to my hearts content, it doesn't have to be a single select.
I can hack something together but I'd like to make the analysis I'm doing a standard product function so would like it to be right.
SELECT TheDay, AVG(TimeTaken) AvgTimeTaken
FROM (
SELECT
CONVERT(DATE, logins.DateTimeStamp) TheDay
, DATEDIFF(SS, logins.DateTimeStamp,
(SELECT TOP 1 DateTimeStamp
FROM AuditData userinfo
WHERE UserID=logins.UserID
and userinfo.ActivityCode=2
and userinfo.DateTimeStamp > logins.DateTimeStamp )
)TimeTaken
FROM AuditData logins
WHERE
logins.ActivityCode = 1
) LogInTimes
GROUP BY TheDay
This might be dead slow in real world though.
In Oracle this would be a cinch, because of analytic functions. In this case, LAG() makes it easy to find the matching pairs of activity codes 1 and 2 and also to calculate the trend. As you can see, things got worse on 2nd JAN and improved quite a bit on the 3rd (I'm working in seconds rather than minutes).
SQL> select DateOnly
2 , elapsed_time
3 , elapsed_time - lag (elapsed_time) over (order by DateOnly) as trend
4 from
5 (
6 select DateOnly
7 , avg(databack_time - prior_login_time) as elapsed_time
8 from
9 ( select DateOnly
10 , databack_time
11 , ActivityCode
12 , lag(login_time) over (order by DateOnly,UserID, AuditRecordID, ActivityCode) as prior_login_time
13 from
14 (
15 select a1.AuditRecordID
16 , a1.DateOnly
17 , a1.UserID
18 , a1.ActivityCode
19 , to_number(to_char(a1.DateTimeStamp, 'SSSSS')) as login_time
20 , 0 as databack_time
21 from AuditData a1
22 where a1.ActivityCode = 1
23 union all
24 select a2.AuditRecordID
25 , a2.DateOnly
26 , a2.UserID
27 , a2.ActivityCode
28 , 0 as login_time
29 , to_number(to_char(a2.DateTimeStamp, 'SSSSS')) as databack_time
30 from AuditData a2
31 where a2.ActivityCode = 2
32 )
33 )
34 where ActivityCode = 2
35 group by DateOnly
36 )
37 /
DATEONLY ELAPSED_TIME TREND
--------- ------------ ----------
01-JAN-09 120
02-JAN-09 600 480
03-JAN-09 150 -450
SQL>
Like I said in my comment I guess you're working in MSSQL. I don't know whether that product has any equivalent of LAG().
If the assumptions are that:
Users will perform various tasks in no mandated order, and
That the difference between any two activities reflects the time it takes for the first of those two activities to execute,
Then why not create a table with two timestamps, the first column containing the activity start time, the second column containing the next activity start time. Thus the difference between these two will always be total time of the first activity. So for the logout activity, you would just have NULL for the second column.
So it would be kind of weird and interesting, for each activity (other than logging in and logging out), the time stamp would be recorded in two different rows--once for the last activity (as the time "completed") and again in a new row (as time started). You would end up with a jacob's ladder of sorts, but finding the data you are after would be much more simple.
In fact, to get really wacky, you could have each row have the time that the user started activity A and the activity code, and the time started activity B and the time stamp (which, as mentioned above, gets put down again for the following row). This way each row will tell you the exact difference in time for any two activities.
Otherwise, you're stuck with a query that says something like
SELECT TIME_IN_SEC(row2-timestamp) - TIME_IN_SEC(row1-timestamp)
which would be pretty slow, as you have already suggested. By swallowing the redundancy, you end up just querying the difference between the two columns. You probably would have less need of knowing the user info as well, since you'd know that any row shows both activity codes, thus you can just query the average for all users on any given day and compare it to the next day (unless you are trying to find out which users are having the problem as well).
This is the faster query to find out, in one row you will have current and row before datetime value, after that you can use DATEDIFF ( datepart , startdate , enddate ). I use #DammyVariable and DamyField as i remember the is some problem if is not first #variable=Field in update statement.
SELECT *, Cast(NULL AS DateTime) LastRowDateTime, Cast(NULL As INT) DamyField INTO #T FROM AuditData
GO
CREATE CLUSTERED INDEX IX_T ON #T (AuditRecordID)
GO
DECLARE #LastRowDateTime DateTime
DECLARE #DammyVariable INT
SET #LastRowDateTime = NULL
SET #DammyVariable = 1
UPDATE #T SET
#DammyVariable = DammyField = #DammyVariable
, LastRowDateTime = #LastRowDateTime
, #LastRowDateTime = DateTimeStamp
option (maxdop 1)