Related
I have a data set where customer id , customer join time and leave time available. I want to count hourly basis each date customer
Here is sample data set
My expected output
Here I going to add my code snip that i tried,where 1st created 24 hours span then tried to join and aggregate function for getting expected result and got for current date but i need for any date i.e dynamically
select logdate as date,timespan,count(customer_id)
(
SELECT userid,cast(joinTime as date) as logdate,customer_id
,starttime,endtime,timespan
FROM login_out_logs AS logTable
left join
(select '00:00:00 - 01:00:00' timespan,DATEadd(hh,0,cast(dateadd(dd,-1,getdate()))) starttime,dateadd(hh,1,cast(dateadd(dd,-1,getdate()))) endtime
union
select '01:00:00 - 02:00:00', dateadd(hh,1,cast(dateadd(dd,-1,getdate()))),dateadd(hh,2,cast(dateadd(dd,-1,getdate())))
union
select '02:00:00 - 03:00:00', dateadd(hh,2,cast(dateadd(dd,-1,getdate()))),dateadd(hh,3,cast(dateadd(dd,-1,getdate())))
union
select '03:00:00 - 04:00:00', dateadd(hh,3,cast(dateadd(dd,-1,getdate()))),dateadd(hh,4,cast(dateadd(dd,-1,getdate())))
union
select '04:00:00 - 05:00:00', dateadd(hh,4,cast(dateadd(dd,-1,getdate()))),dateadd(hh,5,cast(dateadd(dd,-1,getdate())))
union
select '05:00:00 - 06:00:00',dateadd(hh,5,cast(dateadd(dd,-1,getdate()))),dateadd(hh,6,cast(dateadd(dd,-1,getdate())))
union
select '06:00:00 - 07:00:00',dateadd(hh,6,cast(dateadd(dd,-1,getdate()))),dateadd(hh,7,cast(dateadd(dd,-1,getdate())))
union
select '07:00:00 - 08:00:00',dateadd(hh,7,cast(dateadd(dd,-1,getdate()))),dateadd(hh,8,cast(dateadd(dd,-1,getdate())))
union
select '08:00:00 - 09:00:00',dateadd(hh,8,cast(dateadd(dd,-1,getdate()))),dateadd(hh,9,cast(dateadd(dd,-1,getdate())))
union
select '09:00:00 - 10:00:00',dateadd(hh,9,cast(dateadd(dd,-1,getdate()))),dateadd(hh,10,cast(dateadd(dd,-1,getdate())))
union
select '10:00:00 - 11:00:00',dateadd(hh,10,cast(dateadd(dd,-1,getdate()))),dateadd(hh,11,cast(dateadd(dd,-1,getdate())))
union
select '11:00:00 - 12:00:00',dateadd(hh,11,cast(dateadd(dd,-1,getdate()))),dateadd(hh,12,cast(dateadd(dd,-1,getdate())))
union
select '12:00:00 - 13:00:00',dateadd(hh,12,cast(dateadd(dd,-1,getdate()))),dateadd(hh,13,cast(dateadd(dd,-1,getdate())))
union
select '13:00:00 - 14:00:00',dateadd(hh,13,cast(dateadd(dd,-1,getdate()))),dateadd(hh,14,cast(dateadd(dd,-1,getdate())))
union
select '14:00:00 - 15:00:00',dateadd(hh,14,cast(dateadd(dd,-1,getdate()))),dateadd(hh,15,cast(dateadd(dd,-1,getdate())))
union
select '15:00:00 - 16:00:00',dateadd(hh,15,cast(dateadd(dd,-1,getdate()))),dateadd(hh,16,cast(dateadd(dd,-1,getdate())))
union
select '16:00:00 - 17:00:00',dateadd(hh,16,cast(dateadd(dd,-1,getdate()))),dateadd(hh,17,cast(dateadd(dd,-1,getdate())))
union
select '17:00:00 - 18:00:00',dateadd(hh,17,cast(dateadd(dd,-1,getdate()))),dateadd(hh,18,cast(dateadd(dd,-1,getdate())))
union
select '18:00:00 - 19:00:00',dateadd(hh,18,cast(dateadd(dd,-1,getdate()))),dateadd(hh,19,cast(dateadd(dd,-1,getdate())))
union
select '19:00:00 - 20:00:00',dateadd(hh,19,cast(dateadd(dd,-1,getdate()))),dateadd(hh,20,cast(dateadd(dd,-1,getdate())))
union
select '20:00:00 - 21:00:00',dateadd(hh,20,cast(dateadd(dd,-1,getdate()))),dateadd(hh,21,cast(dateadd(dd,-1,getdate())))
union
select '21:00:00 - 22:00:00',dateadd(hh,21,cast(dateadd(dd,-1,getdate()))),dateadd(hh,22,cast(dateadd(dd,-1,getdate())))
union
select '22:00:00 - 23:00:00',dateadd(hh,22,cast(dateadd(dd,-1,getdate()))),dateadd(hh,23,cast(dateadd(dd,-1,getdate())))
union
select '24:00:00 - 00:00:00',dateadd(hh,23,cast(dateadd(dd,-1,getdate()))),dateadd(hh,23,dateadd(mi,59,cast(dateadd(dd,-1,getdate())))))a
on starttime between jointime and leaveTime
or endtime between jointime and leaveTime
or jointime>=starttime and jointime<endtime
) as T
group by leaveTime,timespan
Date Hour customer_count
2018-01-01 8-9 1
2018-01-01 9-10 1
2018-01-01 10-11 1
2018-01-01 11-12 1
2018-01-01 12-13 1
2018-01-01 13-14 1
2018-01-01 14-15 1
2018-01-01 15-16 1
2018-01-01 16-17 1
2018-01-01 17-18 1
2018-01-01 18-19 1
2018-01-01 19-20 1
2018-01-01 20-21 2
2018-01-01 21-22 3
2018-01-01 22-23 2
2018-01-01 23-00 1
Here is an approach - maybe this already solves your problem. I designed it in order to work with any day-difference between join and leave. However, I can't tell anything about the performance on larger sets since I tested with your example only and the evaluation of all relevant hours might take a bit longer if it comes to bigger data sets.
Anyways, I used a recursice cte here in order to evaluate all hours between join and leave and lateron I group by date and hour:
DECLARE #Cust TABLE(
customer_id INT,
joinTime DATETIME,
leaveTime DATETIME
)
INSERT INTO #Cust VALUES
(536, '2018-01-01 08:05:00', '2018-01-01 18:31:00'),
(344, '2018-01-01 19:37:00', '2018-01-01 20:16:00'),
(344, '2018-01-01 19:49:00', '2018-01-01 20:00:00'),
(899, '2018-01-01 20:49:00', '2018-01-01 21:14:00'),
(2336, '2018-01-01 21:02:00', '2018-01-01 21:03:00'),
(335, '2018-01-01 21:03:00', '2018-01-01 23:43:00'),
(2336, '2018-01-01 21:03:00', '2018-01-02 00:06:00'),
(899, '2018-01-01 21:18:00', '2018-01-01 22:24:00'),
(345, '2018-01-01 21:21:00', '2018-01-01 21:39:00'),
(345, '2018-01-01 21:53:00', '2018-01-02 00:13:00');
;WITH cte AS(
SELECT c.customer_id,
c.joinTime,
c.leaveTime,
c.joinTime x
FROM #Cust c
UNION ALL
SELECT c.customer_id,
c.joinTime,
c.leaveTime,
DATEADD(HOUR, 1, x) x
FROM cte c
WHERE DATEADD(HOUR, 1, x) <= CASE WHEN DATEPART(MINUTE, x) < DATEPART(MINUTE, c.leaveTime) THEN c.leaveTime ELSE DATEADD(HOUR, 1, c.leaveTime) END
)
SELECT CONVERT(DATE, x) AS cDate, DATEPART(HOUR, x) AS cHour, COUNT(*) AS cCount
FROM cte
GROUP BY CONVERT(DATE, x), DATEPART(HOUR, x)
ORDER BY 1,2
OPTION (MAXRECURSION 0)
Try this:
;WITH hourlist(starthour) AS (
SELECT 0 -- Seed Row
UNION ALL
SELECT starthour + 1 -- Recursion
FROM hourlist
where starthour+1<=23
)
SELECT
day
,convert(nvarchar,starthour)+'-'+convert(nvarchar,case when starthour+1=24 then 0 else starthour+1 end) hourtitle
,count(distinct customer_id) 'customer count'
FROM
hourlist h -- list of all hourse
cross join
(
select distinct dateadd(day,datediff(day,0, joinTime),0) from #login_out_logs
union
select distinct dateadd(day,datediff(day,0,leaveTime),0) from #login_out_logs
)q10(day) -- list of all days of jointime and leavetime
inner join #login_out_logs l on -- log considered for specific day/hour if starts before hourend and ends before hourstart
l.joinTime <dateadd(hour,starthour+1,q10.day)
and
l.leaveTime>=dateadd(hour,starthour ,q10.day)
group by day,starthour
order by day,starthour
Note: this will only work for jointimes and leavetimes that differ 0 or 1 days, not 2 or more.
I have a log with fingerprint timestamps as follows:
Usr TimeStamp
-------------------------
1 2015-07-01 08:01:00
2 2015-07-01 08:05:00
3 2015-07-01 08:07:00
1 2015-07-01 10:05:00
3 2015-07-01 11:00:00
1 2015-07-01 12:01:00
2 2015-07-01 13:03:00
2 2015-07-01 14:02:00
1 2015-07-01 16:03:00
2 2015-07-01 18:04:00
And I wish an output of workers per hour (rounding to nearest hour)
The theoretical output should be:
7:00 0
8:00 3
9:00 3
10:00 2
11:00 1
12:00 2
13:00 1
14:00 2
15:00 2
16:00 1
17:00 1
18:00 0
19:00 0
Can anyone think on how to approach this as SQL or if no other way, through TSQL?
Edit: The timestamps are logins and logouts of the different users. So at 8am 3 users logged in and the same 3 are still working at 9am. One of them leaves at 10am. etc
To start with you can use datepart to get hours for the days as following and then use group by user
SELECT DATEPART(HOUR, GETDATE());
SQL Fiddle
SELECT Convert(varchar(5),DATEPART(HOUR, timestamp)) + ':00' as time,
count(usr) as users
from tbl
group by DATEPART(HOUR, timestamp)
You need a datetime hour table to do this.
Note : This is just a example of showing how the query should work for one day. Replace the CTE with datetime hour table. In datetime hour table every date should start with 07:00:00 hour and end with 19:00:00 hour
When you want to do this for more than one day then you may have to include the Cast(dt.date_time AS DATE) in select and group by to differentiate the hour belong to which day
WITH datetime_table
AS (SELECT '2015-07-01 07:00:00' AS date_time
UNION ALL
SELECT '2015-07-01 08:00:00'
UNION ALL
SELECT '2015-07-01 09:00:00'
UNION ALL
SELECT '2015-07-01 10:00:00'
UNION ALL
SELECT '2015-07-01 11:00:00'
UNION ALL
SELECT '2015-07-01 12:00:00'
UNION ALL
SELECT '2015-07-01 13:00:00'
UNION ALL
SELECT '2015-07-01 14:00:00'
UNION ALL
SELECT '2015-07-01 15:00:00'
UNION ALL
SELECT '2015-07-01 16:00:00'
UNION ALL
SELECT '2015-07-01 17:00:00'
UNION ALL
SELECT '2015-07-01 18:00:00'
UNION ALL
SELECT '2015-07-01 19:00:00')
SELECT Datepart(hour, dt.date_time),
Hour_count=Count(t.id)
FROM datetime_table dt
LEFT OUTER JOIN Yourtable t
ON Cast(t.dates AS DATE) = Cast(dt.date_time AS DATE)
AND Datepart(hour, t.dates) =
Datepart(hour, dt.date_time)
GROUP BY Datepart(hour, dt.date_time)
SQLFIDDLE DEMO
You just need to group by hours and date. Check this below query and hope this helps you:
Create table #t1
(
usr int,
timelog datetime
)
Insert into #t1 values(1, '2015-07-01 08:01:00')
Insert into #t1 values(2, '2015-07-01 08:05:00')
Insert into #t1 values(3, '2015-07-01 08:07:00')
Insert into #t1 values(1, '2015-07-01 10:05:00')
Insert into #t1 values(3, '2015-07-01 11:00:00')
Insert into #t1 values(1, '2015-07-01 12:01:00')
Insert into #t1 values(2, '2015-07-01 13:03:00')
Insert into #t1 values(2, '2015-07-01 14:02:00')
Insert into #t1 values(1, '2015-07-01 16:03:00')
Insert into #t1 values(2, '2015-07-01 18:04:00')
Select cast(timelog as varchar(11)) as LogDate, Datepart(hour, timelog) as LogTime, count(usr) as UserCount from #t1
Group by Datepart(hour, timelog), cast(timelog as varchar(11))
The harder part is creating the zeros where data is missing. The usual approach is to generate a list of all possible "slots" and then do an outer join to the actual data. I'm assuming that you only want to run this for a single day at a time.
My approach, which is just an example, works because it does a cross join of two tables with 6 and 4 rows respectively and 6 times 4 is 24.
select f1.d * 6 + f0.d, coalesce(data.cnt, 0)
from
(
select 0 as d union all select 1 union all select 2 union all
select 3 union all select 4 union all select 5
) as f0,
(
select 0 as d union all select 1 union all
select 2 union all select 3
) as f1
left outer join
(
select
cast(datepart(hh, TimeStamp) as varchar(2)) + ':00' as hr,
count(*) as cnt
from LOG
group by datepart(hh, TimeStamp)
) as data
on data.hr = f1.d * 6 + f0.d
First you need to round up time to the closest hour
DATEADD(HOUR, DATEDIFF(HOUR, 0, DATEADD(MI, 30, TimeStamp)), 0)
As you see first we add 30 minutes to the original time (DATEADD(MI, 30, TimeStamp))
This approach will round up 08:04 to 08:00 or 07:58 to 8:00 too.
As I assume some workers can start working little bid early
SELECT DATEADD(HOUR, DATEDIFF(HOUR, 0, DATEADD(MI, 30, TimeStamp)), 0) As FingertipTime
FROM Fingertips
You can create a Computed column if you use rounded timestamp often
ALTER TABLE Fingertips ADD RoundedTimeStamp AS (DATEADD(HOUR, DATEDIFF(HOUR, 0, DATEADD(MI, 30, TimeStamp)), 0));
For comparing timestamps with constants of work hours you can find different methods. I will use a variable of type TABLE where i generate work hours for current day
Then using LEFT JOIN and GROUP BY we get quantity of timestamps
DECLARE #WorkHours TABLE(WorkHour DATETIME)
INSERT INTO #WorkHours (WorkHour) VALUES
('2015-07-01 07:00'),
('2015-07-01 08:00'),
('2015-07-01 09:00'),
('2015-07-01 10:00'),
('2015-07-01 11:00'),
('2015-07-01 12:00'),
('2015-07-01 13:00'),
('2015-07-01 14:00'),
('2015-07-01 15:00'),
('2015-07-01 16:00'),
('2015-07-01 17:00'),
('2015-07-01 18:00'),
('2015-07-01 19:00')
SELECT wh.Workhour
, COUNT(ft.TimeStamp) As Quantity
FROM #WorkHours wh
LEFT JOIN Fingertips ft ON ft.RoundedTimeStamp = wh.WorkHour
GROUP BY wh.WorkHour
Check this SQL Fiddle
Many separate parts that have to be glued together to get this done.
First rounding, this is easily done with obtaining the hour part of the date + 30 minutes. Then determine start and end records. If there are no fields to indicate this and assuming the first occurrence of a day is the login or start, you can use row_number and use the odd numbers as start records.
Then start and end have to be coupled, in sql server 2012 and higher this can be easily done with the lead function
To get the missing hours a sequence has to be created with all the hours. Several options for this (good link here), but I like the approach of using row_number on a table that is sure to contain enough rows (with a proper column for order by), such as sys.all_objects used in the link. That way hours 7 to 19 could be created as: select top 13 ROW_NUMBER() over (order by object_id) + 6 [Hour] from sys.all_objects
If there's only one date to check on, the query can simple left join on the hour of the timestamp fingerprints. If there are more dates, a second sequence could be created cross applied to the times to get all dates. Assuming the one date, final code would be:
declare #t table(Usr int, [timestamp] datetime)
insert #t values
(1 , '2015-07-01 08:01:00'),
(2 , '2015-07-01 08:05:00'),
(3 , '2015-07-01 08:07:00'),
(1 , '2015-07-01 10:05:00'),
(3 , '2015-07-01 11:00:00'),
(1 , '2015-07-01 12:01:00'),
(2 , '2015-07-01 13:03:00'),
(2 , '2015-07-01 14:02:00'),
(1 , '2015-07-01 16:03:00'),
(2 , '2015-07-01 18:04:00'),
(2 , '2015-07-01 18:04:00')
;with usrHours as
(
select Usr, datepart(hour, DATEADD(minute,30, times.timestamp)) [Hour] --convert all times to the rounded hour (rounding by adding 30 minutes)
, ROW_NUMBER() over (partition by usr order by [timestamp] ) rnr
from #t times --#t should be your logging table
), startend as --get next (end) hour by using lead
(
select Usr, [hour] StartHour , LEAD([Hour]) over (partition by usr order by rnr) NextHour ,rnr
from usrHours
),hours as --sequence of hours 7 to 19
(
select top 13 ROW_NUMBER() over (order by object_id) + 6 [Hour] from sys.all_objects
)
select cast([Hour] as varchar) + ':00' [Hour], COUNT(startend.usr) Users
from hours --sequence is leading
left join startend on hours.Hour between startend.StartHour and startend.NextHour
and rnr % 2 = 1 --every odd row number is a start time
group by Hours.hour
Here is my final working code:
create table tsts(id int, dates datetime)
insert tsts values
(1 , '2015-07-01 08:01:00'),
(2 , '2015-07-01 08:05:00'),
(3 , '2015-07-01 08:07:00'),
(1 , '2015-07-01 10:05:00'),
(3 , '2015-07-01 11:00:00'),
(1 , '2015-07-01 12:01:00'),
(2 , '2015-07-01 13:03:00'),
(2 , '2015-07-01 14:02:00'),
(1 , '2015-07-01 16:03:00'),
(2 , '2015-07-01 18:04:00')
select horas.hora, isnull(sum(math) over(order by horas.hora rows unbounded preceding),0) as Employees from
(
select 0 as hora union all
select 1 as hora union all
select 2 as hora union all
select 3 as hora union all
select 4 as hora union all
select 5 as hora union all
select 6 as hora union all
select 7 as hora union all
select 8 as hora union all
select 9 as hora union all
select 10 as hora union all
select 11 as hora union all
select 12 as hora union all
select 13 as hora union all
select 14 as hora union all
select 15 as hora union all
select 16 as hora union all
select 17 as hora union all
select 18 as hora union all
select 19 as hora union all
select 20 as hora union all
select 21 as hora union all
select 22 as hora union all
select 23
) as horas
left outer join
(
select hora, sum(math) as math from
(
select id, hora, iif(rowid%2 = 1,1,-1) math from
(
select row_number() over (partition by id order by id, dates) as rowid, id, datepart(hh,dateadd(mi, 30, dates)) as hora from tsts
) as Q1
) as Q2
group by hora
) as Q3
on horas.hora = Q3.hora
SQL Fiddle
From the list of start time and end times from a select query, I need to find out the total time excluding overlapping time and breaks.
StartTime EndTime
2014-10-01 10:30:00.000 2014-10-01 12:00:00.000 -- 90 mins
2014-10-01 10:40:00.000 2014-10-01 12:00:00.000 --0 since its overlapped with previous
2014-10-01 10:42:00.000 2014-10-01 12:20:00.000 -- 20 mins excluding overlapped time
2014-10-01 10:40:00.000 2014-10-01 13:00:00.000 -- 40 mins
2014-10-01 10:44:00.000 2014-10-01 12:21:00.000 -- 0 previous ones have already covered this time range
2014-10-13 15:50:00.000 2014-10-13 16:00:00.000 -- 10 mins
So the total should be 160 mins in this case.
I don't want to use so many loops to get through with this. Looking for some simple solution.
DECLARE #table TABLE (StartTime DateTime2, EndTime DateTime2)
INSERT INTO #table SELECT '2014-10-01 10:30:00.000', '2014-10-01 12:00:00.000'
INSERT INTO #table SELECT '2014-10-01 10:40:00.000', '2014-10-01 12:00:00.000'
INSERT INTO #table SELECT '2014-10-01 10:42:00.000', '2014-10-01 12:20:00.000'
INSERT INTO #table SELECT '2014-10-01 10:40:00.000', '2014-10-01 13:00:00.000'
INSERT INTO #table SELECT '2014-10-01 10:44:00.000', '2014-10-01 12:21:00.000'
INSERT INTO #table SELECT '2014-10-13 15:50:00.000', '2014-10-13 16:00:00.000'
;WITH addNR AS ( -- Add row numbers
SELECT StartTime, EndTime, ROW_NUMBER() OVER (ORDER BY StartTime, EndTime) AS RowID
FROM #table AS T
), createNewTable AS ( -- Recreate table according overlap time
SELECT StartTime, EndTime, RowID
FROM addNR
WHERE RowID = 1
UNION ALL
SELECT
CASE
WHEN a.StartTime <= AN.StartTime AND AN.StartTime <= a.EndTime THEN a.StartTime
ELSE AN.StartTime END AS StartTime,
CASE WHEN a.StartTime <= AN.EndTime AND AN.EndTime <= a.EndTime THEN a.EndTime
ELSE AN.EndTime END AS EndTime,
AN.RowID
FROM addNR AS AN
INNER JOIN createNewTable AS a
ON a.RowID + 1 = AN.RowID
), getMinutes AS ( -- Get difference in minutes
SELECT DATEDIFF(MINUTE,StartTime,MAX(EndTime)) AS diffMinutes
FROM createNewTable
GROUP BY StartTime
)
SELECT SUM(diffMinutes) AS Result
FROM getMinutes
And the result is 160
To get the result with the data you gave, I assume that the end time is not included (otherwise it would be 91 minutes for the first run). With that in mind, this will give you the result you want with no cursors or loops. If the times span multiple days, the logic will need to be adjusted.
--Create sample data
CREATE TABLE TimesToCheck
([StartTime] datetime, [EndTime] datetime)
;
INSERT INTO TimesToCheck
([StartTime], [EndTime])
VALUES
('2014-10-01 10:30:00', '2014-10-01 12:00:00'),
('2014-10-01 10:40:00', '2014-10-01 12:00:00'),
('2014-10-01 10:42:00', '2014-10-01 12:20:00'),
('2014-10-01 10:40:00', '2014-10-01 13:00:00'),
('2014-10-01 10:44:00', '2014-10-01 12:21:00'),
('2014-10-13 15:50:00', '2014-10-13 16:00:00')
;--Now the solution.
;WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^4 or 10,000 rows
N AS (SELECT TOP (3600) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1 AS Number FROM E4),
TimeList AS (SELECT CAST(DATEADD(minute,n.number,0) as time) AS m FROM N),
--We really only need the Timelist table. If it is already created, we can start here.
ActiveTimes AS (SELECT DISTINCT t.m FROM TimeList T
INNER JOIN TimesToCheck C ON t.m BETWEEN CAST(c.StartTime as time) AND CAST(DATEADD(minute,-1,c.EndTime) as time))
SELECT COUNT(*) FROM ActiveTimes
Is there an easy way to do this? By fully between, I mean don't count the 7am or 7pm datetimes that are equal to the start or end time.
I imagine this can be done using the unix timestamp in seconds and a bit of algebra, but I can't figure it out.
I'm happy to use something in PLSQL or plain SQL.
Examples:
start end num_7am_7pm_between_dates
2012-06-16 05:00 2012-06-16 08:00 1
2012-06-16 16:00 2012-06-16 20:00 1
2012-06-16 05:00 2012-06-16 07:00 0
2012-06-16 07:00 2012-06-16 19:00 0
2012-06-16 08:00 2012-06-16 15:00 0
2012-06-16 05:00 2012-06-16 19:01 2
2012-06-16 05:00 2012-06-18 20:00 6
I think this could be reduced further but I don't have Oracle at my disposal to completely test this Oracle SQL:
SELECT StartDate
, EndDate
, CASE WHEN TRUNC(EndDate) - TRUNC(StartDate) < 1
AND TO_CHAR(EndDate, 'HH24') > 19
AND TO_CHAR(StartDate, 'HH24') < 7
THEN 2
WHEN TRUNC(EndDate) - TRUNC(StartDate) < 1
AND (TO_CHAR(EndDate, 'HH24') > 19
OR TO_CHAR(StartDate, 'HH24') < 7)
THEN 1
WHEN TRUNC(EndDate) - TRUNC(StartDate) > 0
AND TO_CHAR(EndDate, 'HH24') > 19
AND TO_CHAR(StartDate, 'HH24') < 7
THEN 2 + ((TRUNC(EndDate) - TRUNC(StartDate)) * 2)
WHEN TRUNC(EndDate) - TRUNC(StartDate) > 0
AND TO_CHAR(EndDate, 'HH24') > 19
OR TO_CHAR(StartDate, 'HH24') < 7
THEN 1 + ((TRUNC(EndDate) - TRUNC(StartDate)) * 2)
ELSE 0
END
FROM MyTable;
Thanks to #A.B.Cade for the Fiddle, it looks like my CASE Logic can be condensed further to:
SELECT SDate
, EDate
, CASE WHEN TO_CHAR(EDate, 'HH24') > 19
AND TO_CHAR(SDate, 'HH24') < 7
THEN 2 + ((TRUNC(EDate) - TRUNC(SDate)) * 2)
WHEN TO_CHAR(EDate, 'HH24') > 19
OR TO_CHAR(SDate, 'HH24') < 7
THEN 1 + ((TRUNC(EDate) - TRUNC(SDate)) * 2)
ELSE 0
END AS MyCalc2
FROM MyTable;
I had fun writing the following solution:
with date_range as (
select min(sdate) as sdate, max(edate) as edate
from t
),
all_dates as (
select sdate + (level-1)/24 as hour
from date_range
connect by level <= (edate-sdate) * 24 + 1
),
counts as (
select t.id, count(*) as c
from all_dates, t
where to_char(hour, 'HH') = '07'
and hour > t.sdate and hour < t.edate
group by t.id
)
select t.sdate, t.edate, nvl(counts.c, 0)
from t, counts
where t.id = counts.id(+)
order by t.id;
I added an id column to the table in case the range of dates aren't unique.
http://www.sqlfiddle.com/#!4/5fa19/13
This may not have the best performance but might work for you:
select sdate, edate, count(*)
from (select distinct edate, sdate, sdate + (level / 24) hr
from t
connect by sdate + (level / 24) <= edate )
where to_char(hr, 'hh') = '07'
group by sdate, edate
UPDATE: As to #FlorinGhita's comment - fixed the query to include zero occurences
select sdate, edate, sum( decode(to_char(hr, 'hh'), '07',1,0))
from (select distinct edate, sdate, sdate + (level / 24) hr
from t
connect by sdate + (level / 24) <= edate )
group by sdate, edate
Do like this (in SQL)
declare #table table ( start datetime, ends datetime)
insert into #table select'2012-06-16 05:00','2012-06-16 08:00' --1
insert into #table select'2012-06-16 16:00','2012-06-16 20:00' --1
insert into #table select'2012-06-16 05:00','2012-06-16 07:00' --0
insert into #table select'2012-06-16 07:00','2012-06-16 19:00' --0
insert into #table select'2012-06-16 08:00','2012-06-16 15:00' --0
insert into #table select'2012-06-16 05:00','2012-06-16 19:01' --2
insert into #table select'2012-06-16 05:00','2012-06-18 20:00' --6
insert into #table select'2012-06-16 07:00','2012-06-18 07:00' --3
Declare #From DATETIME
Declare #To DATETIME
select #From = MIN(start) from #table
select #To = max(ends) from #table
;with CTE AS
(
SELECT distinct
DATEADD(DD,DATEDIFF(D,0,start),0)+'07:00' AS AimTime
FROM #table
),CTE1 AS
(
Select AimTime
FROM CTE
UNION ALL
Select DATEADD(hour, 12, AimTime)
From CTE1
WHERE AimTime< #To
)
select start,ends, count(AimTime)
from CTE1 right join #table t
on t.start < CTE1.AimTime and t.ends > CTE1.AimTime
group by start,ends
I have the following data being returned from a query. Essentially I am putting this in a temp table so it is now in a temp table that I can query off of(Obviously a lot more data in real life, I am just showing an example):
EmpId Date
1 2011-01-01
1 2011-01-02
1 2011-01-03
2 2011-02-03
3 2011-03-01
4 2011-03-02
5 2011-01-02
I need to return only EmpId's that have 30 or more consecutive days in the date column. I also need to return the day count for these employees that have 30 or more consecutive days. There could potentially be 2 or more sets of different consecutive days that are 30 or more days. iIn this instance I would like to return multiple rows. So if an employee has a date from 2011-01-01 to 2011-02-20 then return this and the count in one row. Then if this same employee has dates of 2011-05-01 to 2011-07-01 then return this in another row. Essentially all breaks in consecutive days are treated as a seperate record.
Using DENSE_RANK should do the trick:
;WITH sampledata
AS (SELECT 1 AS id, DATEADD(day, -0, GETDATE())AS somedate
UNION ALL SELECT 1, DATEADD(day, -1, GETDATE())
UNION ALL SELECT 1, DATEADD(day, -2, GETDATE())
UNION ALL SELECT 1, DATEADD(day, -3, GETDATE())
UNION ALL SELECT 1, DATEADD(day, -4, GETDATE())
UNION ALL SELECT 1, DATEADD(day, -5, GETDATE())
UNION ALL SELECT 1, DATEADD(day, -10, GETDATE())
UNION ALL SELECT 1, '2011-01-01 00:00:00'
UNION ALL SELECT 1, '2010-12-31 00:00:00'
UNION ALL SELECT 1, '2011-02-01 00:00:00'
UNION ALL SELECT 1, DATEADD(day, -10, GETDATE())
UNION ALL SELECT 2, DATEADD(day, 0, GETDATE())
UNION ALL SELECT 2, DATEADD(day, -1, GETDATE())
UNION ALL SELECT 2, DATEADD(day, -2, GETDATE())
UNION ALL SELECT 2, DATEADD(day, -6, GETDATE())
UNION ALL SELECT 3, DATEADD(day, 0, GETDATE())
UNION ALL SELECT 4, DATEADD(day, 0, GETDATE())
UNION ALL SELECT 5, DATEADD(day, 0, GETDATE()))
, ranking
AS (SELECT *, DENSE_RANK()OVER(PARTITION BY id ORDER BY DATEDIFF(day, 0, somedate)) - DATEDIFF(day, 0, somedate)AS dategroup
FROM sampledata)
SELECT id
, MIN(somedate)AS range_start
, MAX(somedate)AS range_end
, DATEDIFF(day, MIN(somedate), MAX(somedate)) + 1 AS consecutive_days
FROM ranking
GROUP BY id, dategroup
--HAVING DATEDIFF(day, MIN(somedate), MAX(somedate)) + 1 >= 30 --change as needed
ORDER BY id, range_start
Something like this should do the trick, haven't tested it though.
SELECT
a.empid
, count(*) as consecutive_count
, min(a.mydate) as startdate
FROM (SELECT * FROM logins ORDER BY mydate) a
INNER JOIN (SELECT * FROM logins ORDER BY mydate) b
ON (a.empid = b.empid AND datediff(day,a.mydate,b.mydate) = 1
GROUP BY a.empid, startdate
HAVING consecutive_count > 30
This is a good case for a recursive CTE. I stole the data table from #Davin:
with data AS --sample data
( SELECT 1 as id ,DATEADD(DD,-0,GETDATE()) as date UNION ALL
SELECT 1 as id ,DATEADD(DD,-1,GETDATE()) as date UNION ALL
SELECT 1 as id ,DATEADD(DD,-2,GETDATE()) as date UNION ALL
SELECT 1 as id ,DATEADD(DD,-3,GETDATE()) as date UNION ALL
SELECT 1 as id ,DATEADD(DD,-4,GETDATE()) as date UNION ALL
SELECT 1 as id ,DATEADD(DD,-5,GETDATE()) as date UNION ALL
SELECT 1 as id ,DATEADD(DD,-10,GETDATE()) as date UNION ALL
SELECT 1 as id ,'2011-01-01 00:00:00.000' as date UNION ALL
SELECT 1 as id ,'2010-12-31 00:00:00.000' as date UNION ALL
SELECT 1 as id ,'2011-02-01 00:00:00.000' as date UNION ALL
SELECT 1 as id ,DATEADD(DD,-10,GETDATE()) as date UNION ALL
SELECT 2 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL
SELECT 2 as id ,DATEADD(DD,-1,GETDATE()) as date UNION ALL
SELECT 2 as id ,DATEADD(DD,-2,GETDATE()) as date UNION ALL
SELECT 2 as id ,DATEADD(DD,-6,GETDATE()) as date UNION ALL
SELECT 3 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL
SELECT 4 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL
SELECT 5 as id ,DATEADD(DD,0,GETDATE()) as date )
,CTE AS
(
SELECT id, CAST(date as date) Date, Consec = 1
FROM data
UNION ALL
SELECT t.id, CAST(t.date as DATE) Date, Consec = (c.Consec + 1)
FROM data T
INNER JOIN CTE c
ON T.id = c.id
AND CAST(t.date as date) = CAST(DATEADD(day, 1, c.date) as date)
)
SELECT id, MAX(consec)
FROM CTE
GROUP BY id
ORDER BY id
Basically this generates a lot of rows per person, and measures how many days in a row each date represents.
Assuming there are no duplicate dates for the same employee:
;WITH ranged AS (
SELECT
EmpId,
Date,
RangeId = DATEDIFF(DAY, 0, Date)
- ROW_NUMBER() OVER (PARTITION BY EmpId ORDER BY Date)
FROM atable
)
SELECT
EmpId,
StartDate = MIN(Date),
EndDate = MAX(Date),
DayCount = DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1
FROM ranged
GROUP BY EmpId, RangeId
HAVING DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1 >= 30
ORDER BY EmpId, MIN(Date)
DATEDIFF turns the dates into integers (the difference of days between the 0 date (1900-01-01) and Date). If the dates are consecutive, the integers are consecutive too. Using the data sample in the question as an example, the DATEDIFF results will be:
EmpId Date DATEDIFF
----- ---------- --------
1 2011-01-01 40542
1 2011-01-02 40543
1 2011-01-03 40544
2 2011-02-03 40575
3 2011-03-01 40601
4 2011-03-02 40602
5 2011-01-02 40543
Now, if you take each employee's rows, assign row numbers to them in the order of dates, and get the difference between the numeric representations and row numbers, you will find that the difference stays the same for consecutive numbers (and, therefore, consecutive dates). Using a slightly different sample for better illustration, it will look like this:
Date DATEDIFF RowNum RangeId
---------- -------- ------ -------
2011-01-01 40542 1 40541
2011-01-02 40543 2 40541
2011-01-03 40544 3 40541
2011-01-05 40546 4 40542
2011-01-07 40548 5 40543
2011-01-08 40549 6 40543
2011-01-09 40550 7 40543
The specific value of RangeId is not important, only the fact that it remains the same for consecutive dates matters. Based on that fact, you can use it as a grouping criterion to count the dates in the group and get the range bounds.
The above query uses DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1 to count the days, but you could also simply use COUNT(*) instead.