Merge consecutive time record in SQL server 2008 - sql

Say I have data like this, timeslots is basically time with 30 mins apart.
Note there is a gap in 2021-12-24 between 15:30 and 16:30.
calender_date
timeslot
timeslot_end
2021-12-24
14:00:00
14:30:00
2021-12-24
14:30:00
15:00:00
2021-12-24
15:00:00
15:30:00
2021-12-24
16:30:00
17:00:00
2021-12-24
17:00:00
17:30:00
2021-12-24
17:30:00
18:00:00
2021-12-30
09:00:00
09:30:00
2021-12-30
09:30:00
10:00:00
I want to merge rows where timeslot_end = next row's timeslot and in the same day, so data would look like this.
calender_date
timeslot
timeslot_end
2021-12-24
14:00:00
15:30:00
2021-12-24
16:30:00
18:00:00
2021-12-30
9:00:00
10:00:00
I have try row numbering with self join,
WITH cte AS
(
SELECT
calender_date
,timeslot
,timeslot_end
,ROW_NUMBER()OVER(ORDER BY [timeslot])rn
FROM #tmp_leave tl
)
SELECT
MIN(a.timeslot) OVER(PARTITION BY a.calender_date, DATEDIFF(minute,a.timeslot_end,
ISNULL(b.timeslot, a.timeslot_end))) AS 'StartTime',
MAX(a.timeslot_end ) OVER(PARTITION BY a.calender_date,
DATEDIFF(minute,a.timeslot_end, ISNULL(b.timeslot, a.timeslot_end))) AS 'EndTime'
FROM cte a
LEFT JOIN cte b
ON a.rn + 1 = b.rn AND a.timeslot_end = b.timeslot
ORDER BY calender_date
But the result isn't quite right, it ignored the gap in 2021-12-24 and return below.
calender_date
timeslot
timeslot_end
2021-12-24
14:00:00
18:00:00
2021-12-30
9:00:00
10:00:00
I have been searching and trying to solve it for a while now, please help, any help is very very appreciated!!

This is a gaps and islands problem. We can approach this by creating pseudo groups for each island of continuous dates/times.
WITH cte AS (
SELECT *, LAG(timeslot_end) OVER
(ORDER BY calendar_date, timeslot) timeslot_end_lag
FROM yourTable
),
cte2 AS (
SELECT *, COUNT(CASE WHEN timeslot_end_lag <> timeslot THEN 1 END)
OVER (ORDER BY calendar_date, timeslot) AS grp
FROM cte
)
SELECT calendar_date,
MIN(timeslot) AS timeslot,
MAX(timeslot_end) AS timeslot_end
FROM cte2
GROUP BY calendar_date, grp
ORDER BY calendar_date;
Demo

Related

PostgreSQL Attendance and night shift

i have following table:
dt
type
2022-09-12 21:36:26
WORK_START
2022-09-13 02:00:00
BREAK_START
2022-09-20 06:00:00
WORK_START
2022-09-20 10:00:00
BREAK_START
2022-09-20 10:27:00
BREAK_END
2022-09-20 13:00:00
WORK_END
2022-09-13 06:00:00
WORK_END
2022-09-13 02:30:00
BREAK_END
and query :
SELECT g.tempDatum::date as datum,
MAX(att.dt::time) FILTER (WHERE att.type = 'WORK_START') as work_start
, MAX(att.dt::time) FILTER (WHERE att.type = 'BREAK_START') as break_start
, MAX(att.dt::time) FILTER (WHERE att.type = 'BREAK_END') as break_end
, MAX(att.dt::time) FILTER (WHERE att.type = 'WORK_END') as work_end
FROM generate_series( '2022-09-01','2022-09-30', '1 day'::interval) AS g(tempDatum)
LEFT JOIN att ON att.dt::date = g.tempDatum::date group by g.tempDatum order by
g.tempDatum;
Result is pretty good:
Result photo
except for 2022-09-12 because is a night shift. I want move Break_start + end and work_end to day 2022-09-12 for better result as attendance log.
How achieve this ? Big thanks for any help.
By grouping each work day (start, break start, break end, end) as one we can use crosstab to pivot it using the first work day of each group as the one for the entire day as requested.
select *
from crosstab(
'select min(dte) over(partition by grp), type, tme from
(
select dt::date as dte
,dt::time as tme
,type
,row_number() over(order by dt,type)-case when row_number() over(order by dt,type) <= 4 then row_number() over(order by dt,type) else row_number() over(order by dt,type)-4 end as grp
from t
) t' )
as ct(dt date, WORK_START time, BREAK_START time, BREAK_END time, WORK_END time)
dt
work_start
break_start
break_end
work_end
2022-09-12
21:36:26
02:00:00
02:30:00
06:00:00
2022-09-20
06:00:00
10:00:00
10:27:00
13:00:00
Fiddle

Passing data from one table to a block code - Oracle

I have a table dates_2019 with all the weekdays dates for 2019 as below:-
TS_RANGE_BEGIN |TS_RANGE_END
2019-01-01 17:00:00 |2019-01-02 17:00:00
2019-01-02 17:00:00 |2019-01-03 17:00:00
2019-01-03 17:00:00 |2019-01-04 17:00:00
2019-01-04 17:00:00 |2019-01-07 17:00:00
2019-01-07 17:00:00 |2019-01-08 17:00:00
2019-01-08 17:00:00 |2019-01-09 17:00:00
My insert query as below:-
insert into report_2019(ab,app_name,status,sub_count,category,last_modified_timestamp)
with T as (
select id,ab,app_name,status,trunc(last_modified_timestamp),
row_number() over(partition by id, ab order by p_message_id desc, message_id desc) lastest_status_order_id
,p_message_id
,LAST_MODIFIED_TIMESTAMP,reporting_purpose
from (
select
id,
ab
, app_name,
status
,p.message_id p_message_id
,s.message_id
,s.LAST_MODIFIED_TIMESTAMP, reporting_purpose
from table_a s, table_b d, table_c k, table_d t,
table_e p
where s.LAST_MODIFIED_TIMESTAMP > to_timestamp('2019-01-01 17', 'YYYY-MM-DD HH24')
and s.LAST_MODIFIED_TIMESTAMP <= to_timestamp('2019-01-02 17', 'YYYY-MM-DD HH24')
and .....
) a
)
select * from (
select ab, app_name, status, count(*) subtotal,
'Reporting' as v_category,trunc(last_modified_timestamp)
from T where lastest_status_order_id = 1
and LAST_MODIFIED_TIMESTAMP <= to_timestamp('2019-01-02 17', 'YYYY-MM-DD HH24')
group by ab,app_name,status,trunc(last_modified_timestamp))a order by ab, app_name;
The original idea was to join both the tables dates_2019 and T and get the results for the whole year as below:-
where s.LAST_MODIFIED_TIMESTAMP > c.ts_range_begin
and s.LAST_MODIFIED_TIMESTAMP <= c.ts_range_end
....
However the temp tablespace is low and i got the below errors:-
[Error Code: 12801, SQL State: 72000] ORA-12801: error signaled in
parallel query server P004 ORA-01555: snapshot too old: rollback
segment number 30 with name "_SYSSMU30_326584413$" too small
The option now is to insert the data one day at a time in a block.
Could you please help me with the solution?
Thanks,

Number of rows per hour starting at a specific time

In order to get data for some reporting, I have to know how much lines have been inserted per hour in a table starting at a specific hour for a specific day. I already found a part of the solution in another question but I didn't manage to find a way to adapt it in my case. This is the code I've written so far:
SELECT DATEADD(HOUR, DATEDIFF(HOUR, 0, t.mydatetime), 0) AS HOUR_CONCERNED,
COUNT(*) AS NB_ROWS
FROM mytable t
WHERE CONVERT(DATETIME, FLOOR(CONVERT(FLOAT, t.mydatetime))) = '2016-06-06'
GROUP BY DATEADD(HOUR, DATEDIFF(HOUR, 0, t.mydatetime), 0)
ORDER BY HOUR_CONCERNED;
It gives me the following results:
HOUR_CONCERNED NB_ROWS
------------------- --------
2016-06-06 10:00:00 2157
2016-06-06 11:00:00 60740
2016-06-06 12:00:00 66189
2016-06-06 13:00:00 77096
2016-06-06 14:00:00 90039
The problem is that I can't find a way to start my results at a specific time such as 9.30am and to get the number of rows per hour starting from this time. In other words, I'm looking for the number of rows between 9.30am and 10.30am, between 10.30am and 11.30am, etc. The results I'm looking for should look like this:
HOUR_CONCERNED NB_ROWS
------------------- --------
2016-06-06 09:30:00 3550
2016-06-06 10:30:00 33002
2016-06-06 11:30:00 42058
2016-06-06 12:30:00 55008
2016-06-06 13:30:00 72000
Is there an easy way to adapt my query and get those results ?
Given a specific starting time, you can get hour blocks by finding the number of minutes since your start time, and dividing by 60, then adding this number of hours back to the start time e.g.
DECLARE #StartTime DATETIME2(0) = '20160606 09:30';
WITH DummyData (mydatetime) AS
( SELECT TOP 200 DATEADD(MINUTE, ROW_NUMBER() OVER(ORDER BY [object_id]) - 1, #StartTime)
FROM sys.all_objects
)
SELECT HoursSinceStart = FLOOR(DATEDIFF(MINUTE, #StartTime, mydatetime) / 60.0),
Display = DATEADD(HOUR, FLOOR(DATEDIFF(MINUTE, #StartTime, mydatetime) / 60.0), #StartTime),
Records = COUNT(*)
FROM DummyData
WHERE myDateTime >= #StartTime
GROUP BY FLOOR(DATEDIFF(MINUTE, #StartTime, mydatetime) / 60.0)
ORDER BY Display;
Which gives:
HoursSinceStart Display Records
0 2016-06-06 09:30:00 60
1 2016-06-06 10:30:00 40
2 2016-06-06 11:30:00 60
3 2016-06-06 12:30:00 20
I have left the HoursSinceStart column in, to hopefully assist in deconstructing the logic contained in the Display column
The problem with this method is that it will only give you results for blocks that exist, if you also need those that don't you will need to generate all time blocks using a numbers table, then left join to your data:
You can quickly generate a series of numbers using this:
DECLARE #StartTime DATETIME2(0) = '20160606 09:30';
-- GENERATE 10 ROWS
WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (N)),
-- CROSS JOIN THE 10 ROWS TO GET 100 ROWS
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
--CROSS JOIN THE 100 ROWS TO GET 10,000 ROWS
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
--APPLY ROW_NUMBER TO GET A SET OF NUMBERS FROM 0 - 99,999
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N) - 1 FROM N3)
SELECT *,
TimeStart = DATEADD(HOUR, N, #StartTime),
TimeEnd = DATEADD(HOUR, N + 1, #StartTime)
FROM Numbers;
Which gives something like:
N TimeStart TimeEnd
--------------------------------------------------
0 2016-06-06 09:30:00 2016-06-06 10:30:00
1 2016-06-06 10:30:00 2016-06-06 11:30:00
2 2016-06-06 11:30:00 2016-06-06 12:30:00
3 2016-06-06 12:30:00 2016-06-06 13:30:00
4 2016-06-06 13:30:00 2016-06-06 14:30:00
5 2016-06-06 14:30:00 2016-06-06 15:30:00
6 2016-06-06 15:30:00 2016-06-06 16:30:00
7 2016-06-06 16:30:00 2016-06-06 17:30:00
Then you can left join your data to this (you will probably need an end time too);
DECLARE #StartTime DATETIME2(0) = '20160606 09:30',
#EndTime DATETIME2(0) = '20160606 15:30';
WITH DummyData (mydatetime) AS
( SELECT TOP 200 DATEADD(MINUTE, ROW_NUMBER() OVER(ORDER BY [object_id]) - 1, #StartTime)
FROM sys.all_objects
),
-- GENERATE NUMBERS
N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N) - 1 FROM N3),
TimePeriods AS
( SELECT TimeStart = DATEADD(HOUR, N, #StartTime),
TimeEnd = DATEADD(HOUR, N + 1, #StartTime)
FROM Numbers
WHERE DATEADD(HOUR, N, #StartTime) < #EndTime
)
SELECT tp.TimeStart, tp.TimeEnd, Records = COUNT(dd.myDateTime)
FROM TimePeriods AS tp
LEFT JOIN DummyData AS dd
ON dd.mydatetime >= tp.TimeStart
AND dd.mydatetime < tp.TimeEnd
GROUP BY tp.TimeStart, tp.TimeEnd
ORDER BY tp.TimeStart;
Which will return 0 where there are no records:
TimeStart TimeEnd Records
---------------------------------------------------------
2016-06-06 09:30:00 2016-06-06 10:30:00 60
2016-06-06 10:30:00 2016-06-06 11:30:00 60
2016-06-06 11:30:00 2016-06-06 12:30:00 60
2016-06-06 12:30:00 2016-06-06 13:30:00 20
2016-06-06 13:30:00 2016-06-06 14:30:00 0
2016-06-06 14:30:00 2016-06-06 15:30:00 0
Try this:
SELECT DATEADD( MINUTE, 30, DATEADD(HOUR, DATEDIFF(HOUR, 0, DATEADD( MINUTE, -30, t.mydatetime)), 0)) AS HOUR_CONCERNED,
COUNT(*) AS NB_ROWS
FROM mytable t
WHERE CONVERT(DATETIME, FLOOR(CONVERT(FLOAT, t.mydatetime))) = '2016-06-06'
GROUP BY DATEADD(HOUR, DATEDIFF(HOUR, 0, DATEADD( MINUTE, -30, t.mydatetime)), 0)
ORDER BY HOUR_CONCERNED;
I added a 30 min offset into the GROUP BY function to treat 9:30 as 9:00, 10:30 as 10:00 and so on. In the select part I reverse this offset to give a proper interval.
The WHERE condition in your query needs to change though for performance reasons. Instead of truncating timestamps to a nearest day, you should filter by a range:
WHERE t.mydatetime >= CONVERT( DATETIME, '2016-06-06' ) AND t.mydatetime < CONVERT( DATETIME, '2016-06-07' )
You need to have the time in the where clause, and set a greater than against the time you want to measure from? Also, you can use DATEPART to get the hours.
SELECT NB_ROWS = COUNT(*)
,HOUR_CONCERNED = DATEPART(HOUR, InsertedDate)
FROM table
WHERE InsertedDate = '20160531'
AND InsertedDate> time
GROUP BY DATEPART(HOUR, InsertedDate)

SQL and Temporal data

Given a table of appointments, like this:
User Start End
UserA 2016-01-15 12:00:00 2016-01-15 14:00:00
UserA 2016-01-15 15:00:00 2016-01-15 17:00:00
UserB 2016-01-15 13:00:00 2016-01-15 15:00:00
UserB 2016-01-15 13:32:00 2016-01-15 15:00:00
UserB 2016-01-15 15:30:00 2016-01-15 15:30:00
UserB 2016-01-15 15:45:00 2016-01-15 16:00:00
UserB 2016-01-15 17:30:00 2016-01-15 18:00:00
I want to create a list of distinct time intervals in which the same amount of people have an appointment:
Start End Count
2016-01-15 12:00:00 2016-01-15 13:00:00 1
2016-01-15 13:00:00 2016-01-15 14:00:00 2
2016-01-15 14:00:00 2016-01-15 15:45:00 1
2016-01-15 15:45:00 2016-01-15 16:00:00 2
2016-01-15 16:00:00 2016-01-15 17:00:00 1
2016-01-15 17:00:00 2016-01-15 17:30:00 0
2016-01-15 17:30:00 2016-01-15 18:00:00 1
How would I do this in SQL, preferably SQL Server 2008?
EDIT: To clarify: Manually, the result is obtained by making one row for each user, marking the blocked time, and then summing up the count of rows that have a mark:
Time 12 13 14 15 16 17
UserA xxxxxxxx xxxxxxxx
UserB xxxxxxxx x xx
Count 1 2 1 21 0 1
That result set would start at the minimum time available, end at the maximum time available, and while the ASCII art has only a 15min resolution, I would require at least resolution to the minute. I guess you can leave the rows with "0" out of the result, if this is easier for you.
There's got to be an easier way than this, but at least you can probably follow each step individually:
declare #t table ([User] varchar(19) not null,Start datetime2 not null,[End] datetime2 not null)
insert into #t([User], Start, [End]) values
('UserA','2016-01-15T12:00:00','2016-01-15T14:00:00'),
('UserA','2016-01-15T15:00:00','2016-01-15T17:00:00'),
('UserB','2016-01-15T13:00:00','2016-01-15T15:00:00'),
('UserB','2016-01-15T13:32:00','2016-01-15T15:00:00'),
('UserB','2016-01-15T15:30:00','2016-01-15T15:30:00'),
('UserB','2016-01-15T15:45:00','2016-01-15T16:00:00'),
('UserB','2016-01-15T17:30:00','2016-01-15T18:00:00')
;With Times as (
select Start as Point from #t
union
select [End] from #t
), Ordered as (
select Point,ROW_NUMBER() OVER (ORDER BY Point) as rn
from Times
), Periods as (
select
o1.Point as Start,
o2.Point as [End]
from
Ordered o1
inner join
Ordered o2
on
o1.rn = o2.rn - 1
), UserCounts as (
select p.Start,p.[End],COUNT(distinct [User]) as Cnt,ROW_NUMBER() OVER (Order BY p.[Start]) as rn
from
Periods p
left join
#t t
on
p.Start < t.[End] and
t.Start < p.[End]
group by
p.Start,p.[End]
), Consolidated as (
select uc.*
from
UserCounts uc
left join
UserCounts uc_anti
on
uc.rn = uc_anti.rn + 1 and
uc.Cnt = uc_anti.Cnt
where
uc_anti.Cnt is null
union all
select c.Start,uc.[End],c.Cnt,uc.rn
from
Consolidated c
inner join
UserCounts uc
on
c.Cnt = uc.Cnt and
c.[End] = uc.Start
)
select
Start,MAX([End]) as [End],Cnt
from
Consolidated
group by
Start,Cnt
order by Start
CTEs are - Times - since any given start or end stamp can start or end a period in the final results, we just get them all in one column - so the Ordered can number them, and so that Periods can then re-assembly them into each smallest possible period.
UserCounts then goes back to the original data and finds out how many Users where overlapped by each calculated period.
Consolidated is the trickiest CTE to follow, but it's basically merging periods that abut each other where the user count is equal.
Results:
Start End Cnt
--------------------------- --------------------------- -----------
2016-01-15 12:00:00.0000000 2016-01-15 13:00:00.0000000 1
2016-01-15 13:00:00.0000000 2016-01-15 14:00:00.0000000 2
2016-01-15 14:00:00.0000000 2016-01-15 15:45:00.0000000 1
2016-01-15 15:45:00.0000000 2016-01-15 16:00:00.0000000 2
2016-01-15 16:00:00.0000000 2016-01-15 17:00:00.0000000 1
2016-01-15 17:00:00.0000000 2016-01-15 17:30:00.0000000 0
2016-01-15 17:30:00.0000000 2016-01-15 18:00:00.0000000 1
(And I even got the zero row I was unsure I'd be able to conjure into existence)
This kind of query is much easier to write if you have a calendar table. But in this example I've built one on the fly using a recursive CTE. The CTE returns the appointment blocks, which we can then join to the appointment data. I couldn't determine the interval pattern in your sample data, so I've shown the results in blocks of one hour. You could modify this section, or define your own within a second table.
Sample Data
/* Table variables make sharing data easier
*/
DECLARE #Sample TABLE
(
[User] VARCHAR(50),
[Start] DATETIME,
[End] DATETIME
)
;
INSERT INTO #Sample
(
[User],
[Start],
[End]
)
VALUES
('UserA', '2016-01-15 12:00:00', '2016-01-15 14:00:00'),
('UserA', '2016-01-15 15:00:00', '2016-01-15 17:00:00'),
('UserB', '2016-01-15 13:00:00', '2016-01-15 15:00:00'),
('UserB', '2016-01-15 13:32:00', '2016-01-15 15:00:00'),
('UserB', '2016-01-15 15:30:00', '2016-01-15 15:30:00'),
('UserB', '2016-01-15 15:45:00', '2016-01-15 16:00:00'),
('UserB', '2016-01-15 17:30:00', '2016-01-15 18:00:00')
;
I've used two variables to limit the returned results to just those appointments that fall within the given start and end point.
/* Set an start and end point for the next query
*/
DECLARE #Start DATETIME = '2016-01-15 12:00:00';
DECLARE #End DATETIME = '2016-01-15 18:00:00';
WITH Calendar AS
(
/* Anchor returns start of first appointment
*/
SELECT
#Start AS [Start],
DATEADD(SECOND, -1, DATEADD(HOUR, 1, #Start)) AS [End]
UNION ALL
/* Recursion, keep adding new records until end of last appointment
*/
SELECT
DATEADD(HOUR, 1, [Start]) AS [Start],
DATEADD(HOUR, 1, [End]) AS [End]
FROM
Calendar
WHERE
[End] <= #End
)
SELECT
c [Start],
c [End],
COUNT(DISTINCT s [User]) AS [Count]
FROM
Calendar AS c
LEFT OUTER JOIN #Sample AS s ON s [Start] BETWEEN c [Start] AND c [End]
OR s [End] BETWEEN c [Start] AND c [End]
GROUP BY
c [Start],
c [End]
;
Because an appointment can exceed one hour it may contribute to more than one row. This explains why 7 sample rows leads to a returned total of 9.

Dense_rank and sum

I have this common table expression
WITH total_hour
AS (
SELECT
employee_id,
SUM(ROUND(CAST(DATEDIFF(MINUTE, start_time, finish_time) AS NUMERIC(18, 0)) / 60, 2)) AS total_h
FROM Timesheet t
WHERE t.employee_id = #employee_id
AND DENSE_RANK() OVER (
ORDER BY DATEDIFF(DAY, '20130925', date_worked) / 7 DESC ) = #rank
GROUP BY t.personnel_id
)
This is the sample data:
ID employee_id worked_date start_time finish_time
1 1 2013-09-25 09:00:00 17:30:00
2 1 2013-09-26 07:00:00 17:00:00
8 1 2013-10-01 09:00:00 17:00:00
9 1 2013-10-04 09:00:00 17:00:00
12 1 2013-10-07 09:00:00 17:00:00
13 1 2013-10-30 09:00:00 17:00:00
14 1 2013-10-28 09:00:00 17:00:00
15 1 2013-11-01 09:00:00 17:00:00
Supposed Wednesday is the first day of the week and my based date is 2013-09-25. I want to get the total number of hours worked from 09-25 to 10-01 when #rank is 1 and total hour from 10-02 to 10-08 when #rank=2 and so on.
Thanks
To get the number of hours worked for an employee within a particular week, just use a suitable WHERE criteria. No need to use DENSE_RANK or similar windowed functions for this.
Assuming you have a #Week parameter, that contains an integer (0 for current week, 1 for last week, 2 for week before that, etc.):
SELECT
employee_id
SUM(ROUND(CAST(DATEDIFF(MINUTE, start_time, finish_time) AS NUMERIC(18, 0)) / 60, 2)) AS total_h
FROM
Timesheet t
WHERE
t.employee_id = #employee_id AND
date_worked BETWEEN DATEADD(ww, DATEDIFF(ww,0,GETDATE()) - #Week, 0)
AND DATEADD(ww, DATEDIFF(ww,0,GETDATE()) - #Week, 0) + 7
Here, I've used the current date (GETDATE()) as the base date, but you could just replace it with 20130925, if that's what you need.