PostgreSQL Attendance and night shift - sql

i have following table:
dt
type
2022-09-12 21:36:26
WORK_START
2022-09-13 02:00:00
BREAK_START
2022-09-20 06:00:00
WORK_START
2022-09-20 10:00:00
BREAK_START
2022-09-20 10:27:00
BREAK_END
2022-09-20 13:00:00
WORK_END
2022-09-13 06:00:00
WORK_END
2022-09-13 02:30:00
BREAK_END
and query :
SELECT g.tempDatum::date as datum,
MAX(att.dt::time) FILTER (WHERE att.type = 'WORK_START') as work_start
, MAX(att.dt::time) FILTER (WHERE att.type = 'BREAK_START') as break_start
, MAX(att.dt::time) FILTER (WHERE att.type = 'BREAK_END') as break_end
, MAX(att.dt::time) FILTER (WHERE att.type = 'WORK_END') as work_end
FROM generate_series( '2022-09-01','2022-09-30', '1 day'::interval) AS g(tempDatum)
LEFT JOIN att ON att.dt::date = g.tempDatum::date group by g.tempDatum order by
g.tempDatum;
Result is pretty good:
Result photo
except for 2022-09-12 because is a night shift. I want move Break_start + end and work_end to day 2022-09-12 for better result as attendance log.
How achieve this ? Big thanks for any help.

By grouping each work day (start, break start, break end, end) as one we can use crosstab to pivot it using the first work day of each group as the one for the entire day as requested.
select *
from crosstab(
'select min(dte) over(partition by grp), type, tme from
(
select dt::date as dte
,dt::time as tme
,type
,row_number() over(order by dt,type)-case when row_number() over(order by dt,type) <= 4 then row_number() over(order by dt,type) else row_number() over(order by dt,type)-4 end as grp
from t
) t' )
as ct(dt date, WORK_START time, BREAK_START time, BREAK_END time, WORK_END time)
dt
work_start
break_start
break_end
work_end
2022-09-12
21:36:26
02:00:00
02:30:00
06:00:00
2022-09-20
06:00:00
10:00:00
10:27:00
13:00:00
Fiddle

Related

Merge consecutive time record in SQL server 2008

Say I have data like this, timeslots is basically time with 30 mins apart.
Note there is a gap in 2021-12-24 between 15:30 and 16:30.
calender_date
timeslot
timeslot_end
2021-12-24
14:00:00
14:30:00
2021-12-24
14:30:00
15:00:00
2021-12-24
15:00:00
15:30:00
2021-12-24
16:30:00
17:00:00
2021-12-24
17:00:00
17:30:00
2021-12-24
17:30:00
18:00:00
2021-12-30
09:00:00
09:30:00
2021-12-30
09:30:00
10:00:00
I want to merge rows where timeslot_end = next row's timeslot and in the same day, so data would look like this.
calender_date
timeslot
timeslot_end
2021-12-24
14:00:00
15:30:00
2021-12-24
16:30:00
18:00:00
2021-12-30
9:00:00
10:00:00
I have try row numbering with self join,
WITH cte AS
(
SELECT
calender_date
,timeslot
,timeslot_end
,ROW_NUMBER()OVER(ORDER BY [timeslot])rn
FROM #tmp_leave tl
)
SELECT
MIN(a.timeslot) OVER(PARTITION BY a.calender_date, DATEDIFF(minute,a.timeslot_end,
ISNULL(b.timeslot, a.timeslot_end))) AS 'StartTime',
MAX(a.timeslot_end ) OVER(PARTITION BY a.calender_date,
DATEDIFF(minute,a.timeslot_end, ISNULL(b.timeslot, a.timeslot_end))) AS 'EndTime'
FROM cte a
LEFT JOIN cte b
ON a.rn + 1 = b.rn AND a.timeslot_end = b.timeslot
ORDER BY calender_date
But the result isn't quite right, it ignored the gap in 2021-12-24 and return below.
calender_date
timeslot
timeslot_end
2021-12-24
14:00:00
18:00:00
2021-12-30
9:00:00
10:00:00
I have been searching and trying to solve it for a while now, please help, any help is very very appreciated!!
This is a gaps and islands problem. We can approach this by creating pseudo groups for each island of continuous dates/times.
WITH cte AS (
SELECT *, LAG(timeslot_end) OVER
(ORDER BY calendar_date, timeslot) timeslot_end_lag
FROM yourTable
),
cte2 AS (
SELECT *, COUNT(CASE WHEN timeslot_end_lag <> timeslot THEN 1 END)
OVER (ORDER BY calendar_date, timeslot) AS grp
FROM cte
)
SELECT calendar_date,
MIN(timeslot) AS timeslot,
MAX(timeslot_end) AS timeslot_end
FROM cte2
GROUP BY calendar_date, grp
ORDER BY calendar_date;
Demo

How to fill the time gap after grouping date record for months in postgres

I have table records as -
date n_count
2020-02-19 00:00:00 4
2020-07-14 00:00:00 1
2020-07-17 00:00:00 1
2020-07-30 00:00:00 2
2020-08-03 00:00:00 1
2020-08-04 00:00:00 2
2020-08-25 00:00:00 2
2020-09-23 00:00:00 2
2020-09-30 00:00:00 3
2020-10-01 00:00:00 11
2020-10-05 00:00:00 12
2020-10-19 00:00:00 1
2020-10-20 00:00:00 1
2020-10-22 00:00:00 1
2020-11-02 00:00:00 376
2020-11-04 00:00:00 72
2020-11-11 00:00:00 1
I want to be grouped all the records into months for finding month total count which is working, but there is a missing of month. how to fill this gap.
time month_count
"2020-02-01" 4
"2020-07-01" 4
"2020-08-01" 5
"2020-09-01" 5
"2020-10-01" 26
"2020-11-01" 449
This is what I have tried.
SELECT (date_trunc('month', date))::date AS time,
sum(n_count) as month_count
FROM table1
group by time
order by time asc
You can use generate_series() to generate all starts of months between the earliest and latest date available in the table, then bring the table with a left join:
select d.dt, coalesce(sum(t.n_count), 0) as month_count
from (
select generate_series(date_trunc('month', min(date)), date_trunc('month', max(date)), '1 month') as dt
from table1
) as d(dt)
left join table1 t on t.date >= d.dt and t.date < d.dt + interval '1 month'
group by d.dt
order by d.dt
I would simply UNION a date series, generated from MIN and MAX date:
demo:db<>fiddle
WITH cte AS ( -- 1
SELECT
*,
date_trunc('month', date)::date AS time
FROM
t
)
SELECT
time,
SUM(n_count) as month_count --3
FROM (
SELECT
time,
n_count
FROM cte
UNION
SELECT -- 2
generate_series(
(SELECT MIN(time) FROM cte),
(SELECT MAX(time) FROM cte),
interval '1 month'
)::date,
0
) s
GROUP BY time
ORDER BY time
Use CTE to calculate date_trunc only once. Could be left out if you like to call your table twice in the UNION below
Generate monthly date series from MIN to MAX date containing your n_count value = 0. Add it to the table
Do your calculation

Passing data from one table to a block code - Oracle

I have a table dates_2019 with all the weekdays dates for 2019 as below:-
TS_RANGE_BEGIN |TS_RANGE_END
2019-01-01 17:00:00 |2019-01-02 17:00:00
2019-01-02 17:00:00 |2019-01-03 17:00:00
2019-01-03 17:00:00 |2019-01-04 17:00:00
2019-01-04 17:00:00 |2019-01-07 17:00:00
2019-01-07 17:00:00 |2019-01-08 17:00:00
2019-01-08 17:00:00 |2019-01-09 17:00:00
My insert query as below:-
insert into report_2019(ab,app_name,status,sub_count,category,last_modified_timestamp)
with T as (
select id,ab,app_name,status,trunc(last_modified_timestamp),
row_number() over(partition by id, ab order by p_message_id desc, message_id desc) lastest_status_order_id
,p_message_id
,LAST_MODIFIED_TIMESTAMP,reporting_purpose
from (
select
id,
ab
, app_name,
status
,p.message_id p_message_id
,s.message_id
,s.LAST_MODIFIED_TIMESTAMP, reporting_purpose
from table_a s, table_b d, table_c k, table_d t,
table_e p
where s.LAST_MODIFIED_TIMESTAMP > to_timestamp('2019-01-01 17', 'YYYY-MM-DD HH24')
and s.LAST_MODIFIED_TIMESTAMP <= to_timestamp('2019-01-02 17', 'YYYY-MM-DD HH24')
and .....
) a
)
select * from (
select ab, app_name, status, count(*) subtotal,
'Reporting' as v_category,trunc(last_modified_timestamp)
from T where lastest_status_order_id = 1
and LAST_MODIFIED_TIMESTAMP <= to_timestamp('2019-01-02 17', 'YYYY-MM-DD HH24')
group by ab,app_name,status,trunc(last_modified_timestamp))a order by ab, app_name;
The original idea was to join both the tables dates_2019 and T and get the results for the whole year as below:-
where s.LAST_MODIFIED_TIMESTAMP > c.ts_range_begin
and s.LAST_MODIFIED_TIMESTAMP <= c.ts_range_end
....
However the temp tablespace is low and i got the below errors:-
[Error Code: 12801, SQL State: 72000] ORA-12801: error signaled in
parallel query server P004 ORA-01555: snapshot too old: rollback
segment number 30 with name "_SYSSMU30_326584413$" too small
The option now is to insert the data one day at a time in a block.
Could you please help me with the solution?
Thanks,

Calculating concurrency from a set of ranges

I have a set of rows containing a start timestamp and a duration. I want to perform various summaries using the overlap or concurrency.
For example: peak daily concurrency, peak concurrency grouped on another column.
Example data:
timestamp,duration
2016-01-01 12:00:00,300
2016-01-01 12:01:00,300
2016-01-01 12:06:00,300
I would like to know that peak for the period was 12:01:00-12:05:00 at 2 concurrent.
Any ideas on how to achieve this using BigQuery or, less exciting, a Map/Reduce job?
For a per-minute resolution, with session lengths of up to 255 minutes:
SELECT session_minute, COUNT(*) c
FROM (
SELECT start, DATE_ADD(start, i, 'MINUTE') session_minute FROM (
SELECT * FROM (
SELECT TIMESTAMP("2015-04-30 10:14") start, 7 minutes
),(
SELECT TIMESTAMP("2015-04-30 10:15") start, 12 minutes
),(
SELECT TIMESTAMP("2015-04-30 10:15") start, 12 minutes
),(
SELECT TIMESTAMP("2015-04-30 10:18") start, 12 minutes
),(
SELECT TIMESTAMP("2015-04-30 10:23") start, 3 minutes
)
) a
CROSS JOIN [fh-bigquery:public_dump.numbers_255] b
WHERE a.minutes>b.i
)
GROUP BY 1
ORDER BY 1
STEP 1 - First you need find all periods (start and end) with
respective concurrent entries
SELECT ts AS start, LEAD(ts) OVER(ORDER BY ts) AS finish,
SUM(entry) OVER(ORDER BY ts) AS concurrent_entries
FROM (
SELECT ts, SUM(entry)AS entry
FROM
(SELECT ts, 1 AS entry FROM yourTable),
(SELECT DATE_ADD(ts, duration, 'second') AS ts, -1 AS entry FROM yourTable)
GROUP BY ts
HAVING entry != 0
)
ORDER BY ts
Assuming input as below
(SELECT TIMESTAMP('2016-01-01 12:00:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:01:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:06:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:07:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:10:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:11:00') AS ts, 300 AS duration)
the output of above query will look somehow like this:
start finish concurrent_entries
2016-01-01 12:00:00 UTC 2016-01-01 12:01:00 UTC 1
2016-01-01 12:01:00 UTC 2016-01-01 12:05:00 UTC 2
2016-01-01 12:05:00 UTC 2016-01-01 12:07:00 UTC 1
2016-01-01 12:07:00 UTC 2016-01-01 12:10:00 UTC 2
2016-01-01 12:10:00 UTC 2016-01-01 12:12:00 UTC 3
2016-01-01 12:12:00 UTC 2016-01-01 12:15:00 UTC 2
2016-01-01 12:15:00 UTC 2016-01-01 12:16:00 UTC 1
2016-01-01 12:16:00 UTC null 0
You might still want to polish above query a little - but mainly it does what you need
STEP 2 - now you can do any stats off of above result
For example peak on whole period:
SELECT
start, finish, concurrent_entries, RANK() OVER(ORDER BY concurrent_entries DESC) AS peak
FROM (
SELECT ts AS start, LEAD(ts) OVER(ORDER BY ts) AS finish,
SUM(entry) OVER(ORDER BY ts) AS concurrent_entries
FROM (
SELECT ts, SUM(entry)AS entry FROM
(SELECT ts, 1 AS entry FROM yourTable),
(SELECT DATE_ADD(ts, duration, 'second') AS ts, -1 AS entry FROM yourTable)
GROUP BY ts
HAVING entry != 0
)
)
ORDER BY peak

Get classroom available hours between date time range

I'm, using Oracle 11g and I have this problem. I couldn't come up with any ideas to solve it yet.
I have a table with occupied classrooms. What I need to find are the hours available between a datetime range. For example, I have rooms A, B and C, the table of occupied classrooms looks like this:
Classroom start end
A 10/10/2013 10:00 10/10/2013 11:30
B 10/10/2013 09:15 10/10/2013 10:45
B 10/10/2013 14:30 10/10/2013 16:00
What I need to get is something like this:
with date time range between '10/10/2013 07:00' and '10/10/2013 21:15'
Classroom avalailable_from available_to
A 10/10/2013 07:00 10/10/2013 10:00
A 10/10/2013 11:30 10/10/2013 21:15
B 10/10/2013 07:00 10/10/2013 09:15
B 10/10/2013 10:45 10/10/2013 14:30
B 10/10/2013 16:00 10/10/2013 21:15
C 10/10/2013 07:00 10/10/2013 21:15
Is there a way I can accomplish that with sql or pl/sql?
I was looking at a solution similar in concept at least to Wernfried's, but I think it's different enough to post as well. The start is the same idea, first generating the possible time slots, and assuming you're looking at 15-minute windows: I'm using CTEs because I think they're clearer than nested selects, particularly with this many levels.
with date_time_range as (
select to_date('10/10/2013 07:00', 'DD/MM/YYYY HH24:MI') as date_start,
to_date('10/10/2013 21:15', 'DD/MM/YYYY HH24:MI') as date_end
from dual
),
time_slots as (
select level as slot_num,
dtr.date_start + (level - 1) * interval '15' minute as slot_start,
dtr.date_start + level * interval '15' minute as slot_end
from date_time_range dtr
connect by level <= (dtr.date_end - dtr.date_start) * (24 * 4) -- 15-minutes
)
select * from time_slots;
This gives you the 57 15-minute slots between the start and end date you specified. The CTE for date_time_range isn't strictly necessary, you could put your dates straight into the time_slots conditions, but you'd have to repeat them and that then introduces a possible failure point (and means binding the same value multiple times, from JDBC or wherever).
Those slots can then be cross-joined to the list of classrooms, which I'm assuming are already in another table, which gives you 171 (3x57) combinations; and those can be compared with existing bookings - once those are eliminated you're left with the 153 15-minute slots that have no booking.
with date_time_range as (...),
time_slots as (...),
free_slots as (
select c.classroom, ts.slot_num, ts.slot_start, ts.slot_end,
lag(ts.slot_end) over (partition by c.classroom order by ts.slot_num)
as lag_end,
lead(ts.slot_start) over (partition by c.classroom order by ts.slot_num)
as lead_start
from time_slots ts
cross join classrooms c
left join occupied_classrooms oc on oc.classroom = c.classroom
and not (oc.occupied_end <= ts.slot_start
or oc.occupied_start >= ts.slot_end)
where oc.classroom is null
)
select * from free_slots;
But then you have to collapse those into contiguous ranges. There are various ways of doing that; here I'm peeking at the previous and next rows to decide if a particular value is the edge of a range:
with date_time_range as (...),
time_slots as (...),
free_slots as (...),
free_slots_extended as (
select fs.classroom, fs.slot_num,
case when fs.lag_end is null or fs.lag_end != fs.slot_start
then fs.slot_start end as slot_start,
case when fs.lead_start is null or fs.lead_start != fs.slot_end
then fs.slot_end end as slot_end
from free_slots fs
)
select * from free_slots_extended
where (fse.slot_start is not null or fse.slot_end is not null);
Now we're down to 12 rows. (The outer where clause eliminates all 141 of the 153 slots from the previous step which are mid-range, since we only care about the edges):
CLASSROOM SLOT_NUM SLOT_START SLOT_END
--------- ---------- ---------------- ----------------
A 1 2013-10-10 07:00
A 12 2013-10-10 10:00
A 19 2013-10-10 11:30
A 57 2013-10-10 21:15
B 1 2013-10-10 07:00
B 9 2013-10-10 09:15
B 16 2013-10-10 10:45
B 30 2013-10-10 14:30
B 37 2013-10-10 16:00
B 57 2013-10-10 21:15
C 1 2013-10-10 07:00
C 57 2013-10-10 21:15
So those represent the edges, but on separate rows, and a final step combines them:
...
select distinct fse.classroom,
nvl(fse.slot_start, lag(fse.slot_start)
over (partition by fse.classroom order by fse.slot_num)) as slot_start,
nvl(fse.slot_end, lead(fse.slot_end)
over (partition by fse.classroom order by fse.slot_num)) as slot_end
from free_slots_extended fse
where (fse.slot_start is not null or fse.slot_end is not null)
Or putting all that together:
with date_time_range as (
select to_date('10/10/2013 07:00', 'DD/MM/YYYY HH24:MI') as date_start,
to_date('10/10/2013 21:15', 'DD/MM/YYYY HH24:MI') as date_end
from dual
),
time_slots as (
select level as slot_num,
dtr.date_start + (level - 1) * interval '15' minute as slot_start,
dtr.date_start + level * interval '15' minute as slot_end
from date_time_range dtr
connect by level <= (dtr.date_end - dtr.date_start) * (24 * 4) -- 15-minutes
),
free_slots as (
select c.classroom, ts.slot_num, ts.slot_start, ts.slot_end,
lag(ts.slot_end) over (partition by c.classroom order by ts.slot_num)
as lag_end,
lead(ts.slot_start) over (partition by c.classroom order by ts.slot_num)
as lead_start
from time_slots ts
cross join classrooms c
left join occupied_classrooms oc on oc.classroom = c.classroom
and not (oc.occupied_end <= ts.slot_start
or oc.occupied_start >= ts.slot_end)
where oc.classroom is null
),
free_slots_extended as (
select fs.classroom, fs.slot_num,
case when fs.lag_end is null or fs.lag_end != fs.slot_start
then fs.slot_start end as slot_start,
case when fs.lead_start is null or fs.lead_start != fs.slot_end
then fs.slot_end end as slot_end
from free_slots fs
)
select distinct fse.classroom,
nvl(fse.slot_start, lag(fse.slot_start)
over (partition by fse.classroom order by fse.slot_num)) as slot_start,
nvl(fse.slot_end, lead(fse.slot_end)
over (partition by fse.classroom order by fse.slot_num)) as slot_end
from free_slots_extended fse
where (fse.slot_start is not null or fse.slot_end is not null)
order by 1, 2;
Which gives:
CLASSROOM SLOT_START SLOT_END
--------- ---------------- ----------------
A 2013-10-10 07:00 2013-10-10 10:00
A 2013-10-10 11:30 2013-10-10 21:15
B 2013-10-10 07:00 2013-10-10 09:15
B 2013-10-10 10:45 2013-10-10 14:30
B 2013-10-10 16:00 2013-10-10 21:15
C 2013-10-10 07:00 2013-10-10 21:15
SQL Fiddle.
It is always a challenge when you like to "select something which does not exist". First you need a list of all available classrooms and times (in interval of 15 Minutes). Then you can select them by skipping the occupied items.
I managed to make a query without any PL/SQL:
CREATE TABLE Table1
(Classroom VARCHAR2(10), start_ts DATE, end_ts DATE);
INSERT INTO Table1 VALUES ('A', TIMESTAMP '2013-01-10 10:00:00', TIMESTAMP '2013-01-10 11:30:00');
INSERT INTO Table1 VALUES ('B', TIMESTAMP '2013-01-10 09:15:00', TIMESTAMP '2013-01-10 10:45:00');
INSERT INTO Table1 VALUES ('B', TIMESTAMP '2013-01-10 14:30:00', TIMESTAMP '2013-01-10 16:00:00');
WITH all_rooms AS
(SELECT CHR(64+LEVEL) AS ROOM FROM dual CONNECT BY LEVEL <= 3),
all_times AS
(SELECT CAST(TIMESTAMP '2013-01-10 07:00:00' + (LEVEL-1) * INTERVAL '15' MINUTE AS DATE) AS TIMES, LEVEL AS SLOT
FROM DUAL
CONNECT BY TIMESTAMP '2013-01-10 07:00:00' + (LEVEL-1) * INTERVAL '15' MINUTE <= TIMESTAMP '2013-01-10 21:15:00'),
all_free_slots AS
(SELECT ROOM, TIMES, SLOT,
CASE SLOT-LAG(SLOT, 1, 0) OVER (PARTITION BY ROOM ORDER BY SLOT)
WHEN 1 THEN 0
ELSE 1
END AS NEW_WINDOW
FROM all_times
CROSS JOIN all_rooms
WHERE NOT EXISTS
(SELECT 1 FROM TABLE1 WHERE ROOM = CLASSROOM AND TIMES BETWEEN START_TS + INTERVAL '1' MINUTE AND END_TS - INTERVAL '1' MINUTE)),
free_time_windows AS
(SELECT ROOM, TIMES, SLOT,
SUM(NEW_WINDOW) OVER (PARTITION BY ROOM ORDER BY SLOT) AS WINDOW_ID
FROM all_free_slots)
SELECT ROOM,
TO_CHAR(MIN(TIMES), 'yyyy-mm-dd hh24:mi') AS free_time_start,
TO_CHAR(MAX(TIMES), 'yyyy-mm-dd hh24:mi') AS free_time_end
FROM free_time_windows
GROUP BY ROOM, WINDOW_ID
HAVING MAX(TIMES) - MIN(TIMES) > 0
ORDER BY ROOM, 2;
ROOM FREE_TIME_START FREE_TIME_END
---- ----------------------------------
A 2013-01-10 07:00 2013-01-10 10:00
A 2013-01-10 11:30 2013-01-10 21:15
B 2013-01-10 07:00 2013-01-10 09:15
B 2013-01-10 10:45 2013-01-10 14:30
B 2013-01-10 16:00 2013-01-10 21:15
C 2013-01-10 07:00 2013-01-10 21:15
In order to understand the query you can split the sub-queries from top, e.g.
WITH all_rooms AS
(SELECT CHR(64+LEVEL) AS ROOM FROM dual CONNECT BY LEVEL <= 3),
all_times AS
(SELECT CAST(TIMESTAMP '2013-01-10 07:00:00' + (LEVEL-1) * INTERVAL '15' MINUTE AS DATE) AS TIMES, LEVEL AS SLOT
FROM DUAL
CONNECT BY TIMESTAMP '2013-01-10 07:00:00' + (LEVEL-1) * INTERVAL '15' MINUTE <= TIMESTAMP '2013-01-10 21:15:00')
SELECT ROOM, TIMES, SLOT,
CASE SLOT-LAG(SLOT, 1, 0) OVER (PARTITION BY ROOM ORDER BY SLOT)
WHEN 1 THEN 0
ELSE 1
END AS NEW_WINDOW
FROM all_times
CROSS JOIN all_rooms
WHERE NOT EXISTS (SELECT 1 FROM TABLE1 WHERE ROOM = CLASSROOM AND TIMES BETWEEN START_TS + INTERVAL '1' MINUTE AND END_TS - INTERVAL '1' MINUTE)
ORDER BY ROOM, SLOT