Split duration per time interval - google-bigquery

My data:
The length of a shift is broken down per time interval of 1 hour (e.g. 19:00:00 represents the time interval 19:00:00-20:00:00)
date
time
duration_in_hours
shift_start_at
shift_end_at
2022-05-24
19:00:00
3
19:30:00
22:30:00
2022-05-24
20:00:00
3
19:30:00
22:30:00
2022-05-24
21:00:00
3
19:30:00
22:30:00
2022-05-24
22:00:00
3
19:30:00
22:30:00
Expected outcome:
Split duration_in_hours per time interval
date
time
duration_in_hours
shift_start_at
shift_end_at
2022-05-24
19:00:00
0.5
19:30:00
22:30:00
2022-05-24
20:00:00
1
19:30:00
22:30:00
2022-05-24
21:00:00
1
19:30:00
22:30:00
2022-05-24
22:00:00
0.5
19:30:00
22:30:00
Query used:
SELECT DISTINCT
date,
TIME(hour, 0, 0) AS time,
duration in hours,
shift_start_at,
shift_end_at,
FROM a, UNNEST(GENERATE_ARRAY(0, 23)) hour
WHERE TIME(hour, 0, 0) >= TIME_TRUNC(shift_start_at, HOUR) AND TIME(hour, 0, 0) < shift_end_at
I have used the same query for a different table and it splits the duration_in_hours automatically. It doesn't do the job here and I don't understand why. Any help would be greatly appreciated :)

All the information to calculate duration_in_hours exists in the same row, so I think you can make it with simple math using CASE expression.
Consider below:
CASE WHEN start_at > time THEN 1 - EXTRACT(MINUTE FROM start_at) / 60
WHEN TIME_DIFF(end_at, time, MINUTE) < 60 THEN EXTRACT(MINUTE FROM end_at) / 60
ELSE 1
END AS duration_in_hours
output:

Related

SQL: Split time interval into 1 hour with overlapping minutes split (Bigquery)

This is the data that I have:
date
event_type
interval_start
interval_end
duration_in_min
2022-06-06
s1
09:05:00
11:45:00
160
2022-06-01
s2
08:00:00
08:17:00
17
2022-05-31
c1
17:55:00
18:08:00
13
2022-04-05
s3
07:58:00
08:46:00
48
...
and this is what I would like to achieve:
interval represents a 1 hour interval (or maybe 59 min and 59 sec to be accurate, in case an event starts/ends at exactly 10:00:00 but it should not occur very often).
date
interval
event_type
interval_start
interval_end
duration_in_min
2022-06-06
09:00:00
s1
09:05:00
11:45:00
55
2022-06-06
10:00:00
s1
09:05:00
11:45:00
60
2022-06-06
11:00:00
s1
09:05:00
11:45:00
45
2022-06-01
08:00:00
s2
08:00:00
08:17:00
17
2022-05-31
17:00:00
c1
17:55:00
18:08:00
5
2022-05-31
18:00:00
c1
17:55:00
18:08:00
8
2022-04-05
07:00:00
s3
07:58:00
08:46:00
2
2022-04-05
08:00:00
s3
07:58:00
08:46:00
46
...
I struggle to sort the data per hour by getting a split for the overlapping minutes into a new interval(s).
Any help would be greatly appreciated :)
Consider below approach
select
date, time(hour, 0, 0) as `interval`,
event_type, interval_start, interval_end,
time_diff(least(time(hour + 1, 0, 0), interval_end), greatest(time(hour, 0, 0), interval_start), minute) as duration_in_min
from your_table,
unnest(generate_array(0, 23)) hour
where hour between extract(hour from time(interval_start)) and extract(hour from time(interval_end))
if applied to sample data in your question - output is

How do I retrieve data in Monday to Friday Hourly Format

I have a table that is currently in the following format
ID
Title
CreatedOn
1
Test 1
2021-04-26 08:00:00
2
Test 2
2021-04-26 10:00:00
3
Test 3
2021-04-27 09:00:00
4
Test 4
2021-04-28 14:00:00
5
Test 5
2021-04-28 16:00:00
6
Test 6
2021-04-28 12:00:00
7
Test 7
2021-04-29 13:00:00
8
Test 8
2021-04-30 06:00:00
9
Test 9
2021-05-17 10:00:00
10
Test 10
2021-05-18 19:00:00
11
Test 11
2021-05-18 23:00:00
12
Test 12
2021-05-19 16:00:00
13
Test 13
2021-05-20 07:00:00
14
Test 14
2021-05-21 14:00:00
15
Test 15
2021-05-21 10:00:00
16
Test 16
2021-04-30 10:00:00
What I would like to do is a query that would tell me how many requests have been Monday to Friday per hour. So aggregate all the data into just rows of Monday to Friday.
So the query should return
Day
Hour
Count
Monday
08:00
1
Monday
10:00
2
Tuesday
10:00
1
Tuesday
19:00
1
Tuesday
23:00
1
Wednesday
14:00
1
Wednesday
16:00
2
Wednesday
12:00
1
etc.. How do I achieve this?
So far I have the following
SELECT
DATENAME(WEEK, CreatedOn) AS Week,
DATEPART(Hour, CreatedOn) AS Hour,
COUNT(*) AS Requests
FROM [Enterprise32].[dbo].[nav_EmailEstimateRequests]
where CreatedOn > '2021-01-01'
GROUP BY DATENAME(WK, CreatedOn),DATEPART(Hour, CreatedOn)
ORDER BY DATENAME(WK, CreatedOn);
But the above query returns each week so Week 1 up until Week 21. Please guide me in the right direction.
Thank you!
You want weekday for the date part:
SELECT DATENAME(WEEKDAY, CreatedOn) AS Weekday,
DATEPART(Hour, CreatedOn) AS Hour,
COUNT(*) AS Requests
FROM [Enterprise32].[dbo].[nav_EmailEstimateRequests]
WHERE CreatedOn > '2021-01-01'
GROUP BY DATENAME(WEEKDAY, CreatedOn), DATEPART(Hour, CreatedOn), DATEPART(WEEKDAY, CreatedOn)
ORDER BY DATEPART(WEEKDAY, CreatedOn), Hour;
Note: I included DATEPART(weekday, ) in the GROUP BY, so you could use it in the ORDER BY.

Google Bigquery - Create time series of number of active records

I'm trying to create a timeseries in google bigquery SQL. My data is a series of time ranges covering the period of activity for that record. Here is an example:
Start End
2020-11-01 21:04:00 UTC 2020-11-02 07:15:00 UTC
2020-11-01 21:45:00 UTC 2020-11-02 04:00:00 UTC
2020-11-01 22:00:00 UTC 2020-11-02 09:48:00 UTC
2020-11-01 22:00:00 UTC 2020-11-02 06:00:00 UTC
I wish to create a new table to total the number of active records within a 15 minute block. "21:00:00" would for example be 21:00 to 21:14.59. My desired output for the above would be:
Period Active_Records
2020-11-01 21:00:00 1
2020-11-01 21:15:00 1
2020-11-01 21:30:00 1
2020-11-01 21:45:00 2
2020-11-01 22:00:00 4
2020-11-01 22:15:00 4
etc until the end of the last active range.
I would also like to be able to generate this on the fly by querying a date range and having it return every 15 minute block in the range and how many active records there was in that period.
Any assistance would be greatly appreciated.
Below is for BigQuery Standard SQL
#standardSQL
select ts as period, count(1) as Active_Records
from unnest((
select generate_timestamp_array(timestamp_trunc(min(start), hour), max(`end`), interval 15 minute)
from `project.dataset.table`
)) ts
join `project.dataset.table`
on not (`end` < ts or start > timestamp_add(ts, interval 15 * 60 - 1 second))
group by ts
if to apply to sample data from your question - output is

SQL Server : get start time and end time with in the multiple night shift

How can I get the Start and End time of this list? I can add date to this time and can get by min and max but you can see row 3 have next day shift but it will come under same date because it is night shift
I have added normal day shift employee also get the logic right
EmployeeId ShiftDate ShiftStartTime ShiftEndTime
-----------------------------------------------------
20040 2017-11-01 21:00:00 23:00:00
20040 2017-11-01 23:00:00 00:30:00
20040 2017-11-01 00:30:00 06:00:00
20124 2017-11-01 09:00:00 16:30:00
20124 2017-11-01 16:30:00 22:00:00
20124 2017-11-01 22:00:00 22:30:00
I need it like below:
EmployeeId ShiftDate ShiftStartTime ShiftEndTime
----------------------------------------------------
20040 2017-11-01 21:00:00 06:00:00
20124 2017-11-01 09:00:00 22:30:00
In a commercial environment we solved this by attaching a FLAG to each shift. The Flag would indicate the 'Reporting Date' of the Shift...The Flag would have have a value of 1 if the 'Reporting / Administrative date' was the 'next' day. 0 for the same day. -1 for the previous day (which we never used...depends on your scenario)
I modified your table to show a possible SHIFTS table, which should also have a NAME column I guess (like Morning, Afternoon, Day, Night shift etc)
ReportFlag ShiftStartTime ShiftEndTime
1 21:00:00 23:00:00
1 23:00:00 00:30:00
0 00:30:00 06:00:00
0 09:00:00 16:30:00
0 16:30:00 22:00:00
1 22:00:00 22:30:00
Notice how I added 1 - to say that 'this shift' is actually considered to be on the 'next' day.
Then you can use your flag value 0,1 to add to DATE functions in your queries too

Extract hours as 1 - 48 from half hour interval times

I have the below data.
0:00:00
0:30:00
1:00:00
1:30:00
2:00:00
2:30:00
3:00:00
3:30:00
4:00:00
4:30:00
5:00:00
5:30:00
6:00:00
6:30:00
I can extract the hour the using EXTRACT(HOUR FROM TIMESTAMP) but this will give me 24 hours.
But now I need to some different calculation where I can get numbers from 1-48 based on the time given.
Something like this:
0:00:00 1
0:30:00 2
1:00:00 3
1:30:00 4
2:00:00 5
2:30:00 6
3:00:00 7
3:30:00 8
4:00:00 9
4:30:00 10
6:00:00 13
6:30:00 14
Note the skipped 11 and 12, for the absent values 5:00 and 5:30.
Is there any possibilities that I can get that result in PostgreSQL?
Simply use formula 1 + extract(hour from 2 * tm) - it gives your expected result exactly - obligatory SQLFiddle.
This will give you a double precision result, that you can round to whatever you want:
2 * (EXTRACT(HOUR FROM t) + EXTRACT(MINUTE FROM t) / 60) + 1
EDIT:
Or, as #CraigRinger suggested:
EXTRACT(EPOCH FROM t) / 1800 + 1
For the later, t needs to be TIME, not TIMESTAMP. Use cast if needed.
UPDATE: This will work with INTERVALs too.
SELECT 2 * (EXTRACT(HOUR FROM t) + EXTRACT(MINUTE FROM t) / 60) + 1,
EXTRACT(EPOCH FROM t) / 1800 + 1
FROM (VALUES (time '4:30:00'), (time '7:24:31'), (time '8:15:00')) as foo(t)
-- results:
?column? | ?column?
---------+---------
10 | 10
15.8 | 15.8172222222222
17.5 | 17.5
But as you wrote, there will be no edge cases (when the time cannot be divided with 30 minutes).
select
case
when date_time_field_of_interest::time >= '00:00:00' and date_time_field_of_interest::time < '00:30:00' then 1
when date_time_field_of_interest::time >= '00:30:00' and date_time_field_of_interest::time < '01:00:00' then 2
....
end
from your_table;