SQL: Split time interval into 1 hour with overlapping minutes split (Bigquery) - google-bigquery

This is the data that I have:
date
event_type
interval_start
interval_end
duration_in_min
2022-06-06
s1
09:05:00
11:45:00
160
2022-06-01
s2
08:00:00
08:17:00
17
2022-05-31
c1
17:55:00
18:08:00
13
2022-04-05
s3
07:58:00
08:46:00
48
...
and this is what I would like to achieve:
interval represents a 1 hour interval (or maybe 59 min and 59 sec to be accurate, in case an event starts/ends at exactly 10:00:00 but it should not occur very often).
date
interval
event_type
interval_start
interval_end
duration_in_min
2022-06-06
09:00:00
s1
09:05:00
11:45:00
55
2022-06-06
10:00:00
s1
09:05:00
11:45:00
60
2022-06-06
11:00:00
s1
09:05:00
11:45:00
45
2022-06-01
08:00:00
s2
08:00:00
08:17:00
17
2022-05-31
17:00:00
c1
17:55:00
18:08:00
5
2022-05-31
18:00:00
c1
17:55:00
18:08:00
8
2022-04-05
07:00:00
s3
07:58:00
08:46:00
2
2022-04-05
08:00:00
s3
07:58:00
08:46:00
46
...
I struggle to sort the data per hour by getting a split for the overlapping minutes into a new interval(s).
Any help would be greatly appreciated :)

Consider below approach
select
date, time(hour, 0, 0) as `interval`,
event_type, interval_start, interval_end,
time_diff(least(time(hour + 1, 0, 0), interval_end), greatest(time(hour, 0, 0), interval_start), minute) as duration_in_min
from your_table,
unnest(generate_array(0, 23)) hour
where hour between extract(hour from time(interval_start)) and extract(hour from time(interval_end))
if applied to sample data in your question - output is

Related

CASE in WHERE Clause in Snowflake

I am trying to do a case statement within the where clause in snowflake but I’m not quite sure how should I go about doing it.
What I’m trying to do is, if my current month is Jan, then the where clause for date is between start of previous year and today. If not, the where clause for date would be between start of current year and today.
WHERE
CASE MONTH(CURRENT_DATE()) = 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND CURRENT_DATE()
CASE MONTH(CURRENT_DATE()) != 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE()
END
Appreciate any help on this!
Use a CASE expression that returns -1 if the current month is January or 0 for any other month, so that you can get with DATEADD() a date of the previous or the current year to use in DATE_TRUNC():
WHERE DATE BETWEEN
DATE_TRUNC('YEAR', DATEADD(YEAR, CASE WHEN MONTH(CURRENT_DATE()) = 1 THEN -1 ELSE 0 END, CURRENT_DATE()))
AND
CURRENT_DATE()
I suspect that you don't even need to use CASE here:
WHERE
(MONTH(CURRENT_DATE()) = 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND
CURRENT_DATE()) OR
(MONTH(CURRENT_DATE()) != 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE())
So the other answers are quite good, but... the answer can be even simpler
Making a little table to brake down what is happening.
select
row_number() over (order by null) - 1 as rn,
dateadd('day', rn * 5, date_trunc('year',current_date())) as pretend_current_date,
DATEADD(YEAR, -1, pretend_current_date) as pcd_sub1,
month(pretend_current_date) as pcd_month,
DATE_TRUNC(year, iff(pcd_month = 1, pcd_sub1, pretend_current_date)) as _from,
pretend_current_date as _to
from table(generator(ROWCOUNT => 30))
order by rn;
this shows:
RN
PRETEND_CURRENT_DATE
PCD_SUB1
PCD_MONTH
_FROM
_TO
0
2022-01-01
2021-01-01
1
2021-01-01
2022-01-01
1
2022-01-06
2021-01-06
1
2021-01-01
2022-01-06
2
2022-01-11
2021-01-11
1
2021-01-01
2022-01-11
3
2022-01-16
2021-01-16
1
2021-01-01
2022-01-16
4
2022-01-21
2021-01-21
1
2021-01-01
2022-01-21
5
2022-01-26
2021-01-26
1
2021-01-01
2022-01-26
6
2022-01-31
2021-01-31
1
2021-01-01
2022-01-31
7
2022-02-05
2021-02-05
2
2022-01-01
2022-02-05
8
2022-02-10
2021-02-10
2
2022-01-01
2022-02-10
9
2022-02-15
2021-02-15
2
2022-01-01
2022-02-15
10
2022-02-20
2021-02-20
2
2022-01-01
2022-02-20
11
2022-02-25
2021-02-25
2
2022-01-01
2022-02-25
12
2022-03-02
2021-03-02
3
2022-01-01
2022-03-02
13
2022-03-07
2021-03-07
3
2022-01-01
2022-03-07
14
2022-03-12
2021-03-12
3
2022-01-01
2022-03-12
15
2022-03-17
2021-03-17
3
2022-01-01
2022-03-17
16
2022-03-22
2021-03-22
3
2022-01-01
2022-03-22
17
2022-03-27
2021-03-27
3
2022-01-01
2022-03-27
18
2022-04-01
2021-04-01
4
2022-01-01
2022-04-01
19
2022-04-06
2021-04-06
4
2022-01-01
2022-04-06
20
2022-04-11
2021-04-11
4
2022-01-01
2022-04-11
21
2022-04-16
2021-04-16
4
2022-01-01
2022-04-16
22
2022-04-21
2021-04-21
4
2022-01-01
2022-04-21
23
2022-04-26
2021-04-26
4
2022-01-01
2022-04-26
24
2022-05-01
2021-05-01
5
2022-01-01
2022-05-01
25
2022-05-06
2021-05-06
5
2022-01-01
2022-05-06
26
2022-05-11
2021-05-11
5
2022-01-01
2022-05-11
27
2022-05-16
2021-05-16
5
2022-01-01
2022-05-16
28
2022-05-21
2021-05-21
5
2022-01-01
2022-05-21
29
2022-05-26
2021-05-26
5
2022-01-01
2022-05-26
Your logic is asking "is the current date in the month of January", at which point take the prior year, and then date truncate to the year, otherwise take the current date and truncate to the year. As the start of a BETWEEN test.
This is the same as getting the current date subtracting one month, and truncating this to year.
Thus there is no need for any IFF or CASE
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE())) AND CURRENT_DATE()
and if you like to drop some paren's, CURRENT_DATE can be used if you leave it in upper case, thus it can even be smaller:
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE)) AND CURRENT_DATE

Split duration per time interval

My data:
The length of a shift is broken down per time interval of 1 hour (e.g. 19:00:00 represents the time interval 19:00:00-20:00:00)
date
time
duration_in_hours
shift_start_at
shift_end_at
2022-05-24
19:00:00
3
19:30:00
22:30:00
2022-05-24
20:00:00
3
19:30:00
22:30:00
2022-05-24
21:00:00
3
19:30:00
22:30:00
2022-05-24
22:00:00
3
19:30:00
22:30:00
Expected outcome:
Split duration_in_hours per time interval
date
time
duration_in_hours
shift_start_at
shift_end_at
2022-05-24
19:00:00
0.5
19:30:00
22:30:00
2022-05-24
20:00:00
1
19:30:00
22:30:00
2022-05-24
21:00:00
1
19:30:00
22:30:00
2022-05-24
22:00:00
0.5
19:30:00
22:30:00
Query used:
SELECT DISTINCT
date,
TIME(hour, 0, 0) AS time,
duration in hours,
shift_start_at,
shift_end_at,
FROM a, UNNEST(GENERATE_ARRAY(0, 23)) hour
WHERE TIME(hour, 0, 0) >= TIME_TRUNC(shift_start_at, HOUR) AND TIME(hour, 0, 0) < shift_end_at
I have used the same query for a different table and it splits the duration_in_hours automatically. It doesn't do the job here and I don't understand why. Any help would be greatly appreciated :)
All the information to calculate duration_in_hours exists in the same row, so I think you can make it with simple math using CASE expression.
Consider below:
CASE WHEN start_at > time THEN 1 - EXTRACT(MINUTE FROM start_at) / 60
WHEN TIME_DIFF(end_at, time, MINUTE) < 60 THEN EXTRACT(MINUTE FROM end_at) / 60
ELSE 1
END AS duration_in_hours
output:

How do I retrieve data in Monday to Friday Hourly Format

I have a table that is currently in the following format
ID
Title
CreatedOn
1
Test 1
2021-04-26 08:00:00
2
Test 2
2021-04-26 10:00:00
3
Test 3
2021-04-27 09:00:00
4
Test 4
2021-04-28 14:00:00
5
Test 5
2021-04-28 16:00:00
6
Test 6
2021-04-28 12:00:00
7
Test 7
2021-04-29 13:00:00
8
Test 8
2021-04-30 06:00:00
9
Test 9
2021-05-17 10:00:00
10
Test 10
2021-05-18 19:00:00
11
Test 11
2021-05-18 23:00:00
12
Test 12
2021-05-19 16:00:00
13
Test 13
2021-05-20 07:00:00
14
Test 14
2021-05-21 14:00:00
15
Test 15
2021-05-21 10:00:00
16
Test 16
2021-04-30 10:00:00
What I would like to do is a query that would tell me how many requests have been Monday to Friday per hour. So aggregate all the data into just rows of Monday to Friday.
So the query should return
Day
Hour
Count
Monday
08:00
1
Monday
10:00
2
Tuesday
10:00
1
Tuesday
19:00
1
Tuesday
23:00
1
Wednesday
14:00
1
Wednesday
16:00
2
Wednesday
12:00
1
etc.. How do I achieve this?
So far I have the following
SELECT
DATENAME(WEEK, CreatedOn) AS Week,
DATEPART(Hour, CreatedOn) AS Hour,
COUNT(*) AS Requests
FROM [Enterprise32].[dbo].[nav_EmailEstimateRequests]
where CreatedOn > '2021-01-01'
GROUP BY DATENAME(WK, CreatedOn),DATEPART(Hour, CreatedOn)
ORDER BY DATENAME(WK, CreatedOn);
But the above query returns each week so Week 1 up until Week 21. Please guide me in the right direction.
Thank you!
You want weekday for the date part:
SELECT DATENAME(WEEKDAY, CreatedOn) AS Weekday,
DATEPART(Hour, CreatedOn) AS Hour,
COUNT(*) AS Requests
FROM [Enterprise32].[dbo].[nav_EmailEstimateRequests]
WHERE CreatedOn > '2021-01-01'
GROUP BY DATENAME(WEEKDAY, CreatedOn), DATEPART(Hour, CreatedOn), DATEPART(WEEKDAY, CreatedOn)
ORDER BY DATEPART(WEEKDAY, CreatedOn), Hour;
Note: I included DATEPART(weekday, ) in the GROUP BY, so you could use it in the ORDER BY.

Total time calculation in a sql query for a day where time in 24 hour format as hhmm

I have a table with date(date), left time(varchar2(4)) and arrival time(varchar2(4)). Time taken is in 24 hour format as hhmm. If a person travel 3 times a day, what will be the query to calculate total travel time in a day?
I am using oracle 11g. Kindly help. Thank you.
Convert the value to a number and report in minutes:
select to_number(substring(time, 1, 2))*60 + to_number(substring(time, 3, 2)) as minutes
Your query would look something like:
select person, sum(to_number(substring(time, 1, 2))*60 + to_number(substring(time, 3, 2))) as minutes
from t
group by person;
I see no reason to convert this back to a string -- or to even store the value as a string instead of as a number. But if you need to, you can reverse the process to get a string.
There are 2 answers, If you want to sum time only on date then it can be done as:-
select curr_date,
sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by curr_date,arrival_time,left_time;
The sample output is as follows:-
select curr_date,left_time,arrival_time from sql_prac;
CURR_DATE LEFT_TIME ARRIVAL_TIME
--------- -------------------- --------------------
30-JUN-17 00:00:00 15:00:00
30-JUL-17 03:30:00 11:30:00
30-AUG-17 03:00:00 12:30:00
30-SEP-17 04:00:00 17:00:00
30-JUN-17 00:00:00 15:00:00
30-JUL-17 03:30:00 11:30:00
30-AUG-17 03:00:00 12:30:00
30-SEP-17 04:00:00 17:00:00
30-SEP-17 04:00:00 17:00:00
9 rows selected
select curr_date,sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by curr_date,arrival_time,left_time;
CURR_DATE DIFFERENCE
--------- ----------
30-JUN-17 30
30-JUL-17 16
30-SEP-17 39
30-AUG-17 19
If you want to sum it by person and date then it can be done as:-
select dept,curr_date,sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by dept,curr_date,arrival_time,left_time order by Dept;
The sample output is as follows:-
Data in table is:-
select dept,curr_date,left_time,arrival_time from sql_prac;
DEPT CURR_DATE LEFT_TIME ARRIVAL_TIME
-------------------- --------- -------------------- --------------------
A 30-SEP-17 04:00:00 17:00:00
B 30-SEP-17 04:00:00 17:00:00
C 30-AUG-17 03:00:00 12:30:00
D 30-DEC-17 04:00:00 17:00:00
A 30-SEP-17 04:00:00 17:00:00
B 30-JUL-17 03:30:00 11:30:00
C 30-AUG-17 03:00:00 12:30:00
D 30-SEP-17 04:00:00 17:00:00
R 30-SEP-17 04:00:00 17:00:00
Data fetched using the query
select dept,curr_date,sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by dept,curr_date,arrival_time,left_time order by Dept;
DEPT CURR_DATE DIFFERENCE
-------------------- --------- ----------
A 30-SEP-17 26
B 30-JUL-17 8
B 30-SEP-17 13
C 30-AUG-17 19
D 30-SEP-17 13
D 30-DEC-17 13
R 30-SEP-17 13

SQL code for Comparing date fields in different rows and combining the results

I need help for proper Oracle SQL code to combine rows for a crystal reports command object. This is a part of the bigger query I'm working on and got stuck for the past couple of days.
for eg. if the columns are like below
PatId In_time Out_time
151 01/01/2012 07:00:00 am 01/01/2012 10:00:00 am
151 01/01/2012 11:00:00 am 01/02/2012 08:00:00 am
151 01/02/2012 11:00:00 am 01/02/2012 01:00:00 pm
151 01/03/2012 08:00:00 am 01/03/2012 03:00:00 pm
151 01/06/2012 03:30:00 pm 01/09/2012 07:00:00 am
167 01/03/2012 01:30:00 pm 01/09/2012 07:00:00 am
167 01/13/2012 03:30:00 pm 01/14/2012 07:00:00 am
167 01/14/2012 11:30:00 am 01/15/2012 11:30:00 am
167 01/18/2012 12:00:00 pm 01/19/2012 03:00:00 am
Within a PatId, the code should compare the Out_time of one row to the In_time of the next row, and check whether the time gap is greater than 48 hours. If not, then it is considered part of the same visit. I want one result row per PatID & visit, with min(In_time) and max(Out_time). The time span of the visit (result row) itself may be greater than 48 hours.
For this example, for PatId 151 the time difference between the out_time of 1st row and In_time of 2nd row is less than 48 hours. The difference between Out_time of second row and In_time of 3rd row, as well as between the 3rd and 4th rows, is also less than 48 hours. After this the gap between Out_time of the 4th row and In_time of 5th row is greater than 48 hours. The result for PatId 151 should be as below and same for EmpId 167, the chaining should continue until a gap greater than 48 hours is found.
So the result for the above table should be displayed as,
PatId In_time Out_time
151 01/01/2012 07:00:00 am 01/03/2012 03:00:00 pm
151 01/06/2012 03:30:00 pm 01/09/2012 07:00:00 am
167 01/03/2012 01:30:00 pm 01/09/2012 07:00:00 am
167 01/13/2012 03:30:00 pm 01/15/2012 11:30:00 am
167 01/18/2012 12:00:00 pm 01/19/2012 03:00:00 am
I could not get the logic on how to compare and merge rows.
Thanks in Advance, Abhi
General example of subtracting time - copy/paste to see the output. This example will give you differences in hours, minutes, seconds between two dates. The basic formula is (end_date - start_date) * 86400 (number of seconds in 24 hrs)...:
SELECT trunc(mydate / 3600) hr
, trunc(mod(mydate, 3600) / 60) mnt
, trunc(mod(mydate, 3600) / 60 /60) sec
FROM
(
SELECT (to_date('01/03/2012 10:00:00', 'mm/dd/yyyy hh24:mi:ss') -
to_date('01/01/2012 07:00:00', 'mm/dd/yyyy hh24:mi:ss')) * 86400 mydate
FROM dual
)
/
HR | MNT | SEC
---------------
51 | 0 | 0
You need to check your example and logic. I could not understand what needs to be comnpared with what...