(bigquery) how number of hours event is happening within multiple dates - sql

So my data looks like this:
DATE TEMPERATURE
2012-01-13 23:15:00 UTC 0
2012-01-14 01:35:00 UTC 5
2012-01-14 02:15:00 UTC 6
2012-01-14 03:15:00 UTC 8
2012-01-14 04:15:00 UTC 0
2012-01-14 04:55:00 UTC 0
2012-01-14 05:15:00 UTC -2
2012-01-14 05:35:00 UTC 0
I am trying to calculate the amount of time a zip code temperature will drop to 0 or below on any given day. On the 13th, it only happens for a very short amount of time so we don't really care. I want to know how to calculate the number of minutes this happens on the 14th, since it looks like a significantly (and consistently) cold day.
I want the query to add two more columns.
The first column added would be the time difference between the rows on a given date. So row 3- row 2=40 mins and row 4-row3=60 mins.
The second column would total the amount of minutes for a whole day the minutes the temperature has dropped to 0 or below. Here row 2-4 would be ignored. From row 5-8, total time that the temperature was 0 or below would be about 90 mins
It should end up looking like this:
DATE TEMPERATURE MINUTES_DIFFERENCE TOTAL_MINUTES
2012-01-13 23:15:00 UTC 0 0 0
2012-01-14 01:35:00 UTC 5 140 0
2012-01-14 02:15:00 UTC 6 40 0
2012-01-14 03:15:00 UTC 8 60 0
2012-01-14 04:15:00 UTC 0 60 60
2012-01-14 04:55:00 UTC 0 30 90
2012-01-14 05:15:00 UTC-2 20 110
2012-01-14 05:35:00 UTC 0 20 130

Use below
select *,
sum(minutes_difference) over(order by date) total_minutes
from (
select *,
ifnull(timestamp_diff(timestamp(date), lag(timestamp(date)) over(order by date), minute), 0) as minutes_difference
from your_table
)
if applied to sample data in your question - output is
Update to answer updated question
select * except(new_grp, grp),
sum(if(temperature > 0, 0, minutes_difference)) over(partition by grp order by date) total_minutes
from (
select *, countif(new_grp) over(order by date) as grp
from (
select *,
ifnull(timestamp_diff(timestamp(date), lag(timestamp(date)) over(order by date), minute), 0) as minutes_difference,
ifnull(((temperature <= 0) and (lag(temperature) over(order by date) > 0)) or
((temperature > 0) and (lag(temperature) over(order by date) <= 0)), true) as new_grp
from your_table
)
)
with output

Related

SUM of production counts for "overnight work shift" in MS SQL (2019)

I need some help regarding sum of production count for overnight shifts.
The table just contains a timestamp (that is automaticaly generated by SQL server during INSERT), the number of OK produced pieces and the number of NOT OK produced pieces in that given timestamp.
CREATE TABLE [machine1](
[timestamp] [datetime] NOT NULL,
[OK] [int] NOT NULL,
[NOK] [int] NOT NULL
)
ALTER TABLE [machine1] ADD DEFAULT (getdate()) FOR [timestamp]
The table holds values like these (just an example, there are hundreds of lines each day and the time stamps are not fixed like each hour or each 30mins):
timestamp
OK
NOK
2022-08-01 05:30:00.000
15
1
2022-08-01 06:30:00.000
18
3
...
...
...
2022-08-01 21:30:00.000
10
12
2022-08-01 22:30:00.000
0
3
...
...
...
2022-08-01 23:59:00.000
1
2
2022-08-02 00:01:00.000
7
0
...
...
...
2022-08-02 05:30:00.000
12
4
2022-08-02 06:30:00.000
9
3
The production works in shifts like so:
morning shift: 6:00 -> 14:00
afternoon shift: 14:00 -> 22:00
night shift: 22:00 -> 6:00 the next day
I have managed to get sums for the morning and afternoon shifts without issues but I can't figure out how to do the sum for the night shift (I have these SELECTs for each shift stored as a VIEW for easy access).
For the morning shift:
SELECT CAST(timestamp AS date) AS Morning,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 6 AND DATEPART(hh,timestamp) < 14
GROUP BY CAST(timestamp AS date)
ORDER BY Morning ASC
For the afternoon shift:
SELECT CAST(timestamp AS date) AS Afternoon,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 14 AND DATEPART(hh,timestamp) < 22
GROUP BY CAST(timestamp AS date)
ORDER BY Afternoon ASC
Since we identify the date of each shift by its start, my idea would be that the result for such SUM of night shift would be
Night
SUM_OK
SUM_NOK
2022-08-01
xxx
xxx
for interval 2022-08-01 22:00:00.000 -> 2022-08-02 05:59:59.999
2022-08-02
xxx
xxx
for interval 2022-08-02 22:00:00.000 -> 2022-08-03 05:59:59.999
2022-08-03
xxx
xxx
for interval 2022-08-03 22:00:00.000 -> 2022-08-04 05:59:59.999
2022-08-04
xxx
xxx
for interval 2022-08-04 22:00:00.000 -> 2022-08-05 05:59:59.999
...
...
...
After few days of trial and error I have probably managed to find the needed solution. Using a subquery I shift all the times in range 00:00:00 -> 05:59:59 to the previous day and then I use that result in same approach as for morning and afternon shift (because now all the production data from night shift are in the same date between 22:00:00 and 23:59:59).
In case anyone needs it in future:
SELECT
CAST(nightShift.shiftedTime AS date) AS Night,
SUM(nightShift.OK) AS SUM_OK,
SUM(nightShift.NOK) AS SUM_NOK
FROM
(SELECT
CASE WHEN (DATEPART(hh, timestamp) < 6 AND DATEPART(hh, timestamp) >= 4) THEN DATEADD(HOUR, -6, timestamp)
WHEN (DATEPART(hh, timestamp) < 4 AND DATEPART(hh, timestamp) >= 2) THEN DATEADD(HOUR, -4, timestamp)
WHEN (DATEPART(hh, timestamp) < 2 AND DATEPART(hh, timestamp) >= 0) THEN DATEADD(HOUR, -2, timestamp)
END AS shiftedTime,
[OK],
[NOK]
FROM [machine1]
WHERE (DATEPART(hh, cas) >= 0 AND DATEPART(hh, cas) < 6)) nightShift
WHERE DATEPART(hh,nightShift.shiftedTime) >= 22
GROUP BY CAST(nightShift.shiftedTime AS date)
ORDER BY Night ASC
PS: If there is anything wrong with this approach, please feel free to correct me as I'm just newbie in SQL. So far this seems to do exactly what I needed.

Find MAX, AVG between every current and previous row BigQuery

I have a table with 150.000 rows containing DateTime and Speed column. Timestamp difference between rows is 10 seconds. I want to calculate MAX and AVG of Speed column for each 20 second segment (2x 10 sec), so basically compare each current row with its previous row and calculate MAX and AVG of Speed column.
Expected result:
DateTime Speed MAXspeed AVGspeed
2019-03-21 10:58:34 UTC 52
2019-03-21 10:58:44 UTC 50 52 51
2019-03-21 10:58:54 UTC 55 55 52.5
2019-03-21 10:59:04 UTC 60 60 57.5
2019-03-21 10:59:14 UTC 65 65 62.5
2019-03-21 10:59:24 UTC 63 65 64
2019-03-21 10:59:34 UTC 50 63 56.5
2019-03-21 10:59:44 UTC 50 50 50
2019-03-21 10:59:54 UTC 50 50 50
...
I tried with query below but it is obviously wrong:
select *,
MAX(SpeedGearbox_km_h, LAG(SpeedGearbox_km_h) over (order by DateTime)) as Maxspeeg,
AVG(SpeedGearbox_km_h, LAG(SpeedGearbox_km_h) over (order by DateTime)) as AVGspeed,
from `xx.yy`
group by 1,2
order by DateTime
Just use ROWS BETWEEN 1 PRECEDING AND CURRENT ROW in your queries:
SELECT *,
MAX(SpeedGearbox_km_h) OVER (ORDER BY DateTime ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) as MAXspeed,
AVG(SpeedGearbox_km_h) OVER (ORDER BY DateTime ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) as AVGspeed
FROM `xx.yy`
ORDER BY DateTime

BigQuery - A way to generate timestamps based on hour/minute/seconds?

Is there a way to generate sequential timestamps in BigQuery that is focused on hours, minutes, and seconds?
In BigQuery you can generate sequential dates by:
select *
FROM UNNEST(GENERATE_DATE_ARRAY('2016-10-18', '2016-10-19', INTERVAL 1 DAY)) as day
This will generate the dates from 2016-10-18 to 2016-10-19 in date intervals
Row day
1 2016-10-18
2 2016-10-19
But let's say I want intervals in 15 minutes or 5 minutes, is there a way to do that?
First, I would recommend "starring" the feature request for GENERATE_TIMESTAMP_ARRAY to express interest in having a function like this. Given GENERATE_ARRAY, though, the best option currently is to use a query of this form:
SELECT TIMESTAMP_ADD('2018-04-01', INTERVAL 15 * x MINUTE)
FROM UNNEST(GENERATE_ARRAY(0, 13)) AS x;
If you want a minute-based GENERATE_TIMESTAMP_ARRAY equivalent, you can use a UDF like this:
CREATE TEMP FUNCTION GenerateMinuteTimestampArray(
t0 TIMESTAMP, t1 TIMESTAMP, minutes INT64) AS (
ARRAY(
SELECT TIMESTAMP_ADD(t0, INTERVAL minutes * x MINUTE)
FROM UNNEST(GENERATE_ARRAY(0, TIMESTAMP_DIFF(t1, t0, MINUTE))) AS x
)
);
SELECT ts
FROM UNNEST(GenerateMinuteTimestampArray('2018-04-01', '2018-04-01 12:00:00', 15)) AS ts;
This returns a timestamp for each 15-minute interval between midnight and 12 PM on April 1.
Update: You can now use the GENERATE_TIMESTAMP_ARRAY function in BigQuery. If you want to generate timestamps at intervals of 15 minutes, for example, you can use:
SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-18', '2016-10-19', INTERVAL 15 MINUTE);
Epochs seems like the way to go.
But requires to convert date to epoch first.
select TIMESTAMP_MICROS(CAST(day * 1000000 as INT64))
FROM UNNEST(GENERATE_ARRAY(1522540800, 1525132799, 900)) as day
Row f0_
1 2018-04-01 00:00:00.000 UTC
2 2018-04-01 00:15:00.000 UTC
3 2018-04-01 00:30:00.000 UTC
4 2018-04-01 00:45:00.000 UTC
5 2018-04-01 01:00:00.000 UTC
6 2018-04-01 01:15:00.000 UTC
7 2018-04-01 01:30:00.000 UTC
8 2018-04-01 01:45:00.000 UTC
9 2018-04-01 02:00:00.000 UTC
10 2018-04-01 02:15:00.000 UTC
11 2018-04-01 02:30:00.000 UTC
12 2018-04-01 02:45:00.000 UTC
13 2018-04-01 03:00:00.000 UTC

Oracle, SQL, how to get intervals between dates

I need help with a problem. Actually, I do not know if it will be possible to solve it directly in SQL.
I have a list of works. Each work has a start date and ending date, with this format
YYYY/MM/DD HH24:MI:SS
I need to calculate the cost of those jobs, the hour price depends on the time intervals in which the work has been done:
Nigth time: 22:00 to 6:00, for example: 20 €/h
Normal time: the rest 17 €/h
So, if I have a sample like this:
wo start end
21 2017/11/16 21:25:00 2017/11/16 22:55:00
22 2017/11/17 05:45:00 2017/11/17 07:05:00
23 2017/11/18 23:00:00 2017/11/19 1:10:00
24 2017/11/17 18:00:00 2017/11/17 19:00:00
I would need to calculate the intervals of the dates between the 22h and 6h and the rest to multiply them by their corresponding price
wo rest(minutes) night(minutes)
21 35 55
22 15 65
23 0 130
24 1 0
Thank for your help in advance.
Heh. If you really wish it :)
Fifth record (started at 2016-10-30) had been added for testing purposes.
SQL> with
2 src as (select timestamp '2017-11-16 21:25:00' b, timestamp '2017-11-16 22:55:00' f from dual union all
3 select timestamp '2017-11-17 05:45:00' b, timestamp '2017-11-17 07:05:00' f from dual union all
4 select timestamp '2017-11-18 23:00:00' b, timestamp '2017-11-19 1:10:00' f from dual union all
5 select timestamp '2017-11-17 18:00:00' b, timestamp '2017-11-17 19:00:00' f from dual union all
6 select timestamp '2016-10-30 00:00:00' b, timestamp '2016-11-03 23:00:00' f from dual),
7 srd as (select b, f, f - b t from src),
8 mmm as (select min(trunc(b)) b, max(trunc(f)) f from src),
9 rws as (select b + 6/24 + rownum - 1 b, b + 22/24 + rownum - 1 f from mmm connect by level <= f - b + 1),
10 mix as (select s.b, s.f, s.t, r.b rb, r.f rf from srd s, rws r where s.f >= r.b (+) and r.f (+) >= s.b),
11 clc as (select b, f, t, nvl(numtodsinterval(sum((least(f, rf) + 0) - (greatest(b, rb) + 0)), 'DAY'), interval '0' second) d from mix group by b, f, t)
12 select
13 to_char(b, 'dd.mm.yyyy hh24:mi') as "datetime begin",
14 to_char(f, 'dd.mm.yyyy hh24:mi') as "datetime finish",
15 cast(t as interval day to second(0)) as "total time",
16 cast(d as interval day to second(0)) as "daytime",
17 cast(t - d as interval day to second(0)) as "nighttime"
18 from
19 clc
20 order by
21 1, 2;
datetime begin datetime finish total time daytime nighttime
------------------ ------------------ -------------- -------------- --------------
16.11.2017 21:25 16.11.2017 22:55 +00 01:30:00 +00 00:35:00 +00 00:55:00
17.11.2017 05:45 17.11.2017 07:05 +00 01:20:00 +00 01:05:00 +00 00:15:00
17.11.2017 18:00 17.11.2017 19:00 +00 01:00:00 +00 01:00:00 +00 00:00:00
18.11.2017 23:00 19.11.2017 01:10 +00 02:10:00 +00 00:00:00 +00 02:10:00
30.10.2016 00:00 03.11.2016 23:00 +04 23:00:00 +03 08:00:00 +01 15:00:00
A different approach is more brute force one, but it allows to distinct the interval configuration from the reporting.
It goes in three stept:
1) define the rate type for aech minute of the day (change the granularity if required)
create table day_config as
with helper as (
select
rownum -1 minute_id
from dual connect by level <= 24*60),
helper2 as (
select
minute_id,
trunc(minute_id/60) hour_no,
mod(minute_id,60) minute_no
from helper)
select
minute_id,hour_no, minute_no,
case when hour_no >= 22 or hour_no <= 5 then 0 else 1 end rate_id
from helper2;
select * from day_config order by minute_id;
MINUTE_ID HOUR_NO MINUTE_NO RATE_ID
---------- ---------- ---------- ----------
0 0 0 0
1 0 1 0
2 0 2 0
3 0 3 0
4 0 4 0
5 0 5 0
6 0 6 0
7 0 7 0
8 0 8 0
9 0 9 0
Here rate_id means nigth, rate_id 1 means a day.
Advantage is, that you can introduce as much rate types as required.
2) expand the configuration for the required interval e.g. to whole year.
So now we have for each minute of the year the configuration, which rate is to be applied.
create or replace view year_config as
select my_date + MINUTE_ID / (24*60) minute_ts , MINUTE_ID, HOUR_NO, MINUTE_NO, RATE_ID from day_config
cross join
(select DATE '2017-01-01' + rownum -1 as my_date from dual connect by level <= 365)
order by 1,2;
select * from (
select * from year_config
order by 1)
where rownum <= 5;
MINUTE_TS MINUTE_ID HOUR_NO MINUTE_NO RATE_ID
------------------- ---------- ---------- ---------- ----------
01-01-2017 00:00:00 0 0 0 0
01-01-2017 00:01:00 1 0 1 0
01-01-2017 00:02:00 2 0 2 0
01-01-2017 00:03:00 3 0 3 0
01-01-2017 00:04:00 4 0 4 0
3) the reporting is as easy as joining to our config table constraining the interval (half open) and grouping in the RATE.
select b, f,RATE_ID, count(*) minute_cnt
from tst join year_config c on c.MINUTE_TS >= tst.b and c.MINUTE_TS < tst.f
group by b, f,RATE_ID
order by b, f,RATE_ID;
B F RATE_ID MINUTE_CNT
------------------- ------------------- ---------- ----------
16-11-2017 21:25:00 16-11-2017 22:55:00 0 55
16-11-2017 21:25:00 16-11-2017 22:55:00 1 35
17-11-2017 05:45:00 17-11-2017 07:05:00 0 15
17-11-2017 05:45:00 17-11-2017 07:05:00 1 65
17-11-2017 18:00:00 17-11-2017 19:00:00 1 60
18-11-2017 23:00:00 19-11-2017 01:10:00 0 130
The easiest way is probably to get all minutes worked in a recursive WITH clause and then see in which time range the minutes fall. As Oracle doesn't have a TIME datatype unfortunately, we'll have to work with times strings ('00'00' till '23:59').
with shifts as
(
select 'night' as shift, '00:00' as starttime, '05:59' as endtime, 20 as cost from dual
union all
select 'normal' as shift, '06:00' as starttime, '21:59' as endtime, 17 as cost from dual
union all
select 'night' as shift, '22:00' as starttime, '23:59' as endtime, 20 as cost from dual
)
, workminutes(wo, workminute, thetime, endtime) as
(
select wo, to_char(starttime, 'hh24:mi') as workminute, starttime as thetime, endtime
from mytable
union all
select
wo,
to_char(thetime + interval '1' minute, 'hh24:mi') as workminute,
thetime + interval '1' minute as thetime,
endtime
from workminutes
where thetime + interval '1' minute < endtime
)
select
wo,
count(case when s.shift = 'normal' then 1 end) as normal_time,
coalesce(sum(case when m.workminute between '06:00' and '21:59' then s.cost end), 0)
as normal_cost,
count(case when s.shift = 'night' then 1 end) as night_time,
coalesce(sum(case when m.workminute not between '06:00' and '21:59' then s.cost end), 0)
as night_cost,
count(*) as total_time,
coalesce(sum(s.cost), 0)
as total_cost
from workminutes m
join shifts s on m.workminute between s.starttime and s.endtime
group by wo
order by wo;
Output:
WO NORMAL_TIME NORMAL_COST NIGHT_TIME NIGHT_COST TOTAL_TIME TOTAL_COST
21 35 595 55 1100 90 1695
22 65 1105 15 300 80 1405
23 0 0 130 2600 130 2600
24 60 1020 0 0 60 1020
25 4800 81600 2340 46800 7140 128400
(This query looks a lot nicer of course, if you have a real shifts table and don't have to make one up on-the-fly. Also, you may not need all those seven columns I have in my result.)

How to calculate total hours using sql

I am trying to add the total hours where flag =1. Here is how my data look like in the table. The hours are 30 minutes interval
ID FullDateTime Flag
22 2015-02-26 05:30:00.000 1
44 2015-02-26 05:00:00.000 1
25 2015-02-26 04:30:00.000 0
23 2015-02-26 04:00:00.000 1
74 2015-02-26 03:30:00.000 1
36 2015-02-26 03:00:00.000 0
here is what i tried but not working:
select DATEDIFF(minute, sum(FullDatetime), sum(FullDatetime)) / 60.0 as hours
from myTable
where flag = 1
I am expecting the results to be 2 hours.
If the total number of hours is just a function of the number of half hour periods flagged as 1 then a simple count(*) of the rows matching flag 1 multiplied by 0.5 (for the half hour) should do it:
select count(*) * 0.5 from myTable where flag = 1