Find MAX, AVG between every current and previous row BigQuery - google-bigquery

I have a table with 150.000 rows containing DateTime and Speed column. Timestamp difference between rows is 10 seconds. I want to calculate MAX and AVG of Speed column for each 20 second segment (2x 10 sec), so basically compare each current row with its previous row and calculate MAX and AVG of Speed column.
Expected result:
DateTime Speed MAXspeed AVGspeed
2019-03-21 10:58:34 UTC 52
2019-03-21 10:58:44 UTC 50 52 51
2019-03-21 10:58:54 UTC 55 55 52.5
2019-03-21 10:59:04 UTC 60 60 57.5
2019-03-21 10:59:14 UTC 65 65 62.5
2019-03-21 10:59:24 UTC 63 65 64
2019-03-21 10:59:34 UTC 50 63 56.5
2019-03-21 10:59:44 UTC 50 50 50
2019-03-21 10:59:54 UTC 50 50 50
...
I tried with query below but it is obviously wrong:
select *,
MAX(SpeedGearbox_km_h, LAG(SpeedGearbox_km_h) over (order by DateTime)) as Maxspeeg,
AVG(SpeedGearbox_km_h, LAG(SpeedGearbox_km_h) over (order by DateTime)) as AVGspeed,
from `xx.yy`
group by 1,2
order by DateTime

Just use ROWS BETWEEN 1 PRECEDING AND CURRENT ROW in your queries:
SELECT *,
MAX(SpeedGearbox_km_h) OVER (ORDER BY DateTime ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) as MAXspeed,
AVG(SpeedGearbox_km_h) OVER (ORDER BY DateTime ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) as AVGspeed
FROM `xx.yy`
ORDER BY DateTime

Related

(bigquery) how number of hours event is happening within multiple dates

So my data looks like this:
DATE TEMPERATURE
2012-01-13 23:15:00 UTC 0
2012-01-14 01:35:00 UTC 5
2012-01-14 02:15:00 UTC 6
2012-01-14 03:15:00 UTC 8
2012-01-14 04:15:00 UTC 0
2012-01-14 04:55:00 UTC 0
2012-01-14 05:15:00 UTC -2
2012-01-14 05:35:00 UTC 0
I am trying to calculate the amount of time a zip code temperature will drop to 0 or below on any given day. On the 13th, it only happens for a very short amount of time so we don't really care. I want to know how to calculate the number of minutes this happens on the 14th, since it looks like a significantly (and consistently) cold day.
I want the query to add two more columns.
The first column added would be the time difference between the rows on a given date. So row 3- row 2=40 mins and row 4-row3=60 mins.
The second column would total the amount of minutes for a whole day the minutes the temperature has dropped to 0 or below. Here row 2-4 would be ignored. From row 5-8, total time that the temperature was 0 or below would be about 90 mins
It should end up looking like this:
DATE TEMPERATURE MINUTES_DIFFERENCE TOTAL_MINUTES
2012-01-13 23:15:00 UTC 0 0 0
2012-01-14 01:35:00 UTC 5 140 0
2012-01-14 02:15:00 UTC 6 40 0
2012-01-14 03:15:00 UTC 8 60 0
2012-01-14 04:15:00 UTC 0 60 60
2012-01-14 04:55:00 UTC 0 30 90
2012-01-14 05:15:00 UTC-2 20 110
2012-01-14 05:35:00 UTC 0 20 130
Use below
select *,
sum(minutes_difference) over(order by date) total_minutes
from (
select *,
ifnull(timestamp_diff(timestamp(date), lag(timestamp(date)) over(order by date), minute), 0) as minutes_difference
from your_table
)
if applied to sample data in your question - output is
Update to answer updated question
select * except(new_grp, grp),
sum(if(temperature > 0, 0, minutes_difference)) over(partition by grp order by date) total_minutes
from (
select *, countif(new_grp) over(order by date) as grp
from (
select *,
ifnull(timestamp_diff(timestamp(date), lag(timestamp(date)) over(order by date), minute), 0) as minutes_difference,
ifnull(((temperature <= 0) and (lag(temperature) over(order by date) > 0)) or
((temperature > 0) and (lag(temperature) over(order by date) <= 0)), true) as new_grp
from your_table
)
)
with output

Get hourly data based on StartDate

For Example,
I had data something like this :-
batch MIN MAX TIME
X 10 20 2018-07-12 10:29:00.000
X 30 50 2018-07-12 10:30:00.000
X 50 30 2018-07-12 10:31:00.000
| | | |
X 40 20 2018-07-12 11:45:00.000
Now I want hourly data based on start time, For example :-
DURATION MIN
2018-07-12 10:29:00.000-2018-07-12 11:29:00.000 10
2018-07-12 11:30:00.000-2018-07-12 12:30:00.000 10
How can I get this?(Get Min Value For every hour based on Start Time)
dateadd function allows you to add or subtract days,hours, minutes to a date.
Consider The below query
select dateadd(HOUR, -1, getdate()) as time_added,
getdate() as curr_date
The -1 is for subtracting one hour (adding negative one hour)
The result of above query is :
timeadded curr_date
2018-07-12 13:25:31.603 2018-07-12 14:25:31.603
Instead of getdate() use your startdate
In your case it would be
select min from table where time<#starttime and time> dateadd(HOUR, -1, #starttime)

Compare values for consecutive dates of same month

I have a table
ID Value Date
1 10 2017-10-02 02:50:04.480
2 20 2017-10-01 07:28:53.593
3 30 2017-09-30 23:59:59.000
4 40 2017-09-30 23:59:59.000
5 50 2017-09-30 02:36:07.520
I compare Value with previous date. But, I don't need compare result between first day in current month and last day in previous month. For this table, I don't need to compare result between 2017-10-01 07:28:53.593 and 2017-09-30 23:59:59.000 How it can be done?
Result table for this example:
ID Value Date Diff
1 10 2017-10-02 02:50:04.480 10
2 20 2017-10-01 07:28:53.593 NULL
3 30 2017-09-30 23:59:59.000 10
4 40 2017-09-29 23:59:59.000 10
5 50 2017-09-28 02:36:07.520 NULL
You can use this.
SELECT * ,
LEAD(Value) OVER( PARTITION BY DATEPART(YEAR,[Date]), DATEPART(MONTH,[Date]) ORDER BY ID ) - Value AS Diff
FROM MyTable
ORDER BY ID
you can use a query like below
select *,
diff=LEAD(Value) OVER( PARTITION BY Month(Date),Year(Date) ORDER BY Date desc)-Value
from t
order by id asc
see working demo

How to calculate total hours using sql

I am trying to add the total hours where flag =1. Here is how my data look like in the table. The hours are 30 minutes interval
ID FullDateTime Flag
22 2015-02-26 05:30:00.000 1
44 2015-02-26 05:00:00.000 1
25 2015-02-26 04:30:00.000 0
23 2015-02-26 04:00:00.000 1
74 2015-02-26 03:30:00.000 1
36 2015-02-26 03:00:00.000 0
here is what i tried but not working:
select DATEDIFF(minute, sum(FullDatetime), sum(FullDatetime)) / 60.0 as hours
from myTable
where flag = 1
I am expecting the results to be 2 hours.
If the total number of hours is just a function of the number of half hour periods flagged as 1 then a simple count(*) of the rows matching flag 1 multiplied by 0.5 (for the half hour) should do it:
select count(*) * 0.5 from myTable where flag = 1

PostgreSQL - Query a table that stores value changes and present output in a periodic format

Given the following table that stores value changes of a variable:
Timestamp Value
13:14 12
14:25 33
15:13 24
15:41 48
16:31 54
17:00 63
19:30 82
22:30 13
I need to construct a query that outputs the following:
Timestamp Value
14:00 12
15:00 33
16:00 48
17:00 63
18:00 63
19:00 63
20:00 82
21:00 82
22:00 82
23:00 13
And so on...
What would be the correct approach to achieve the desired output?
Thanks in advance.
use date_trunc() and date/time operator
roundup example
user=# select datetime from tbl_test limit 1;
datetime
------------------------
2013-07-26 15:36:00+09
(1 row)
user=# select date_trunc('hour', datetime) + interval '1 hour'
from tbl_test limit 1
?column?
------------------------
2013-07-26 16:00:00+09
(1 row)
formatting example
user=# select to_char(date_trunc('hour', datetime) + interval '1 hour', 'HH24:MI')
from tbl_test limit 1;
to_char
---------
16:00
(1 row)
UPDATED:
you can select latest one using window function.
SELECT DISTINCT x.timestamp, last_value(x.value) OVER (PARTITION BY x.timestamp)
FROM (SELECT TO_CHAR(date_trunc('hour', timestamp) + INTERVAL '1 hour', 'HH24:MI') AS timestamp, value
FROM tbl_test) as x
ORDER BY x.timestamp;
postgresql reference:
9.9. Date/Time Functions and Operators
9.8. Data Type Formatting Functions