I have a table with data from a sensor created like that:
CREATE TABLE IF NOT EXISTS "aqi" (
"time" datetime,
"pm25" real,
"pm10" real
);
When is sensor running, it sends data to a server (which it writes to a database) every second. But when the sensor is not running, there are "gaps" in data in the database like that (I've rewritten time column to a readable format and timezone GMT+01, leaving raw data in parentheses):
time
pm25
pm10
...
...
...
2021-12-28 18:44 (1640713462)
9.19
9.27
2021-12-28 18:45 (1640713522)
9.65
9.69
2021-12-28 18:46 (1640713582)
9.68
9.76
2021-12-29 10:17 (1640769421)
7.42
7.42
2021-12-29 10:18 (1640769481)
7.94
7.98
2021-12-29 10:19 (1640769541)
7.42
7.43
...
...
...
I wanted to create a query, that selects data from the last 24 hours, outputting pm25 and pm10 as NULL if there aren't data in the table for the current time. So the table above would look like that:
time
pm25
pm10
...
...
...
2021-12-28 18:44 (1640713462)
9.19
9.27
2021-12-28 18:45 (1640713522)
9.65
9.69
2021-12-28 18:46 (1640713582)
9.68
9.76
2021-12-28 18:47 (1640713642)
NULL
NULL
2021-12-28 18:48 (1640713702)
NULL
NULL
2021-12-28 18:49 (1640713762)
NULL
NULL
...
...
...
2021-12-29 10:14 (1640769262)
NULL
NULL
2021-12-29 10:15 (1640769322)
NULL
NULL
2021-12-29 10:16 (1640769382)
NULL
NULL
2021-12-29 10:17 (1640769421)
7.42
7.42
2021-12-29 10:18 (1640769481)
7.94
7.98
2021-12-29 10:19 (1640769541)
7.42
7.43
...
...
...
I don't mind if the seconds would be different because of the generation of time...
I tried generating time for the last 24 hours using code from https://stackoverflow.com/a/32987070 and that works, as I wanted:
WITH RECURSIVE dates(generated_time) AS (
VALUES(datetime('now', '-1 minute', 'localtime'))
UNION ALL
SELECT datetime(generated_time, '-1 minute')
FROM dates
LIMIT 1440
)
SELECT strftime('%Y-%m-%d %H:%M', datetime(generated_time)) AS time
FROM dates;
But I don't know how to add (JOIN) data from the sensor (columns pm25, pm10) to query above... I tried something, but it outputs 0 rows:
WITH RECURSIVE dates(generated_time) AS (
VALUES(datetime('now', '-1 minute', 'localtime'))
UNION ALL
SELECT datetime(generated_time, '-1 minute')
FROM dates
LIMIT 1440
)
SELECT
strftime('%Y-%m-%d %H:%M', datetime(generated_time)) AS generated_time,
pm25,
pm10
FROM
dates
INNER JOIN aqi ON generated_time = strftime('%Y-%m-%d %H:%M', datetime(aqi.time));
Probably it's something really obvious, that I'm missing, but I have no idea :/
EDIT:
As #DrummerMann pointed out, it works with LEFT JOIN, but it takes around one whole minute to execute the query (in the database is around 14 000 values):
WITH RECURSIVE dates(time) AS (
VALUES(datetime('now', '-1 minute', 'localtime'))
UNION ALL
SELECT datetime(time, '-1 minute')
FROM dates
LIMIT 1440
)
SELECT
dates.time,
aqi.pm25,
aqi.pm10
FROM
dates
LEFT JOIN aqi ON strftime('%Y-%m-%d %H:%M', datetime(dates.time)) = strftime('%Y-%m-%d %H:%M', datetime(aqi.time, 'unixepoch', 'localtime'))
ORDER BY dates.time;
Is there any better way to do that?
Try this version of the cte, which uses integer unix timestamps where the seconds are stripped off and there are no functions in the ON clause of the join:
WITH RECURSIVE dates(generated_time) AS (
SELECT strftime('%s', 'now', '-1 minute', 'localtime') / 60 * 60
UNION ALL
SELECT generated_time - 60
FROM dates
LIMIT 1440
)
SELECT strftime('%Y-%m-%d %H:%M', d.generated_time, 'unixepoch', 'localtime') AS generated_time,
a.pm25,
a.pm10
FROM dates d LEFT JOIN aqi a
ON d.generated_time = a.time / 60 * 60;
Related
I need some help regarding sum of production count for overnight shifts.
The table just contains a timestamp (that is automaticaly generated by SQL server during INSERT), the number of OK produced pieces and the number of NOT OK produced pieces in that given timestamp.
CREATE TABLE [machine1](
[timestamp] [datetime] NOT NULL,
[OK] [int] NOT NULL,
[NOK] [int] NOT NULL
)
ALTER TABLE [machine1] ADD DEFAULT (getdate()) FOR [timestamp]
The table holds values like these (just an example, there are hundreds of lines each day and the time stamps are not fixed like each hour or each 30mins):
timestamp
OK
NOK
2022-08-01 05:30:00.000
15
1
2022-08-01 06:30:00.000
18
3
...
...
...
2022-08-01 21:30:00.000
10
12
2022-08-01 22:30:00.000
0
3
...
...
...
2022-08-01 23:59:00.000
1
2
2022-08-02 00:01:00.000
7
0
...
...
...
2022-08-02 05:30:00.000
12
4
2022-08-02 06:30:00.000
9
3
The production works in shifts like so:
morning shift: 6:00 -> 14:00
afternoon shift: 14:00 -> 22:00
night shift: 22:00 -> 6:00 the next day
I have managed to get sums for the morning and afternoon shifts without issues but I can't figure out how to do the sum for the night shift (I have these SELECTs for each shift stored as a VIEW for easy access).
For the morning shift:
SELECT CAST(timestamp AS date) AS Morning,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 6 AND DATEPART(hh,timestamp) < 14
GROUP BY CAST(timestamp AS date)
ORDER BY Morning ASC
For the afternoon shift:
SELECT CAST(timestamp AS date) AS Afternoon,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 14 AND DATEPART(hh,timestamp) < 22
GROUP BY CAST(timestamp AS date)
ORDER BY Afternoon ASC
Since we identify the date of each shift by its start, my idea would be that the result for such SUM of night shift would be
Night
SUM_OK
SUM_NOK
2022-08-01
xxx
xxx
for interval 2022-08-01 22:00:00.000 -> 2022-08-02 05:59:59.999
2022-08-02
xxx
xxx
for interval 2022-08-02 22:00:00.000 -> 2022-08-03 05:59:59.999
2022-08-03
xxx
xxx
for interval 2022-08-03 22:00:00.000 -> 2022-08-04 05:59:59.999
2022-08-04
xxx
xxx
for interval 2022-08-04 22:00:00.000 -> 2022-08-05 05:59:59.999
...
...
...
After few days of trial and error I have probably managed to find the needed solution. Using a subquery I shift all the times in range 00:00:00 -> 05:59:59 to the previous day and then I use that result in same approach as for morning and afternon shift (because now all the production data from night shift are in the same date between 22:00:00 and 23:59:59).
In case anyone needs it in future:
SELECT
CAST(nightShift.shiftedTime AS date) AS Night,
SUM(nightShift.OK) AS SUM_OK,
SUM(nightShift.NOK) AS SUM_NOK
FROM
(SELECT
CASE WHEN (DATEPART(hh, timestamp) < 6 AND DATEPART(hh, timestamp) >= 4) THEN DATEADD(HOUR, -6, timestamp)
WHEN (DATEPART(hh, timestamp) < 4 AND DATEPART(hh, timestamp) >= 2) THEN DATEADD(HOUR, -4, timestamp)
WHEN (DATEPART(hh, timestamp) < 2 AND DATEPART(hh, timestamp) >= 0) THEN DATEADD(HOUR, -2, timestamp)
END AS shiftedTime,
[OK],
[NOK]
FROM [machine1]
WHERE (DATEPART(hh, cas) >= 0 AND DATEPART(hh, cas) < 6)) nightShift
WHERE DATEPART(hh,nightShift.shiftedTime) >= 22
GROUP BY CAST(nightShift.shiftedTime AS date)
ORDER BY Night ASC
PS: If there is anything wrong with this approach, please feel free to correct me as I'm just newbie in SQL. So far this seems to do exactly what I needed.
I have power meter data stored in table MeterData:
create table MeterData (
MeterID VARCHAR2(10), DownloadCycle VARCHAR2(6), DateHour Date,
KWH Number(22,6), KW Number(22,6), KVA Number(22,6), KVAR Number(22,6),
CONSTRAINT UniqueDownload UNIQUE(MeterID, DownloadCycle, DateHour))
The data looks like this:
MeterID
DownloadCycle
DateHour
KWH
KW
KVA
KVAR
2319927
202206
13/06/2022 00:00
0.138
0.552
0.552
0
2500350
202206
13/06/2022 00:15
0.612
2.448
2.916
1.584
2500351
202206
13/06/2022 01:30
0.8
3.2
3.2358
0.48
2500352
202206
13/06/2022 04:00
0.288
1.152
1.44
0.864
2500353
202206
13/06/2022 05:30
0.90808
3.63232
4.32456
0
2500396
202206
13/06/2022 12:00
68.09
272.36
277.101157
51.04
2500446
202206
13/06/2022 18:15
0
0
0
0
2500453
202206
13/06/2022 21:00
2.772
11.088
11.088
0
2500472
202206
13/06/2022 23:30
64.8
259.2
305.788256
162.24
2500490
202206
14/06/2022 00:30
2.4
9.6
9.6
0
2501352
202206
14/06/2022 01:45
11.64
46.56
46.56
0
5187222
202206
14/06/2022 06:30
1.452
5.808
7.392
0
5284288
202206
14/06/2022 11:00
66.792
267.168
267.447334
149.336
5516997
202206
14/06/2022 18:30
0.384
1.536
8.112
0
I need to assign every record in table MeterData to a range of hours stored in table HourlyBlocks, in which I'm using intervals as the starting and ending hours:
create table HourlyBlocks (
HourlyBlock VARCHAR2(6) UNIQUE, BlockStart INTERVAL DAY(1) TO SECOND(0),
BlockEnd INTERVAL DAY(1) TO SECOND(0));
insert into HourlyBlocks values (
'Rest', interval '0 05:00:00' day to second, interval '0 18:00:00' day to second);
insert into HourlyBlocks values (
'Peak', interval '0 18:00:00' day to second, interval '0 23:00:00' day to second);
insert into HourlyBlocks values (
'Valley', interval '0 23:00:00' day to second, interval '1 05:00:00' day to second);
(HourlyBlock 'Valley' begins at 23:00:00 and ends at 05:00:00 of the following day).
To test to which HourlyBlock every record in MeterData belongs, I extract the record's hour, minute and second information as an interval with the following, adding 1 day to the interval if it is less than 05:00:00 and thus belongs to HourlyBlock 'Valley':
select distinct m.MeterID, m.DateHour, m.kwh,
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') + case when
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') <= interval '00 05:00:00'
day to second then interval '1' day else interval '0' day end as intval,
h.HourlyBlock
from MeterData m, HourlyBlocks h
where (NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour ), 'DAY') > h.BlockStart
and NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour ), 'DAY') + case when
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour ), 'DAY') <= interval '00 05:00:00'
day to second then interval '1' day else interval '0' day end <= h.BlockEnd)
The HourlyBlock are correctly assigned, except for records where DateHour is between 00:00:00 and 05:00:00!
What am I doing wrong?
The expected output for the sample data provided would be:
|MeterID|DateHour |KWH |intval |HourlyBlock|
|-------|----------------|------|-------------------|-----------|
|2319927|13/06/2022 00:00|0.138 |+00 00:00:00.000000|Valley |
|2500350|13/06/2022 00:15|0.612 |+01 00:15:00.000000|Valley |
|2500351|13/06/2022 01:30|0.8 |+01 01:30:00.000000|Valley |
|2500352|13/06/2022 04:00|0.288 |+01 04:00:00.000000|Valley |
|2500353|13/06/2022 05:30|0.908 |+00 05:30:00.000000|Rest |
|2500396|13/06/2022 12:00|68.09 |+00 12:00:00.000000|Rest |
|2500446|13/06/2022 18:15|0 |+00 18:15:00.000000|Peak |
|2500453|13/06/2022 21:00|2.772 |+00 21:00:00.000000|Peak |
|2500472|13/06/2022 23:30|64.8 |+00 23:30:00.000000|Valley |
|2500490|14/06/2022 00:30|2.4 |+01 00:30:00.000000|Valley |
|2501352|14/06/2022 01:45|11.64 |+01 01:45:00.000000|Valley |
|5187222|14/06/2022 06:30|1.452 |+00 06:30:00.000000|Rest |
|5284288|14/06/2022 11:00|66.792|+00 11:00:00.000000|Rest |
|5516997|14/06/2022 18:30|0.384 |+00 18:30:00.000000|Peak |
(I'm sorry I had to format the output as code. It was the only way around that pesky "Your post appears to contain code that is not properly formatted as code" error.)
I found the fix, and simplified my WHERE clause:
select distinct m.MeterID, m.DateHour, m.kwh,
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') + case when
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') <= interval '00 05:00:00'
day to second then interval '1' day else interval '0' day end as intval, h.HourlyBlock
from MeterData m, HourlyBlocks h
where NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') +
case when NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY')
<= interval '00 05:00:00' day to second then interval '1' day else
interval '0' day end between h.BlockStart + interval '1' second and h.BlockStart
I am generating one time-series from using the below query.
SELECT date_trunc('day', dd):: TIMESTAMP WITHOUT TIME zone as time_ent
FROM generate_series (
CASE
WHEN MOD(EXTRACT(DAY FROM '2020-12-13 13:02:42'::timestamp)::INT, 4) = 0 THEN
'2020-12-13 13:02:42'::date
ELSE
'2020-12-13 13:02:42'::date + concat(MOD(EXTRACT(DAY FROM '2020-12-13 13:02:42'::timestamp)::INT, 4), ' day')::interval
END
, '2021-12-13 13:02:42'::date
, '5760 min'::INTERVAL
) dd
and it will give me output like below.
2020-12-14 00:00:00.000
2020-12-18 00:00:00.000
2020-12-22 00:00:00.000
2020-12-26 00:00:00.000
2020-12-30 00:00:00.000
2021-01-03 00:00:00.000
but I need output like.
2020-12-16 00:00:00.000
2020-12-20 00:00:00.000
2020-12-24 00:00:00.000
2020-12-28 00:00:00.000
2020-01-01 00:00:00.000
2021-01-05 00:00:00.000
currently, the time series days depend upon the timestamp that I pass. in above it gives me days like 14,18,22...but I want the days like 16,20,24. multiple of 4..days should not depend on the time I passed in query. I tried many things but not any success.
Try this :
SELECT date_trunc('day', dd):: TIMESTAMP WITHOUT TIME zone as time_ent
FROM generate_series ( date_trunc('month', '2020-12-13 13:02:42' :: timestamp) :: date + (ceiling (EXTRACT(DAY FROM '2020-12-13 13:02:42'::timestamp)/4)*4 - 1) :: integer
, '2021-12-13 13:02:42'::date
, '4 days' ::INTERVAL
) dd
see the result
Is there a way to generate sequential timestamps in BigQuery that is focused on hours, minutes, and seconds?
In BigQuery you can generate sequential dates by:
select *
FROM UNNEST(GENERATE_DATE_ARRAY('2016-10-18', '2016-10-19', INTERVAL 1 DAY)) as day
This will generate the dates from 2016-10-18 to 2016-10-19 in date intervals
Row day
1 2016-10-18
2 2016-10-19
But let's say I want intervals in 15 minutes or 5 minutes, is there a way to do that?
First, I would recommend "starring" the feature request for GENERATE_TIMESTAMP_ARRAY to express interest in having a function like this. Given GENERATE_ARRAY, though, the best option currently is to use a query of this form:
SELECT TIMESTAMP_ADD('2018-04-01', INTERVAL 15 * x MINUTE)
FROM UNNEST(GENERATE_ARRAY(0, 13)) AS x;
If you want a minute-based GENERATE_TIMESTAMP_ARRAY equivalent, you can use a UDF like this:
CREATE TEMP FUNCTION GenerateMinuteTimestampArray(
t0 TIMESTAMP, t1 TIMESTAMP, minutes INT64) AS (
ARRAY(
SELECT TIMESTAMP_ADD(t0, INTERVAL minutes * x MINUTE)
FROM UNNEST(GENERATE_ARRAY(0, TIMESTAMP_DIFF(t1, t0, MINUTE))) AS x
)
);
SELECT ts
FROM UNNEST(GenerateMinuteTimestampArray('2018-04-01', '2018-04-01 12:00:00', 15)) AS ts;
This returns a timestamp for each 15-minute interval between midnight and 12 PM on April 1.
Update: You can now use the GENERATE_TIMESTAMP_ARRAY function in BigQuery. If you want to generate timestamps at intervals of 15 minutes, for example, you can use:
SELECT GENERATE_TIMESTAMP_ARRAY('2016-10-18', '2016-10-19', INTERVAL 15 MINUTE);
Epochs seems like the way to go.
But requires to convert date to epoch first.
select TIMESTAMP_MICROS(CAST(day * 1000000 as INT64))
FROM UNNEST(GENERATE_ARRAY(1522540800, 1525132799, 900)) as day
Row f0_
1 2018-04-01 00:00:00.000 UTC
2 2018-04-01 00:15:00.000 UTC
3 2018-04-01 00:30:00.000 UTC
4 2018-04-01 00:45:00.000 UTC
5 2018-04-01 01:00:00.000 UTC
6 2018-04-01 01:15:00.000 UTC
7 2018-04-01 01:30:00.000 UTC
8 2018-04-01 01:45:00.000 UTC
9 2018-04-01 02:00:00.000 UTC
10 2018-04-01 02:15:00.000 UTC
11 2018-04-01 02:30:00.000 UTC
12 2018-04-01 02:45:00.000 UTC
13 2018-04-01 03:00:00.000 UTC
Given the following table that stores value changes of a variable:
Timestamp Value
13:14 12
14:25 33
15:13 24
15:41 48
16:31 54
17:00 63
19:30 82
22:30 13
I need to construct a query that outputs the following:
Timestamp Value
14:00 12
15:00 33
16:00 48
17:00 63
18:00 63
19:00 63
20:00 82
21:00 82
22:00 82
23:00 13
And so on...
What would be the correct approach to achieve the desired output?
Thanks in advance.
use date_trunc() and date/time operator
roundup example
user=# select datetime from tbl_test limit 1;
datetime
------------------------
2013-07-26 15:36:00+09
(1 row)
user=# select date_trunc('hour', datetime) + interval '1 hour'
from tbl_test limit 1
?column?
------------------------
2013-07-26 16:00:00+09
(1 row)
formatting example
user=# select to_char(date_trunc('hour', datetime) + interval '1 hour', 'HH24:MI')
from tbl_test limit 1;
to_char
---------
16:00
(1 row)
UPDATED:
you can select latest one using window function.
SELECT DISTINCT x.timestamp, last_value(x.value) OVER (PARTITION BY x.timestamp)
FROM (SELECT TO_CHAR(date_trunc('hour', timestamp) + INTERVAL '1 hour', 'HH24:MI') AS timestamp, value
FROM tbl_test) as x
ORDER BY x.timestamp;
postgresql reference:
9.9. Date/Time Functions and Operators
9.8. Data Type Formatting Functions