Oracle SQL: Using intervals to specify ranges of hours - sql

I have power meter data stored in table MeterData:
create table MeterData (
MeterID VARCHAR2(10), DownloadCycle VARCHAR2(6), DateHour Date,
KWH Number(22,6), KW Number(22,6), KVA Number(22,6), KVAR Number(22,6),
CONSTRAINT UniqueDownload UNIQUE(MeterID, DownloadCycle, DateHour))
The data looks like this:
MeterID
DownloadCycle
DateHour
KWH
KW
KVA
KVAR
2319927
202206
13/06/2022 00:00
0.138
0.552
0.552
0
2500350
202206
13/06/2022 00:15
0.612
2.448
2.916
1.584
2500351
202206
13/06/2022 01:30
0.8
3.2
3.2358
0.48
2500352
202206
13/06/2022 04:00
0.288
1.152
1.44
0.864
2500353
202206
13/06/2022 05:30
0.90808
3.63232
4.32456
0
2500396
202206
13/06/2022 12:00
68.09
272.36
277.101157
51.04
2500446
202206
13/06/2022 18:15
0
0
0
0
2500453
202206
13/06/2022 21:00
2.772
11.088
11.088
0
2500472
202206
13/06/2022 23:30
64.8
259.2
305.788256
162.24
2500490
202206
14/06/2022 00:30
2.4
9.6
9.6
0
2501352
202206
14/06/2022 01:45
11.64
46.56
46.56
0
5187222
202206
14/06/2022 06:30
1.452
5.808
7.392
0
5284288
202206
14/06/2022 11:00
66.792
267.168
267.447334
149.336
5516997
202206
14/06/2022 18:30
0.384
1.536
8.112
0
I need to assign every record in table MeterData to a range of hours stored in table HourlyBlocks, in which I'm using intervals as the starting and ending hours:
create table HourlyBlocks (
HourlyBlock VARCHAR2(6) UNIQUE, BlockStart INTERVAL DAY(1) TO SECOND(0),
BlockEnd INTERVAL DAY(1) TO SECOND(0));
insert into HourlyBlocks values (
'Rest', interval '0 05:00:00' day to second, interval '0 18:00:00' day to second);
insert into HourlyBlocks values (
'Peak', interval '0 18:00:00' day to second, interval '0 23:00:00' day to second);
insert into HourlyBlocks values (
'Valley', interval '0 23:00:00' day to second, interval '1 05:00:00' day to second);
(HourlyBlock 'Valley' begins at 23:00:00 and ends at 05:00:00 of the following day).
To test to which HourlyBlock every record in MeterData belongs, I extract the record's hour, minute and second information as an interval with the following, adding 1 day to the interval if it is less than 05:00:00 and thus belongs to HourlyBlock 'Valley':
select distinct m.MeterID, m.DateHour, m.kwh,
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') + case when
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') <= interval '00 05:00:00'
day to second then interval '1' day else interval '0' day end as intval,
h.HourlyBlock
from MeterData m, HourlyBlocks h
where (NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour ), 'DAY') > h.BlockStart
and NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour ), 'DAY') + case when
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour ), 'DAY') <= interval '00 05:00:00'
day to second then interval '1' day else interval '0' day end <= h.BlockEnd)
The HourlyBlock are correctly assigned, except for records where DateHour is between 00:00:00 and 05:00:00!
What am I doing wrong?
The expected output for the sample data provided would be:
|MeterID|DateHour |KWH |intval |HourlyBlock|
|-------|----------------|------|-------------------|-----------|
|2319927|13/06/2022 00:00|0.138 |+00 00:00:00.000000|Valley |
|2500350|13/06/2022 00:15|0.612 |+01 00:15:00.000000|Valley |
|2500351|13/06/2022 01:30|0.8 |+01 01:30:00.000000|Valley |
|2500352|13/06/2022 04:00|0.288 |+01 04:00:00.000000|Valley |
|2500353|13/06/2022 05:30|0.908 |+00 05:30:00.000000|Rest |
|2500396|13/06/2022 12:00|68.09 |+00 12:00:00.000000|Rest |
|2500446|13/06/2022 18:15|0 |+00 18:15:00.000000|Peak |
|2500453|13/06/2022 21:00|2.772 |+00 21:00:00.000000|Peak |
|2500472|13/06/2022 23:30|64.8 |+00 23:30:00.000000|Valley |
|2500490|14/06/2022 00:30|2.4 |+01 00:30:00.000000|Valley |
|2501352|14/06/2022 01:45|11.64 |+01 01:45:00.000000|Valley |
|5187222|14/06/2022 06:30|1.452 |+00 06:30:00.000000|Rest |
|5284288|14/06/2022 11:00|66.792|+00 11:00:00.000000|Rest |
|5516997|14/06/2022 18:30|0.384 |+00 18:30:00.000000|Peak |
(I'm sorry I had to format the output as code. It was the only way around that pesky "Your post appears to contain code that is not properly formatted as code" error.)

I found the fix, and simplified my WHERE clause:
select distinct m.MeterID, m.DateHour, m.kwh,
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') + case when
NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') <= interval '00 05:00:00'
day to second then interval '1' day else interval '0' day end as intval, h.HourlyBlock
from MeterData m, HourlyBlocks h
where NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY') +
case when NUMTODSINTERVAL(m.DateHour - trunc(m.DateHour), 'DAY')
<= interval '00 05:00:00' day to second then interval '1' day else
interval '0' day end between h.BlockStart + interval '1' second and h.BlockStart

Related

How to convert data from one row to multiple rows base on Date

I wish to convert data from one row to multiple rows base on start_time and end_time.
INPUT DATA:
ID
Start_Time
End_Time
Down_Mins
ABC123
11/22/2022 12:01
11/29/2022 14:33
10232.47
I need to write SQL for this requirement:
OUTPUT_DATA:
ID
Start_Time
End_Time
Down_Mins
ABC123
11/22/2022 12:01
11/23/2022 7:00
1138.55
ABC123
11/23/2022 7:00
11/24/2022 7:00
1440
ABC123
11/24/2022 7:00
11/25/2022 7:00
1440
ABC123
11/25/2022 7:00
11/26/2022 7:00
1440
ABC123
11/26/2022 7:00
11/27/2022 7:00
1440
ABC123
11/27/2022 7:00
11/28/2022 7:00
1440
ABC123
11/28/2022 7:00
11/29/2022 7:00
1440
ABC123
11/29/2022 7:00
11/29/2022 14:33
453.92
enter image description here
You can use a recursive query to split the data into rows for each 24-hour period starting at 7am:
WITH days (id, start_time, day_end, end_time, day_mins, down_mins) AS (
SELECT id,
start_time,
LEAST(TRUNC(start_time - INTERVAL '7' HOUR) + INTERVAL '31' HOUR, end_time),
end_time,
LEAST((LEAST(TRUNC(start_time - INTERVAL '7' HOUR) + INTERVAL '31' HOUR, end_time) - start_time) * 24 * 60, down_mins),
down_mins - LEAST((LEAST(TRUNC(start_time - INTERVAL '7' HOUR) + INTERVAL '31' HOUR, end_time) - start_time) * 24 * 60, down_mins)
FROM table_name
UNION ALL
SELECT id,
day_end,
LEAST(day_end + INTERVAL '24' HOUR, end_time),
end_time,
LEAST((LEAST(day_end + INTERVAL '24' HOUR, end_time) - day_end) * 24 * 60, down_mins),
down_mins - LEAST((LEAST(day_end + INTERVAL '24' HOUR, end_time) - day_end) * 24 * 60, down_mins)
FROM days
WHERE day_end < end_time
AND down_mins > 0
)
SEARCH DEPTH FIRST BY id, start_time SET order_id
SELECT id,
start_time,
day_end AS end_time,
day_mins AS down_mins
FROM days;
Which, for the sample data:
CREATE TABLE table_name (ID, Start_Time, End_Time, Down_Mins) AS
SELECT 'ABC123',
DATE '2022-11-23' + INTERVAL '7' HOUR - NUMTODSINTERVAL(1138.55, 'MINUTE'),
DATE '2022-11-23' + INTERVAL '7' HOUR + NUMTODSINTERVAL(10232.47 - 1138.55, 'MINUTE'),
10232.47
FROM DUAL;
Outputs:
ID
START_TIME
END_TIME
DOWN_MINS
ABC123
2022-11-22 12:01:27
2022-11-23 07:00:00
1138.55
ABC123
2022-11-23 07:00:00
2022-11-24 07:00:00
1440
ABC123
2022-11-24 07:00:00
2022-11-25 07:00:00
1440
ABC123
2022-11-25 07:00:00
2022-11-26 07:00:00
1440
ABC123
2022-11-26 07:00:00
2022-11-27 07:00:00
1440
ABC123
2022-11-27 07:00:00
2022-11-28 07:00:00
1440
ABC123
2022-11-28 07:00:00
2022-11-29 07:00:00
1440
ABC123
2022-11-29 07:00:00
2022-11-29 14:33:55
453.916666666666666666666666666666666667
fiddle

SUM of production counts for "overnight work shift" in MS SQL (2019)

I need some help regarding sum of production count for overnight shifts.
The table just contains a timestamp (that is automaticaly generated by SQL server during INSERT), the number of OK produced pieces and the number of NOT OK produced pieces in that given timestamp.
CREATE TABLE [machine1](
[timestamp] [datetime] NOT NULL,
[OK] [int] NOT NULL,
[NOK] [int] NOT NULL
)
ALTER TABLE [machine1] ADD DEFAULT (getdate()) FOR [timestamp]
The table holds values like these (just an example, there are hundreds of lines each day and the time stamps are not fixed like each hour or each 30mins):
timestamp
OK
NOK
2022-08-01 05:30:00.000
15
1
2022-08-01 06:30:00.000
18
3
...
...
...
2022-08-01 21:30:00.000
10
12
2022-08-01 22:30:00.000
0
3
...
...
...
2022-08-01 23:59:00.000
1
2
2022-08-02 00:01:00.000
7
0
...
...
...
2022-08-02 05:30:00.000
12
4
2022-08-02 06:30:00.000
9
3
The production works in shifts like so:
morning shift: 6:00 -> 14:00
afternoon shift: 14:00 -> 22:00
night shift: 22:00 -> 6:00 the next day
I have managed to get sums for the morning and afternoon shifts without issues but I can't figure out how to do the sum for the night shift (I have these SELECTs for each shift stored as a VIEW for easy access).
For the morning shift:
SELECT CAST(timestamp AS date) AS Morning,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 6 AND DATEPART(hh,timestamp) < 14
GROUP BY CAST(timestamp AS date)
ORDER BY Morning ASC
For the afternoon shift:
SELECT CAST(timestamp AS date) AS Afternoon,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 14 AND DATEPART(hh,timestamp) < 22
GROUP BY CAST(timestamp AS date)
ORDER BY Afternoon ASC
Since we identify the date of each shift by its start, my idea would be that the result for such SUM of night shift would be
Night
SUM_OK
SUM_NOK
2022-08-01
xxx
xxx
for interval 2022-08-01 22:00:00.000 -> 2022-08-02 05:59:59.999
2022-08-02
xxx
xxx
for interval 2022-08-02 22:00:00.000 -> 2022-08-03 05:59:59.999
2022-08-03
xxx
xxx
for interval 2022-08-03 22:00:00.000 -> 2022-08-04 05:59:59.999
2022-08-04
xxx
xxx
for interval 2022-08-04 22:00:00.000 -> 2022-08-05 05:59:59.999
...
...
...
After few days of trial and error I have probably managed to find the needed solution. Using a subquery I shift all the times in range 00:00:00 -> 05:59:59 to the previous day and then I use that result in same approach as for morning and afternon shift (because now all the production data from night shift are in the same date between 22:00:00 and 23:59:59).
In case anyone needs it in future:
SELECT
CAST(nightShift.shiftedTime AS date) AS Night,
SUM(nightShift.OK) AS SUM_OK,
SUM(nightShift.NOK) AS SUM_NOK
FROM
(SELECT
CASE WHEN (DATEPART(hh, timestamp) < 6 AND DATEPART(hh, timestamp) >= 4) THEN DATEADD(HOUR, -6, timestamp)
WHEN (DATEPART(hh, timestamp) < 4 AND DATEPART(hh, timestamp) >= 2) THEN DATEADD(HOUR, -4, timestamp)
WHEN (DATEPART(hh, timestamp) < 2 AND DATEPART(hh, timestamp) >= 0) THEN DATEADD(HOUR, -2, timestamp)
END AS shiftedTime,
[OK],
[NOK]
FROM [machine1]
WHERE (DATEPART(hh, cas) >= 0 AND DATEPART(hh, cas) < 6)) nightShift
WHERE DATEPART(hh,nightShift.shiftedTime) >= 22
GROUP BY CAST(nightShift.shiftedTime AS date)
ORDER BY Night ASC
PS: If there is anything wrong with this approach, please feel free to correct me as I'm just newbie in SQL. So far this seems to do exactly what I needed.

How to join generated datetimes with data in SQLite database?

I have a table with data from a sensor created like that:
CREATE TABLE IF NOT EXISTS "aqi" (
"time" datetime,
"pm25" real,
"pm10" real
);
When is sensor running, it sends data to a server (which it writes to a database) every second. But when the sensor is not running, there are "gaps" in data in the database like that (I've rewritten time column to a readable format and timezone GMT+01, leaving raw data in parentheses):
time
pm25
pm10
...
...
...
2021-12-28 18:44 (1640713462)
9.19
9.27
2021-12-28 18:45 (1640713522)
9.65
9.69
2021-12-28 18:46 (1640713582)
9.68
9.76
2021-12-29 10:17 (1640769421)
7.42
7.42
2021-12-29 10:18 (1640769481)
7.94
7.98
2021-12-29 10:19 (1640769541)
7.42
7.43
...
...
...
I wanted to create a query, that selects data from the last 24 hours, outputting pm25 and pm10 as NULL if there aren't data in the table for the current time. So the table above would look like that:
time
pm25
pm10
...
...
...
2021-12-28 18:44 (1640713462)
9.19
9.27
2021-12-28 18:45 (1640713522)
9.65
9.69
2021-12-28 18:46 (1640713582)
9.68
9.76
2021-12-28 18:47 (1640713642)
NULL
NULL
2021-12-28 18:48 (1640713702)
NULL
NULL
2021-12-28 18:49 (1640713762)
NULL
NULL
...
...
...
2021-12-29 10:14 (1640769262)
NULL
NULL
2021-12-29 10:15 (1640769322)
NULL
NULL
2021-12-29 10:16 (1640769382)
NULL
NULL
2021-12-29 10:17 (1640769421)
7.42
7.42
2021-12-29 10:18 (1640769481)
7.94
7.98
2021-12-29 10:19 (1640769541)
7.42
7.43
...
...
...
I don't mind if the seconds would be different because of the generation of time...
I tried generating time for the last 24 hours using code from https://stackoverflow.com/a/32987070 and that works, as I wanted:
WITH RECURSIVE dates(generated_time) AS (
VALUES(datetime('now', '-1 minute', 'localtime'))
UNION ALL
SELECT datetime(generated_time, '-1 minute')
FROM dates
LIMIT 1440
)
SELECT strftime('%Y-%m-%d %H:%M', datetime(generated_time)) AS time
FROM dates;
But I don't know how to add (JOIN) data from the sensor (columns pm25, pm10) to query above... I tried something, but it outputs 0 rows:
WITH RECURSIVE dates(generated_time) AS (
VALUES(datetime('now', '-1 minute', 'localtime'))
UNION ALL
SELECT datetime(generated_time, '-1 minute')
FROM dates
LIMIT 1440
)
SELECT
strftime('%Y-%m-%d %H:%M', datetime(generated_time)) AS generated_time,
pm25,
pm10
FROM
dates
INNER JOIN aqi ON generated_time = strftime('%Y-%m-%d %H:%M', datetime(aqi.time));
Probably it's something really obvious, that I'm missing, but I have no idea :/
EDIT:
As #DrummerMann pointed out, it works with LEFT JOIN, but it takes around one whole minute to execute the query (in the database is around 14 000 values):
WITH RECURSIVE dates(time) AS (
VALUES(datetime('now', '-1 minute', 'localtime'))
UNION ALL
SELECT datetime(time, '-1 minute')
FROM dates
LIMIT 1440
)
SELECT
dates.time,
aqi.pm25,
aqi.pm10
FROM
dates
LEFT JOIN aqi ON strftime('%Y-%m-%d %H:%M', datetime(dates.time)) = strftime('%Y-%m-%d %H:%M', datetime(aqi.time, 'unixepoch', 'localtime'))
ORDER BY dates.time;
Is there any better way to do that?
Try this version of the cte, which uses integer unix timestamps where the seconds are stripped off and there are no functions in the ON clause of the join:
WITH RECURSIVE dates(generated_time) AS (
SELECT strftime('%s', 'now', '-1 minute', 'localtime') / 60 * 60
UNION ALL
SELECT generated_time - 60
FROM dates
LIMIT 1440
)
SELECT strftime('%Y-%m-%d %H:%M', d.generated_time, 'unixepoch', 'localtime') AS generated_time,
a.pm25,
a.pm10
FROM dates d LEFT JOIN aqi a
ON d.generated_time = a.time / 60 * 60;

get time series in 4 days of interval

I am generating one time-series from using the below query.
SELECT date_trunc('day', dd):: TIMESTAMP WITHOUT TIME zone as time_ent
FROM generate_series (
CASE
WHEN MOD(EXTRACT(DAY FROM '2020-12-13 13:02:42'::timestamp)::INT, 4) = 0 THEN
'2020-12-13 13:02:42'::date
ELSE
'2020-12-13 13:02:42'::date + concat(MOD(EXTRACT(DAY FROM '2020-12-13 13:02:42'::timestamp)::INT, 4), ' day')::interval
END
, '2021-12-13 13:02:42'::date
, '5760 min'::INTERVAL
) dd
and it will give me output like below.
2020-12-14 00:00:00.000
2020-12-18 00:00:00.000
2020-12-22 00:00:00.000
2020-12-26 00:00:00.000
2020-12-30 00:00:00.000
2021-01-03 00:00:00.000
but I need output like.
2020-12-16 00:00:00.000
2020-12-20 00:00:00.000
2020-12-24 00:00:00.000
2020-12-28 00:00:00.000
2020-01-01 00:00:00.000
2021-01-05 00:00:00.000
currently, the time series days depend upon the timestamp that I pass. in above it gives me days like 14,18,22...but I want the days like 16,20,24. multiple of 4..days should not depend on the time I passed in query. I tried many things but not any success.
Try this :
SELECT date_trunc('day', dd):: TIMESTAMP WITHOUT TIME zone as time_ent
FROM generate_series ( date_trunc('month', '2020-12-13 13:02:42' :: timestamp) :: date + (ceiling (EXTRACT(DAY FROM '2020-12-13 13:02:42'::timestamp)/4)*4 - 1) :: integer
, '2021-12-13 13:02:42'::date
, '4 days' ::INTERVAL
) dd
see the result

Find out number of months between 2 dates

select
(age('2012-11-30 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp)),
(age('2012-12-31 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp)),
(age('2013-01-31 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp)),
(age('2013-02-28 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp))
which gives the followings:
0 years 0 mons 30 days 0 hours 0 mins 0.00 secs
0 years 2 mons 0 days 0 hours 0 mins 0.00 secs
0 years 3 mons 0 days 0 hours 0 mins 0.00 secs
0 years 3 mons 28 days 0 hours 0 mins 0.00 secs
But I want to have the following month definition , how can I do it?
0 years 1 mons 0 days 0 hours 0 mins 0.00 secs
0 years 2 mons 0 days 0 hours 0 mins 0.00 secs
0 years 3 mons 0 days 0 hours 0 mins 0.00 secs
0 years 4 mons 0 days 0 hours 0 mins 0.00 secs
The expression
age('2012-11-30 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp)
gives 30 days. We are expecting 1 month as both values point to last days of month. If we add 1 day to the values we shall get first days of next month and
age('2012-12-01 00:00:00'::timestamp, '2012-11-01 00:00:00'::timestamp)
will give us 1 month as expected. So let us check if we have two last days of month and in this case return age interval of the next days. In other cases we shall return age interval of original values:
create or replace function age_m (t1 timestamp, t2 timestamp)
returns interval language plpgsql immutable
as $$
declare
_t1 timestamp = t1+ interval '1 day';
_t2 timestamp = t2+ interval '1 day';
begin
if extract(day from _t1) = 1 and extract(day from _t2) = 1 then
return age(_t1, _t2);
else
return age(t1, t2);
end if;
end $$;
Some examples:
with my_table(date1, date2) as (
values
('2012-11-30 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp),
('2012-12-31 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp),
('2013-01-31 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp),
('2013-02-28 00:00:00'::timestamp, '2012-10-31 00:00:00'::timestamp)
)
select *, age(date1, date2), age_m(date1, date2)
from my_table
date1 | date2 | age | age_m
---------------------+---------------------+----------------+--------
2012-11-30 00:00:00 | 2012-10-31 00:00:00 | 30 days | 1 mon
2012-12-31 00:00:00 | 2012-10-31 00:00:00 | 2 mons | 2 mons
2013-01-31 00:00:00 | 2012-10-31 00:00:00 | 3 mons | 3 mons
2013-02-28 00:00:00 | 2012-10-31 00:00:00 | 3 mons 28 days | 4 mons
(4 rows)
It seems like you always use the last day of the month. What you are trying to do works flawlessly with the first day of the month. So use that instead. You can always subtract a single day to get the last day of the previous month.
#klin's function is based on that. For dates (instead of timestamps), simplify:
_t1 date = t1 + 1;
_t2 date = t2 + 1;
One can just add / subtract integer values from dates (but not timestamps).
If you want to add "a month", don't just increase the month field, since this can fail like you have experienced. And there is also the wrap around at the end of the year. Add an interval '1 month' instead.
SELECT (mydate + interval '1 month')::date AS mydate_next_month;
I cast back to date because the result of date + interval is a timestamp.
This "rounds down" automatically, if the last day of the next month is before the day in the original date. Note that it does not "round up" in the opposite case. If you want that, operate with the first of the month instead as explained above.
SQL Fiddle.
This is a modified version of the time rounding function located on PostgreSQL's official wiki:
CREATE OR REPLACE FUNCTION interval_round(base_interval INTERVAL, round_interval INTERVAL) RETURNS INTERVAL AS $BODY$
SELECT justify_interval((EXTRACT(epoch FROM $1)::INTEGER + EXTRACT(epoch FROM $2)::INTEGER / 2)
/ EXTRACT(epoch FROM $2)::INTEGER * EXTRACT(epoch FROM $2)::INTEGER * INTERVAL '1 second');
$BODY$ LANGUAGE SQL STABLE;
You can call it with another interval to round to, f.ex.
SELECT interval_round(age('2013-02-28 00:00:00'::timestamp,
'2012-10-31 00:00:00'::timestamp), '1 month')
will return 4 mons.
From http://www.postgresql.org/docs/8.4/static/functions-datetime.html, can you swap the order of the timestamps?
"Note there can be ambiguity in the months returned by age because different months have a different number of days. PostgreSQL's approach uses the month from the earlier of the two dates when calculating partial months. For example, age('2004-06-01', '2004-04-30') uses April to yield 1 mon 1 day, while using May would yield 1 mon 2 days because May has 31 days, while April has only 30."