Query - find empty interval in series of timestamps - sql

I have a table that stores historical data. I get a row inserted in this query every 30 seconds from different type of sources and obviously there is a time stamp associated.
Let's make my parameter as disservice to 1 hour.
Since I charge my services based on time, I need to know, for example, in a specific month, if there is a period within this month in which the there is an interval which is equal or exceeds my 1 hour interval.
A simplified structure of the table would be like:
tid serial primary key,
tunitd id int,
tts timestamp default now(),
tdescr text
I don't want to write a function that loops through all the records comparing them one by one as I suppose it is time and memory consuming.
Is there any way to do this directly from SQL maybe using the interval type in PostgreSQL?
Thanks.

this small SQL query will display all gaps with the duration more than one hour:
select tts, next_tts, next_tts-tts as diff from
(select a.tts, min(b.tts) as next_tts
from test1 a
inner join test1 b ON a.tts < b.tts
GROUP BY a.tts) as c
where next_tts - tts > INTERVAL '1 hour'
order by tts;
SQL Fiddle

Related

Splitting multiple overlapping date range interval in postgresql

I am using postgresql version 10.9 and I am trying to split overlapping intervals from 2 different tables that record events.
For each interval of the main table, I need to detect the person movement (arrival or departure) that overlaps the main interval, and split it for each prod_line by the event. If the person movement is arrival, I need to take the highest value of a 5 minutes interval consecutive arrivals. If the person movement is departure, I need to take the lowest value of a 5 minutes interval of consecutive departures.
I only found samples of merging overlapping intervals for date ranges within the same table.
I tried to write a function that loops through the main data set and for each interval to loop through the attendance intervals and return overlapping intervals.
I failed to make the attendance intervals take into account the highest arrival_time within a 5 minute interval and the lowest value within a 5 minute interval for departure time and also further compare it to the main current interval to properly split it as expected.
My 2 tables have the following structure
main (prod_line text, item_code text, start_time timestamp without time zone, end_time timestamp without time zone);
attendance(person_id text, prod_line text, arrival_time timestamp without time zone, departure_time timestamp without time zone);
having the following sample data for Main:
"RS-5";"110067805";"2019-06-11 06:30:41";"2019-06-11 15:00:05"
and for Attendance
11770;"RS-5";"2019-06-11 06:30:09";"2019-06-11 11:00:12"
675;"RS-5";"2019-06-11 06:30:14";"2019-06-11 10:00:01"
11504;"RS-5";"2019-06-11 06:30:17";"2019-06-11 10:00:07"
101;"RS-5";"2019-06-11 06:30:23";"2019-06-11 11:00:10"
627;"RS-5";"2019-06-11 06:30:25";"2019-06-11 11:00:20"
11765;"RS-5";"2019-06-11 06:34:29";"2019-06-11 11:00:01"
675;"RS-5";"2019-06-11 11:30:09";"2019-06-11 15:00:25"
627;"RS-5";"2019-06-11 11:30:16";"2019-06-11 15:00:24"
11504;"RS-5";"2019-06-11 11:30:19";"2019-06-11 15:00:18"
11770;"RS-5";"2019-06-11 11:30:22";"2019-06-11 15:00:15"
11765;"RS-5";"2019-06-11 11:30:25";"2019-06-11 15:00:12"
101;"RS-5";"2019-06-11 11:30:27";"2019-06-11 15:00:30"
353;"RS-5";"2019-06-11 15:01:39";"2019-06-11 15:10:35"
11712;"RS-5";"2019-06-11 15:01:42";"2019-06-11 15:10:34"
817;"RS-5";"2019-06-11 15:01:44";"2019-06-11 15:10:32"
1337;"RS-5";"2019-06-11 15:01:46";"2019-06-11 15:10:30"
1363;"RS-5";"2019-06-11 15:01:48";"2019-06-11 15:10:28"
1510;"RS-5";"2019-06-11 15:01:50";"2019-06-11 15:10:24"
Rextester Fiddle
Basically, I would like to get a result of the intervals looking like this:
result (prod_line text, item_code text, start_time timestamp without time zone, end_time timestamp without time zone);
with the values
"RS-5";"110067805";"2019-06-11 06:34:29";"2019-06-11 10:00:01"
"RS-5";"110067805";"2019-06-11 10:00:07";"2019-06-11 11:00:01"
"RS-5";"110067805";"2019-06-11 11:00:01";"2019-06-11 11:30:27"
"RS-5";"110067805";"2019-06-11 11:30:27";"2019-06-11 15:00:05"
In case anyone else needs a solution for this, I figured something out (maybe not the best but it works)
The solution is available here

count difference in seconds between values from table and some moment in time using PostgreSQL

How to find difference in seconds betweem chosen moment and values from table using PostgreSQL?
For example chosen moment is '2003-05-21' and table look like
TABLE
Name Date_of_Birth
Charles 2007-12-12
Matti 2003-09-20
Kath 2009-11-09
I tried to use this
SELECT EXTRACT(SECONDS FROM TIMESTAMP '2013-05-21')-EXTRACT(SECONDS FROM TIMESTAMP (SELECT Date_of_Birth FROM TABLE);
How to find how many second has gone from date of birth to the moment for every person from table? And results should be represented as table
Forgive me if my question is not understantable?
Just extract the epoch from the difference:
select extract(epoch from timestamp '2013-05-21' - date_of_birth)
from the_table;
From the manual
epoch
[...] for interval values, the total number of seconds in the interval

Resample on time series data

I have a table with time series column in the millisecond, I want to resample the time series and apply mean on the group. How can I implement it in Postgres?
"Resample" means aggregate all time stamps within one second or one minute. All rows within one second or one minute form a group.
table structure
date x y z
Use date_trunc() to truncate timestamps to a given unit of time, and GROUP BY that expression:
SELECT date_trunc('minute', date) AS date_truncated_to_minute
, avg(x) AS avg_x
, avg(y) AS avg_y
, avg(z) AS avg_z
FROM tbl
GROUP BY 1;
Assuming your misleadingly named date column is actually of type timestamp or timestamptz.
Related answer with more details and links:
PostgreSQL: running count of rows for a query 'by minute'

Oracle timestamp difference greater than X hours/days/months

I am trying to write a query to run on Oracle database. The table ActionTable contains actionStartTime and actionEndTime columns. I need to find out which action took longer than 1 hour to complete.
actionStartTime and actionEndTime are of timestamp type
I have a query which gives me the time taken for each action:
select (actionEndTime - actionStartTime) actionDuration from ActionTable
What would be my where clause that would return only actions that took longer than 1 hour to finish?
Subtracting two timestamps returns an interval. So you'd want something like
SELECT (actionEndTime - actionStartTime) actionDuration
FROM ActionTable
WHERE actionEndTime - actionStartTime > interval '1' hour

Postgres SQL select a range of records spaced out by a given interval

I am trying to determine if it is possible, using only sql for postgres, to select a range of time ordered records at a given interval.
Lets say I have 60 records, one record for each minute in a given hour. I want to select records at 5 minute intervals for that hour. The resulting rows should be 12 records each one 5 minutes apart.
This is currently accomplished by selecting the full range of records and then looping thru the results and pulling out the records at the given interval. I am trying to see if I can do this purly in sql as our db is large and we may be dealing with tens of thousands of records.
Any thoughts?
Yes you can. Its really easy once you get the hang of it. I think its one of jewels of SQL and its especially easy in PostgreSQL because of its excellent temporal support. Often, complex functions can turn into very simple queries in SQL that can scale and be indexed properly.
This uses generate_series to draw up sample time stamps that are spaced 1 minute apart. The outer query then extracts the minute and uses modulo to find the values that are 5 minutes apart.
select
ts,
extract(minute from ts)::integer as minute
from
( -- generate some time stamps - one minute apart
select
current_time + (n || ' minute')::interval as ts
from generate_series(1, 30) as n
) as timestamps
-- extract the minute check if its on a 5 minute interval
where extract(minute from ts)::integer % 5 = 0
-- only pick this hour
and extract(hour from ts) = extract(hour from current_time)
;
ts | minute
--------------------+--------
19:40:53.508836-07 | 40
19:45:53.508836-07 | 45
19:50:53.508836-07 | 50
19:55:53.508836-07 | 55
Notice how you could add an computed index on the where clause (where the value of the expression would make up the index) could lead to major speed improvements. Maybe not very selective in this case, but good to be aware of.
I wrote a reservation system once in PostgreSQL (which had lots of temporal logic where date intervals could not overlap) and never had to resort to iterative methods.
http://www.amazon.com/SQL-Design-Patterns-Programming-Focus/dp/0977671542 is an excellent book that goes has lots of interval examples. Hard to find in book stores now but well worth it.
Extract the minutes, convert to int4, and see, if the remainder from dividing by 5 is 0:
select *
from TABLE
where int4 (date_part ('minute', COLUMN)) % 5 = 0;
If the intervals are not time based, and you just want every 5th row; or
If the times are regular and you always have one record per minute
The below gives you one record per every 5
select *
from
(
select *, row_number() over (order by timecolumn) as rown
from tbl
) X
where mod(rown, 5) = 1
If your time records are not regular, then you need to generate a time series (given in another answer) and left join that into your table, group by the time column (from the series) and pick the MAX time from your table that is less than the time column.
Pseudo
select thetimeinterval, max(timecolumn)
from ( < the time series subquery > ) X
left join tbl on tbl.timecolumn <= thetimeinterval
group by thetimeinterval
And further join it back to the table for the full record (assuming unique times)
select t.* from
tbl inner join
(
select thetimeinterval, max(timecolumn) timecolumn
from ( < the time series subquery > ) X
left join tbl on tbl.timecolumn <= thetimeinterval
group by thetimeinterval
) y on tbl.timecolumn = y.timecolumn
How about this:
select min(ts), extract(minute from ts)::integer / 5
as bucket group by bucket order by bucket;
This has the advantage of doing the right thing if you have two readings for the same minute, or your readings skip a minute. Instead of using min even better would be to use one of the the first() aggregate functions-- code for which you can find here:
http://wiki.postgresql.org/wiki/First_%28aggregate%29
This assumes that your five minute intervals are "on the fives", so to speak. That is, that you want 07:00, 07:05, 07:10, not 07:02, 07:07, 07:12. It also assumes you don't have two rows within the same minute, which might not be a safe assumption.
select your_timestamp
from your_table
where cast(extract(minute from your_timestamp) as integer) in (0,5);
If you might have two rows with timestamps within the same minute, like
2011-01-01 07:00:02
2011-01-01 07:00:59
then this version is safer.
select min(your_timestamp)
from your_table
group by (cast(extract(minute from your_timestamp) as integer) / 5)
Wrap either of those in a view, and you can join it to your base table.