Find a latest gap between Unix timestamps - sql

I currently have two functions that should return the time a device started logging again, the time when the previous row before it was more than 60 seconds away. These functions may work fine but I have to see it work as it takes forever. Is there any shortcuts to make this faster?
CREATE OR REPLACE FUNCTION findNextTime(startt integer)
RETURNS integer AS
$nextTime$
DECLARE
nextTime integer;
BEGIN
select time into nextTime from m01 where time < startt ORDER BY time DESC LIMIT 1;
return nextTime;
END;
$nextTime$ LANGUAGE plpgsql;
CREATE OR REPlACE FUNCTION findStart()
RETURNS integer AS
$lastTime$
DECLARE
currentTime integer;
lastTime integer;
BEGIN
select time into currentTime from m01 ORDER BY time DESC LIMIT 1;
LOOP
RAISE NOTICE 'Current Time: %', currentTime;
select findNextTime(currentTime) into lastTime;
EXIT WHEN ((currentTime - lastTime) > 60);
currentTime := lastTime;
END LOOP;
return lastTime;
END;
$lastTime$ LANGUAGE plpgsql;
To clarify, I want to essentially find the last time there was a break of more than 60 seconds between any two rows.
CREATE TABLE IF NOT EXISTS m01 (
time integer,
value decimal,
id smallint,
driveId smallint
)
Sample Data:
In this case it would return 1520376063 because the next entry (1520375766) is more than 60 seconds apart it.
| time | value | id | driveid |
|------------|--------------------|------|---------|
| 1520376178 | 516.2 | 5116 | 2 |
| 1520376173 | 507.8 | 5116 | 2 |
| 1520376168 | 499.5 | 5116 | 2 |
| 1520376163 | 491.1 | 5116 | 2 |
| 1520376158 | 482.90000000000003 | 5116 | 2 |
| 1520376153 | 474.5 | 5116 | 2 |
| 1520376148 | 466.20000000000005 | 5116 | 2 |
| 1520376143 | 457.8 | 5116 | 2 |
| 1520376138 | 449.5 | 5116 | 2 |
| 1520376133 | 441.20000000000005 | 5116 | 2 |
| 1520376128 | 432.90000000000003 | 5116 | 2 |
| 1520376123 | 424.6 | 5116 | 2 |
| 1520376118 | 416.20000000000005 | 5116 | 2 |
| 1520376113 | 407.8 | 5116 | 2 |
| 1520376108 | 399.5 | 5116 | 2 |
| 1520376103 | 391.20000000000005 | 5116 | 2 |
| 1520376098 | 382.90000000000003 | 5116 | 2 |
| 1520376093 | 374.5 | 5116 | 2 |
| 1520376088 | 366.20000000000005 | 5116 | 2 |
| 1520376083 | 357.8 | 5116 | 2 |
| 1520376078 | 349.5 | 5116 | 2 |
| 1520376073 | 341.20000000000005 | 5116 | 2 |
| 1520376068 | 332.90000000000003 | 5116 | 2 |
| 1520376063 | 324.5 | 5116 | 2 |
| 1520375766 | 102.5 | 5116 | 2 |

This simple query should replace your two functions. Note the window function lead() in the subquery:
SELECT *
FROM (
SELECT time, lead(time) OVER (ORDER BY time DESC) AS last_time
FROM m01
WHERE time < _startt
) sub
WHERE time > last_time + 60
ORDER BY time DESC
LIMIT 1;
Either way, the crucial part for performance is the right index. Ideally on (time DESC).
Assuming time is defined NOT NULL - which it probably should be, but the table definition in the question does not say so. Else you probably want ORDER BY time DESC NULLS LAST - and a matching index. See:
PostgreSQL sort by datetime asc, null first?
I expect this plpgsql function to perform faster, though, if gaps typically show up early:
CREATE OR REPLACE FUNCTION find_gap_before_time(_startt int)
RETURNS int AS
$func$
DECLARE
_current_time int;
_last_time int;
BEGIN
FOR _last_time IN -- single loop is enough!
SELECT time
FROM m01
WHERE time < _startt
ORDER BY time DESC -- NULLS LAST?
LOOP
IF _current_time > _last_time + 60 THEN -- never true for 1st row
RETURN _current_time;
END IF;
_current_time := _last_time;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT find_gap_before_time(1520376200);
Result as requested.
Aside: You'd typically save a couple of bytes per row in storage by placing the column value last or first, thereby minimizing alignment padding. Like:
CREATE TABLE m01 (
time integer,
id smallint,
driveId smallint,
value decimal
);
Detailed explanation:
Calculating and saving space in PostgreSQL

Related

oracle multiple criteria dynamic sql

I have a multiple criteria search function for user input/select different criteria to find results, and every criteria is optional, so the field value could be null; The PL/SQL backend processes each criteria value to construct a dynamic SQL.
Currently, I use the below way to process, but it is hard for debugging and maintaining.
jo := json_object_t(p_payload);
v_country := jo.get_String('IAINST_NATN_CODE');
v_region := jo.get_String('IAINST_REGN_CODE');
v_rank_code := jo.get_String('RANK_CODE');
v_year := jo.get_String('RANK_YEAR');
v_sql := 'select * from IAVW_INST_JSON_TABLE i where
((:1 is null) or (i.IAINST_NATN_CODE = :1))
and ((:2 is null) or (i.IAINST_REGN_CODE = :2))
and ((:3 is null) or (i.RANK_CODE = :3))
and ((:4 is null) or (i.RANK_YEAR = :4))';
OPEN c FOR v_sql
USING v_country, v_country, --1
v_region, v_region, --2
v_rank_code, v_rank_code, --3
v_year, v_year; --4
RETURN c;
Any good advice to improve?
I would only change the structure of the clauses to be like :
AND i.IAINST_REGN_CODE = NVL(:2, i.IAINST_REGN_CODE)
This way you will avoid OR and still won't interfer with indexing if there is any, but apart from that your code looks fine (and fine even without my suggestion either).
After searching the related post. Here is the summary:
For my scenario, my table owns around 5K rows.
So
WHERE NVL(mycolumn,'NULL') = NVL(searchvalue,'NULL')
could simplify my dynamic SQL.
But if the table owns massive data, the above approach is not efficient (time cost to run the column conversion for NVL), please use the below query:
where ((MYCOLUMN=SEARCHVALUE) OR (MYCOLUMN is NULL and SEARCHVALUE is NULL))
Details see this post: Determine Oracle null == null
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:7806711400346248708
For parameters referencing non-nullable columns you can use
and t.somecol = nvl(:b1,t.somecol)
For this the parser/optimiser will typically generate an execution plan with a union-all and a filter such that the most efficient approach will be used depending on whether :b1 is null or not (depending on database version, indexing, stats etc).
select * from bigtable t where t.product_type = nvl(:b1,t.product_type)
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6000000 | 486000000 | 5679 | 00:00:01 |
| 1 | PX COORDINATOR | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 6000000 | 486000000 | 5679 | 00:00:01 |
| 3 | VIEW | VW_ORE_1B35BA0F | 6000000 | 486000000 | 5679 | 00:00:01 |
| 4 | UNION-ALL | | | | | |
| * 5 | FILTER | | | | | |
| 6 | PX BLOCK ITERATOR | | 1200000 | 145200000 | 2840 | 00:00:01 |
| * 7 | TABLE ACCESS FULL | BIGTABLE | 1200000 | 145200000 | 2840 | 00:00:01 |
| * 8 | FILTER | | | | | |
| 9 | PX BLOCK ITERATOR | | 4800000 | 580800000 | 2840 | 00:00:01 |
| 10 | TABLE ACCESS FULL | BIGTABLE | 4800000 | 580800000 | 2840 | 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 5 - filter(:B1 IS NOT NULL)
* 7 - filter("T"."PRODUCT_TYPE"=:B1)
* 8 - filter(:B1 IS NULL)
However, it obviously can't keep extending this by generating union-alls for every possible combination of an arbitrarily large number of bind variables.
select * from bigtable t
where t.product_type = nvl(:b1,t.product_type)
and t.in_stock = nvl(:b2,t.in_stock)
and t.discounted = nvl(:b3,t.discounted)
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 594 | 48114 | 5699 | 00:00:01 |
| 1 | PX COORDINATOR | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 594 | 48114 | 5699 | 00:00:01 |
| 3 | VIEW | VW_ORE_1B35BA0F | 594 | 48114 | 5699 | 00:00:01 |
| 4 | UNION-ALL | | | | | |
| * 5 | FILTER | | | | | |
| 6 | PX BLOCK ITERATOR | | 119 | 14399 | 2844 | 00:00:01 |
| * 7 | TABLE ACCESS FULL | BIGTABLE | 119 | 14399 | 2844 | 00:00:01 |
| * 8 | FILTER | | | | | |
| 9 | PX BLOCK ITERATOR | | 475 | 57475 | 2854 | 00:00:01 |
| * 10 | TABLE ACCESS FULL | BIGTABLE | 475 | 57475 | 2854 | 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 5 - filter(:B1 IS NOT NULL)
* 7 - filter("T"."PRODUCT_TYPE"=:B1 AND "T"."IN_STOCK"=NVL(:B2,"T"."IN_STOCK") AND "T"."DISCOUNTED"=NVL(:B3,"T"."DISCOUNTED") AND (NVL(:B2,"T"."IN_STOCK")='Y' OR NVL(:B2,"T"."IN_STOCK")='N') AND
(NVL(:B3,"T"."DISCOUNTED")='Y' OR NVL(:B3,"T"."DISCOUNTED")='N'))
* 8 - filter(:B1 IS NULL)
* 10 - filter("T"."IN_STOCK"=NVL(:B2,"T"."IN_STOCK") AND "T"."DISCOUNTED"=NVL(:B3,"T"."DISCOUNTED") AND (NVL(:B2,"T"."IN_STOCK")='Y' OR NVL(:B2,"T"."IN_STOCK")='N') AND (NVL(:B3,"T"."DISCOUNTED")='Y'
OR NVL(:B3,"T"."DISCOUNTED")='N'))
The classic Tom Kyte/Bryn Llewellyn approach is to generate different SQL depending on whether the parameter is null or not null, but still binding each parameter exactly once. This way will produce multiple different cursors, but maximum 2 * the number of parameter values, and it's neat and efficient. The idea is that for each parameter value, you generate either
where t.column = :b1
if :b1 has a value, or else
where (1=1 or :b1 is null)
if it's null. You could logically skip the 1=1 part, but it takes advantage of some short-circuiting logic in the Oracle SQL parser that means it won't evaluate the or condition at all because it knows there is no need. For example,
select dummy from dual where 1=1 or sqrt(-1) > 1/0;
which returns 'X' without evaluating the impossible sqrt(-1) or 1/0 expressions.
Using this approach, your SQL would be generated as something like this:
v_sql := '
select * from iavw_inst_json_table i
where (1=1 or i.iainst_natn_code = :1)
and i.iainst_regn_code = :2
and i.rank_code = :3
and (1=1 or i.rank_year = :4)
';
You could use a procedure to generate the parameter handling SQL:
declare
l_report_sql clob := 'select * from bigtable t where 1=1';
l_product_type bigtable.product_type%type;
l_in_stock bigtable.in_stock%type := 'Y';
l_discounted bigtable.discounted%type := 'N';
procedure apply_bind
( p_bind# in number
, p_column_name in varchar2
, p_value_is_null in boolean
, p_sql in out clob )
is
begin
p_sql := p_sql || chr(10) || 'and ' ||
case
when p_value_is_null then '(1=1 or :'||p_bind#||' is null)'
else p_column_name||' = :'||p_bind#
end;
end;
begin
apply_bind(1, 't.product_type', l_product_type is null, l_report_sql);
apply_bind(2, 't.in_stock', l_in_stock is null, l_report_sql);
apply_bind(3, 't.discounted', l_discounted is null, l_report_sql);
dbms_output.put_line(l_report_sql);
open :results for l_report_sql using l_product_type, l_in_stock, l_discounted;
end;
My example gives:
select * from bigtable t where 1=1
and (1=1 or :1 is null)
and t.in_stock = :2
and t.discounted = :3

join two views and detect missing entries where the matching condition is in the next row of the other view/table (using SQLITE)

I am running a science test and logging my data inside two sqlite tables.
I have selected the data needed into two seperate and independent Views (RX and TX views).
Now I need to analyze the measurements and create a 3rd table view with the results with the following points in mind:
1- For each test at TX side (Table-1) there might be a corresponding entry at RX side (Table-2).
2- If the time stamp #RX side is less than the time stamp at the next row of the TX table view
we consider them to be associated with one record in the 3rd view/table and calculate the time difference OTHERWISE it would be a miss.
Question: How should i write the sql query in SQLITE to produce the analysis and test result given in table3?
Thanks a lot in advance.
TX View - Table (1)
id | time | measurement
------------------------
1 | 09:40:10.221 | 100
2 | 09:40:15.340 | 60
3 | 09:40:21.100 | 80
4 | 09:40:25.123 | 90
5 | 09:40:29.221 | 45
RX View -Table (2)
time | measurement
------------------------
09:40:15.7 | 65
09:40:21.560 | 80
09:40:30.414 | 50
Test Result View - Table (3)
id |TxTime |RxTime | delta_time(s)| delta_value
------------------------------------------------------------------------
1 | 09:40:10.221 | NULL |NULL | NULL (i.e. missed)
2 | 09:40:15.340 | 09:40:15.7 |0.360 | 5
3 | 09:40:21.100 | 09:40:21.560 |0.460 | 0
4 | 09:40:25.123 | NULL |NULL | NULL (i.e. missed)
5 | 09:40:29.221 | 09:40:30.414 |1.193 | 5
Use window function LEAD() to get the next time of each row in TX and join the views on your conditions:
SELECT t.id, t.time TxTime, r.time RxTime,
ROUND((julianday(r.time) - julianday(t.time)) * 24 * 60 *60, 3) [delta_time(s)],
r.measurement - t.measurement delta_value
FROM (
SELECT *, LEAD(time) OVER (ORDER BY Time) next
FROM TX
) t
LEFT JOIN RX r ON r.time >= t.time AND (r.time < t.next OR t.next IS NULL)
See the demo.
Results:
> id | TxTime | RxTime | delta_time(s) | delta_value
> -: | :----------- | :----------- | :------------ | :----------
> 1 | 09:40:10.221 | null | null | null
> 2 | 09:40:15.340 | 09:40:15.7 | 0.36 | 5
> 3 | 09:40:21.100 | 09:40:21.560 | 0.46 | 0
> 4 | 09:40:25.123 | null | null | null
> 5 | 09:40:29.221 | 09:40:30.414 | 1.193 | 5

PostgreSQL: create table with unique timestamp for all rows

I have a record of users' trips with begin/end positions and time in a table like this:
CREATE TABLE trips(id integer, start_timestamp timestamp with time zone,
session_id integer, start_lat double precision,
start_lon double precision, end_lat double precision,
end_lon double precision, mode integer);
INSERT INTO trips (id, start_timestamp, session_id, start_lat,start_lon,end_lat,end_lon,mode)
VALUES (563097015,'2017-05-20 17:47:12+01', 128618, 41.1783308,-8.5949878, 41.1784478, -8.5948463, 0),
(563097013, '2017-05-20 17:45:29+01', 128618, 41.1781344, -8.5951169, 41.1782919, -8.5950689, 0),
(563097011, '2017-05-20 17:43:41+01', 128618, 41.1781196, -8.5954075, 41.1782139, -8.5950689, 0),
(563097009, '2017-05-20 17:41:48+01', 128618, 41.1782497, -8.595197, 41.1781101, -8.5954124, 0),
(563097003, '2017-05-20 17:10:29+01', 128618, 41.1832512, -8.6081606, 41.1782561, -8.5950259, 0)
And in the second table is the records of raw gps traces for all the trips similar to:
CREATE TABLE gps_traces (session_id integer, seconds integer, lat double precision,
lon double precision, speed double precision);
INSERT INTO gps_traces (session_id, seconds , lat , lon , speed )
VALUES (128618,1495296443,41.1844471,-8.6065158,1.35148),
(128618,1495296444,41.1844482,-8.6065303,1.28004),
(128618,1495296445,41.1844572,-8.6065503,1.46086),
(128618,1495296446,41.1844541,-8.6065691,1.23),
(128618,1495296446,41.1844589,-8.6065861, 1.22919),
(128618,1495296447,41.1844587, -8.6066043, 1.30188),
(128618, 1495296448, 41.1844604, -8.6066261, 1.43126),
(128618, 1495296449, 41.184471, -8.6066412, 1.55003),
(128618,1495296450, 41.1844715, -8.6066572, 1.29062),
(128618,1495296450, 41.1844707, -8.6066736, 1.3618)
From this I want to create a new table mytable containing GPS joining these tables on session_id, like so:
CREATE TABLE mytable AS SELECT id, seconds, lat, lon, speed, mode
FROM trips t
JOIN gps_traces g
ON t.session_id=g.session_id
However, in the new table, I want to ensure that for rows recorded twice at same unix timestamp in a trip, only only is selected into my new table. For example in this case:
SELECT * FROM mytable WHERE id = 563097003;
+-----------+------------+------------+------------+---------+------+
| id | seconds | lat | lon | speed | mode |
+-----------+------------+------------+------------+---------+------+
| 563097003 | 1495296443 | 41.1844471 | -8.6065158 | 1.35148 | 0 |
| 563097003 | 1495296444 | 41.1844482 | -8.6065303 | 1.28004 | 0 |
| 563097003 | 1495296445 | 41.1844572 | -8.6065503 | 1.46086 | 0 |
| 563097003 | 1495296446 | 41.1844541 | -8.6065691 | 1.23 | 0 |
| 563097003 | 1495296446 | 41.1844589 | -8.6065861 | 1.22919 | 0 |
| 563097003 | 1495296447 | 41.1844587 | -8.6066043 | 1.30188 | 0 |
| 563097003 | 1495296448 | 41.1844604 | -8.6066261 | 1.43126 | 0 |
| 563097003 | 1495296449 | 41.184471 | -8.6066412 | 1.55003 | 0 |
| 563097003 | 1495296450 | 41.1844715 | -8.6066572 | 1.29062 | 0 |
| 563097003 | 1495296450 | 41.1844707 | -8.6066736 | 1.3618 | 0 |
| 10 rows | | | | | |
+-----------+------------+------------+------------+---------+------+
Column seconds is the Unix timestamp. As shown, we can see rows having more than 1 unique timestamp count at 1495296446 and 1495296450. I would like to ensure that for each trip, records are selected into the new table with unique timestamp (so in the case above, only one recorded should selected into the new table). I illustrate that in this db<>fiddle.
EDIT
Expected output:
+-----------+------------+------------+------------+---------+------+
| id | seconds | lat | lon | speed | mode |
+-----------+------------+------------+------------+---------+------+
| 563097003 | 1495296443 | 41.1844471 | -8.6065158 | 1.35148 | 0 |
| 563097003 | 1495296444 | 41.1844482 | -8.6065303 | 1.28004 | 0 |
| 563097003 | 1495296445 | 41.1844572 | -8.6065503 | 1.46086 | 0 |
| 563097003 | 1495296446 | 41.1844541 | -8.6065691 | 1.23 | 0 |
| 563097003 | 1495296447 | 41.1844587 | -8.6066043 | 1.30188 | 0 |
| 563097003 | 1495296448 | 41.1844604 | -8.6066261 | 1.43126 | 0 |
| 563097003 | 1495296449 | 41.184471 | -8.6066412 | 1.55003 | 0 |
| 563097003 | 1495296450 | 41.1844715 | -8.6066572 | 1.29062 | 0 |
| 8 rows | | | | | |
+-----------+------------+------------+------------+---------+------+
Use DISTINCT ON:
CREATE TABLE mytable AS
SELECT DISTINCT ON (t.session_id, seconds) id, seconds, lat, lon, speed, mode
FROM trips t JOIN
gps_traces g
ON t.session_id = g.session_id
ORDER BY t.session_id, seconds;
Note: I would expect you to include session_id in the new table as well.
Thanks to #Abelisto, it turns out that the following modification to this answer works as intended.
CREATE TABLE mytable AS SELECT DISTINCT ON (id, seconds)id,
seconds, lat, lon, speed, mode
FROM trips t
JOIN gps_traces g
ON t.session_id=g.session_id
ORDER BY id, seconds
Here is a db<>fiddle.

Pick a record based on a given value in postgres

I have a table in postgres like below,
alg_campaignid | alg_score | cp | sum
----------------+-----------+---------+----------
9829 | 30.44056 | 12.4000 | 12.4000
9880 | 29.59280 | 12.0600 | 24.4600
9882 | 29.59280 | 12.0600 | 36.5200
9827 | 29.27504 | 11.9300 | 48.4500
9821 | 29.14840 | 11.8800 | 60.3300
9881 | 29.14840 | 11.8800 | 72.2100
9883 | 29.14840 | 11.8800 | 84.0900
10026 | 28.79280 | 11.7300 | 95.8200
10680 | 10.31504 | 4.1800 | 100.0000
From which i have to select a record based on randomly generated number from 0 to 100.i.e first record should be returned if random number picked is between 0 and 12.4000,second if rendom is between 12.4000 and 24.4600,and likewise last if random no is between 95.8200 and 100.0000.
For Example
if the random number picked is 8 then the first record should be returned
or
if the random number picked is 48 then the fourth record should be returned
Is it possible to do this postgres if so kindly recommend a solution for this..
Yes, you can do this in Postgres. If you want to generate the number in the database:
with r as (
select random() * 100 as r
)
select t.*
from table t cross join r
where t.sum <= r.r
order by t.sum desc
limit 1;

How to check date in postgresql

my table name is tbl1. The fileds are id,name,txdate.
| ID | NAME | TXDATE |
| 1 | RAJ | 1-1-2013 |
| 2 | RAVI | |
| 3 | PRABHU | 25-3-2013 |
| 4 | SAT | |
Now i want to use select query for check txdate < 2-2-2013 in which rows have txdate not empty and the select also retrun which rows have txdate empty.
The Result is like this
| ID | NAME | TXDATE |
| 1 | RAJ | 1-1-2013 |
| 2 | RAVI | |
| 4 | SAT | |
Any feasible solution is there?.
With out using union it is possible?.
Assuming that the TXDATE is of data type DATE then you can use WHERE "TXDATE" < '2013-2-2' OR "TXDATE" IS NULL. Something like:
SELECT *
FROM table1
WHERE "TXDATE" < '2013-2-2'
OR "TXDATE" IS NULL;
See it in action:
SQL Fiddle Demo
I don't now what database your are using and what data type the TXDATE is.
I just tried on my postgreSQL 9.2, with a field "timestamp without time zone".
I have three rows in the table , like:
ac_device_name | ac_last_heartbeat_time
----------------+-------------------------
Nest-Test1 |
Nest-Test3 |
Nest-Test2 | 2013-04-10 15:06:18.287
Then use below statement
select ac_device_name,ac_last_heartbeat_time
from at_device
where ac_last_heartbeat_time<'2013-04-11';
It is ok to return only one record:
ac_device_name | ac_last_heartbeat_time
----------------+-------------------------
Nest-Test2 | 2013-04-10 15:06:18.287
I think you can try statement like:
select * from tbl1 where TXDATE<'2-2-2013' and TXDATE is not NULL
this statement also works in my environment.