Sqlite query by timestamp and value - sql

I have a Sqlite table with the following rows:
id: int PK autoincrement
timestamp: int value NOT NULL. Timestamp of the DB insertion
value: int value NOT NULL. Possible values [0-4].
I want to query the database to obtain if all the values on the database for the registers contained within the 60 seconds before the given timestamp have the same value. For instance:
id | timestamp | value
1 | 1594575090 | 1
2 | 1594575097 | 1
3 | 1594575100 | 1
4 | 1594575141 | 2
5 | 1594575145 | 2
6 | 1594575055 | 3
7 | 1594575060 | 4
In this case, if I made the expected query for the registers contained on the 60 seconds before the register 3 (including the register 3), it should query if the value of the registers [1,2, 3] are the same, which should return 1.
On the other side, if this query was done with register 7, it will compare value of registers [4,5,6,7] and it should return 0, as this value is not the same for the three of them.
Any guesses of how can I perform this query?

I think that you want this:
select count(distinct value) = 1 result
from tablename
where id <= ?
and timestamp - (select timestamp from tablename where id = ?) <= 60;
Replace the ? placeholder with the id that you want the results for.
Maybe you want the absolute value of the difference of the timestamps to be less than 60, so if this is the case then change to:
and abs(timestamp - (select timestamp from tablename where id = ?)) <= 60;
See the demo.

Hmmm . . . I think the logic you are describing is:
select ( min(value) = max(value) ) as all_same
from t cross join
(select t.*
from t
where t.id = ?
) tt
where t.timestamp < tt.timestamp and
t.timestamp >= tt.timestamp - 60

Related

SQL: Calculate number of days since last success

Following table represents results of given test.
Every result for the same test is either pass ( error_id=0) or fail ( error_id <> 0)
I need help to write a query, that returns the number of runs since last good run ( error_id= 0) and the date.
| Date | test_id | error_id |
-----------------------------------
| 2019-12-20 | 123 | 23
| 2019-12-19 | 123 | 23
| 2019-12-17 | 123 | 22
| 2019-12-18 | 123 | 0
| 2019-12-16 | 123 | 11
| 2019-12-15 | 123 | 11
| 2019-12-13 | 123 | 11
| 2019-12-12 | 123 | 0
So the result for this example should be:
| 2019-12-18 | 123 | 4
as the test 123 was PASS on 2019-12-18 and this happened 4 runs ago.
I have a query to determine whether given run is error or not, but I have trouble applying appropriate window function to it to get the wanted result
select test_id, Date, error_id, (CASE WHEN error_id 0 THEN 1 ELSE 0 END) as is_error
from testresults
You can generate a row number, in reverse order from the sorting of the query itself:
SELECT test_date, test_id, error_code,
(row_number() OVER (ORDER BY test_date asc) - 1) as runs_since_last_pass
FROM tests
WHERE test_date >= (SELECT MAX(test_date) FROM tests WHERE error_code=0)
ORDER BY test_date DESC
LIMIT 1;
Note that this will run into issues if test_date is not unique. Better use a timestamp (precise to the millisecond) instead of a date.
Here's a DBFiddle: https://www.db-fiddle.com/f/8gSHVcXMztuRiFcL8zLeEx/0
If there's more than one test_id, you'll want to add a PARTITION BY clause to the row number function, and the subquery would become a bit more complex. It may be more efficient to come up with a way to do this by a JOIN instead of a subquery, but it would be more cognitively complex.
I think you just want aggregation and some filtering:
select id, count(*),
max(date) over (filter where error_id = 0) as last_success_date
from t
where date > (select max(t2.date) from t t2 where t2.error_id = 0);
group by id;
You have to use the Maximum date of the good runs for every test_id in your query. You can try this query:
select tr2.Date_error, tr.test_id, count(tr.error_id) from
testresults tr inner join (select max(Date_error), test_id
from testresult where error_id=0 group by test_id) tr2 on
tr.test_id=tr2.test_id and tr.date_error >=tr2.date_error
group by test_id
This should do the trick:
select count(*) from table t,
(select max(date) date from table where error_id = 0) good
where t.date >= good.date
Basically you are counting the rows that have a date >= the date of the last success.
Please note: If you need the number of days, it is a complete different query:
select now()::date - max(test_date) last_valid from tests
where error_code = 0;

Greatest N Per Group with JOIN and multiple order columns

I have two tables:
Table0:
| ID | TYPE | TIME | SITE |
|----|------|-------|------|
| aa | 1 | 12-18 | 100 |
| aa | 1 | 12-10 | 101 |
| bb | 2 | 12-10 | 102 |
| cc | 1 | 12-09 | 100 |
| cc | 2 | 12-12 | 103 |
| cc | 2 | 12-01 | 109 |
| cc | 1 | 12-07 | 101 |
| dd | 1 | 12-08 | 100 |
and
Table1:
| ID |
|----|
| aa |
| cc |
| cc |
| dd |
| dd |
I'm trying to output results where:
ID must exist in both tables.
TYPE must be the maximum for each ID.
TIME must be the minimum value for the maximum TYPE for each ID.
SITE should be the value from the same row as the minimum TIME value.
Given my sample data, my results should look like this:
| ID | TYPE | TIME | SITE |
|----|------|-------|------|
| aa | 1 | 12-10 | 101 |
| cc | 2 | 12-01 | 109 |
| dd | 1 | 12-08 | 100 |
I've tried these statements:
INSERT INTO "NuTable"
SELECT DISTINCT(QTS."ID"), "SITE",
CASE WHEN MAS.MAB=1 THEN 'B'
WHEN MAS.MAB=2 THEN 'F'
ELSE NULL END,
"TIME"
FROM (SELECT DISTINCT("ID") FROM TABLE1) AS QTS,
TABLE0 AS MA,
(SELECT "ID", MAX("TYPE") AS MASTY, MIN("TIME") AS MASTM
FROM TABLE0
GROUP BY "ID") AS MAS,
WHERE QTS."ID" = MA."ID"
AND QTS."ID" = MAS."ID"
AND MSD.MASTY =MA."TYPE"
...which generates a syntax error
INSERT INTO "NuTable"
SELECT DISTINCT(QTS."ID"), "SITE",
CASE WHEN MAS.MAB=1 THEN 'B'
WHEN MAS.MAB=2 THEN 'F'
ELSE NULL END,
"TIME"
FROM (SELECT DISTINCT("ID") FROM TABLE1) AS QTS,
TABLE0 AS MA,
(SELECT "ID", MAX("TYPE") AS MAB
FROM TABLE0
GROUP BY "ID") AS MAS,
((SELECT "ID", MIN("TIME") AS MACTM, MIN("TYPE") AS MACTY
FROM TABLE0
WHERE "TYPE" = 1
GROUP BY "ID")
UNION
(SELECT "ID", MIN("TIME"), MAX("TYPE")
FROM TABLE0
WHERE "TYPE" = 2
GROUP BY "ID")) AS MACU
WHERE QTS."ID" = MA."ID"
AND QTS."ID" = MAS."ID"
AND MACU."ID" = QTS."ID"
AND MA."TIME" = MACU.MACTM
AND MA."TYPE" = MACU.MACTB
... which is getting the wrong results.
Answering your direct question "how to avoid...":
You get this error when you specify a column in a SELECT area of a statement that isn't present in the GROUP BY section and isn't part of an aggregating function like MAX, MIN, AVG
in your data, I cannot say
SELECT
ID, site, min(time)
FROM
table
GROUP BY
id
I didn't say what to do with SITE; it's either a key of the group (in which case I'll get every unique combination of ID,site and the min time in each) or it should be aggregated (eg max site per ID)
These are ok:
SELECT
ID, max(site), min(time)
FROM
table
GROUP BY
id
SELECT
ID, site, min(time)
FROM
table
GROUP BY
id,site
I cannot simply not specify what to do with it- what should the database return in such a case? (If you're still struggling, tell me in the comments what you think the db should do, and I'll better understand your thinking so I can tell you why it can't do that ). The programmer of the database cannot make this decision for you; you must make it
Usually people ask this when they want to identify:
The min time per ID, and get all the other row data as well. eg "What is the full earliest record data for each id?"
In this case you have to write a query that identifies the min time per id and then join that subquery back to the main data table on id=id and time=mintime. The db runs the subquery, builds a list of min time per id, then that effectively becomes a filter of the main data table
SELECT * FROM
(
SELECT
ID, min(time) as mintime
FROM
table
GROUP BY
id
) findmin
INNER JOIN table t ON t.id = findmin.id and t.time = findmin.mintime
What you cannot do is start putting the other data you want into the query that does the grouping, because you either have to group by the columns you add in (makes the group more fine grained, not what you want) or you have to aggregate them (and then it doesn't necessarily come from the same row as other aggregated columns - min time is from row 1, min site is from row 3 - not what you want)
Looking at your actual problem:
The ID value must exist in two tables.
The Type value must be largest group by id.
The Time value must be smallest in the largest type group.
Leaving out a solution that involves having or analytics for now, so you can get to grips with the theory here:
You need to find the max type group by id, and then join it back to the table to get the other relevant data also (time is needed) for that id/maxtype and then on this new filtered data set you need the id and min time
SELECT t.id,min(t.time) FROM
(
SELECT
ID, max(type) as maxtype
FROM
table
GROUP BY
id
) findmax
INNER JOIN table t ON t.id = findmax.id and t.type = findmax.maxtype
GROUP BY t.id
If you can't see why, let me know
demo:db<>fiddle
SELECT DISTINCT ON (t0.id)
t0.id,
type,
time,
first_value(site) OVER (PARTITION BY t0.id ORDER BY time) as site
FROM table0 t0
JOIN table1 t1 ON t0.id = t1.id
ORDER BY t0.id, type DESC, time
ID must exist in both tables
This can be achieved by joining both tables against their ids. The result of inner joins are rows that exist in both tables.
SITE should be the value from the same row as the minimum TIME value.
This is the same as "Give me the first value of each group ofids ordered bytime". This can be done by using the first_value() window function. Window functions can group your data set (PARTITION BY). So you are getting groups of ids which can be ordered separately. first_value() gives the first value of these ordered groups.
TYPE must be the maximum for each ID.
To get the maximum type per id you'll first have to ORDER BY id, type DESC. You are getting the maximum type as first row per id...
TIME must be the minimum value for the maximum TYPE for each ID.
... Then you can order this result by time additionally to assure this condition.
Now you have an ordered data set: For each id, the row with the maximum type and its minimum time is the first one.
DISTINCT ON gives you exactly the first row of each group. In this case the group you defined is (id). The result is your expected one.
I would write this using distinct on and in/exists:
select distinct on (t0.id) t0.*
from table0 t0
where exists (select 1 from table1 t1 where t1.id = t0.id)
order by t0.id, type desc, time asc;

SQL updating a record in a table concerning another record in the same table

I have a table that contains more than 16,000,000 records.
Each record has a primary key (formed by five fields "tsid, plisid, plifc, plisc, dt"), and two counter fields ("icount, aicount").
There is a relation between some of the records in the table.
To simplify the problem let's say we have only these two records
tsid, plisid, plifc, plisc, dt, icount, aicount
10 1 0 0 0 2 2
11 1 0 0 0 7 0
The requirement:
I want to update the "aicount" field in the second record to be 9 (i.e. "icount" in the second record + "aicount" in the first record).
The relation between the first and second record is that they have the same values in (plisid, plifc, plisc, dt), and the tsid value of the second record == the tsid of the first record + 1
The desired result after the update is:
tsid, plisid, plifc, plisc, dt, icount, aicount
10 1 0 0 0 2 2
11 1 0 0 0 7 9
I tried this SQL statement in PostgreSQL but I got a syntax error "ERROR: syntax error at or near "SELECT" Position: 59"
UPDATE table1 SET
table1.aicount = table1.icount + SELECT COALESCE( (SELECT CASE
WHEN table1temp.aicount IS NULL
THEN 0
ELSE table1temp.aicount
END
FROM table1 table1temp
WHERE table1temp.tsid = table1.tsid - 1
AND table1temp.plisid = table1.plisid
AND table1temp.plifc = table1.plifc
AND table1temp.plisc = table1.plisc
AND table1temp.dt = table1.dt), 0)
WHERE table1.tsid = 10;
What is the wrong in the statement above? Any idea or suggestions?
The error caused because you couldn't use select subquery to add an update column.
You seem to want to get the number, which this row icount number add with last recorded aicount number
I would use LAG function to get last recorded aicount number in subquery then update the number.
There are three parameters in LAG function.
First your column, which you want to get the last column value.
offset from this column value defaults to 1
default value. default to null
lag(value any [, offset integer [, default any ]])
returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null
CREATE TABLE T(
tsid int,
plisid int,
plifc int,
plisc int,
dt int,
icount int,
aicount int
);
INSERT INTO T VALUES (10,1,0,0,0,2,2);
INSERT INTO T VALUES (11,1,0,0,0,7,0);
UPDATE T
SET aicount = t1.totle
FROM
(
SELECT *,(LAG(aicount,1,0) over(order by tsid) + icount) totle
FROM T
) t1
WHERE
T.tsid = t1.tsid
AND T.plisid = t1.plisid
AND T.plifc = t1.plifc
AND T.plisc = t1.plisc
AND T.dt = t1.dt
Query 1:
SELECT * FROM T
Results:
| tsid | plisid | plifc | plisc | dt | icount | aicount |
|------|--------|-------|-------|----|--------|---------|
| 10 | 1 | 0 | 0 | 0 | 2 | 2 |
| 11 | 1 | 0 | 0 | 0 | 7 | 9 |
Try the following query-:
update T
set aicount=mm.m
from(
select sum(iCount) over (partition by plisid,plifc,plisc,dt order by tsid) m from T
) mm
SQL Server

How to create end date that is one day less than the next start date created by another another query with sql?

I queried off of a table that pulls in anyone who has working time percentage of less than 100 and all their working time records if they met the less than 100 criteria.
This table contains the columns: id, eff_date (of working time percentage), and percentage. This table does not contain end_date.
Problem: how to build on top of the query below and add a new column called end_date that is one date less than the next eff_date?
Current query
select
j1.id, j1.eff_date, j1.percentage
from
working_time_table j1
where
exists (select 1
from working_time_table j2
where j2.id = j1.id and j2.percentage < 100)
Data returned from the query above:
ID | EFF_DATE| PERCENTAGE
------------------------
12 | 01-JUN-2012 | 70
12 | 03-MAR-2013 | 100
12 | 13-DEC-2014 | 85
The desired result set is:
ID | EFF_DATE | PERCENTAGE | END_DATE
-------------------------------------------
12 | 01-JUN-2012 | 70 | 02-MAR-2013
12 | 03-MAR-2013 | 100 | 12-DEC-2014
12 | 13-DEC-2014 | 85 | null
You didn't state your DBMS so this is ANSI SQL using window functions:
select j1.id,
j1.eff_date,
j1.percentage,
lead(j1.eff_date) over (partition by j1.id order by j1.eff_date) - interval '1' day as end_date
from working_time_table j1
where exists (select 1
from working_time_table j2
where j2.id = j1.id and j2.percentage < 100);
First off, curious if the "id" column is unique or it has duplicate values like the 12's in your sample, or is that a unique column or primary key possibly. It would be WAAAAY easier to do this if there was a unique
id column that held the order. If you don't have a unique ID column,
are you able to add one to the table? Again, would simplify this
tremendously.
This took forever to get right, I hope this helps, burned many hours on it.
Props to Akhil for helping me finally get the query right. He is a true SQL genius.
Here is the ..
SQLFIDDLE
SELECT
id,
firstTbl.eff_Date,
UPPER(DATE_FORMAT(DATE_SUB(
STR_TO_DATE(secondTbl.eff_Date, '%d-%M-%Y'),
INTERVAL 1 DAY), '%d-%b-%Y')) todate,
percentage FROM
(SELECT
(#cnt := #cnt + 1) rownum,
id, eff_date, percentage
FROM working_time_table,
(SELECT
#cnt := 0) s) firstTbl
LEFT JOIN
(SELECT
(#cnt1 := #cnt1 + 1) rownum,
eff_date
FROM working_time_table,
(SELECT
#cnt1 := 0) s) secondTbl
ON (firstTbl.rownum + 1) = secondTbl.rownum

SQL select two nearly fields

How do I select two most nearly fields to specific timestamp?
SELECT *
FROM 'wp_weather'
WHERE ( timestamp most nearly to 1385435000) AND city = 'Махачкала'
The table:
id | timestamp
---------------
0 | 1385410000
1 | 1385420000
2 | 1385430000
3 | 1385440000
4 | 1385450000
SELECT *
FROM wp_weather
WHERE city = 'Махачкала'
order by abs(timestamp - 1385435000)
limit 2
You may try like this:
SELECT * FROM 'wp_weather'
WHERE city = 'Махачкала'
order by abs(timestamp - 1385435000)
limit 2
Also check the ABS function