Select row with timestamp nearest to, but not later than, now - sql

Using Postgres 9.4, I am trying to select a single row from from a table that contains data nearest to, but not before, the current system time. The datetime colum is a timestamp without time zone data type, and the data is in the same timezone as the server. The table structure is:
uid | datetime | date | day | time | predictionft | predictioncm | highlow
-----+---------------------+------------+-----+----------+--------------+--------------+---------
1 | 2015-12-31 03:21:00 | 2015/12/31 | Thu | 03:21 AM | 5.3 | 162 | H
2 | 2015-12-31 09:24:00 | 2015/12/31 | Thu | 09:24 AM | 2.4 | 73 | L
3 | 2015-12-31 14:33:00 | 2015/12/31 | Thu | 02:33 PM | 4.4 | 134 | H
4 | 2015-12-31 21:04:00 | 2015/12/31 | Thu | 09:04 PM | 1.1 | 34 | L
Query speed is not a worry since the table contains ~1500 rows.
For clarity, if the current server time was 2015-12-31 14:00:00, the row returned should be 3 rather than 2.
EDIT:
The solution, based on the accepted answer below, was:
select *
from myTable
where datetime =
(select min(datetime)
from myTable
where datetime > now());
EDIT 2: Clarified question.

You can also use this. This will be faster. But it wont make much difference if you have few rows.
select * from table1
where datetime >= current_timestamp
order by datetime
limit 1
SQLFiddle Demo

The general idea follows. You can adjust it for postgresql.
select fields
from yourTable
where datetimeField =
(select min(datetimeField)
from yourTable
where datetimeField > current_timestamp)

Another approach other than the answers given is to use a window function first_value
select id, first_value(dt) over (order by dt)
from test
where dt >= current_timestamp
limit 1
See it working here: http://sqlfiddle.com/#!15/0031c/12

Related

Work out variance of groups of rows in SQL

I'm looking to work out a variance value per month for a table of data, with each month containing three rows to be accounted for. I'm struggling to think of a way of doing this without 'looping' which, as far as I'm aware, isn't supported in SQL.
Here is an example table of what I mean:
+======================+=======+
| timestamp | value |
+======================+=======+
| 2020-01-04T10:58:24Z | 10 | # January (Sum of vals = 110)
+----------------------+-------+
| 2020-01-14T10:58:21Z | 68 |
+----------------------+-------+
| 2020-01-29T10:58:12Z | 32 |
+----------------------+-------+
| 2020-02-04T10:58:13Z | 19 | # February (Sum of vals = 112)
+----------------------+-------+
| 2020-02-14T10:58:19Z | 5 |
+----------------------+-------+
| 2020-02-24T10:58:11Z | 88 |
+----------------------+-------+
| 2020-03-04T10:58:11Z | 72 | # March (Sum of vals = 184)
+----------------------+-------+
| 2020-03-15T10:58:10Z | 90 |
+----------------------+-------+
| 2020-03-29T10:58:16Z | 22 |
+----------------------+-------+
| .... | .... |
+======================+=======+
I need to build a query which can combine all 3 values from each item in each month, then work out the variation of the combined value across months. Hopefully this makes sense? So in this case, I would need to work out the variance betweeen January (110), February (112) and March (184).
Does anyone have any suggestions as to how I could accomplish this? I'm using PostgreSQL, but need a vanilla SQL solution :/
Thanks!
Are you looking for aggregation by month and then a variance calculation? If so:
select variance(sum_vals)
from (select date_trunc('month', timestamp) as mon, sum(val) as sum_vals
from t
group by mon
) t;

Aggregate results split by day

I'm trying to write a query that returns summarised data, per day, over many day's of data.
For example
| id | user_id | start
|----|---------|------------------------------
| 1 | 1 | 2020-02-01T17:35:37.242+00:00
| 2 | 1 | 2020-02-01T13:25:21.344+00:00
| 3 | 1 | 2020-01-31T16:42:51.344+00:00
| 4 | 1 | 2020-01-30T06:44:55.344+00:00
The outcome I'm hoping for is a function that I can pass in a the userid and timezone, or UTC offset, and get out:
| day | count |
|---------|-------|
| 1/2/20 | 2 |
| 31/1/20 | 1 |
| 30/1/20 | 7 |
Where the count is all the rows that have a start time falling between 00:00:00.0000 and 23:59:59.9999 on each day - taking into consideration the supplied UTC offset.
I don't really know where to start writing a query like this, and I the fact I can't even picture where to start feels like a big gap in my SQL thinking. How should I approach something like this?
You can use:
select date_trunc('day', start) as dte, count(*)
from t
where userid = ?
group by date_trunc('day', start)
order by dte;
If you want to handle an additional offset, build that into the query:
select dte, count(*)
from t cross join lateral
(values (date_trunc('day', start + ? * interval '1 hour'))) v(dte)
where userid = ?
group by v.dte
order by v.dte;

Comparing two tables that are the same and listing out the max date

I was wondering if it's possible to compare dates within the same table with same ID, but the catch is that there is an additional column that display the status. For instance, here's a table A:
The results I would like to see is this:
I know I could use a group by and max aggregate with ID to find the max date; however, I would like the status (Running/Stopped) column associated to be there. It would help me a lot.
In most databases, the fastest method (assuming the right indexes) is a correlated subquery:
select t.*
from t
where t.date = (select max(t2.date) from t t2 where t2.id = t.id);
Even if not the fastest, this should work in any database.
In case of Oracle, you can use the KEEP clause like this:
SELECT t.id,
MAX(t.status) KEEP (DENSE_RANK LAST ORDER BY t."DATE") AS corresponding_status,
MAX(t."DATE") AS last_date
FROM tab t
GROUP BY t.id
ORDER BY 1
For this sample data:
+----+---------+------------+
| ID | STATUS | DATE |
+----+---------+------------+
| 1 | Running | 2018-02-03 |
| 1 | Stopped | 2018-04-04 |
| 2 | Running | 2018-03-24 |
| 2 | Stopped | 2018-01-02 |
| 3 | Running | 2018-06-12 |
| 3 | Stopped | 2018-06-12 |
+----+---------+------------+
This would return this result:
+----+----------------------+------------+
| ID | CORRESPONDING_STATUS | LAST_DATE |
+----+----------------------+------------+
| 1 | Stopped | 2018-04-04 |
| 2 | Running | 2018-03-24 |
| 3 | Stopped | 2018-06-12 |
+----+----------------------+------------+
As can be seen in this SQL Fiddle.
For the cases, when you have multiple entries on the same ID and DATE combination, it'll choose one STATUS value - in this case the last one (based on alphanumerical sorting), as I've used MAX on the STATUS.
The part LAST ORDER BY t."DATE" corresponds to how we choose DATE value in the group, i.e. by choosing the last DATE in the group.
See this Oracle Docs entry on more details.

Finding the max value between the last 22 months or between any 10 hour window within the last 22 months in Microsoft SQL Server

I'd like to find the max value within the last 22 months OR the max value within any 10 hour window of those last 22 months.
I'm doing this in Microsoft SQL Server.
Essentially, I'm looking to retrieve a value that has sustained a high for at least 10 hours before I consider it my max and if it is larger than the max of the last 22 months, it would be the new max, otherwise I would use the max of the last 22 months.
Here's what I think it should look like pseudo code:
if (time > 10 hours) AND (value = max) OR (18 > time > 0) AND (value = max)
then output = value
The SQL code that I've tried:
SELECT TOP 90 PERCENT
DATEADD(s,time,'19700101') as time_22month
,GETDATE() as date_22month
,b.tagname as tag_22month
,value as value_22month
,maximum as max_22month
FROM
db..hour a
INNER JOIN
db..tag b
ON
a.tagid = b.tagid
WHERE
b.tagname like '%T500.1234%'
AND
(GETDATE() - DATEADD(s, time, '19700101') < 670)
ORDER BY
max_22month DESC
SELECT
DATEADD(s,time,'19700101') as time_10hour
,GETDATE() as date_10hour
,b.tagname as tag_10hour
,value as value_10hour
,maximum as max_10hour
FROM
db..hour a
INNER JOIN
db..tag b
ON
a.tagid = b.tagid
WHERE
b.tagname like '%T500.1234%'
AND
(GETDATE() - DATEADD(s, time, '19700101') < 0.42)
ORDER BY
max_10hour DESC
Output right now is the following:
+-------------------------+----------------------------+-------------+----------------+---------------+
| time_22month | date_22month | tag_22month | value_22month | max_22month |
+-------------------------+----------------------------+-------------+----------------+---------------+
| 2016-03-08 06:00:00.000 | 2017-04-10 10:07:57:32.783 | T500.1234 | 1567.88546416 | 2445.56419848 |
| 2016-03-08 07:00:00.000 | 2017-04-10 10:07:57:32.783 | T500.1234 | 1499.88546416 | 2434.47673719 |
+-------------------------+----------------------------+-------------+----------------+---------------+
+-------------------------+----------------------------+------------+---------------+---------------+
| time_10hour | date_10hour | tag_10hour | value_10hour | max_10hour |
+-------------------------+----------------------------+------------+---------------+---------------+
| 2017-04-10 00:00:00.000 | 2017-04-10 10:07:57:32.783 | T500.1234 | 8763.42572454 | 8759.64548912 |
| 2017-04-10 01:00:00.000 | 2017-04-10 10:07:57:32.783 | T500.1234 | 8001.64578943 | 8001.64578943 |
+-------------------------+----------------------------+------------+---------------+---------------+
So I'm a little confused on how I should be comparing these max values, especially when the 10 hour window needs to be rolling (incrementing every hour). Any help is appreciated.
The output should be the greater value of the two parameters, so perhaps a new table column would be the output along with two columns that precede it that show the highest 22month value and the highest 10 hour window value.
+--------+-------------+------------+------+
| Month | 22Month_Max | 10Hour_Max | Max |
+--------+-------------+------------+------+
| July | 5478 | 5999 | 5999 |
| August | 4991 | 3523 | 4991 |
+--------+-------------+------------+------+

Why won't this simple SQL statement work?

dateposted is a MySQL TIMESTAMP column:
SELECT *
FROM posts
WHERE dateposted > NOW() - 604800
...SHOULD, if I am not mistaken, return rows where dateposted was in the last week. But it returns only posts less than roughly one day old. I was under the impression that TIMESTAMP used seconds?
IE: 7 * 3600 * 24 = 604800
Use:
WHERE dateposted BETWEEN DATE_ADD(NOW(), INTERVAL -7 DAY) AND NOW()
That is because now() is implicitly converted into a number from timestamp and mysql conversion rules create a number like YYYYMMDDHHMMSS.uuuuuu
from mysql docs:
mysql> SELECT NOW();
-> '2007-12-15 23:50:26'
mysql> SELECT NOW() + 0;
-> 20071215235026.000000
Internally perhaps. The way to do this is the date math functions. So it would be:
SELECT * FROM posts WHERE dateposted > DATE_ADD(NOW(), INTERVAL -7 DAY)
I think there is a DATE_SUB, I'm just used to using ADD everywhere.
No, you can't implicitly use integer arithmetic with TIMESTAMP, DATETIME, and other date-related data types. You're thinking of the UNIX timestamp format, which is an integer number of seconds since 1/1/1970.
You can convert SQL data types to a UNIX timestamp in MySQL and then use arithmetic:
SELECT * FROM posts WHERE UNIX_TIMESTAMP(dateposted)+604800 > NOW()+0;
NB: adding zero to NOW() makes it return a numeric value instead of a string value.
update: Okay, I'm totally wrong with the above query. Converting NOW() to a numeric output doesn't produce a number that can be compared to UNIX timestamps. It produces a number, but the number doesn't count seconds or anything else. The digits are just YYYYMMDDHHMMSS strung together.
Example:
CREATE TABLE foo (
id SERIAL PRIMARY KEY,
dateposted TIMESTAMP
);
INSERT INTO foo (dateposted) VALUES ('2009-12-4'), ('2009-12-11'), ('2009-12-18');
SELECT * FROM foo;
+----+---------------------+
| id | dateposted |
+----+---------------------+
| 1 | 2009-12-04 00:00:00 |
| 2 | 2009-12-11 00:00:00 |
| 3 | 2009-12-18 00:00:00 |
+----+---------------------+
SELECT *, UNIX_TIMESTAMP(dateposted) AS ut, NOW()-604800 AS wk FROM foo
+----+---------------------+------------+-----------------------+
| id | dateposted | ut | wk |
+----+---------------------+------------+-----------------------+
| 1 | 2009-12-04 00:00:00 | 1259913600 | 20091223539359.000000 |
| 2 | 2009-12-11 00:00:00 | 1260518400 | 20091223539359.000000 |
| 3 | 2009-12-18 00:00:00 | 1261123200 | 20091223539359.000000 |
+----+---------------------+------------+-----------------------+
It's clear that the numeric values are not comparable. However, UNIX_TIMSTAMP() can also convert numeric values in that format as it can convert a string representation of a timestamp:
SELECT *, UNIX_TIMESTAMP(dateposted) AS ut, UNIX_TIMESTAMP(NOW())-604800 AS wk FROM foo
+----+---------------------+------------+------------+
| id | dateposted | ut | wk |
+----+---------------------+------------+------------+
| 1 | 2009-12-04 00:00:00 | 1259913600 | 1261089774 |
| 2 | 2009-12-11 00:00:00 | 1260518400 | 1261089774 |
| 3 | 2009-12-18 00:00:00 | 1261123200 | 1261089774 |
+----+---------------------+------------+------------+
Now one can run a query with an expression comparing them:
SELECT * FROM foo WHERE UNIX_TIMESTAMP(dateposted) > UNIX_TIMESTAMP(NOW())-604800
+----+---------------------+
| id | dateposted |
+----+---------------------+
| 3 | 2009-12-18 00:00:00 |
+----+---------------------+
But the answer given by #OMGPonies is still better, because this expression in my query probably can't make use of an index. I'm just offering this as an explanation of how the TIMESTAMP and NOW() features work.
Try this query:
SELECT * FROM posts WHERE DATE_SUB(CURDATE(),INTERVAL 7 DAY) < dateposted;
I am assuming that you are using mySQL.