How to find the earliest data that closest to the specified date?

How to find the earliest data that closest to the specified date? - sql

Some business data has a create_on column to indicate the creation date, and I want to find the earliest data that closest to the specified date. How do I write the sql? I'm using postgres database.
drop table if exists t;
create table t (
id int primary key,
create_on date not null
-- ignore other columns
);
insert into t(id, create_on) values
(1, '2018-01-10'::date),
(2, '2018-01-20'::date);
-- maybe have many other data
| sn | specified-date | expected-result |
| 1 | 2018-01-09 | (1, '2018-01-10'::date) |
| 2 | 2018-01-10 | (1, '2018-01-10'::date) |
| 3 | 2018-01-11 | (1, '2018-01-10'::date) |
| 4 | 2018-01-19 | (1, '2018-01-10'::date) |
| 5 | 2018-01-20 | (2, '2018-01-20'::date) |
| 6 | 2018-01-21 | (2, '2018-01-20'::date) |

This is tricky, because you seem to want the most recent row one or before the date. But if no such row exists, you want the earliest date in the table:
with tt as (
select t.*
from t
where t.created_on <= :specified_date
order by t.created_on desc
fetch first 1 row only
)
select tt.* -- select most recent row before the date
from tt
union all
select t.* -- otherwise select most oldest row
from t
where not exists (select 1 from tt)
order by t.created_on
fetch first 1 row only;
EDIT:
You can also handle this with a single query:
select t.*
from t
order by (t.created_on <= :specified_date) desc, -- put older dates first
(case when t.created_on <= :specified_date then created_on end) desc,
created_on asc
fetch first 1 row only;
Although this looks simpler, it might actually be more expensive, because the query cannot make use of an index on (created_on). And, there is no where clause reducing the number of rows before sorting.

Related

Postgres - group rows by user, return one row per user in each group

I have a purchases table:
-----------------
user_id | amount
-----------------
1 | 12
1 | 4
1 | 8
2 | 23
2 | 45
2 | 7
I want a query that will return one row per user_id, but the row that I want for each user_id is where the amount is the smallest per user_id. So I should get as my result set:
-----------------
user_id | amount
-----------------
1 | 4
2 | 7
Using DISTINCT on the user_id column ensures I don't get duplicate user's, but I don't know how to make it so that returns the user row with the fewest amount.

You can use distinct on:
select distinct on (user) t.*
from t
order by user, amount;
Note: If you just want the smallest amount, then group by would be the typical solution:
select user, min(amount)
from t
group by user;
Distinct on is a convenient Postgres extension that makes it easy to get one row per group -- and it often performs better than other methods.

If your requirement requires ouput of a row that equates to the smallest amount, e.g. the table includes a transaction date and you need this in the output, then a convenient method is to use row_number() over() to select the wanted rows. e.g.
CREATE TABLE mytable(
user_id INTEGER NOT NULL
,amount INTEGER NOT NULL
,trandate DATE NOT NULL
);
INSERT INTO mytable(user_id,amount,trandate) VALUES (1,12,'2020-09-12');
INSERT INTO mytable(user_id,amount,trandate) VALUES (1,4,'2020-10-02');
INSERT INTO mytable(user_id,amount,trandate) VALUES (1,8,'2020-11-12');
INSERT INTO mytable(user_id,amount,trandate) VALUES (2,23,'2020-12-02');
INSERT INTO mytable(user_id,amount,trandate) VALUES (2,45,'2021-01-12');
INSERT INTO mytable(user_id,amount,trandate) VALUES (2,7,'2021-02-02');
select
user_id, amount, trandate
from (
select user_id, amount, trandate
, row_number() over(partition by user_id order by amount) as rn
from mytable
) t
where rn = 1
result:
+---------+--------+------------+
| user_id | amount | trandate |
+---------+--------+------------+
| 1 | 4 | 2020-10-02 |
| 2 | 7 | 2021-02-02 |
+---------+--------+------------+
demonstartion of this at db<>fiddle here

Select Top N rows plus another Select based on previous result

I am quite new to SQL.
I have a MS SQL DB where I would like to fetch the top 3 rows with datetime above a specific input PLUS get all the rows where the datetime value is equal to the last row of the previous fetch.
| rowId | Timestamp | data |
|-------|--------------------------|------|
| rsg | 2019-01-01T00:00:00.000Z | 120 |
| zqd | 2020-01-01T00:00:00.000Z | 36 |
| ylp | 2020-01-01T00:00:00.000Z | 48 |
| abt | 2022-01-01T00:00:00.000Z | 53 |
| zio | 2022-01-01T00:00:00.000Z | 12 |
Here is my current request to fetch the 3 rows.
SELECT
TOP 3 *
FROM
Table
WHERE
Timestamp >= '2020-01-01T00:00:00.000Z'
ORDER BY
Timestamp ASC
Here I would like to get in one request the last 4 rows.
Thanks for your help

One possibility, using ROW_NUMBER:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Timestamp) rn
FROM yourTable
)
SELECT *
FROM cte
WHERE
Timestamp <= (SELECT Timestamp FROM cte WHERE rn = 3);
Matching records should be in the first 3 rows or should have timestamps equal to the timestamp in the third row. We can combine these conditions by restricting to timestamps equal or before the timestamp in the third row.
Or maybe use TOP 3 WITH TIES:
SELECT TOP 3 WITH TIES *
FROM yourTable
ORDER BY Timestamp;

get the id based on condition in group by

I'm trying to create a sql query to merge rows where there are equal dates. the idea is to do this based on the highest amount of hours, so that i in the end gets the corresponding id for each date with the highest amount of hours. i've been trying to do with a simple group by, but does not seem to work, since i CANT just put a aggregate function on id column, since it should be based the hours condition
+------+-------+--------------------------------------+
| id | date | hours |
+------+-------+--------------------------------------+
| 1 | 2012-01-01 | 37 |
| 2 | 2012-01-01 | 10 |
| 3 | 2012-01-01 | 5 |
| 4 | 2012-01-02 | 37 |
+------+-------+--------------------------------------+
desired result
+------+-------+--------------------------------------+
| id | date | hours |
+------+-------+--------------------------------------+
| 1 | 2012-01-01 | 37 |
| 4 | 2012-01-02 | 37 |
+------+-------+--------------------------------------+

If you want exactly one row -- even if there are ties -- then use row_number():
select t.*
from (select t.*, row_number() over (partition by date order by hours desc) as seqnum
from t
) t
where seqnum = 1;
Ironically, both Postgres and Oracle (the original tags) have what I would consider to be better ways of doing this, but they are quite different.
Postgres:
select distinct on (date) t.*
from t
order by date, hours desc;
Oracle:
select date, max(hours) as hours,
max(id) keep (dense_rank first over order by hours desc) as id
from t
group by date;

Here's one approach using row_number:
select id, dt, hours
from (
select id, dt, hours, row_number() over (partition by dt order by hours desc) rn
from yourtable
) t
where rn = 1

You can use subquery with correlation approach :
select t.*
from table t
where id = (select t1.id
from table t1
where t1.date = t.date
order by t1.hours desc
limit 1);
In Oracle you can use fetch first 1 row only in subquery instead of LIMIT clause.

sybase - update column to be identity, ordered by certain values

I want to update the sequence column below to be an IDENTITY column in future, and the current rows must be updated to be ordered by update_time ascending.
How do I do this in Sybase? Simplified example of what I have below.
Current table:
SEQUENCE | UPDATE_TIME | DATA
null | 2016-01-01 | x
null | 2013-01-01 | y
null | 2015-01-01 | z
Desired table:
SEQUENCE | UPDATE_TIME | DATA
3 | 2016-01-01 | x
1 | 2013-01-01 | y
2 | 2015-01-01 | z

I did this by joining the table onto itself but with 1 additional ID row. This additional row is created using the ROW_NUMBER function by ordering the update_time ascending. Something like...
UPDATE myTable SET update_seq = tmp.ID
FROM myTable a INNER JOIN (
SELECT update_time, data, ROW_NUMBER() OVER (ORDER BY update_seq ASC) as ID from myTable tmp
on a.update_time = tmp.update_time
and a.data = tmp.data

On Sybase ASA you can numerate column using that update:
update [table_name]
set [SEQUENCE]=number(*)
order by [UPDATE_TIME]

Filter table : Keep N row after each row with special value

I have a table with a huge amount of data with this structure (simplidied) :
+--------+-------------------------+-------+
| id | datetime | type |
+--------+-------------------------+-------+
| 1 | 2015-08-13 17:50:41 | 1 |
| 2 | 2015-08-13 17:50:45 | 0 |
| 3 | 2015-08-14 17:50:56 | 0 |
| 4 | 2015-08-14 17:50:59 | 0 |
+--------+-------------------------+-------+
Row with type=1 are followed by a lots of rows with type=0
I need to do an intelligent clean :
I want to keep rows with type=0 following rows with type=1 only during one hour (After the type 1 row timestamp)
And at least one row with type=0 per hour
I don't know if its possible to do that with a query, or if I will have to loop through all rows with a script.
I use PostgreSQL

I dont have postgres here to test, but this should return all of the data you want to keep:
SELECT ID FROM (
SELECT ID FROM (SELECT
id,
datetime,
type,
LAG(type) OVER (ORDER BY id asc) AS prev_type,
LAG(datetime) OVER (ORDER BY id asc) AS prev_date
FROM employees
WHERE
type=1 AND
prev_type=0 AND
EXTRACT(EPOCH FROM (datetime - prev_date)) < 3601
)
UNION
SELECT MAX(ID) FROM employees GROUP BY TO_CHAR(datetime, 'DDMMYYYHH24'))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to find the earliest data that closest to the specified date? - sql

Related

Postgres - group rows by user, return one row per user in each group

Select Top N rows plus another Select based on previous result

get the id based on condition in group by

sybase - update column to be identity, ordered by certain values

Filter table : Keep N row after each row with special value

Categories

Resources