SQL: Select a row from a table with an additional column containing the next value of the column - sql

Let's assume I have a table "TABLE_A" in an Oracle database:
=======================
| id | key | date |
=======================
| 0 | 1 | 1.1.2020 |
| 1 | 1 | 1.1.2021 |
=======================
I want to get a result like this:
===================================
| id | key | date | next_date |
===================================
| 0 | 1 | 1.1.2020 | 1.1.2021 |
===================================
Note that I want the row with a certain key <ID> on a certain date <DATE> with another column that contains the next date in the database with the same key. However if there is no other date it should still give me the same row but with next_date empty.
Is there a simpler / better / more readable version than this?
SELECT a.*, next_date
FROM TABLE_A a,
(SELECT key, date as next_date
FROM TABLE_A
WHERE key = <ID>
AND date > <DATE>
AND ROWNUM <= 1
ORDER BY next_date asc) a2
WHERE key = <ID>
AND date = <DATE>
AND a2.key(+) = a.key

Although lead() is what you are describing, I think that a correlated subquery might be fastest:
select t.*,
(select min(t2.date)
from t t2
where t2.key = t.key and t2.date > t.date
) as next_date
from t;
(You can add a filter for a particular key.)
In particular, this makes very efficient use of an index on (key, date).

I would use ROW_NUMBER here along with pivoting logic:
WITH cte AS (
SELECT a.*, ROW_NUMBER() OVER (PARTITION BY "key" ORDER BY "date") rn
FROM TABLE_A a
)
SELECT
MIN(id) AS id,
"key",
MAX(CASE WHEN rn = 1 THEN "date" END) AS "date",
MAX(CASE WHEN rn = 2 THEN "date" END) AS next_date
FROM cte
GROUP BY
"key";
Demo

Use LEAD() window function:
SELECT t.*
FROM (
SELECT t.*, LEAD(t."date") OVER (PARTITION BY t."key" ORDER BY t."date") NEXT_DATE
FROM tablename t
WHERE t."key" = 1 -- remove this line if you want results for all the keys
) t
WHERE t."date" = date '2020-01-01'
See the demo.
Results:
> id | key | date | NEXT_DATE
> -: | --: | :-------- | :--------
> 0 | 1 | 01-JAN-20 | 01-JAN-21

Yet another option is to use the self join as follows:
SELECT T1.ID, T1.KEY, T1.DATE,
MIN(T2.DATE) AS NEXT_DATE
FROM TABLE_A T1 LEFT JOIN TABLE_A T2
ON T1.KEY = T2.KEY AND T2.DATE > T1.DATE
WHERE T1.KEY = <KEY>
AND T1.DATE = <DATE>
GROUP BY T1.ID, T1.KEY, T1.DATE;

If just one repetition for the key column value throughout the rows is the case, then use a simple aggregation :
SELECT MIN(ID) AS ID, key, MIN("date") AS "date", MAX("date") AS next_date
FROM TABLE_A
GROUP BY key
Otherwise(two many repetitions exist for the concerned key value and the next date strictly matters), then use LAG() function along with ROW_NUMBER() descendingly ordered by "date" column in order to pick the first returning row such as
WITH A AS
(
SELECT MIN(ID) OVER (PARTITION BY key) AS ID,
key,
MIN("date") OVER (PARTITION BY key) AS "date",
LAG("date") OVER (PARTITION BY key ORDER BY "date") AS next_date,
ROW_NUMBER() OVER (PARTITION BY key ORDER BY "date" DESC) AS rn
FROM TABLE_A
)
SELECT id, key, "date", next_date
FROM A
WHERE rn = 1
Demo

Related

SELECT SQL Matching Number

I have millions of rows of data that have similar values ​​like this:
Id Reff Amount
1 a1 1000
2 a2 -1000
3 a3 -2500
4 a4 -1500
5 a5 1500
every data must have positive and negative values. the question is, how do I show only records that don't have a similar value? like a row Id 3. thanks for help
You can use not exists:
select t.*
from mytable t
where not exists (select 1 from mytable t1 where t1.amount = -1 * t.amount)
A left join antipattern would also get the job done:
select t.*
from mytable t
left join mytable t1 on t1.amount = -1 * t.amount
where t1.id is null
Demo on DB Fiddle:
Id | Reff | Amount
-: | :--- | -----:
3 | a3 | -2500
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE Test(
Id int
,Reff varchar(2)
,Amount int
);
INSERT INTO Test(Id,Reff,Amount) VALUES (1,'a1',1000);
INSERT INTO Test(Id,Reff,Amount) VALUES (2,'a2',-1000);
INSERT INTO Test(Id,Reff,Amount) VALUES (3,'a3',-2500);
INSERT INTO Test(Id,Reff,Amount) VALUES (4,'a4',-1500);
INSERT INTO Test(Id,Reff,Amount) VALUES (5,'a5',1500);
Query 1:
select t.*
from Test t
left join Test t1 on t1.amount =ABS(t.amount)
where t1.id is null
Results:
| Id | Reff | Amount |
|----|------|--------|
| 3 | a3 | -2500 |
Using a NOT EXISTS or a LEFT JOIN will work fine to find the amounts that don't have an opposite amount in the data.
But to really find the amounts that don't balance out with an Amount sorted by ID?
For such SQL puzzle it should be handled as a Gaps-And-Islands problem.
So the solution might appear a bit more complicated, but it's actually quite simple.
It first calculates a ranking per absolute value.
And based on that ranking it filters the last amount where the SUM per ranking isn't balanced out (not 0)
SELECT Id, Reff, Amount
FROM
(
SELECT *,
SUM(Amount) OVER (PARTITION BY Rnk) AS SumAmountByRank,
ROW_NUMBER() OVER (PARTITION BY Rnk ORDER BY Id DESC) AS Rn
FROM
(
SELECT Id, Reff, Amount,
ROW_NUMBER() OVER (ORDER BY Id) - ROW_NUMBER() OVER (PARTITION BY ABS(Amount) ORDER BY Id) AS Rnk
FROM YourTable
) AS q1
) AS q2
WHERE SumAmountByRank != 0
AND Rn = 1
ORDER BY Id;
A test on rextester here
If the sequence doesn't matter, and just the balance matters?
Then the query can be simplified.
SELECT Id, Reff, Amount
FROM
(
SELECT Id, Reff, Amount,
SUM(Amount) OVER (PARTITION BY ABS(Amount)) AS SumByAbsAmount,
ROW_NUMBER() OVER (PARTITION BY ABS(Amount) ORDER BY Id DESC) AS Rn
FROM YourTable
) AS q
WHERE SumByAbsAmount != 0
AND Rn = 1
ORDER BY Id;

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;

Limit MAX() result to one row based on highest value in a particular field

Of course my data set is more complex, but this is essentially what I have:
+--------+--------+-------+
| SEQ_NO | FILTER | VALUE |
+--------+--------+-------+
| 1 | 'A' | 5 |
| 2 | 'A' | 10 |
| 3 | 'A' | 15 |
+--------+--------+-------+
Here is my query:
SELECT MAX(SEQ_NO)
, FILTER
, VALUE
FROM TABLE
GROUP BY FILTER
, VALUE
This returns my entire data set. How can I alter my query so that it only returns the record with the highest SEQ_NO ?
SELECT t1.*
FROM Table AS t1
INNER JOIN
(
SELECT MAX(SEQ_NO) MAXSeq
, FILTER
, VALUE
FROM TABLE
GROUP BY FILTER
, VALUE
) t2 ON t1.SEQ_NO = t2.MAXSeq
AND t1.FILTER = t2.FILTER
AND t1.VALUE = t2.VALUE
Or using row_number:
SELECT *
FROM
(
SELECT *,
row_number() over(partition by FILTER, VALUE
order by SEQ_NO desc) as rn
FROM table
) t
WHERE rn = 1
From Oracle 12C:
SELECT SEQ_NO
, FILTER
, VALUE
FROM TABLE
ORDER BY SEQ_NO DESC
FETCH FIRST 1 ROWS ONLY;
You can use ROWNUM in oracle:
select *
from
( select *
from yourTable
order by SEQ_NO desc ) as t
where ROWNUM = 1;
This should work
SELECT TOP 1 *
FROM TABLE
ORDER BY SEQ_NO DESC
If I understand correctly, you want the top SEQ_NO per filter?
i've created this in SQL Server and converted to Oracle
SELECT a.SEQ_NO,
a.FILTER,
a.VALUE
FROM (
SELECT SEQ_NO,
FILTER,
VALUE,
MAX(SEQ_NO) OVER (PARTITION BY FILTER) m
FROM TABLE
) a
WHERE SEQ_NO = m
Using mysql
SELECT SEQ_NO
, VALUE
, FILTER
FROM TABLE
Order by SEQ_NO DESC LIMIT 1

Getting the First and Last Row Using ROW_NUMBER and PARTITION BY

Sample Input
Name | Value | Timestamp
-----|-------|-----------------
One | 1 | 2016-01-01 02:00
Two | 3 | 2016-01-01 03:00
One | 2 | 2016-01-02 02:00
Two | 4 | 2016-01-03 04:00
Desired Output
Name | Value | EarliestTimestamp | LatestTimestamp
-----|-------|-------------------|-----------------
One | 2 | 2016-01-01 02:00 | 2016-01-02 02:00
Two | 4 | 2016-01-01 03:00 | 2016-01-03 04:00
Attempted Query
I am trying to use ROW_NUMBER() and PARTITION BY to get the latest Name and Value but I would also like the earliest and latest Timestamp value:
SELECT
t.Name,
t.Value,
t.????????? AS EarliestTimestamp,
t.Timestamp AS LatestTimestamp
FROM
(SELECT
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TIMESTAMP DESC) AS RowNumber,
Name,
Value
Timestamp) t
WHERE t.RowNumber = 1
This can be done using window functions min and max.
select distinct name,
min(timestamp) over(partition by name), max(timestamp) over(partition by name)
from tablename
Example
Edit: Based on the comments
select t.name,t.value,t1.earliest,t1.latest
from t
join (select distinct name,
min(tm) over(partition by name) earliest, max(tm) over(partition by name) latest
from t) t1 on t1.name = t.name and t1.latest = t.tm
Edit: Another approach is using the first_value window function, which would eliminate the need for a sub-query and join.
select distinct
name,
first_value(value) over(partition by name order by timestamp desc) as latest_value,
min(tm) over(partition by name) earliest,
-- or first_value can be used
-- first_value(timestamp) over(partition by name order by timestamp)
max(tm) over(partition by name) latest
-- or first_value can be used
-- first_value(timestamp) over(partition by name order by timestamp desc)
from t
If I'm understanding your question correctly, here's one option using the row_number function twice. Then to get them on the same row, you can use conditional aggregation.
This should be close:
SELECT
t.Name,
t.Value,
max(case when t.minrn = 1 then t.timestamp end) AS EarliestTimestamp,
max(case when t.maxrn = 1 then t.timestamp end) AS LatestTimestamp
FROM
(SELECT
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TIMESTAMP) as minrn,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TIMESTAMP DESC) as maxrn,
Name,
Value
Timestamp
FROM YourTable) t
WHERE t.minrn = 1 or t.maxrn = 1
GROUP BY t.Name, t.Value
Use MIN(Timestamp) OVER (PARTITION BY Name) in addition to the ROW_NUMBER() column, like so:
SELECT
t.Name,
t.Value,
t.EarliestTimestamp AS EarliestTimestamp,
t.Timestamp AS LatestTimestamp
FROM
(SELECT
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TIMESTAMP DESC) AS RowNumber,
MIN(Timestamp) OVER (PARTITION BY Name) AS EarliestTimestamp,
^^
Name,
Value
Timestamp) t
WHERE t.RowNumber = 1
You can use MIN and MAX functions + OUTER APPLY:
SELECT t.Name,
p.[Value],
MIN(t.[Timestamp]) as EarliestTimestamp ,
MAX(t.[Timestamp]) as LatestTimestamp
FROM Table1 t
OUTER APPLY (SELECT TOP 1 * FROM Table1 WHERE t.Name = Name ORDER BY [Timestamp] DESC) p
GROUP BY t.Name, p.[Value]
Output:
Name Value EarliestTimestamp LatestTimestamp
One 2 2016-01-01 02:00 2016-01-02 02:00
Two 4 2016-01-01 03:00 2016-01-03 04:00
If I understood your question, use the row_number() function as follows:
SELECT
t.Name,
t.Value,
min(t.Timestamp) Over (Partition by name) As EarliestTimestamp,
t.Timestamp AS LatestTimestamp
FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TIMESTAMP DESC) AS RowNumber,
Name,
Value,
Timestamp) t
WHERE t.RowNumber = 1
Group By t.Name, t.Value, t.TimeStamp
Think simple.
select
t.Name,
MAX(t.Value),
MIN(t.Timestamp),
MAX(t.Timestamp)
FROM
t
group by
t.Name

Comparing row values in oracle

I have Table1 with three columns:
Key | Date | Price
----------------------
1 | 26-May | 2
1 | 25-May | 2
1 | 24-May | 2
1 | 23 May | 3
1 | 22 May | 4
2 | 26-May | 2
2 | 25-May | 2
2 | 24-May | 2
2 | 23 May | 3
2 | 22 May | 4
I want to select the row where value 2 was last updated (24-May). The Date was sorted using RANK function.
I am not able to get the desired results. Any help will be appreciated.
SELECT *
FROM (SELECT key, DATE, price,
RANK() over (partition BY key order by DATE DESC) AS r2
FROM Table1 ORDER BY DATE DESC) temp;
Another way of looking at the problem is that you want to find the most recent record with a price different from the last price. Then you want the next record.
with lastprice as (
select t.*
from (select t.*
from table1 t
order by date desc
) t
where rownum = 1
)
select t.*
from (select t.*
from table1 t
where date > (select max(date)
from table1 t2
where t2.price <> (select price from lastprice)
)
order by date asc
) t
where rownum = 1;
This query looks complicated. But, it is structured so it can take advantage of indexes on table1(date). The subqueries are necessary in Oracle pre-12. In the most recent version, you can use fetch first 1 row only.
EDIT:
Another solution is to use lag() and find the most recent time when the value changed:
select t1.*
from (select t1.*
from (select t1.*,
lag(price) over (order by date) as prev_price
from table1 t1
) t1
where prev_price is null or prev_price <> price
order by date desc
) t1
where rownum = 1;
Under many circumstances, I would expect the first version to have better performance, because the only heavy work is done in the innermost subquery to get the max(date). This verson has to calculate the lag() as well as doing the order by. However, if performance is an issue, you should test on your data in your environment.
EDIT II:
My best guess is that you want this per key. Your original question says nothing about key, but:
select t1.*
from (select t1.*,
row_number() over (partition by key order by date desc) as seqnum
from (select t1.*,
lag(price) over (partition by key order by date) as prev_price
from table1 t1
) t1
where prev_price is null or prev_price <> price
order by date desc
) t1
where seqnum = 1;
You can try this:-
SELECT Date FROM Table1
WHERE Price = 2
AND PrimaryKey = (SELECT MAX(PrimaryKey) FROM Table1
WHERE Price = 2)
This is very similar to the second option by Gordon Linoff but introduces a second windowed function row_number() to locate the most recent row that changed the price. This will work for all or a range of keys.
select
*
from (
select
*
, row_number() over(partition by Key order by [date] DESC) rn
from (
select
*
, NVL(lag(Price) over(partition by Key order by [date] DESC),0) prevPrice
from table1
where Key IN (1,2,3,4,5) -- as an example
)
where Price <> prevPrice
)
where rn = 1
apologies but I haven't been able to test this at all.