Select distinct rows with max date with repeated and null values (Oracle) - sql

I've 3 tables. Let's say Root, Detail and Revision
I need to select the distinct codes from Root with the highest revision date, having count that the revision lines may not exist and/or have repeteated values in the date column.
Root: idRoot, Code
Detail: idDetail, price, idRoot
Revision: idRevision, date, idDetail
So, i've started doing the join query:
select code, price, date from Root r
inner join Detail d on d.idRoot = r.idRoot
left join Revision r on d.idDetail = r.idDetail;
Having table results like this:
CODE|PRICE|DATE idRevision
---- ----- ----- -----------
C1 100 2/1/2016 1
C1 120 2/1/2016 3
C1 150 null 2
C1 200 1/1/2016 4
C2 300 null null
C3 400 3/1/2016 6
But what I really need is the next result:
CODE|PRICE|DATE idRevision
---- ----- ----- -----------
C1 120 2/1/2016 3
C2 300 null null
C3 400 3/1/2016 6
I've seen several answers for similar cases, but never with null and repeated values:
Oracle: Taking the record with the max date
Fetch the row which has the Max value for a column
Oracle Select Max Date on Multiple records
Any kind of help would be really appreciated

You can use row_number():
select code, price, date
from (select code, price, date,
row_number() over (partition by code order by date desc nulls last, idRevision desc) as seqnum
from Root r inner join
Detail d
on d.idRoot = r.idRoot left join
Revision r
on d.idDetail = r.idDetail
) rdr
where seqnum = 1;

Related

How to retrieve historical data based on condition on one row?

I have a table historical_data
ID
Date
column_a
column_b
1
2011-10-01
a
a1
1
2011-11-01
w
w1
1
2011-09-01
a
a1
2
2011-01-12
q
q1
2
2011-02-01
d
d1
3
2011-11-01
s
s1
I need to retrieve the whole history of an id based on the date condition on any 1 row related to that ID.
date>='2011-11-01' should get me
ID
Date
column_a
column_b
1
2011-10-01
a
a1
1
2011-11-01
w
w1
1
2011-09-01
a
a1
3
2011-11-01
s
s1
I am aware you can get this by using a CTE or a subquery like
with selected_id as (
select id from historical_data where date>='2011-11-01'
)
select hd.* from historical_data hd
inner join selected_id si on hd.id = si.id
or
select * from historical_data
where id in (select id from historical_data where date>='2011-11-01')
In both these methods I have to query/scan the table ``historical_data``` twice.
I have indexes on both id and date so it's not a problem right now, but as the table grows this may cause issues.
The table above is a sample table, the table I have is about to touch 1TB in size with upwards of 600M rows.
Is there any way to achieve this by only querying the table once? (I am using Snowflake)
Using QUALIFY:
SELECT *
FROM historical_data
QUALIFY MAX(date) OVER(PARTITION BY id) >= '2011-11-01'::DATE;

left out join returns the duplicate records

I have 3 tables where i have to join and get the latest data. The 3 tables are as follows "STUDENT", "MATH", "ENGLISH".
STUNDET tables contain:
ID NAME CLASS CODE MODIFIED_DATE
-------------------------------------
1 ABC First 1234 01-10-2020
2 EFG Second 3421 01-01-2020
3 XYZ Third 1434 01-01-2020
1 ABC First 9999 01-01-2021
MATH table contain:
ID MSCORE MDATE
----------------
1 80 20-09-2020
2 71 10-12-2020
1 74 04-03-2021
2 90 13-03-2020
ENGLISH table contains:
ID ESCORE EDATE
---------------
1 72 21-04-2021
2 43 19-01-2021
3 60 01-01-2021
3 38 01-05-2021
Result should be:
ID NAME CODE MSCORE MDATE ESCORE EDATE
----------------------------------------------
1 ABC 9999 74 04-03-2021 72 21-04-2021
2 EFG 71 10-12-2020 43 19-01-2021
3 XYZ 38 01-05-2021
But i am getting duplicate records for each ID. when i am using the below query.
select a.ID,a.NAME,a.CODE,b.MSCORE,b.MDATE,c.ESCORE,c.EDATE from STUDENT a LEFT OUTER JOIN MATH b ON a.ID=b.ID LEFT OUTER JOIN ENGLISH c ON a.ID=c.ID;
Please someone let me know what might be the correct query to fetch each record for a ID form tables based on the latest date given in MATH and ENGLISH table.
EDIT:
I have added Code column to STUDENT table, and when i run the query i should get the latest code data for the ID.
If you want the most recent row from each table, use window functions:
select s.*, m.MSCORE, m.MDATE, e.ESCORE, e.EDATE
from (select s.*,
row_number() over (partition by s.id order by modified_date desc) as seqnum
from STUDENT s
) s LEFT OUTER JOIN
(select m.*,
row_number() over (partition by m.id order by m.mdate desc) as seqnum
from MATH m
) m
on m.ID = s.ID and m.seqnum = 1 LEFT OUTER JOIN
(select e.*,
row_number() over (partition by e.id order by e.edate desc) as seqnum
from ENGLISH e
) e
on e.id = s.id and e.seqnum = 1
where s.seqnum = 1;
Note that I have replaced your meaningless table aliases with abbreviations for the table names. This makes the query much simpler to read and maintain.
A second way to do this is to use a correlated sub-query on each table before joining them to pick latest record for each ID:
Select s.id, s.name, s.code,m.mscore,m.MDATE, e.ESCORE, e.EDATE
From
(Select * from Student s1
Where modified_date=(Select max(modified_date
From Student s2
Where s2.id=s1.id)
) s LEFT OUTER JOIN
(Select * from Math m1
Where mdate=(Select max(mdate)
From Math m2
Where m2.id=m1.id)
) m ON s.id=m.id LEFT OUTER JOIN
(Select * from English e1
Where edate=(Select max(edate)
From English e2
Where e2.id=e1.id)
) e ON s.id=e.id
Also, you should really make your 3 modified dates into date-time data types to distinguish among different modifications done the same day. If two such records appear in your tables, this query fails by bringing back both records while Gordon Linoff answer could return a row that was not the most recent.

SQL Interleave multiple ordered tables

Let's say I have 2 tables with date ordered rows like:
products table:
date
name
09/01/2021
P1
12/01/2021
P2
22/01/2021
P3
and artworks table:
date
name
19/01/2018
A1
27/02/2019
A2
28/02/2021
A3
Is there any way in SQL to design a query that joins the 2 tables by "interleaving" them, but takes the first 2 products, then 1 artwork, then the next 2 products, then the next artwork...and so on
The result would be like:
date
name
09/01/2021
P1
12/01/2021
P2
19/01/2018
A1
22/01/2021
P3
27/02/2019
A2
You can use ROW_NUMBER() to produce interleaving numbering.
For example:
select
date, name
from (
select date, name,
row_number() over(order by date) * 10 as rn
from products
union all
select date, name,
row_number() over(order by date) * 20 + 1 as rn
from artworks
) x
order by rn

find all rows after the recent update using oracle

I tried below query to bring all rows after last Action="UNLOCKED", but ORDER BY is not allowed in subquery it seems.
SELECT *
FROM TABLE
WHERE id >= (SELECT MAX(id)
FROM TABLE
WHERE ACTION='UNLOCKED' AND action_id=123
ORDER BY CREATE_DATE DESC);
Sample data
Id action_id Action ... CREATE_DATE
1 123 ADD 03/18/2018
2 123 Unlocked 03/19/2018
3 123 Updated1 03/19/2018
4 123 Updated2 03/19/2018
5 123 Unlocked 03/20/2018
6 123 Updated3 03/20/2018
7 123 Updated4 03/20/2018
Output should be rows with id 5,6,7. What should i use to get this output
you could use an inner join on subselect for max create_date
select * from TABLE
INNER JOIN (
select max(CREATE_DATE) max_date
from TABLE
where Action = 'Unlocked' ) T on t.max_date = TABLE.CREATE_DATE
You need not order the inner query because it will return only one value. You can do it as follows
SELECT * FROM TABLE WHERE id >= (select max(id) from TABLE where ACTION='UNLOCKED' and action_id=123);

select rows from main table based on highest date in child table between a date range

Sorry for the confusing title.
I've this table:
ApplicantID Applicant Name
-------------------------------
1 Sandeep
2 Thomas
3 Philip
4 Jerin
ALong with this child table which is connected with the above table:
DetailsID ApplicantID CourseName Dt
---------------------------------------------------------------------
1 1 C1 10/5/2014
2 1 C2 10/18/2014
3 1 c3 7/3/2014
4 2 C1 3/2/2014
5 2 C2 10/18/2014
6 2 c3 1/1/2014
7 3 C1 1/5/2014
8 3 C2 4/18/2014
9 3 c3 2/23/2014
10 4 C1 3/15/2014
11 4 C2 2/20/2014
12 4 C2 2/20/2014
I want to get applicantsID, for example, when I specify a date range from
4/20/2014 to 3/5/2014 I should have:
ApplicantID Applicant Name
-------------------------------
3 Philip
4 Jerin
That means the applicants from the main table that must be in the second table and also the highest date of the second table must fall in the specified date range. Hope the scenario is clear.
you can use window analytic function row_number to get applicant with maximum date in the given time range.
select T1.[ApplicantID], [Applicant Name]
from Table1 T1
join ( select [ApplicantID],
ROW_NUMBER() over ( partition by [ApplicantID] order by Dt desc) as rn
from Table2
where Dt BETWEEN '3/5/2014' AND '4/20/2014'
) T
on T1.[ApplicantID] = T.[ApplicantID]
and T.rn =1
You will need to pull the MAX per ApplicantId with a GROUP BY in a sub-query, then JOIN to that result. This should work for you:
Select A.ApplicantId, A.[Applicant Name]
From ApplicantTableName A
Join
(
Select D.ApplicantId, Max(D.Dt) DT
From DetailsTableName D
Group By D.ApplicantId
) B On A.ApplicantId = B.ApplicantId
Where B.DT Between '03/05/2014' And '04/20/2014'