Here is an image describing the table I'm working with (TBL_CHILDREN) as well as the desired output I'm trying to achieve.
The desired table wants to have a separate row for each new combination of active CHILD_ID's under the same PARENT_ID. So, for example, from 2017-01-01 to 2017-02-28 only CHILD_ID 1 was active so the desired table has a row spanning 2017-01-01 to 2017-02-28. But then on 2017-03-01 CHILD_ID 2 also became effective, so I need a new row to reflect the period where CHILD_ID 1 and 2 were active at the same time. And so on and so forth until I have a row describing each period of CHILD_ID combinations.
Here is some code for TBL_CHILDREN:
WITH TBL_CHILDREN AS (SELECT 57 PARENT_ID, 1 CHILD_ID, TO_DATE('2017-01-01','YYYY-MM-DD') START_DATE, TO_DATE('9999-12-31','YYYY-MM-DD') END_DATE FROM dual UNION ALL
SELECT 57 PARENT_ID, 2 CHILD_ID, TO_DATE('2017-03-01','YYYY-MM-DD') START_DATE, TO_DATE('2017-05-31','YYYY-MM-DD') END_DATE FROM dual UNION ALL
SELECT 57 PARENT_ID, 3 CHILD_ID, TO_DATE('2017-04-01','YYYY-MM-DD') START_DATE, TO_DATE('2017-10-31','YYYY-MM-DD') END_DATE FROM dual)
SELECT *
FROM TBL_CHILDREN
Using UNPIVOT with LAG or LEAD analytic function will do it in a single table scan:
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table TBL_CHILDREN ( parent_id, child_id, start_date, end_date )AS
SELECT 57, 1, DATE '2017-01-01', DATE '9999-12-31' FROM dual UNION ALL
SELECT 57, 2, DATE '2017-03-01', DATE '2017-05-31' FROM dual UNION ALL
SELECT 57, 3, DATE '2017-04-01', DATE '2017-10-31' FROM dual;
Query 1:
SELECT *
FROM (
SELECT PARENT_ID,
DT AS start_date,
LEAD( DT ) OVER ( PARTITION BY parent_id ORDER BY DT ) AS end_date
FROM TBL_CHILDREN
UNPIVOT( dt FOR start_end IN ( start_date, end_date ) )
)
WHERE end_date IS NOT NULL
Results:
| PARENT_ID | START_DATE | END_DATE |
|-----------|----------------------|----------------------|
| 57 | 2017-01-01T00:00:00Z | 2017-03-01T00:00:00Z |
| 57 | 2017-03-01T00:00:00Z | 2017-04-01T00:00:00Z |
| 57 | 2017-04-01T00:00:00Z | 2017-05-31T00:00:00Z |
| 57 | 2017-05-31T00:00:00Z | 2017-10-31T00:00:00Z |
| 57 | 2017-10-31T00:00:00Z | 9999-12-31T00:00:00Z |
Query 2 and this will get the parent and child ids for each time period:
SELECT *
FROM (
SELECT parent_id,
( SELECT LISTAGG( child_id, ',' ) WITHIN GROUP ( ORDER BY child_id )
FROM TBL_CHILDREN c
WHERE u.dt >= c.START_DATE
AND u.dt < c.END_DATE ) AS child_ids,
DT AS start_date,
LEAD( DT ) OVER ( PARTITION BY parent_id ORDER BY DT ) AS end_date
FROM TBL_CHILDREN
UNPIVOT( dt FOR start_end IN ( start_date, end_date ) ) u
)
WHERE end_date IS NOT NULL
Results:
| PARENT_ID | CHILD_IDS | START_DATE | END_DATE |
|-----------|-----------|----------------------|----------------------|
| 57 | 1 | 2017-01-01T00:00:00Z | 2017-03-01T00:00:00Z |
| 57 | 1,2 | 2017-03-01T00:00:00Z | 2017-04-01T00:00:00Z |
| 57 | 1,2,3 | 2017-04-01T00:00:00Z | 2017-05-31T00:00:00Z |
| 57 | 1,3 | 2017-05-31T00:00:00Z | 2017-10-31T00:00:00Z |
| 57 | 1 | 2017-10-31T00:00:00Z | 9999-12-31T00:00:00Z |
Please take a look at this demo
WITH qqq AS (
SELECT * FROM TBL_CHILDREN
START WITH child_id = 1
CONNECT BY PRIOR parent_id = parent_id AND PRIOR child_id + 1 = child_id
)
SELECT * FROM (
SELECT PARENT_ID,
d as start_date,
lead(d) over (partition by PARENT_ID order by d ) - 1 as end_date
FROM (
SELECT PARENT_ID, start_date as d FROM qqq
UNION
SELECT PARENT_ID, end_date FROM qqq
)
)
WHERE end_date is not null
ORDER by PARENT_ID, start_date
;
| PARENT_ID | START_DATE | END_DATE |
|-----------|----------------------|----------------------|
| 57 | 2017-01-01T00:00:00Z | 2017-02-28T00:00:00Z |
| 57 | 2017-03-01T00:00:00Z | 2017-03-31T00:00:00Z |
| 57 | 2017-04-01T00:00:00Z | 2017-05-30T00:00:00Z |
| 57 | 2017-05-31T00:00:00Z | 2017-10-30T00:00:00Z |
| 57 | 2017-10-31T00:00:00Z | 9999-12-30T00:00:00Z |
Related
I am trying to take Employee by status from table. I have 2 statuses, If an employee has A condition take that row, otherwise take P status row with maximum oper_day It looks like below:
Table
---------------------------------------------------
id | emp_code | name | status | oper_day |
--------------------------------------------------
1 | 164094 | John | P | 2020-10-02 |
2 | 164094 | John | P | 2020-10-09 |
3 | 164094 | John | A | 2020-10-10 |
4 | 145890 | Mike | P | 2020-10-05 |
My result should look like below
--------------------------------
id | emp_code | name | status | oper_day |
--------------------------------------------------
1 | 164094 | John | A | 2020-10-10 |
2 | 145890 | Mike | P | 2020-10-05 |
Any help is appreciated
Using ROW_NUMBER:
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY emp_code ORDER BY status, oper_day DESC) rn
FROM yourTable t
)
SELECT id, emp_code, name, status, oper_day
FROM cte
WHERE rn = 1;
The logic here is that should an employee have a status A record, it would be assigned the first row number, since A sorts before P. Otherwise, a P status record would be chosen. We choose the more recent record per employee in case of multiple records.
You can use aggregation functions with KEEP( DENSE_RANK FIRST ORDER BY ... ):
SELECT MAX( id ) KEEP ( DENSE_RANK FIRST ORDER BY status ASC, oper_day DESC ) AS id,
emp_code,
MAX( name ),
MIN( status ) AS status,
MAX( oper_day ) KEEP ( DENSE_RANK FIRST ORDER BY status ) AS oper_day
FROM table_name
GROUP BY
emp_code
Which., for your sample data:
CREATE TABLE table_name ( id, emp_code, name, status, oper_day ) AS
SELECT 1, 164094, 'John', 'P', DATE '2020-10-02' FROM DUAL UNION ALL
SELECT 2, 164094, 'John', 'P', DATE '2020-10-09' FROM DUAL UNION ALL
SELECT 3, 164094, 'John', 'A', DATE '2020-10-10' FROM DUAL UNION ALL
SELECT 4, 145890, 'Mike', 'P', DATE '2020-10-05' FROM DUAL;
Outputs:
ID | EMP_CODE | MAX(NAME) | STATUS | OPER_DAY
-: | -------: | :-------- | :----- | :------------------
4 | 145890 | Mike | P | 2020-10-05 00:00:00
3 | 164094 | John | A | 2020-10-10 00:00:00
db<>fiddle here
I have the following Entity–attribute–value (EAV) table in Oracle:
| ID | Key | Value |
|----|-------------|--------------|
| 1 | phone_num_1 | 111-111-1111 |
| 1 | phone_num_2 | 222-222-2222 |
| 1 | contact_1 | friend |
| 1 | contact_2 | family |
| 1 | first_name | mike |
| 1 | last_name | smith |
| 2 | phone_num_1 | 333-333-3333 |
| 2 | phone_num_2 | 444-444-4444 |
| 2 | contact_1 | family |
| 2 | contact_2 | friend |
| 2 | first_name | john |
| 2 | last_name | adams |
| 3 | phone_num_1 | 555-555-5555 |
| 3 | phone_num_2 | 666-666-6666 |
| 3 | phone_num_3 | 777-777-7777 |
| 3 | contact_1 | work |
| 3 | contact_2 | family |
| 3 | contact_3 | friend |
| 3 | first_name | mona |
| 3 | last_name | lisa |
Notice that some keys are indexed and therefore have an association with other indexed keys. For example, phone_num_1 is to be associated with contact_1.
Note: There is no hard limit to the number of indexes. There can be 10, 20, or even 50 phone_num_*, but it's guaranteed that for each phone_num_N, there is a corresponding contact_N
This is my desired result:
| ID | Phone_Num | Contact | First_Name | Last_Name |
|----|--------------|---------|------------|-----------|
| 1 | 111-111-1111 | friend | mike | smith |
| 1 | 222-222-2222 | family | mike | smith |
| 2 | 333-333-3333 | family | john | adams |
| 2 | 444-444-4444 | friend | john | adams |
| 3 | 555-555-5555 | work | mona | lisa |
| 3 | 666-666-6666 | family | mona | lisa |
| 3 | 777-777-7777 | friend | mona | lisa |
What have I tried/looked at:
I have looked into the pivot function of Oracle; however, I don't believe that can solve my problem since I don't have a fixed number of attributes that I want to pivot on.
I've looked at these posts:
SQL Query to return multiple key value pairs from a single table in one row
Pivot rows to columns without aggregate
Question:
Is what I'm tying to accomplish at all possible purely with SQL? If so, how can it be done? If not, please explain why.
Any help is much appreciated and here's the with table to help you get started:
with
table_1 ( id, key, value ) as (
select 1,'phone_num_1','111-111-1111' from dual union all
select 1,'phone_num_2','222-222-2222' from dual union all
select 1,'contact_1','friend' from dual union all
select 1,'contact_2','family' from dual union all
select 1,'first_name','mike' from dual union all
select 1,'last_name','smith' from dual union all
select 2,'phone_num_1','333-333-3333' from dual union all
select 2,'phone_num_2','444-444-4444' from dual union all
select 2,'contact_1','family' from dual union all
select 2,'contact_2','friend' from dual union all
select 2,'first_name','john' from dual union all
select 2,'last_name','adams' from dual union all
select 3,'phone_num_1','555-555-5555' from dual union all
select 3,'phone_num_2','666-666-6666' from dual union all
select 3,'phone_num_3','777-777-7777' from dual union all
select 3,'contact_1','work' from dual union all
select 3,'contact_2','family' from dual union all
select 3,'contact_3','friend' from dual union all
select 3,'first_name','mona' from dual union all
select 3,'last_name','lisa' from dual
)
select * from table_1;
This is not a dynamic pivot as you have a fixed set of keys - you just need to separate the enumeration of the keys from the keys themselves first.
You need to:
Separate the phone_num and contact key prefixes from the enumerated item; then
Pivot the common keys that have no enumeration so that they are associated with each enumerated key; and finally,
Pivot a second time to get the enumerated keys in a row together.
Oracle Setup:
CREATE TABLE table_1 ( id, key, value ) as
select 1,'phone_num_1','111-111-1111' from dual union all
select 1,'phone_num_2','222-222-2222' from dual union all
select 1,'contact_1','friend' from dual union all
select 1,'contact_2','family' from dual union all
select 1,'first_name','mike' from dual union all
select 1,'last_name','smith' from dual union all
select 2,'phone_num_1','333-333-3333' from dual union all
select 2,'phone_num_2','444-444-4444' from dual union all
select 2,'contact_1','family' from dual union all
select 2,'contact_2','friend' from dual union all
select 2,'first_name','john' from dual union all
select 2,'last_name','adams' from dual union all
select 3,'phone_num_1','555-555-5555' from dual union all
select 3,'phone_num_2','666-666-6666' from dual union all
select 3,'phone_num_3','777-777-7777' from dual union all
select 3,'contact_1','work' from dual union all
select 3,'contact_2','family' from dual union all
select 3,'contact_3','friend' from dual union all
select 3,'first_name','mona' from dual union all
select 3,'last_name','lisa' from dual
Query:
SELECT *
FROM (
SELECT id,
CASE
WHEN key LIKE 'phone_num_%' THEN 'phone_num'
WHEN key LIKE 'contact_%' THEN 'contact'
ELSE key
END AS key,
CASE
WHEN key LIKE 'phone_num_%'
OR key LIKE 'contact_%'
THEN TO_NUMBER( SUBSTR( key, INSTR( key, '_', -1 ) + 1 ) )
ELSE NULL
END AS item,
value,
MAX( CASE key WHEN 'first_name' THEN value END )
OVER ( PARTITION BY id ) AS first_name,
MAX( CASE key WHEN 'last_name' THEN value END )
OVER ( PARTITION BY id ) AS last_name
FROM table_1
)
PIVOT( MAX( value ) FOR key IN ( 'contact' AS contact, 'phone_num' AS phone_num ) )
WHERE item IS NOT NULL
ORDER BY id, item
Output:
ID | ITEM | FIRST_NAME | LAST_NAME | CONTACT | PHONE_NUM
-: | ---: | :--------- | :-------- | :------ | :-----------
1 | 1 | mike | smith | friend | 111-111-1111
1 | 2 | mike | smith | family | 222-222-2222
2 | 1 | john | adams | family | 333-333-3333
2 | 2 | john | adams | friend | 444-444-4444
3 | 1 | mona | lisa | work | 555-555-5555
3 | 2 | mona | lisa | family | 666-666-6666
3 | 3 | mona | lisa | friend | 777-777-7777
db<>fiddle here
If you can refactor the table then a simple improvement would be to add an extra column to hold the enumeration of the keys and use NULL when it is a value common to every enumeration:
CREATE TABLE table_1 ( id, key, line, value ) as
select 1, 'phone_num', 1, '111-111-1111' from dual union all
select 1, 'phone_num', 2, '222-222-2222' from dual union all
select 1, 'contact', 1, 'friend' from dual union all
select 1, 'contact', 2, 'family' from dual union all
select 1, 'first_name', NULL, 'mike' from dual union all
select 1, 'last_name', NULL, 'smith' from dual
Then your set of keys is always fixed and you do not need to extract the enumeration value from the key.
This is ugly, but I think does what you need
select t1.* , t2.value, t3.n, t3.f
from table_1 t1
inner join table_1 t2 on t1.id = t2.id and REPLACE(t1.key, 'phone_num_', '') = REPLACE(t2.key, 'contact_', '')
inner join (
select ID, min(case when Key = 'first_name' then Value end) as n, min(case when Key = 'last_name' then Value end) as f
from table_1
group by ID
) t3 on t1.id = t3.id
where
t1.Key not in('first_name','last_name')
SELECT id,
phone,
contact,
first_value(last) IGNORE NULLS over (partition BY id order by id DESC range BETWEEN CURRENT row AND unbounded following ) last_name,
first_value(FIRST) IGNORE NULLS over (partition BY id order by id DESC range BETWEEN CURRENT row AND unbounded following ) first_name
FROM
(SELECT id,
value,
row_number() over ( partition BY id,SUBSTR(KEY,1 ,instr(KEY,'',1)-1) order by KEY) rn,
SUBSTR(KEY,1 ,instr(KEY,'',1) -1) KEY
FROM table_1
) pivot ( MAX(value) FOR KEY IN ( 'phone' AS phone,'last' AS last,'first' AS FIRST,'contact' AS contact))
ORDER BY id;
This is a simplified version of my table
+----+----------+------------+------------+
| ID | Category | Start Date | End Date |
+----+----------+------------+------------+
| 1 | 'Alpha' | 2018/04/12 | 2018/04/15 |
| 2 | null | 2018/04/17 | 2018/04/21 |
| 3 | 'Gamma' | 2018/05/02 | 2018/05/07 |
| 4 | 'Gamma' | 2018/05/09 | 2018/05/11 |
| 5 | 'Gamma' | 2018/05/11 | 2018/05/17 |
| 6 | 'Alpha' | 2018/05/17 | 2018/05/23 |
| 7 | 'Alpha' | 2018/05/23 | 2018/05/24 |
| 8 | null | 2018/05/24 | 2018/06/02 |
| 9 | 'Beta' | 2018/06/12 | 2018/06/16 |
| 10 | 'Beta' | 2018/06/16 | 2018/06/20 |
+----+----------+------------+------------+
All Start Date are unique, not nullable and they have the same order as the IDs (if a and b are IDs and a < b then StartDate[a] < StartDate[b]). The Start Date is not always equal to the End Date of the previous row for the same Category (look at id 3 and 4).
I'm looking for a query that will give me the following result
+----------+------------+------------+
| Category | Start Date | End Date |
+----------+------------+------------+
| 'Alpha' | 2018/04/12 | 2018/04/15 |
| null | 2018/04/17 | 2018/04/21 |
| 'Gamma' | 2018/05/02 | 2018/05/17 |
| 'Alpha' | 2018/05/17 | 2018/05/24 |
| null | 2018/05/24 | 2018/06/02 |
| 'Beta' | 2018/06/12 | 2018/06/20 |
+----------+------------+------------+
Note: The End Date will be equal to End Date of the last row in the subgroup (same continuous Category).
This is a gaps-and-islands problem. I think you can use the difference of row numbers:
select category, min(startdate), max(enddate)
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by category order by id) as seqnum_c
from t
) t
group by category, (seqnum - seqnum_c)
order by min(startdate);
This is a gaps and islands question, you can use such a logic below
select category, min(start_date) as start_date, max(end_date) as end_date
from
(
select tt.*, sum(grp) over (order by id, start_date) sm
from
(
with t( ID, Category, Start_Date, End_Date) as
(
select 1 , 'Alpha' , date'2018-04-12',date'2018-04-15' from dual union all
select 2 , null , date'2018-04-17',date'2018-04-21' from dual union all
select 3 , 'Gamma' , date'2018-05-02',date'2018-05-07' from dual union all
select 4 , 'Gamma' , date'2018-05-09',date'2018-05-11' from dual union all
select 5 , 'Gamma' , date'2018-05-11',date'2018-05-17' from dual union all
select 6 , 'Alpha' , date'2018-05-17',date'2018-05-23' from dual union all
select 7 , 'Alpha' , date'2018-05-23',date'2018-05-24' from dual union all
select 8 , null , date'2018-05-24',date'2018-06-02' from dual union all
select 9 , 'Beta' , date'2018-06-12',date'2018-06-16' from dual union all
select 10 , 'Beta' , date'2018-06-16',date'2018-06-20' from dual
)
select id, Category,
decode(nvl(lag(end_date) over
(order by end_date),start_date),start_date,0,1)
as grp, --> means prev. value equals or not
row_number() over (order by id, end_date) as rn, start_date, end_date
from t
) tt
order by rn
)
group by Category, sm
order by end_date;
CATEGORY START_DATE END_DATE
Alpha 12.04.2018 15.04.2018
NULL 17.04.2018 21.04.2018
Gamma 02.05.2018 07.05.2018
Gamma 09.05.2018 17.05.2018
Alpha 17.05.2018 24.05.2018
NULL 24.05.2018 02.06.2018
Beta 12.06.2018 20.06.2018
Good day,
I have data in following form
ID Start Date End Date
1 01-Nov-2018 01-Nov-2018
2 04-Nov-2018 07-Nov-2018
3 09-Nov-2018 09-Nov-2018
4 11-Nov-2018 12-Nov-2018
I want to generate the following output
ID Date
1 01-Nov-2018
2 04-Nov-2018
2 05-Nov-2018
2 06-Nov-2018
2 07-Nov-2018
3 09-Nov-2018
4 11-Nov-2018
4 12-Nov-2018
I know how to do it if i want to process it for single ID
SELECT
,d.ID
, dv.date_start start_date
, dv.date_end End_Date
, dv.date_start + Level - 1 the_date
From (SELECT *
FROM table_name d
WHERE d.id = <some_id>) dv
Where (dv.date_start + Level - 1) <= dv.date_end
Connect By Level <= dv.date_end - dv.date_start + 1;
But as soon as i give multiple Ids it just goes haywire and give multiple duplicate dates. Appreciate if anyone can help in how I can generate the desired data.
Try this.
SELECT id,
start_date + LEVEL - 1
FROM t
CONNECT BY LEVEL <= ( end_date - start_date + 1 )
AND PRIOR id = id
AND PRIOR sys_guid() IS NOT NULL;
Read :Sys_Guid() in connect by level
Demo
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( ID, Start_Date, End_Date ) AS
SELECT 1, DATE '2018-11-01', DATE '2018-11-01' FROM DUAL UNION ALL
SELECT 2, DATE '2018-11-04', DATE '2018-11-07' FROM DUAL UNION ALL
SELECT 3, DATE '2018-11-09', DATE '2018-11-09' FROM DUAL UNION ALL
SELECT 4, DATE '2018-11-11', DATE '2018-11-12' FROM DUAL;
Query 1:
SELECT t.*,
c.COLUMN_VALUE AS the_date
FROM table_name t
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT t.start_date + LEVEL - 1
FROM DUAL
CONNECT BY t.start_date + LEVEL - 1 <= t.end_date
)
AS SYS.ODCIDATELIST
)
) c
Results:
| ID | START_DATE | END_DATE | THE_DATE |
|----|----------------------|----------------------|----------------------|
| 1 | 2018-11-01T00:00:00Z | 2018-11-01T00:00:00Z | 2018-11-01T00:00:00Z |
| 2 | 2018-11-04T00:00:00Z | 2018-11-07T00:00:00Z | 2018-11-04T00:00:00Z |
| 2 | 2018-11-04T00:00:00Z | 2018-11-07T00:00:00Z | 2018-11-05T00:00:00Z |
| 2 | 2018-11-04T00:00:00Z | 2018-11-07T00:00:00Z | 2018-11-06T00:00:00Z |
| 2 | 2018-11-04T00:00:00Z | 2018-11-07T00:00:00Z | 2018-11-07T00:00:00Z |
| 3 | 2018-11-09T00:00:00Z | 2018-11-09T00:00:00Z | 2018-11-09T00:00:00Z |
| 4 | 2018-11-11T00:00:00Z | 2018-11-12T00:00:00Z | 2018-11-11T00:00:00Z |
| 4 | 2018-11-11T00:00:00Z | 2018-11-12T00:00:00Z | 2018-11-12T00:00:00Z |
I have two tables:
Person
+---------+-----------+
| Name | Added |
+---------+-----------+
| Roger | 2/1/2001 |
| Natalie | 5/5/2001 |
| George | 6/6/2001 |
| Paul | 12/5/1999 |
+---------+-----------+
Stage
+-------------+----------+
| Description | Start |
+-------------+----------+
| 1 | 1/1/1980 |
| 2 | 4/1/2001 |
| 3 | 6/1/2001 |
+-------------+----------+
I want to join Person with stage such that I get the following result.
Result
+---------+-----------+--------+
| Name | Added | Stage |
+---------+-----------+--------+
| Roger | 2/1/2001 | 1 |
| Natalie | 5/5/2001 | 2 |
| George | 6/6/2001 | 3 |
| Paul | 12/5/1999 | 1 |
+---------+-----------+--------+
So, the stage 1 matches (added >= 1/1/1980 AND added < 4/1/2001), stage 2 matches (added >= 4/1/2001 AND added < 6/1/2001), stage 3 (added >= 6/1/2001) etc... This works, but I think it's kind of ugly (and only happens to work because the description is sequential as well).
SELECT person.name,
person.added,
(SELECT MAX(description) FROM stage d2 WHERE person.added >= d2.start) description
FROM person
Is there a way to do this in a regular join, and if description were a string rather than a sequential number? Thanks.
Instead of a subquery, you could use row_number():
select name, added, description
from (
select p.name, p.added, s.description
, row_number() over (
partition by p.name
order by s.start desc
) as rn
from person p
inner join stage s
on s.start <= p.added
) t
where rn = 1
test setup: http://rextester.com/SIAUAZ29747
with Person (Name,Added_date) as (
select 'Roger' , to_date('2001-02-01','yyyy-mm-dd') from dual union all
select 'Natalie' , to_date('2001-05-05','yyyy-mm-dd') from dual union all
select 'George' , to_date('2001-06-06','yyyy-mm-dd') from dual union all
select 'Paul' , to_date('1999-12-05','yyyy-mm-dd') from dual
),
Stage ( Description , Start_date ) as (
select 1, to_date('1980-01-01','yyyy-mm-dd') from dual union all
select 2, to_date('2001-04-01','yyyy-mm-dd') from dual union all
select 3, to_date('2001-06-01','yyyy-mm-dd') from dual
)
select name, to_char(added_date,'yyyy-mm-dd') added, description
from (
select p.name, p.added_date, s.description
, row_number() over (
partition by p.name
order by s.start_date desc
) as rn
from person p
inner join stage s
on s.start_date <= p.added_date
) t
where rn = 1
order by added_date
returns:
+---------+------------+-------------+
| NAME | ADDED | DESCRIPTION |
+---------+------------+-------------+
| Paul | 1999-12-05 | 1 |
| Roger | 2001-02-01 | 1 |
| Natalie | 2001-05-05 | 2 |
| George | 2001-06-06 | 3 |
+---------+------------+-------------+
Problems of this type can often be solved with no joins at all. Instead, combine the two tables (as illustrated below) with UNION ALL and use the LAST_VALUE() function:
select name, added, description
from (
select name, added,
last_value(description ignore nulls)
over (order by added, description) as description
from ( select name, null as description, added
from person
union all
select null, description, start_date
from stage
)
)
where name is not null
order by added, name -- if needed
;
NAME ADDED DESCRIPTION
------- ---------- -----------
Paul 12/05/1999 1
Roger 02/01/2001 1
Natalie 05/05/2001 2
George 06/06/2001 3
Big THANK YOU to #MT0 for providing the setup (CREATE TABLE statements).
Here is a version that joins the rows in Person to Stage with a 1:1 correspondence (unlike the accepted solution which will join Person to multiple rows in Stage and then have to filter out the unwanted rows):
Oracle Setup:
CREATE TABLE Person (Name,Added) AS
SELECT 'Roger' , DATE '2001-02-01' FROM DUAL UNION ALL
SELECT 'Natalie' , DATE '2001-05-05' FROM DUAL UNION ALL
SELECT 'George' , DATE '2001-06-06' FROM DUAL UNION ALL
SELECT 'Paul' , DATE '1999-12-05' FROM DUAL;
CREATE TABLE Stage ( Description , Start_date ) AS
SELECT 1, DATE '1980-01-01' FROM DUAL UNION ALL
SELECT 2, DATE '2001-04-01' FROM DUAL UNION ALL
SELECT 3, DATE '2001-06-01' FROM DUAL;
Query:
SELECT name, added, description
FROM person p
INNER JOIN
(
SELECT description,
start_date,
LEAD( start_date ) OVER ( ORDER BY start_date ) AS end_date
FROM stage
) s
ON ( s.start_date <= p.added AND ( s.end_date IS NULL OR p.added < s.end_date ) );
Output:
NAME ADDED DESCRIPTION
------- ------------------- -----------
Paul 1999-12-05 00:00:00 1
Roger 2001-02-01 00:00:00 1
Natalie 2001-05-05 00:00:00 2
George 2001-06-06 00:00:00 3