Eliminate overlapping date ranges - sql

My data in an Oracle table is like this. I need a solution in Oracle SQL
StDt
EdDt
User Stat
20-12-2021
12-06-2022
A
16-06-2022
31-12-4712
A
09-06-2022
30-06-2022
B
OUTPUT :-
StDt
EdDt
20-12-2021
31-12-4712
This output is because the person was active throughout the time till 31-12-4712.
Another Scenario :-
StDt
EdDt
User Stat
20-12-2021
31-12-4712
A
09-06-2022
30-06-2022
B
Output :-
StDt
EdDt
20-12-2021
31-12-4712
Another Scenario :-
StDt
EdDt
User Stat
20-12-2021
12-06-2022
A
16-06-2022
25-06-2022
A
20-06-2022
30-06-2022
B
10-10-2022
31-03-2023
B
Output :-
StDt
EdDt
20-12-2021
12-06-2022
16-06-2022
30-06-2022
10-10-2022
31-03-2022
So in short we have to remove the overlapping date range here.

This is a classical job for MATCH_RECOGNIZE, a general pattern:
MATCH_RECOGNIZE (
PARTITION BY userstat
ORDER BY stdt, eddt
MEASURES FIRST(stdt) AS stdt, MAX(eddt) as eddt
PATTERN( merged* start )
DEFINE
merged AS MAX(eddt) >= NEXT(stdt)
)

You can use a MERGE statement with MATCH_RECOGNIZE:
MERGE INTO table_name dst
USING (
SELECT ROWID AS rid,
rn,
MAX(eddt) OVER (PARTITION BY user_stat, mno) AS eddt
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY user_stat
ORDER BY StDt
MEASURES
COUNT(*) AS rn,
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN (overlapping* final_row)
DEFINE
overlapping AS MAX(eddt) >= NEXT(stdt)
)
) src
ON (dst.ROWID = src.rid)
WHEN MATCHED THEN
UPDATE
SET eddt = src.eddt
DELETE WHERE rn > 1;
Which, for the sample data:
CREATE TABLE table_name (StDt, EdDt, User_Stat) AS
SELECT DATE '2021-12-20', DATE '2022-06-12', 'A' FROM DUAL UNION ALL
SELECT DATE '2022-06-16', DATE '4712-12-31', 'A' FROM DUAL UNION ALL
SELECT DATE '2022-06-09', DATE '2022-06-30', 'B' FROM DUAL UNION ALL
SELECT DATE '2022-06-09', DATE '2022-06-30', 'C' FROM DUAL UNION ALL
SELECT DATE '2022-06-15', DATE '2022-06-20', 'C' FROM DUAL UNION ALL
SELECT DATE '2022-06-15', DATE '2022-06-20', 'D' FROM DUAL UNION ALL
SELECT DATE '2022-06-18', DATE '2022-06-23', 'D' FROM DUAL UNION ALL
SELECT DATE '2022-06-25', DATE '2022-06-30', 'D' FROM DUAL;
Then, after the MERGE statement the table contains:
STDT
EDDT
USER_STAT
2021-12-20 00:00:00
2022-06-12 00:00:00
A
2022-06-16 00:00:00
4712-12-31 00:00:00
A
2022-06-09 00:00:00
2022-06-30 00:00:00
B
2022-06-09 00:00:00
2022-06-30 00:00:00
C
2022-06-15 00:00:00
2022-06-23 00:00:00
D
2022-06-25 00:00:00
2022-06-30 00:00:00
D
fiddle

Related

Order Data based on previous row data

I have a query in Oracle SQL. My query gives three columns: old_data ,new_data and transaction_date. I want to sort this data primarily based on increasing transaction_date and secondly in such a way that new_data of previous row equals old_data of next row. Both new_data and old_date are number fields that can decrease or increase.
If I sort just by transaction_date, some data has the same exact date and time and hence the order will not be accurate as I need new_data of previous row to match old_data of current row. I also cannot use a hierarchical query alone to meet the second sorting condition since transaction_date sorting is the primary sorting condition.
Can anyone suggest a solution?
A sample output will need to look like below:
output_sample
Thanks in advance
You could use a hierarchical query and connect by equal dates as well as the relationship between old- and new-data:
SELECT transaction_date,
new_data,
old_data
FROM table_name
START WITH old_data IS NULL -- You need to define how to pick the first row
CONNECT BY
PRIOR transaction_date = transaction_date
AND PRIOR new_data = old_date
ORDER SIBLINGS BY
transaction_date
Which, for the sample data:
ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD HH24:MI:SS';
CREATE TABLE table_name ( transaction_date, new_data, old_data ) AS
SELECT DATE '2022-01-01', 1, NULL FROM DUAL UNION ALL
SELECT DATE '2022-01-01', 2, 1 FROM DUAL UNION ALL
SELECT DATE '2022-01-01', 3, 2 FROM DUAL UNION ALL
SELECT DATE '2022-01-02', 3, NULL FROM DUAL UNION ALL
SELECT DATE '2022-01-02', 1, 3 FROM DUAL UNION ALL
SELECT DATE '2022-01-02', 2, 1 FROM DUAL UNION ALL
SELECT DATE '2022-01-03', 4, NULL FROM DUAL;
Outputs:
TRANSACTION_DATE
NEW_DATA
OLD_DATA
2022-01-01 00:00:00
1
null
2022-01-01 00:00:00
2
1
2022-01-01 00:00:00
3
2
2022-01-02 00:00:00
3
null
2022-01-02 00:00:00
1
3
2022-01-02 00:00:00
2
1
2022-01-03 00:00:00
4
null
fiddle
Update
Given the updated sample data:
CREATE TABLE table_name ( transaction_date, old_data, new_data ) AS
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 0, 115.09903 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 115.09903, 115.13233 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 115.13233, 115.16490 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:25' HOUR TO MINUTE, 115.16490, 115.19678 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.19678, 115.22799 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.22799, 115.25854 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.25854, 115.28846 FROM DUAL UNION ALL
SELECT DATE '2021-12-20'+INTERVAL '11:35' HOUR TO MINUTE, 115.28846, 115.31776 FROM DUAL;
Then, since both old_data and new_data increase with time, you could use:
SELECT *
FROM table_name
ORDER BY old_data;
or:
SELECT *
FROM table_name
ORDER BY new_data;
or, if you want to use the hierarchy then:
SELECT transaction_date,
old_data,
new_data
FROM table_name
START WITH old_data = 0
CONNECT BY
PRIOR transaction_date <= transaction_date
AND PRIOR new_data = old_data
ORDER SIBLINGS BY
transaction_date
Which all output:
TRANSACTION_DATE
OLD_DATA
NEW_DATA
2021-12-20 11:25:00
0
115.09903
2021-12-20 11:25:00
115.09903
115.13233
2021-12-20 11:25:00
115.13233
115.1649
2021-12-20 11:25:00
115.1649
115.19678
2021-12-20 11:35:00
115.19678
115.22799
2021-12-20 11:35:00
115.22799
115.25854
2021-12-20 11:35:00
115.25854
115.28846
2021-12-20 11:35:00
115.28846
115.31776
fiddle

Query to get record list as per particular column data

I want active ids but not those record which have I' as status till next record for that id but if the previous record has status 'A' then it should come but not after the status 'I record'.
INPUT:
id
start_date
end_date
status
1000000278
8/25/2021
8/25/2022
I
1000000278
8/25/2022
8/25/2023
A
1000000284
8/20/2021
8/25/2022
A
1000000284
8/25/2022
8/25/2023
A
1000000285
8/20/2024
8/20/2028
A
1000000285
8/21/2028
8/20/2030
I
1000000285
8/21/2030
8/20/2031
A
1000000286
8/25/2021
8/25/2022
A
OUTPUT:
id
start_date
end_date
status
1000000284
8/20/2021
8/25/2022
A
1000000284
8/25/2022
8/25/2023
A
1000000285
8/20/2024
8/20/2028
A
1000000286
8/25/2021
8/25/2022
A
From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row processing. To get the rows for each id with the earliest status of A until the first status I row, you can use:
SELECT *
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY id
ORDER BY start_date
ALL ROWS PER MATCH
PATTERN ( ^ a_status+ )
DEFINE a_status AS status = 'A'
)
Which, for the sample data:
CREATE TABLE table_name (id, start_date, end_date, status) AS
SELECT 1000000278, DATE '2021-08-25', DATE '2022-08-25', 'I' FROM DUAL UNION ALL
SELECT 1000000278, DATE '2022-08-25', DATE '2023-08-25', 'A' FROM DUAL UNION ALL
SELECT 1000000284, DATE '2021-08-20', DATE '2022-08-25', 'A' FROM DUAL UNION ALL
SELECT 1000000284, DATE '2022-08-25', DATE '2023-08-25', 'A' FROM DUAL UNION ALL
SELECT 1000000285, DATE '2024-08-20', DATE '2028-08-20', 'A' FROM DUAL UNION ALL
SELECT 1000000285, DATE '2028-08-21', DATE '2030-08-20', 'I' FROM DUAL UNION ALL
SELECT 1000000285, DATE '2030-08-21', DATE '2031-08-20', 'A' FROM DUAL UNION ALL
SELECT 1000000286, DATE '2021-08-25', DATE '2022-08-25', 'A' FROM DUAL;
Outputs:
ID
START_DATE
END_DATE
STATUS
1000000284
2021-08-20 00:00:00
2022-08-25 00:00:00
A
1000000284
2022-08-25 00:00:00
2023-08-25 00:00:00
A
1000000285
2024-08-20 00:00:00
2028-08-20 00:00:00
A
1000000286
2021-08-25 00:00:00
2022-08-25 00:00:00
A
fiddle

Analytic function/logic to get min and max record date in Oracle

I have a requirement to fetch value based on eff_dt and end date. given below sample data.
Database : Oracle 11g
Example data:
id
val
eff_date
end_date
10
100
01-Jan-21
04-Jan-21
10
105
05-Jan-21
07-Jan-21
10
100
08-Jan-21
10-Jan-21
10
100
11-Jan-21
17-Jan-21
10
100
18-Jan-21
21-Jan-21
10
110
22-Jan-21
null
output:
id
val
eff_date
end_date
10
100
01-Jan-21
04-Jan-21
10
105
05-Jan-21
07-Jan-21
10
100
08-Jan-21
21-Jan-21
10
110
22-Jan-21
null
You can use the ROW_NUMBER analytic function and then aggregate:
SELECT id,
val,
MIN(eff_date) AS eff_date,
MAX(end_date) AS end_date
FROM (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY eff_date)
- ROW_NUMBER() OVER (PARTITION BY id, val ORDER BY eff_date) AS grp
FROM table_name t
)
GROUP BY id, val, grp
ORDER BY id, eff_date;
Which, for the sample data:
CREATE TABLE table_name (id, val, eff_date, end_date) AS
SELECT 10, 100, DATE '2021-01-01', DATE '2021-01-04' FROM DUAL UNION ALL
SELECT 10, 105, DATE '2021-01-05', DATE '2021-01-07' FROM DUAL UNION ALL
SELECT 10, 100, DATE '2021-01-08', DATE '2021-01-10' FROM DUAL UNION ALL
SELECT 10, 100, DATE '2021-01-11', DATE '2021-01-17' FROM DUAL UNION ALL
SELECT 10, 100, DATE '2021-01-18', DATE '2021-01-21' FROM DUAL UNION ALL
SELECT 10, 110, DATE '2021-01-22', null FROM DUAL;
Outputs:
ID
VAL
EFF_DATE
END_DATE
10
100
2021-01-01 00:00:00
2021-01-04 00:00:00
10
105
2021-01-05 00:00:00
2021-01-07 00:00:00
10
100
2021-01-08 00:00:00
2021-01-21 00:00:00
10
110
2021-01-22 00:00:00
null
From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row processing:
SELECT *
FROM table_name t
MATCH_RECOGNIZE(
PARTITION BY id
ORDER BY eff_date
MEASURES
FIRST(val) AS val,
FIRST(eff_date) AS eff_date,
LAST(end_date) AS end_date
PATTERN (same_val+)
DEFINE same_val AS FIRST(val) = val
)
Which has the same output and is likely to be more efficient.
fiddle

Fetch record with max number in one column except if date in that column is > than today

I have a problem with fetching few exceptions from DB.
Example, table b:
sn
v_num
start_date
end_date
1
001
01-01-2019
31-12-2099
1
002
01-01-2021
31-01-2022
1
003
01-02-2022
31-12-2099
2
001
01-01-2022
31-12-2099
2
002
01-07-2022
31-07-2022
2
003
01-08-2022
31-12-2099
Expected output:
sn
v_num
start_date
end_date
1
003
01-02-2022
31-12-2099
2
001
01-01-2022
31-12-2099
Currently I'm here:
SELECT * FROM table a, table b
WHERE a.sn = b.sn
AND b.v_num = (SELECT max (v_num) FROM b WHERE a.sn = b.sn)
but obviously that is not good because of a few cases like this with sn = 2.
Conclusion, I need to get unique sn record where v_num is max (95% of them in DB) except in case if start_date of max v_num record is > today.
Filter using start_date <= TRUNC(SYSDATE) then use the ROW_NUMBER analytic function:
SELECT *
FROM (
SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY sn ORDER BY v_num DESC) AS rn
FROM "TABLE" a
WHERE start_date <= TRUNC(SYSDATE)
)
WHERE rn = 1;
If the start_date has a time component then you can use start_date < TRUNC(SYSDATE) + INTERVAL '1' DAY to get all the values for today from 00:00:00 to 23:59:59.
If you can have ties for the maximum and want to return all the ties then you can use the RANK analytic function instead of ROW_NUMBER.
Which, for the sample data:
CREATE TABLE "TABLE" (sn, v_num, start_date, end_date) AS
SELECT 1, '001', DATE '2022-01-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 1, '002', DATE '2022-01-01', DATE '2022-01-31' FROM DUAL UNION ALL
SELECT 1, '003', DATE '2022-02-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 2, '001', DATE '2022-01-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 2, '002', DATE '2022-07-01', DATE '2022-07-31' FROM DUAL UNION ALL
SELECT 2, '003', DATE '2022-08-01', DATE '2099-12-31' FROM DUAL;
Outputs:
SN
V_NUM
START_DATE
END_DATE
RN
1
003
2022-02-01 00:00:00
2099-12-31 00:00:00
1
2
001
2022-01-01 00:00:00
2099-12-31 00:00:00
1
db<>fiddle here

How do I find the last occurrence of a given sequence of numbers repetitively?

I have a sequence in my Oracle database for example:
|Event code | Event time |
|41164 | jan-20-2016 |
|41165 | jan-21-2016 |
|41164 | jan-27-2016 |
|41164 | jan-30-2016 |
|41164 | jan-31-2016 |
|41165 | Feb-01-2016 |
|41164 | Feb-03-2016 |
|41164 | Feb-05-2016 |
|41165 | Feb-01-2016 |
I need to return every occurrence of 41164 directly before the next 41165.
How would I do this with a query?
Oracle Setup:
CREATE TABLE Events (Event_code, Event_time ) AS
SELECT 41164, DATE '2016-01-20' FROM DUAL UNION ALL
SELECT 41165, DATE '2016-01-21' FROM DUAL UNION ALL
SELECT 41164, DATE '2016-01-27' FROM DUAL UNION ALL
SELECT 41164, DATE '2016-01-30' FROM DUAL UNION ALL
SELECT 41164, DATE '2016-01-31' FROM DUAL UNION ALL
SELECT 41165, DATE '2016-02-01' FROM DUAL UNION ALL
SELECT 41164, DATE '2016-02-03' FROM DUAL UNION ALL
SELECT 41164, DATE '2016-02-05' FROM DUAL UNION ALL
SELECT 41165, DATE '2016-02-01' FROM DUAL;
Query - Ordered by fetch order:
SELECT Event_Code,
Event_Time
FROM (
SELECT e.*,
LEAD( Event_Code ) OVER ( ORDER BY ROWID ) as next_code
FROM Events e
)
WHERE Event_Code = 41164
AND Next_Code = 41165;
Output:
EVENT_CODE EVENT_TIME
---------- -------------------
41164 2016-01-20 00:00:00
41164 2016-01-31 00:00:00
41164 2016-02-05 00:00:00
Query - Ordered by date order:
SELECT Event_Code,
Event_Time
FROM (
SELECT e.*,
LEAD( Event_Code ) OVER ( ORDER BY Event_Time ) as next_code
FROM Events e
)
WHERE Event_Code = 41164
AND Next_Code = 41165;
Output:
EVENT_CODE EVENT_TIME
---------- -------------------
41164 2016-01-20 00:00:00
41164 2016-01-31 00:00:00
As indicated, requirement is not clear (For example, what should happen if more than these 2 numbers are available? If on next date you have both numbers, how you need to treat it ? etc )
You can start with and adapt below SQL:
select event_code from (
select
event_code,
lead(event_code) over ( order by event_time ) next_event_code
from events )
where event_code < next_event_code;
NOTE: Written from memory, not tested
This has been tested on oracle DB, you can run it without the DB and check if that is what you are looking for. Used lead analytical function to get the result.
with seq as
(select
41164 a, 'jan-20-2016' b
from dual
union
select
41165 a, 'jan-21-2016' b
from dual
union
select
41164 a, 'jan-27-2016' b
from dual
union
select
41164 a, 'jan-30-2016' b
from dual
union
select
41164 a, 'jan-31-2016' b
from dual
union
select
41165 a, 'Feb-01-2016' b
from dual),
rown as
(select
a, to_date(b,'mon-dd-yyyy') d, b
from seq),
lead as
(select
a, lead(a) over (order by d) c, b from rown)
select
a, c, b
from lead
where
a = 41164 and
c=41165 ;
returns
41164 41165 jan-20-2016
41164 41165 jan-31-2016