Oracle SQL - Find origin ID of autoincrement column - sql

There's a table on my ERP database that has data about certain events. It has the start date, end date and a column shows if the event is a continuation of a previous one (sequential_id references unique_id). Here's an example:
unique_id
start_date
end_date
sequential_id
001
2021-01-01
2021-01-15
002
2021-02-01
2021-02-16
001
003
2021-03-01
2021-03-17
002
004
2021-03-10
2021-03-11
005
2021-03-19
In the example above, rows 001, 002 and 003 are all part of the same event, and 004/005 are unique events, with no sequences. How can I group the data in a way that the output is like this:
origin_id
start_date
end_date
001
2021-01-01
2021-03-17
004
2021-03-10
2021-03-11
005
2021-03-19
I've tried using group by, but due to sequential_id being auto incremental, it didn't work.
Thanks in advance.

You can use modern match_recognize which is an optimal solution for such tasks:
Pattern Recognition With MATCH_RECOGNIZE
DBFiddle
select *
from t
match_recognize(
measures
first(unique_id) start_unique_id,
first(start_date) start_date,
last(end_date) end_date
pattern (strt nxt*)
define nxt as sequential_id=prev(unique_id)
);

You can use hierarchical query for this:
with a (unique_id, start_date, end_date, sequential_id) as (
select '001', date '2021-01-01', date '2021-01-15', null from dual union all
select '002', date '2021-02-01', date '2021-02-16', '001' from dual union all
select '003', date '2021-03-01', date '2021-03-17', '002' from dual union all
select '004', date '2021-03-10', date '2021-03-11', null from dual union all
select '005', date '2021-03-19', null, null from dual
)
, b as (
select
connect_by_root(unique_id) as unique_id
, connect_by_root(start_date) as start_date
, end_date
, connect_by_isleaf as l
from a
start with sequential_id is null
connect by prior unique_id = sequential_id
)
select
unique_id
, start_date
, end_date
from b
where l = 1
order by 1 asc
UNIQUE_ID | START_DATE | END_DATE
:-------- | :--------- | :--------
001 | 01-JAN-21 | 17-MAR-21
004 | 10-MAR-21 | 11-MAR-21
005 | 19-MAR-21 | null
db<>fiddle here

This is a graph-walking problem, so you can use a recursive CTE:
with cte (unique_id, start_date, end_date, start_unique_id) as (
select unique_id, start_date, end_date, unique_id
from t
where not exists (select 1 from t t2 where t.sequential_id = t2.unique_id)
union all
select t.unique_id, t.start_date, t.end_date, cte.start_unique_id
from cte join
t
on cte.unique_id = t.sequential_id
)
select start_unique_id, min(start_date), max(end_date)
from cte
group by start_Unique_id;
Here is a db<>fiddle.

Related

Fetch record with max number in one column except if date in that column is > than today

I have a problem with fetching few exceptions from DB.
Example, table b:
sn
v_num
start_date
end_date
1
001
01-01-2019
31-12-2099
1
002
01-01-2021
31-01-2022
1
003
01-02-2022
31-12-2099
2
001
01-01-2022
31-12-2099
2
002
01-07-2022
31-07-2022
2
003
01-08-2022
31-12-2099
Expected output:
sn
v_num
start_date
end_date
1
003
01-02-2022
31-12-2099
2
001
01-01-2022
31-12-2099
Currently I'm here:
SELECT * FROM table a, table b
WHERE a.sn = b.sn
AND b.v_num = (SELECT max (v_num) FROM b WHERE a.sn = b.sn)
but obviously that is not good because of a few cases like this with sn = 2.
Conclusion, I need to get unique sn record where v_num is max (95% of them in DB) except in case if start_date of max v_num record is > today.
Filter using start_date <= TRUNC(SYSDATE) then use the ROW_NUMBER analytic function:
SELECT *
FROM (
SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY sn ORDER BY v_num DESC) AS rn
FROM "TABLE" a
WHERE start_date <= TRUNC(SYSDATE)
)
WHERE rn = 1;
If the start_date has a time component then you can use start_date < TRUNC(SYSDATE) + INTERVAL '1' DAY to get all the values for today from 00:00:00 to 23:59:59.
If you can have ties for the maximum and want to return all the ties then you can use the RANK analytic function instead of ROW_NUMBER.
Which, for the sample data:
CREATE TABLE "TABLE" (sn, v_num, start_date, end_date) AS
SELECT 1, '001', DATE '2022-01-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 1, '002', DATE '2022-01-01', DATE '2022-01-31' FROM DUAL UNION ALL
SELECT 1, '003', DATE '2022-02-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 2, '001', DATE '2022-01-01', DATE '2099-12-31' FROM DUAL UNION ALL
SELECT 2, '002', DATE '2022-07-01', DATE '2022-07-31' FROM DUAL UNION ALL
SELECT 2, '003', DATE '2022-08-01', DATE '2099-12-31' FROM DUAL;
Outputs:
SN
V_NUM
START_DATE
END_DATE
RN
1
003
2022-02-01 00:00:00
2099-12-31 00:00:00
1
2
001
2022-01-01 00:00:00
2099-12-31 00:00:00
1
db<>fiddle here

How to do a query on Oracle SQL to get time intervals, grouping by specific fields

I love a good challenge, but this one has been breaking my head for too long. :)
I'm trying to build a query to get dates intervals, grouping the information by one field.
Let me try to explain it in a simple way.
We have this table:
I need to get the intervals a soldier spent on each ranking, so the end result I need to get should be something like this:
As you can see the soldier can be promoted/demoted along the time.
Any suggestion on how to build a query to do this?
THANK YOU!
From Oracle 12, you can use MATCH_RECOGNIZE:
SELECT *
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY start_date, end_date
MEASURES
FIRST( name ) AS name,
FIRST( ranking ) AS ranking,
FIRST( start_date ) AS start_date,
LAST( end_Date ) AS end_Date
PATTERN ( same_rank+ )
DEFINE same_rank AS FIRST( ranking ) = ranking
)
Which, for the sample data:
CREATE TABLE table_name ( id, name, ranking, start_date, end_date ) AS
SELECT 1001, 'Jones', 'Lieutenant', DATE '2000-03-20', DATE '2002-08-15' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2002-08-16', DATE '2003-03-18' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2003-03-19', DATE '2004-06-01' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Lieutenant', DATE '2004-06-02', DATE '2004-10-01' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2004-10-02', DATE '2005-04-20' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2005-04-21', DATE '2007-02-20' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2007-02-21', DATE '2008-10-22' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2008-10-23', DATE '2010-01-26' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2010-01-27', DATE '2013-11-25' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Captain', DATE '2013-11-26', DATE '2014-05-11' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'Major', DATE '2014-05-12', DATE '2016-04-22' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'General', DATE '2016-04-23', DATE '2020-10-10' FROM DUAL UNION ALL
SELECT 1001, 'Jones', 'General', DATE '2020-10-11', DATE '2020-11-30' FROM DUAL;
Outputs:
ID | NAME | RANKING | START_DATE | END_DATE
---: | :---- | :--------- | :------------------ | :------------------
1001 | Jones | Lieutenant | 2000-03-20 00:00:00 | 2004-10-01 00:00:00
1001 | Jones | Captain | 2004-10-02 00:00:00 | 2007-02-20 00:00:00
1001 | Jones | Major | 2007-02-21 00:00:00 | 2010-01-26 00:00:00
1001 | Jones | Captain | 2010-01-27 00:00:00 | 2014-05-11 00:00:00
1001 | Jones | Major | 2014-05-12 00:00:00 | 2016-04-22 00:00:00
1001 | Jones | General | 2016-04-23 00:00:00 | 2020-11-30 00:00:00
db<>fiddle here
This is a type of gaps and islands problem. You want to find groups of rows that are the same, which you can do using lag() to compare the ranking and then a cumulative sum to keep track of the changes:
select soldier_id, soldier_name, ranking,
min(start_date), max(end_date)
from (select t.*,
sum(case when prev_end_date = start_date - interval '1' day then 0 else 1 end)
(partition by soldier_id order by start_date) as island
from (select t.*,
lag(end_date) over (partition by soldier_id, ranking order by start_date) as prev_end_date
from t
) t
) t
group by soldier_id, soldier_name, ranking, island;
Note: This assumes that the soldier_name does not change over time for a given soldier. If that is something you need to deal with, then ask a new question with appropriate sample data and desired results.

Create time intervals based on values in one column / SQL Oracle

I need to create query that will return time intervals from table, that has attributes for (almost) every day.
The original table looks like the following:
Person | Date | Date_Type
-------|------------|----------
Sam | 01.06.2020 | Vacation
Sam | 02.06.2020 | Vacation
Sam | 03.06.2020 | Work
Sam | 04.06.2020 | Work
Sam | 05.06.2020 | Work
Frodo | 01.06.2020 | Work
Frodo | 02.06.2020 | Work
.....
And the desired should look like:
Person | Date_Interval | Date_Type
-------|-----------------------|----------
Sam | 01.06.2020-02.06.2020 | Vacation
Sam | 03.06.2020-05.06.2020 | Work
Frodo | 01.06.2020-02.06.2020 | Work
.....
Will be grateful for any idea :)
This reads like a gaps-and-island problem. Here is one approach:
select person, min(date) startdate, max(date) enddate, date_type
from (
select t.*,
row_number() over(partition by person order by date) rn1,
row_number() over(partition by person, date_type order by date) rn2
from mytable t
) t
group by person, date_type, rn1 - rn2
This also works if not all dates are contiguous (since you stated that you have almost all dates, I understood you don't have them all).
This is a type of gaps-and-islands problem.
To get adjacent days with the same date_type, you can subtract a sequence. It will be constant for adjacent days. Then you can aggregate:
select person, date_type, min(date), max(date)
from (select t.*,
row_number() over (partition by person, date_type
order by date) as seqnum
from t
) t
group by person, date_type, (date - seqnum);
One of the simplest methods is to use MATCH_RECOGNIZE to perform a row-by-row comparison and aggregation:
SELECT *
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY Person
ORDER BY "DATE"
MEASURES
FIRST( "DATE" ) AS start_date,
LAST( "DATE") AS end_date,
FIRST( Date_Type ) AS date_type
ONE ROW PER MATCH
PATTERN ( successive_dates+ )
DEFINE
SUCCESSIVE_DATES AS (
FIRST( Date_Type ) = NEXT( Date_Type )
AND MAX( "DATE" ) + INTERVAL '1' DAY = NEXT( "DATE")
)
);
Which, for the sample data:
CREATE TABLE table_name ( Person, "DATE", Date_Type ) AS
SELECT 'Sam', DATE '2020-06-01', 'Vacation' FROM DUAL UNION ALL
SELECT 'Sam', DATE '2020-06-02', 'Vacation' FROM DUAL UNION ALL
SELECT 'Sam', DATE '2020-06-03', 'Work' FROM DUAL UNION ALL
SELECT 'Sam', DATE '2020-06-04', 'Work' FROM DUAL UNION ALL
SELECT 'Sam', DATE '2020-06-05', 'Work' FROM DUAL UNION ALL
SELECT 'Frodo', DATE '2020-06-01', 'Work' FROM DUAL UNION ALL
SELECT 'Frodo', DATE '2020-06-02', 'Work' FROM DUAL;
Outputs:
PERSON | START_DATE | END_DATE | DATE_TYPE
:----- | :------------------ | :------------------ | :--------
Frodo | 2020-06-01 00:00:00 | 2020-06-01 00:00:00 | Work
Sam | 2020-06-01 00:00:00 | 2020-06-01 00:00:00 | Vacation
Sam | 2020-06-03 00:00:00 | 2020-06-04 00:00:00 | Work
db<>fiddle here

How to join two tables to determine date ranges when one table contains (id, start_date) and another contains (id, end_date)

I'm new to SQL, hope you guys don't find it silly. Working with two tables here, one contains start dates and other contains end dates. Entries do not follow sequence/possibility of duplicates.
**TABLE 1**
id start_date
1 2019-04-23
1 2019-06-05
1 2019-06-05
1 2019-10-29
1 2019-12-16
2 2019-01-05
3 2020-02-01
**TABLE 2**
id end_date
1 2019-04-23
1 2019-06-05
1 2019-06-06
1 2019-06-06
1 2019-07-24
1 2019-10-16
2 2020-01-04
**EXPECTED OUTPUT**
id start_date end_date
1 2019-04-23 2019-06-05
1 2019-10-29 null
2 2019-01-05 2020-01-04
3 2020-02-01 null
You can use union all and aggregation with some window functions:
with table1 as (
select 1 as id, date('2019-04-23') as start_date union all
select 1, '2019-06-05' union all
select 1, '2019-06-05' union all
select 1, '2019-10-29' union all
select 1, '2019-12-16' union all
select 2, '2019-01-05' union all
select 3, '2020-02-01'
),
table2 as (
SELECT 1 as id, DATE('2019-04-23') as end_date union all
SELECT 1, '2019-06-05' union all
select 1, '2019-06-06' union all
select 1, '2019-06-06' union all
select 1, '2019-07-24' union all
select 1, '2019-10-16' union all
select 2, '2020-01-04'
)
select id, min(start_date), end_date
from (select id, start_date,
first_value(end_date ignore nulls) over (partition by id order by DATE_DIFF(coalesce(start_date, end_date), CURRENT_DATE, day) RANGE between 1 following and unbounded following) as end_date
from ((select id, start_date, null as end_date
from table1
) union all
(select id, null as start_date, end_date
from table2
)
) se
)
group by id, end_date
having min(start_date) is not null;
Why do you have multiple records with the same id (Am assuming id is a primary key)? My suggestion would be for you to make the id's unique and creating a foreign key constraint in the end dates table (Since there can't be and end date without a start date) and use the foreign key relationship to retrieve the desired results. E.g SELECT S.start_date,E.end_date FROM table1 S JOIN table2 E where S.id=E.table1_fk
Below is for BigQuery Standard SQL
#standardSQL
SELECT id, start_date, IF(end_date = '9999-01-01', NULL, end_date) end_date
FROM (
SELECT id, start_date, ARRAY_AGG(end_date ORDER BY end_date LIMIT 1)[OFFSET(0)] end_date
FROM (
SELECT id, start_date, IF(start_date < end_date, end_date, '9999-01-01') end_date
FROM `project.dataset.table1`
LEFT JOIN `project.dataset.table2`
USING (id)
)
GROUP BY id, start_date
)
If to apply to sample data from your question - result is
Row id start_date end_date
1 1 2019-04-23 2019-06-05
2 1 2019-06-05 2019-06-06
3 1 2019-10-29 null
4 1 2019-12-16 null
5 2 2019-01-05 2020-01-04
6 3 2020-02-01 null
Note: quick and not optimized - but looks like produces desired result

SQL find effective price of the products based on the date

I have a table with four columns : id,validFrom,validTo and price.
This table contains the price of an article and the duration when that price is effective.
| id| validFrom | validTo | price
|---|-----------|-----------|---------
| 1 | 01-01-17 | 10-01-17 | 30000
| 1 | 04-01-17 | 09-01-17 | 20000
Now, for this inputs in my table my query output should be :
| id| validFrom | validTo | price
|---|-----------|----------|-------
| 1 | 01-01-17 | 03-01-17 | 30000
| 1 | 04-01-17 | 09-01-17 | 20000
| 1 | 10-01-17 | 10-01-17 | 30000
I can compare the dates and check if products with same id have overlapping dates but I have no idea how to split those dates into non-overlapping dates. Also I am not allowed to use PL/SQL.
Is this possible using only SQL ?
Oracle Setup:
CREATE TABLE prices ( id, validFrom, validTo, price ) AS
SELECT 1, DATE '2017-01-01', DATE '2017-01-10', 30000 FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-04', DATE '2017-01-09', 20000 FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-11', DATE '2017-01-15', 10000 FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-16', DATE '2017-01-18', 15000 FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-17', DATE '2017-01-20', 40000 FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-21', DATE '2017-01-24', 28000 FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-23', DATE '2017-01-26', 23000 FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-26', DATE '2017-01-26', 17000 FROM DUAL;
Query:
WITH daily_prices ( id, dt, price, duration ) AS (
-- Unroll the price ranges to individual days
SELECT id,
d.COLUMN_VALUE,
price,
validTo - validFrom
FROM prices p,
TABLE(
CAST(
MULTISET(
SELECT p.validFrom + LEVEL - 1
FROM DUAL
CONNECT BY p.validFrom + LEVEL - 1 <= p.validTo
)
AS SYS.ODCIDATELIST
)
) d
),
min_daily_prices ( id, dt, price ) AS (
-- Where a day falls between multiple ranges group them so the price
-- is for the shortest duration offer and if there are two equally short
-- durations then take the minimum price
SELECT id,
dt,
MIN( price ) KEEP ( DENSE_RANK FIRST ORDER BY duration )
FROM daily_prices
GROUP BY id, dt
),
group_changes ( id, dt, price, has_changed_group ) AS (
-- Find when the price changes or a day is skipped which means a new price
-- group is beginning
SELECT id,
dt,
price,
CASE WHEN dt = LAG( dt ) OVER ( PARTITION BY id ORDER BY dt ) + 1
AND price = LAG( price ) OVER ( PARTITION BY id ORDER BY dt )
THEN 0
ELSE 1
END
FROM min_daily_prices
),
groups ( id, dt, price, grp ) AS (
-- Calculate unique indexes (per id) for each group of price ranges
SELECT id,
dt,
price,
SUM( has_changed_group ) OVER ( PARTITION BY id ORDER BY dt )
FROM group_changes
)
SELECT id,
MIN( dt ) AS validFrom,
MAX( dt ) AS validTo,
MIN( price ) AS price
FROM groups
GROUP BY id, grp
ORDER BY id, validFrom;
Output:
ID VALIDFROM VALIDTO PRICE
---------- -------------------- -------------------- ----------
1 01-JAN-2017 00:00:00 03-JAN-2017 00:00:00 30000
1 04-JAN-2017 00:00:00 09-JAN-2017 00:00:00 20000
1 10-JAN-2017 00:00:00 10-JAN-2017 00:00:00 30000
1 11-JAN-2017 00:00:00 15-JAN-2017 00:00:00 10000
1 16-JAN-2017 00:00:00 18-JAN-2017 00:00:00 15000
1 19-JAN-2017 00:00:00 20-JAN-2017 00:00:00 40000
1 21-JAN-2017 00:00:00 22-JAN-2017 00:00:00 28000
1 23-JAN-2017 00:00:00 25-JAN-2017 00:00:00 23000
1 26-JAN-2017 00:00:00 26-JAN-2017 00:00:00 17000