I got an SQL problem I'm not capable to solve.
First of all, an SQL fiddle with it: http://sqlfiddle.com/#!4/fe7b07/2
As you see, I fill the table with some dates, which are bound to some ID. Those dates are day by day. So for this example, we'd have something like this, if we only look at January:
The timelines spanning from 2020-01-01 to 2020-01-31, the blocks are the dates in the database. So this would be the simple SELECT * FROM days output.
What I now want is to fill in some days to this output. These would span from timeline_begin to MIN(date_from); and from MAX(date_from) to timeline_end.
I'll mark these red in the following picture:
The orange span is not necessary to be added, too, but if your solution would do that too, that would be also ok.
Ok, so far so good.
For this I created the SELECT * FROM minmax, which will select the MIN(date_from) and MAX(date_from) for every id_othertable. Still no magic involved.
What I struggle is now creating those days for every id_othertable, while also joining the data they have on them (in this fiddle, it's just the some_info field).
I tried to write this in the SELECT * FROM days_before query, but I just can't get it to work. I read about the magical function CONNECT BY, which will on its own create dates line by line, but I can't get to join my data from the former table. Every time I join the info, I only get one line per id_othertable, not all those dates I need.
So the ideal solution I'm looking for would be to have three select queries:
SELECT * FROM days which select dates out of the database
SELECT * FROM days_before which will show the dates before MIN(date_from) of query 1
SELECT * FROM days_after for dates after MAX(date_from) of query 1
And in the end I'd UNION those three queries to have them all combined.
I hope I could explain my problem good enough. If you need any information or further explaining, please don't hesitate to ask.
EDIT 1: I created a pastebin with some example data: https://pastebin.com/jskrStpZ
Bear in mind that only the first query has actual information from the database, the other two have created data. Also, this example output only has data for id_othertable = 1, so the actual query should also have the information for id_othertable = 2, 3.
EDIT 2: just for clarification, the field date_to is just a simple date_from + 1 day.
If you have denormalised date it's quite simple:
with bas as (
select 1 id_other_table, to_date('2020-01-05', 'YYYY-MM-DD') date_from, to_date('2020-01-06', 'YYYY-MM-DD') date_to, 'hello' some_info from dual
union all select 1 id_other_table, to_date('2020-01-06', 'YYYY-MM-DD') date_from, to_date('2020-01-07', 'YYYY-MM-DD') date_to, 'hello' some_info from dual
union all select 1 id_other_table, to_date('2020-01-07', 'YYYY-MM-DD') date_from, to_date('2020-01-08', 'YYYY-MM-DD') date_to, 'hello' some_info from dual
union all select 1 id_other_table, to_date('2020-01-10', 'YYYY-MM-DD') date_from, to_date('2020-01-11', 'YYYY-MM-DD') date_to, 'hello' some_info from dual
union all select 1 id_other_table, to_date('2020-01-11', 'YYYY-MM-DD') date_from, to_date('2020-01-12', 'YYYY-MM-DD') date_to, 'hello' some_info from dual
union all select 1 id_other_table, to_date('2020-01-12', 'YYYY-MM-DD') date_from, to_date('2020-01-13', 'YYYY-MM-DD') date_to, 'hello' some_info from dual
union all select 2 id_other_table, to_date('2020-01-10', 'YYYY-MM-DD') date_from, to_date('2020-01-11', 'YYYY-MM-DD') date_to, 'my' some_info from dual
union all select 2 id_other_table, to_date('2020-01-11', 'YYYY-MM-DD') date_from, to_date('2020-01-12', 'YYYY-MM-DD') date_to, 'my' some_info from dual
union all select 2 id_other_table, to_date('2020-01-12', 'YYYY-MM-DD') date_from, to_date('2020-01-13', 'YYYY-MM-DD') date_to, 'my' some_info from dual
union all select 3 id_other_table, to_date('2020-01-20', 'YYYY-MM-DD') date_from, to_date('2020-01-21', 'YYYY-MM-DD') date_to, 'friend' some_info from dual
union all select 3 id_other_table, to_date('2020-01-21', 'YYYY-MM-DD') date_from, to_date('2020-01-22', 'YYYY-MM-DD') date_to, 'friend' some_info from dual
union all select 3 id_other_table, to_date('2020-01-22', 'YYYY-MM-DD') date_from, to_date('2020-01-23', 'YYYY-MM-DD') date_to, 'friend' some_info from dual)
, ad as (select trunc(sysdate,'YYYY') -1 + level all_dates from dual connect by level <= 31)
select distinct some_info,all_dates from bas,ad where (some_info,all_dates) not in (select some_info,date_from from bas)
If you have longer date ranges or mind of the time the query needs another solution is helpful. But that is harder to debug. Because it's quite hard to get the orange time slot
If you want the dates per id that are not in the database then you can use the LEAD analytic function:
WITH dates ( id, date_from, date_to ) AS (
SELECT id_othertable,
DATE '2020-01-01',
MIN( date_from )
FROM some_dates
WHERE date_to > DATE '2020-01-01'
AND date_from < ADD_MONTHS( DATE '2020-01-01', 1 )
GROUP BY id_othertable
UNION ALL
SELECT id_othertable,
date_to,
LEAD( date_from, 1, ADD_MONTHS( DATE '2020-01-01', 1 ) )
OVER ( PARTITION BY id_othertable ORDER BY date_from )
FROM some_dates
WHERE date_to > DATE '2020-01-01'
AND date_from < ADD_MONTHS( DATE '2020-01-01', 1 )
)
SELECT id,
date_from,
date_to
FROM dates
WHERE date_from < date_to
ORDER BY id, date_from;
so for the test data:
CREATE TABLE some_dates ( id_othertable, date_from, date_to, some_info ) AS
SELECT 1, DATE '2020-01-05', DATE '2020-01-06', 'hello1' FROM DUAL UNION ALL
SELECT 1, DATE '2020-01-06', DATE '2020-01-07', 'hello2' FROM DUAL UNION ALL
SELECT 1, DATE '2020-01-07', DATE '2020-01-08', 'hello3' FROM DUAL UNION ALL
SELECT 1, DATE '2020-01-10', DATE '2020-01-13', 'hello4' FROM DUAL UNION ALL
SELECT 2, DATE '2020-01-10', DATE '2020-01-13', 'my' FROM DUAL UNION ALL
SELECT 3, DATE '2020-01-20', DATE '2020-01-23', 'friend' FROM DUAL UNION ALL
SELECT 4, DATE '2019-12-31', DATE '2020-01-05', 'before' FROM DUAL UNION ALL
SELECT 4, DATE '2020-01-30', DATE '2020-02-02', 'after' FROM DUAL UNION ALL
SELECT 5, DATE '2019-12-31', DATE '2020-01-10', 'only_before' FROM DUAL UNION ALL
SELECT 6, DATE '2020-01-15', DATE '2020-02-01', 'only_after' FROM DUAL UNION ALL
SELECT 7, DATE '2019-12-31', DATE '2020-02-01', 'exlude_all' FROM DUAL;
this outputs:
ID | DATE_FROM | DATE_TO
-: | :--------- | :---------
1 | 2020-01-01 | 2020-01-05
1 | 2020-01-08 | 2020-01-10
1 | 2020-01-13 | 2020-02-01
2 | 2020-01-01 | 2020-01-10
2 | 2020-01-13 | 2020-02-01
3 | 2020-01-01 | 2020-01-20
3 | 2020-01-23 | 2020-02-01
4 | 2020-01-05 | 2020-01-30
5 | 2020-01-10 | 2020-02-01
6 | 2020-01-01 | 2020-01-15
db<>fiddle here
If you want the days before then filter on:
WHERE day_from = DATE '2020-01-01'
and, similarly, if you want the days after then filter on:
WHERE day_to = ADD_MONTHS( DATE '2020-01-01', 1 )
If you want to specify the start date and number of months duration then use named bind parameters:
WITH dates ( id, date_from, date_to ) AS (
SELECT id_othertable,
:start_date,
MIN( date_from )
FROM some_dates
WHERE date_to > :start_date
AND date_from < ADD_MONTHS( :start_date, :number_months )
GROUP BY id_othertable
UNION ALL
SELECT id_othertable,
date_to,
LEAD( date_from, 1, ADD_MONTHS( :start_date, :number_months ) )
OVER ( PARTITION BY id_othertable ORDER BY date_from )
FROM some_dates
WHERE date_to > :start_date
AND date_from < ADD_MONTHS( :start_date, :number_months )
)
SELECT id,
date_from,
date_to
FROM dates
WHERE date_from < date_to
ORDER BY id, date_from;
Select whole range using connect by generator. Join your table partitioned by id.
select date_from, nvl(date_to, date_from +1) date_to, id_othertable, some_info
from (
select date '2020-01-01' + level - 1 as date_from
from dual
connect by level <= date '2020-01-31' - date '2020-01-01' ) gen
natural left join some_dates partition by (id_othertable)
sqlfiddle
Related
My original query:
SELECT desc, start_date
FROM foo.bar
WHERE desc LIKE 'Fall%' AND desc NOT LIKE '%Med%'
UNION
SELECT desc, end_date
FROM foo.bar
WHERE desc LIKE 'Spring%' AND desc NOT LIKE '%Med%'
ORDER BY start_date;
With this query, I get (roughly) the data set I am looking for. I now need to take that data and combine the results taking two at a time in order and then produce a result like:
DESC
START_DATE
END_DATE
Fall 1971 - Spring 1972
15-AUG-71
15-MAY-72
Fall 1971 - Spring 1972
15-AUG-72
15-MAY-73
Where DESC is a concatenation of the DESC form row 1 and 2, START_DATE is the date from row 1 and END_DATE is the date from row 2. Following this same pattern for the entire data set.
Any help with a query that will produce the result I need is greatly appreciated. Not sure if I'm heading down the right path or if that originally query is just wrong.
As stated above, I tried the supplied query, which gives me the data I need. However, I've been unsuccessful in finding a way to format it into my desired output. It should also be noted that I am running this on an Oracle database.
Instead of union, use each of those queries as CTEs (with a slight modification - include row number you'll later use in JOIN):
Sample data:
SQL> with test (description, datum) as
2 (select 'Fall 1971' , date '1971-08-15' from dual union all
3 select 'Spring 1972', date '1972-05-15' from dual union all
4 select 'Fall 1972' , date '1972-08-15' from dual union all
5 select 'Spring 1973', date '1973-05-15' from dual union all
6 select 'Fall 1973' , date '1973-08-15' from dual union all
7 select 'Spring 1974', date '1974-05-15' from dual union all
8 select 'Fall 1974' , date '1974-08-15' from dual union all
9 select 'Spring 1975', date '1975-05-15' from dual
10 ),
Query begins here: t_start and t_end represent your current queries
11 t_start as
12 (select description, datum,
13 row_number() Over (order by datum) rn
14 from test
15 where description like 'Fall%' and description not like '%Med%'
16 ),
17 t_end as
18 (select description, datum,
19 row_number() Over (order by datum) rn
20 from test
21 where description like 'Spring%' and description not like '%Med%'
22 )
Finally:
23 select s.description ||' - '|| e.description as description,
24 s.datum start_date,
25 e.datum end_date
26 from t_start s join t_end e on s.rn = e.rn
27 order by s.rn;
DESCRIPTION START_DAT END_DATE
------------------------- --------- ---------
Fall 1971 - Spring 1972 15-AUG-71 15-MAY-72
Fall 1972 - Spring 1973 15-AUG-72 15-MAY-73
Fall 1973 - Spring 1974 15-AUG-73 15-MAY-74
Fall 1974 - Spring 1975 15-AUG-74 15-MAY-75
SQL>
You can also use the MODEL clause to avoid to scan the table twice:
with data(description,datum) as (
select 'Fall 1971' , date '1971-08-15' from dual union all
select 'Spring 1972', date '1972-05-15' from dual union all
select 'Fall 1972' , date '1972-08-15' from dual union all
select 'Spring 1973', date '1973-05-15' from dual union all
select 'Fall 1973' , date '1973-08-15' from dual union all
select 'Spring 1974', date '1974-05-15' from dual union all
select 'Fall 1974' , date '1974-08-15' from dual union all
select 'Spring 1975', date '1975-05-15' from dual
)
select description, start_date, end_date
from (
select rn, desc1 as description, start_date, end_date
from (
select row_number() over(order by datum) as rn, description, datum
from data
where description not like '%Med%'
)
model
dimension by (rn)
measures (
cast(' ' as varchar2(256)) as desc1, description, cast(NULL as DATE) start_date, cast(NULL as DATE) end_date , datum
)
rules (
desc1[mod(rn,2)=1] = description[cv()] || ' - ' || description[cv()+1],
start_date[mod(rn,2)=1] = datum[cv()],
end_date[mod(rn,2)=1] = datum[cv()+1]
)
)
where mod(rn,2)=1
;
I have a table like this.
Date
Enddate
20012022
21012022
21012022
23012022
23012022
24012022
20012022
26012022
26012022
27012022
27012022
27012022
The next date entry is equal to the last one enddate. How do I find lines that don't follow this rule? In the example, line 4 (previus enddate 24012022 - next date 20012022).
I tried use
lag()
I can't understand how it works... Thanks for helping..
Here's one option.
Sample data:
SQL> with test (datum, enddatum) as
2 (select date '2022-01-20', date '2022-01-21' from dual union all
3 select date '2022-01-21', date '2022-01-23' from dual union all
4 select date '2022-01-23', date '2022-01-24' from dual union all
5 select date '2022-01-20', date '2022-12-26' from dual union all
6 select date '2022-12-26', date '2022-12-27' from dual union all
7 select date '2022-12-27', date '2022-12-27' from dual
8 ),
Query begins here: find previous enddatum so that you could compare it to datum (line #17):
9 temp as
10 (select datum,
11 enddatum,
12 lag(enddatum) over (order by enddatum) previous_enddatum
13 from test
14 )
15 select datum, enddatum
16 from temp
17 where datum <> previous_enddatum;
DATUM ENDDATUM
---------- ----------
20.01.2022 26.12.2022
SQL>
The LAG() function's result depends on query partition clause and order by clause. Here are two codes giving different results if ordered by Start or End date:
Your sample data:
WITH
tbl (START_DATE, END_DATE) as
( Select DATE '2022-01-20', DATE '2022-01-21' From dual Union All
Select DATE '2022-01-21', DATE '2022-01-23' From dual Union All
Select DATE '2022-01-23', DATE '2022-01-24' From dual Union All
Select DATE '2022-01-20', DATE '2022-12-26' From dual Union All
Select DATE '2022-12-26', DATE '2022-12-27' From dual Union All
Select DATE '2022-12-27', DATE '2022-12-27' From dual
)
Using Order By END_DATE:
Select START_DATE, END_DATE,
CASE
WHEN START_DATE != LAG(END_DATE) OVER(ORDER BY END_DATE)
THEN 'Should be ' || LAG(END_DATE) OVER(ORDER BY END_DATE)
END "END_DATE_CHECK"
From tbl
START_DATE END_DATE END_DATE_CHECK
---------- --------- -------------------
20-JAN-22 21-JAN-22
21-JAN-22 23-JAN-22
23-JAN-22 24-JAN-22
20-JAN-22 26-DEC-22 Should be 24-JAN-22
26-DEC-22 27-DEC-22
27-DEC-22 27-DEC-22
Using Order By START_DATE
Select START_DATE, END_DATE,
CASE
WHEN START_DATE != LAG(END_DATE) OVER(ORDER BY START_DATE)
THEN 'Should be ' || LAG(END_DATE) OVER(ORDER BY START_DATE)
END "END_DATE_CHECK"
From tbl
START_DATE END_DATE END_DATE_CHECK
---------- --------- -------------------
20-JAN-22 21-JAN-22
20-JAN-22 26-DEC-22 Should be 21-JAN-22
21-JAN-22 23-JAN-22 Should be 26-DEC-22
23-JAN-22 24-JAN-22
26-DEC-22 27-DEC-22 Should be 24-JAN-22
27-DEC-22 27-DEC-22
It looks like there is something missing in your sample data (some ID column maybe). Let's say that there is some column the dates belong to and that we could partition the dates by that column like below. There is no checking problems at all:
3. Using Partition By
WITH
tbl (ID, START_DATE, END_DATE) as
( Select 1, DATE '2022-01-20', DATE '2022-01-21' From dual Union All
Select 1, DATE '2022-01-21', DATE '2022-01-23' From dual Union All
Select 1, DATE '2022-01-23', DATE '2022-01-24' From dual Union All
Select 2, DATE '2022-01-20', DATE '2022-12-26' From dual Union All
Select 2, DATE '2022-12-26', DATE '2022-12-27' From dual Union All
Select 2, DATE '2022-12-27', DATE '2022-12-27' From dual
)
Select ID, START_DATE, END_DATE,
CASE
WHEN START_DATE != LAG(END_DATE) OVER(Partition By ID ORDER BY START_DATE)
THEN 'Should be ' || LAG(END_DATE) OVER(Partition By ID ORDER BY START_DATE)
END "END_DATE_CHECK"
From tbl
ID START_DATE END_DATE END_DATE_CHECK
---------- ---------- --------- -------------------
1 20-JAN-22 21-JAN-22
1 21-JAN-22 23-JAN-22
1 23-JAN-22 24-JAN-22
2 20-JAN-22 26-DEC-22
2 26-DEC-22 27-DEC-22
2 27-DEC-22 27-DEC-22
In this case there is no difference using Start or End date ordering... More about LAG() OVER() here.
I need SELECT for finding data with overlapping date in Oracle SQL just from today to exactly one year ago. ID_FORMULAR is not UNIQUE value and I need to include just data with overlapping date where ID_FORMULAR is UNIQUE.
My code:
SELECT T1.*
FROM VISITORS T1, VISITORS T2
WHERE ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.FROM_DATE >= t2.FROM_DATE
AND t1.FROM_DATE <= t2.TO_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
OR ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.TO_DATE >= t2.FROM_DATE
AND t1.TO_DATE <= t2.TO_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
OR ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.TO_DATE >= t2.TO_DATE
AND t1.FROM_DATE <= t2.FROM_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
It is not working correctly. Any help?
From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row processing:
SELECT *
FROM (
SELECT *
FROM visitors
WHERE created_date >= ADD_MONTHS(TRUNC(CURRENT_DATE), -12)
AND created_date < TRUNC(CURRENT_DATE) + 1
)
MATCH_RECOGNIZE(
ORDER BY from_date
ALL ROWS PER MATCH
PATTERN (any_row overlap+)
DEFINE
overlap AS PREV(id_formular) != id_formular
AND PREV(to_date) >= from_date
)
Which, for the sample data:
CREATE TABLE visitors (id_formular, created_date, from_date, to_date) AS
SELECT 1, DATE '2022-08-01', DATE '2022-08-01', DATE '2022-08-03' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-02', DATE '2022-08-04' FROM DUAL UNION ALL
SELECT 3, DATE '2022-08-01', DATE '2022-08-03', DATE '2022-08-05' FROM DUAL UNION ALL
SELECT 1, DATE '2022-08-01', DATE '2022-08-06', DATE '2022-08-06' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-07', DATE '2022-08-09' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-08', DATE '2022-08-10' FROM DUAL UNION ALL
SELECT 1, DATE '2022-08-01', DATE '2022-08-09', DATE '2022-08-11' FROM DUAL;
Outputs:
FROM_DATE
ID_FORMULAR
CREATED_DATE
TO_DATE
01-AUG-22
1
01-AUG-22
03-AUG-22
02-AUG-22
2
01-AUG-22
04-AUG-22
03-AUG-22
3
01-AUG-22
05-AUG-22
08-AUG-22
2
01-AUG-22
10-AUG-22
09-AUG-22
1
01-AUG-22
11-AUG-22
db<>fiddle here
I don't quite understand the question. The thing that is confusing me is that you need just rows where ID is unique. If ID is unique than there is no other row to overlap with. Anyway, lets suppose that the sample data is like below:
WITH
tbl AS
(
SELECT 0 "ID", DATE '2021-07-01' "CREATED", DATE '2021-07-01' "DATE_FROM", DATE '2021-07-13' "DATE_TO" FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-01', DATE '2021-12-01', DATE '2021-12-03' FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-04', DATE '2021-12-04', DATE '2021-12-14' FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-12', DATE '2021-12-12', DATE '2021-12-29' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-04', DATE '2022-08-04', DATE '2022-08-10' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-11', DATE '2022-08-11', DATE '2022-08-21' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-21', DATE '2022-08-21', DATE '2022-08-29' FROM DUAL UNION ALL
SELECT 3, DATE '2022-08-11', DATE '2022-08-11', DATE '2022-08-29' FROM DUAL UNION ALL
SELECT 4, DATE '2022-08-14', DATE '2022-08-14', DATE '2022-08-14' FROM DUAL UNION ALL
SELECT 4, DATE '2022-08-29', DATE '2022-08-14', DATE '2022-08-29' FROM DUAL
)
We can add some columns that will tell us if the ID is unique or not, what is the order of appearance of the same ID, what is the end date of the previous row for the same ID and if the rows of a particular ID overlaps or not. Here is the code: (used analytic functions with windowing clause)
SELECT
ID "ID",
CASE WHEN Count(*) OVER (PARTITION BY ID ORDER BY ID) = 1 THEN 'Y' ELSE 'N' END "IS_UNIQUE",
Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) "ID_ORDER_NO",
CREATED "CREATED",
DATE_FROM "DATE_FROM",
DATE_TO "DATE_TO",
CASE
WHEN Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) = 1
THEN Null
ELSE
First_Value(DATE_TO) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
END "PREVIOUS_END_DATE",
CASE
WHEN Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) = 1
THEN 'N'
ELSE
CASE
WHEN DATE_FROM <= First_Value(DATE_TO) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
THEN 'Y'
ELSE 'N'
END
END "OVERLAPS"
FROM
TBL
WHERE
CREATED BETWEEN ADD_MONTHS(TRUNC(SYSDATE, 'dd'), -12) And TRUNC(SYSDATE, 'dd')
Here is the resulting dataset...
/* R e s u l t
ID IS_UNIQUE ID_ORDER_NO CREATED DATE_FROM DATE_TO PREVIOUS_END_DATE OVERLAPS
---------- --------- ----------- --------- --------- --------- ----------------- --------
1 N 1 01-DEC-21 01-DEC-21 03-DEC-21 N
1 N 2 04-DEC-21 04-DEC-21 14-DEC-21 03-DEC-21 N
1 N 3 12-DEC-21 12-DEC-21 29-DEC-21 14-DEC-21 Y
2 N 1 04-AUG-22 04-AUG-22 10-AUG-22 N
2 N 2 11-AUG-22 11-AUG-22 21-AUG-22 10-AUG-22 N
2 N 3 21-AUG-22 21-AUG-22 29-AUG-22 21-AUG-22 Y
3 Y 1 11-AUG-22 11-AUG-22 29-AUG-22 N
4 N 1 14-AUG-22 14-AUG-22 14-AUG-22 N
4 N 2 29-AUG-22 14-AUG-22 29-AUG-22 14-AUG-22 Y
*/
This dataset could be further used to get you the rows and columns that you are trying to get. You can filter it, do some other calculations (like number of overlaping days), get number of rows per ID and so on....
Regards...
I have the below table. I need to count how many ids were active in a given month. So thinking I'll need to create a row for each id that was active during that month so that id can be counted each month. A row should be generated for a term_dt during that month.
active_dt term_dt id
1/1/2018 101
1/1/2018 5/15/2018 102
3/1/2018 6/1/2018 103
1/1/2018 4/25/18 104
Apparently this is a "count number of overlapping intervals" problem. The algorithm goes like this:
Create a sorted list of all start and end points
Calculate a running sum over this list, add one when you encounter a start and subtract one when you encounter an end
If two points are same then perform subtractions first
You will end up with list of all points where the sum changed
Here is a rough outline of the query. It is for SQL Server but could be ported to any RDBMS that supports window functions:
WITH cte1(date, val) AS (
SELECT active_dt, 1 FROM #t AS t
UNION ALL
SELECT COALESCE(term_dt, '2099-01-01'), -1 FROM #t AS t
-- if end date is null then assume the row is valid indefinitely
), cte2 AS (
SELECT date, SUM(val) OVER(ORDER BY date, val) AS rs
FROM cte1
)
SELECT YEAR(date) AS YY, MONTH(date) AS MM, MAX(rs) AS MaxActiveThisYearMonth
FROM cte2
GROUP BY YEAR(date), MONTH(date)
DB Fiddle
I was toying with a simpler query, that seemed to do the trick, for Oracle:
with candidates (month_start) as (
select to_date ('2018-' || column_value || '-01','YYYY-MM-DD')
from
table
(sys.odcivarchar2list('01','02','03','04','05',
'06','07','08','09','10','11','12'))
), sample_data (active_dt, term_dt, id) as (
select to_date('01/01/2018', 'MM/DD/YYYY'), null, 101 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('05/15/2018', 'MM/DD/YYYY'), 102 from dual
union select to_date('03/01/2018', 'MM/DD/YYYY'),
to_date('06/01/2018', 'MM/DD/YYYY'), 103 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('04/25/2018', 'MM/DD/YYYY'), 104 from dual
)
select c.month_start, count(1)
from candidates c
join sample_data d
on c.month_start between d.active_dt and nvl(d.term_dt,current_date)
group by c.month_start
order by c.month_start
An alternative solution would be to use a hierarchical query, e.g.:
WITH your_table AS (SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, NULL term_dt, 101 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('15/05/2018', 'dd/mm/yyyy') term_dt, 102 ID FROM dual UNION ALL
SELECT to_date('01/03/2018', 'dd/mm/yyyy') active_dt, to_date('01/06/2018', 'dd/mm/yyyy') term_dt, 103 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('25/04/2018', 'dd/mm/yyyy') term_dt, 104 ID FROM dual)
SELECT active_month,
COUNT(*) num_active_ids
FROM (SELECT add_months(TRUNC(active_dt, 'mm'), -1 + LEVEL) active_month,
ID
FROM your_table
CONNECT BY PRIOR ID = ID
AND PRIOR sys_guid() IS NOT NULL
AND LEVEL <= FLOOR(months_between(coalesce(term_dt, SYSDATE), active_dt)) + 1)
GROUP BY active_month
ORDER BY active_month;
ACTIVE_MONTH NUM_ACTIVE_IDS
------------ --------------
01/01/2018 3
01/02/2018 3
01/03/2018 4
01/04/2018 4
01/05/2018 3
01/06/2018 2
01/07/2018 1
01/08/2018 1
01/09/2018 1
01/10/2018 1
Whether this is more or less performant than the other answers is up to you to test.
I have a problem with choosing from the list of absences, those that follow one another and grouping them into periods.
date_from (data_od) date_to(data_do)
--------------------------
18/08/01 - 18/08/15
18/08/16 - 18/08/20
18/08/21 - 18/08/31
18/09/01 - 18/09/08
18/05/01 - 18/05/31
18/06/01 - 18/06/30
18/03/01 - 18/03/18
18/02/14 - 18/02/28
above is a list of absences, and the result of which should be a table:
date_from (data_od) date_to(data_do)
--------------------------
18/08/01 18/09/08
18/05/01 18/06/30
18/02/14 18/03/18
For now, I did something like this, but I only research in twos :(
SELECT u1.data_od,u2.data_do
FROM l_absencje u1 CROSS APPLY
(SELECT * FROM l_absencje labs
WHERE labs.prac_id=u1.prac_id AND
TRUNC(labs.data_od) = TRUNC(u1.data_do)+1
ORDER BY id DESC FETCH FIRST 1 ROWS ONLY
) u2 where u1.prac_id=1067 ;
And give me that:
18/08/01 18/08/20 bad
18/08/16 18/08/31 bad
18/08/21 18/09/08 bad
18/05/01 18/06/30 good
18/02/14 18/03/18 good
You can use a combination of the LAG(), LEAD() and LAST_VALUE() analytic functions:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE absences ( date_from, date_to ) AS
SELECT DATE '2018-08-01', DATE '2018-08-15' FROM DUAL UNION ALL
SELECT DATE '2018-08-16', DATE '2018-08-20' FROM DUAL UNION ALL
SELECT DATE '2018-08-21', DATE '2018-08-31' FROM DUAL UNION ALL
SELECT DATE '2018-09-01', DATE '2018-09-08' FROM DUAL UNION ALL
SELECT DATE '2018-05-01', DATE '2018-05-31' FROM DUAL UNION ALL
SELECT DATE '2018-06-01', DATE '2018-06-30' FROM DUAL UNION ALL
SELECT DATE '2018-03-01', DATE '2018-03-18' FROM DUAL UNION ALL
SELECT DATE '2018-02-14', DATE '2018-02-28' FROM DUAL;
Query 1:
SELECT *
FROM (
SELECT CASE
WHEN date_to IS NOT NULL
THEN LAST_VALUE( date_from ) IGNORE NULLS
OVER( ORDER BY ROWNUM )
END AS date_from,
date_to
FROM (
SELECT CASE date_from
WHEN LAG( date_to ) OVER ( ORDER BY date_to )
+ INTERVAL '1' DAY
THEN NULL
ELSE date_from
END AS date_from,
CASE date_to
WHEN LEAD( date_from ) OVER ( ORDER BY date_from )
- INTERVAL '1' DAY
THEN NULL
ELSE date_to
END AS date_to
FROM absences
)
)
WHERE date_from IS NOT NULL
AND date_to IS NOT NULL
Results:
| DATE_FROM | DATE_TO |
|----------------------|----------------------|
| 2018-02-14T00:00:00Z | 2018-03-18T00:00:00Z |
| 2018-05-01T00:00:00Z | 2018-06-30T00:00:00Z |
| 2018-08-01T00:00:00Z | 2018-09-08T00:00:00Z |