Row lumping, cycle dates

Row lumping, cycle dates - sql

I want to look at the lead type and if that type is the same for that row then merge in those dates to fit within one row.
I have the below table:
id start_dt end_dt type
1 1/1/19 2/21/19 cross
1 2/22/19 6/5/19 cross
1 6/6/19 8/31/19 cross
1 9/1/19 10/3/19 AAAA
1 10/4/19 10/4/19 cross
1 10/5/19 10/6/19 AAAA
1 10/7/19 10/10/19 AAAA
1 10/11/19 12/31/99 cross
Expected Results:
id start_dt end_dt type
1 1/1/19 8/31/19 cross
1 9/1/19 10/3/19 AAAA
1 10/4/19 10/4/19 cross
1 10/5/19 10/10/19 AAAA
1 10/11/19 12/31/99 cross
How can I get my output to look like the expected results?
I have tested withlead lag rank and case expression but nothing worthy of adding here. Am I on the right path?

This is a gaps-and-islands problem. One option for solving it through contribution of row_number() analytical function :
select min(start_dt) as startdate, max(end_dt) as enddate, type
from
(
with t(id, start_dt, end_dt,type) as
(
select 1, date'2019-01-01', date'2019-02-21', 'cross' from dual union all
select 1, date'2019-02-22', date'2019-06-05', 'cross' from dual union all
select 1, date'2019-06-06', date'2019-08-31', 'cross' from dual union all
select 1, date'2019-09-01', date'2019-10-03', 'AAAA' from dual union all
select 1, date'2019-09-04', date'2019-10-04', 'cross' from dual union all
select 1, date'2019-10-05', date'2019-10-06', 'AAAA' from dual union all
select 1, date'2019-10-07', date'2019-10-10', 'AAAA' from dual union all
select 1, date'2019-10-11', date'2019-12-31', 'cross' from dual
)
select type,
row_number() over (partition by id, type order by end_dt) as rn1,
row_number() over (partition by id order by end_dt) as rn2,
start_dt, end_dt
from t
) tt
group by type, rn1 - rn2
order by enddate;
STARTDATE ENDDATE TYPE
--------- --------- -----
01-JAN-19 31-AUG-19 cross
01-SEP-19 03-OCT-19 AAAA
04-SEP-19 04-OCT-19 cross
05-OCT-19 10-OCT-19 AAAA
11-OCT-19 31-DEC-19 cross
Demo

I actually think this is a pretty good case for Oracle's Pattern Matching Functionality.
with t(id, start_dt, end_dt,type) as
(
select 1, date'2019-01-01', date'2019-02-21', 'cross' from dual union all
select 1, date'2019-02-22', date'2019-06-05', 'cross' from dual union all
select 1, date'2019-06-06', date'2019-08-31', 'cross' from dual union all
select 1, date'2019-09-01', date'2019-10-03', 'AAAA' from dual union all
select 1, date'2019-09-04', date'2019-10-04', 'cross' from dual union all
select 1, date'2019-10-05', date'2019-10-06', 'AAAA' from dual union all
select 1, date'2019-10-07', date'2019-10-10', 'AAAA' from dual union all
select 1, date'2019-10-11', date'2019-12-31', 'cross' from dual
)
SELECT *
FROM t
MATCH_RECOGNIZE(ORDER BY start_dt
MEASURES a.id AS ID,
A.start_dt AS START_DT,
NVL(LAST(B.end_dt), A.end_dt) AS END_DT,
a.type AS TYPE
PATTERN (A B*)
DEFINE B AS start_dt > PREV(start_dt) AND type = PREV(type));
A detailed primer on the topic can be found here

If you want to look at adjacent rows to find groups that can combine, then I recommend lag() to find where groups start and a cumulative sum on that:
select id, type, min(start_dt), max(end_dt)
from (select t.*,
sum(case when prev_end_dt >= start_dt - 1 then 0 else 1 end) over (partition by id, type order by start_dt) as grp
from (select t.*,
lag(end_dt) over (partition by id, type order by start_dt) as prev_end_dt
from t
) t
) t
group by id, type, grp
order by id, min(start_dt);
In particular, this will find cases where the type does not change but there is a gap in the time frames, as shown by this db<>fiddle for id = 2.

Related

Fetch latest start date, if there are two minimum start dates in a table

I'm trying to build a query for the following scenario,
Group records by license ID and get min and max dates
For a given license ID, if there are two earliest start dates, then start date of the particular ID has to be updated as latest start date in that grouping.
Since I'm new to sql, I need help to satisfy condition 2. Any help is greatly appreciated. Thanks
Actual data
LicenseID
StartDate
EndDate
100
4/3/2000
3/1/2013
100
4/3/2000
2/2/2017
100
3/1/2013
1/23/2015
100
1/23/2015
2/2/2017
100
2/2/2017
2/9/2018
100
2/2/2017
12/18/2018
100
12/18/2018
2/16/2021
Expected output
LicenseID
StartDate
EndDate
100
12/18/2018
2/16/2021

Here's one option; read comments within code.
Sample data:
SQL> with test (id, start_date, end_date) as
2 (select 100, date '2000-04-03', date '2013-03-01' from dual union all
3 select 100, date '2000-04-03', date '2017-02-02' from dual union all
4 select 100, date '2018-12-18', date '2021-02-16' from dual
5 ),
Query begins here:
6 -- rank start dates per each ID
7 temp as
8 (select id,
9 min(start_date) over (partition by id) min_sd,
10 max(start_date) over (partition by id) max_sd,
11 rank() over (partition by id order by start_date) rnk_sd,
12 --
13 max(end_date) over (partition by id) max_ed
14 from test
15 ),
16 -- count number of the 1st start dates
17 temp2 as
18 (select id,
19 sum(case when rnk_sd = 1 then 1 else 0 end) cnt_sd
20 from temp
21 group by id
22 )
23 -- if number of the 1st start dates is 1, take MIN_SD. Otherwise, take MAX_SD
24 select distinct
25 b.id,
26 case when b.cnt_sd = 1 then a.min_sd else a.max_sd end start_date,
27 a.max_ed end_date
28 from temp2 b join temp a on a.id = b.id;
Result:
ID START_DATE END_DATE
---------- ---------- ----------
100 12/18/2018 02/16/2021
SQL>

This can filter them:
WITH sample_data AS
(
SELECT 100 AS LicenseID, TO_DATE('04/03/2000','MM/DD/YYYY') AS StartDate, TO_DATE('03/01/2013','MM/DD/YYYY') AS EndDate FROM DUAL UNION ALL
SELECT 100, TO_DATE('04/03/2000','MM/DD/YYYY'), TO_DATE('02/02/2017','MM/DD/YYYY') FROM DUAL UNION ALL
SELECT 100, TO_DATE('03/01/2013','MM/DD/YYYY'), TO_DATE('01/23/2015','MM/DD/YYYY') FROM DUAL UNION ALL
SELECT 100, TO_DATE('01/23/2015','MM/DD/YYYY'), TO_DATE('02/02/2017','MM/DD/YYYY') FROM DUAL UNION ALL
SELECT 100, TO_DATE('02/02/2017','MM/DD/YYYY'), TO_DATE('02/09/2018','MM/DD/YYYY') FROM DUAL UNION ALL
SELECT 100, TO_DATE('02/02/2017','MM/DD/YYYY'), TO_DATE('12/18/2018','MM/DD/YYYY') FROM DUAL UNION ALL
SELECT 100, TO_DATE('12/18/2018','MM/DD/YYYY'), TO_DATE('02/16/2021','MM/DD/YYYY') FROM DUAL
)
SELECT dat.licenseID, CASE WHEN dups.licenseID IS NOT NULL THEN MAX(StartDate)
ELSE MIN(StartDate)
END,
CASE WHEN dups.licenseID IS NOT NULL THEN MAX(EndDate)
ELSE MIN(EndDate)
END
FROM sample_data dat
LEFT OUTER JOIN (SELECT COUNT(1), sd.LicenseID
FROM sample_data sd
INNER JOIN (SELECT MIN(StartDate) AS StartDate, LicenseID
FROM sample_data
GROUP BY LicenseID) mins
ON sd.LicenseID = mins.LicenseID AND sd.startDate = mins.StartDate
GROUP BY sd.LicenseID
HAVING COUNT(1) > 1) dups
ON dups.LicenseID = dat.licenseID
GROUP BY dat.licenseID, dups.licenseID;

You can use:
SELECT licenseid,
MAX(startdate) AS startdate,
MAX(enddate) KEEP (DENSE_RANK LAST ORDER BY startdate) AS enddate
FROM table_name
GROUP BY licenseid
HAVING COUNT(*) KEEP (DENSE_RANK FIRST ORDER BY startdate) > 1;
or:
SELECT licenseid,
max_startdate AS startdate,
max_enddate As enddate
FROM (
SELECT licenseid,
RANK()
OVER (PARTITION BY licenseid ORDER BY startdate) AS rnk,
ROW_NUMBER()
OVER (PARTITION BY licenseid, startdate ORDER BY enddate) AS rn,
MAX(startdate)
OVER (PARTITION BY licenseid) AS max_startdate,
MAX(enddate)
KEEP (DENSE_RANK LAST ORDER BY startdate)
OVER (PARTITION BY licenseid) AS max_enddate
FROM table_name t
)
WHERE rnk = 1
AND rn = 2;
Which, for the sample data:
CREATE TABLE table_name (licenseid, startdate, enddate) AS
SELECT 100, DATE'2000-04-03', DATE'2013-03-01' FROM DUAL UNION ALL
SELECT 100, DATE'2000-04-03', DATE'2017-02-02' FROM DUAL UNION ALL
SELECT 100, DATE'2013-03-01', DATE'2015-01-23' FROM DUAL UNION ALL
SELECT 100, DATE'2015-01-23', DATE'2017-02-02' FROM DUAL UNION ALL
SELECT 100, DATE'2017-02-02', DATE'2018-02-09' FROM DUAL UNION ALL
SELECT 100, DATE'2018-02-02', DATE'2018-12-18' FROM DUAL UNION ALL
SELECT 100, DATE'2018-12-18', DATE'2021-02-16' FROM DUAL;
Both output:
LICENSEID
STARTDATE
ENDDATE
100
2018-12-18 00:00:00
2021-02-16 00:00:00
If you do want to perform an UPDATE of that second row then:
MERGE INTO table_name dst
USING (
SELECT ROWID AS rid,
max_startdate,
max_enddate
FROM (
SELECT RANK()
OVER (PARTITION BY licenseid ORDER BY startdate) AS rnk,
ROW_NUMBER()
OVER (PARTITION BY licenseid, startdate ORDER BY enddate) AS rn,
MAX(startdate)
OVER (PARTITION BY licenseid) AS max_startdate,
MAX(enddate)
KEEP (DENSE_RANK LAST ORDER BY startdate)
OVER (PARTITION BY licenseid) AS max_enddate
FROM table_name t
)
WHERE rnk = 1
AND rn = 2
)src
ON (src.rid = dst.ROWID)
WHEN MATCHED THEN
UPDATE
SET startdate = src.max_startdate,
enddate = src.max_enddate;
db<>fiddle here

Complex query analyzing historical records

I am using Oracle and trying to retrieve the total number of days a person was out of the office during the year. I have 2 tables involved:
Statuses
1 - Active
2 - Out of the Office
3 - Other
ScheduleHistory
RecordID - primary key
PersonID
PreviousStatusID
NextStatusID
DateChanged
I can easily find when the person went on vacation and when they came back, using
SELECT DateChanged FROM ScheduleHistory WHERE PersonID=111 AND NextStatusID = 2
and
SELECT DateChanged FROM ScheduleHistory WHERE PersonID=111 AND PreviousStatusID = 2
But in case a person went on vacation more than once, how can I can I calculate total number of days a person was out of the office. Is it possible to do programmatically, given only PersonID?
Here is some sample data:
RecordID PersonID PreviousStatusID NextStatusID DateChanged
-----------------------------------------------------------------------------
1 111 1 2 03/11/2020
2 111 2 1 03/13/2020
3 111 1 3 04/01/2020
4 111 3 1 04/07/2020
5 111 1 2 06/03/2020
6 111 2 1 06/05/2020
7 111 1 2 09/14/2020
8 111 2 1 09/17/2020
So from the data above, for the year 2020 for PersonID 111 the query should return 7

Try this:
with aux1 AS (
SELECT
a.*,
to_date(datechanged, 'MM/DD/YYYY') - LAG(to_date(datechanged, 'MM/DD/YYYY')) OVER(
PARTITION BY personid
ORDER BY
recordid
) lag_date
FROM
ScheduleHistory a
)
SELECT
personid,
SUM(lag_date) tot_days_ooo
FROM
aux1
WHERE
previousstatusid = 2
GROUP BY
personid;

If you want total days (or weekdays) for each year (and to account for periods when it goes over the year boundary) then:
WITH date_ranges ( personid, status, start_date, end_date ) AS (
SELECT personid,
nextstatusid,
datechanged,
LEAD(datechanged, 1, datechanged) OVER(
PARTITION BY personid
ORDER BY datechanged
)
FROM table_name
),
split_year_ranges ( personid, year, start_date, end_date, max_date ) AS (
SELECT personid,
TRUNC( start_date, 'YY' ),
start_date,
LEAST(
end_date,
ADD_MONTHS( TRUNC( start_date, 'YY' ), 12 )
),
end_date
FROM date_ranges
WHERE status = 2
UNION ALL
SELECT personid,
end_date,
end_date,
LEAST( max_date, ADD_MONTHS( end_date, 12 ) ),
max_date
FROM split_year_ranges
WHERE end_date < max_date
)
SELECT personid,
EXTRACT( YEAR FROM year) AS year,
SUM( end_date - start_date ) AS total_days,
SUM(
( TRUNC( end_date, 'IW' ) - TRUNC( start_date, 'IW' ) ) * 5 / 7
+ LEAST( end_date - TRUNC( end_date, 'IW' ), 5 )
- LEAST( start_date - TRUNC( start_date, 'IW' ), 5 )
) AS total_weekdays
FROM split_year_ranges
GROUP BY personid, year
ORDER BY personid, year
Which, for the sample data:
CREATE TABLE table_name ( RecordID, PersonID, PreviousStatusID, NextStatusID, DateChanged ) AS
SELECT 1, 111, 1, 2, DATE '2020-03-11' FROM DUAL UNION ALL
SELECT 2, 111, 2, 1, DATE '2020-03-13' FROM DUAL UNION ALL
SELECT 3, 111, 1, 3, DATE '2020-04-01' FROM DUAL UNION ALL
SELECT 4, 111, 3, 1, DATE '2020-04-07' FROM DUAL UNION ALL
SELECT 5, 111, 1, 2, DATE '2020-06-03' FROM DUAL UNION ALL
SELECT 6, 111, 2, 1, DATE '2020-06-05' FROM DUAL UNION ALL
SELECT 7, 111, 1, 2, DATE '2020-09-14' FROM DUAL UNION ALL
SELECT 8, 111, 2, 1, DATE '2020-09-17' FROM DUAL UNION ALL
SELECT 9, 222, 1, 2, DATE '2019-12-31' FROM DUAL UNION ALL
SELECT 10, 222, 2, 2, DATE '2020-12-01' FROM DUAL UNION ALL
SELECT 11, 222, 2, 2, DATE '2021-01-02' FROM DUAL;
Outputs:
PERSONID
YEAR
TOTAL_DAYS
TOTAL_WEEKDAYS
111
2020
7
7
222
2019
1
1
222
2020
366
262
222
2021
1
1
db<>fiddle here

Provided no vacation crosses a year boundary
with grps as (
SELECT sh.*,
row_number() over (partition by PersonID, NextStatusID order by DateChanged) grp
FROM ScheduleHistory sh
WHERE NextStatusID in (1,2) and 3 not in (NextStatusID, PreviousStatusID)
), durations as (
SELECT PersonID, min(DateChanged) DateChanged, max(DateChanged) - min(DateChanged) duration
FROM grps
GROUP BY PersonID, grp
)
SELECT PersonID, sum(duration) days_out
FROM durations
GROUP BY PersonID;
db<>fiddle

year_span is used to split an interval spanning across two years in two different records
H1 adds a row number dependent from PersonID to get the right sequence for each person
H2 gets the periods for each status change and extract 1st day of the year of the interval end
H3 split records that span across two years and calculate the right date_start and date_end for each interval
H calculates days elapsed in each interval for each year
final query sum up the records to get output
EDIT
If you need workdays instead of total days, you should not use total_days/7*5 because it is a bad approximation and in some cases gives weird results.
I have posted a solution to jump on fridays to mondays here
with
statuses (sid, sdescr) as (
select 1, 'Active' from dual union all
select 2, 'Out of the Office' from dual union all
select 3, 'Other' from dual
),
ScheduleHistory(RecordID, PersonID, PreviousStatusID, NextStatusID , DateChanged) as (
select 1, 111, 1, 2, date '2020-03-11' from dual union all
select 2, 111, 2, 1, date '2020-03-13' from dual union all
select 3, 111, 1, 3, date '2020-04-01' from dual union all
select 4, 111, 3, 1, date '2020-04-07' from dual union all
select 5, 111, 1, 2, date '2020-06-03' from dual union all
select 6, 111, 2, 1, date '2020-06-05' from dual union all
select 7, 111, 1, 2, date '2020-09-14' from dual union all
select 8, 111, 2, 1, date '2020-09-17' from dual union all
SELECT 9, 222, 1, 2, date '2019-12-31' from dual UNION ALL
SELECT 10, 222, 2, 2, date '2020-12-01' from dual UNION ALL
SELECT 11, 222, 2, 2, date '2021-01-02' from dual
),
year_span (n) as (
select 1 from dual union all
select 2 from dual
),
H1 AS (
SELECT ROW_NUMBER() OVER (PARTITION BY PersonID ORDER BY RecordID) PID, H.*
FROM ScheduleHistory H
),
H2 as (
SELECT
H1.*, H2.DateChanged DateChanged2,
EXTRACT(YEAR FROM H2.DateChanged) - EXTRACT(YEAR FROM H1.DateChanged) + 1 Y,
trunc(H2.DateChanged,'YEAR') Y2
FROM H1 H1
LEFT JOIN H1 H2 ON H1.PID = H2.PID-1 AND H1.PersonID = H2.PersonID
),
H3 AS (
SELECT Y, N, H2.PID, H2.RecordID, H2.PersonID, H2.NextStatusID,
CASE WHEN Y=1 THEN H2.DateChanged ELSE CASE WHEN N=1 THEN H2.DateChanged ELSE Y2 END END D1,
CASE WHEN Y=1 THEN H2.DateChanged2 ELSE CASE WHEN N=1 THEN Y2 ELSE H2.DateChanged2 END END D2
FROM H2
JOIN year_span N ON N.N <=Y
),
H AS (
SELECT PersonID, NextStatusID, EXTRACT(year FROM d1) Y, d2-d1 D
FROM H3
)
select PersonID, sdescr Status, Y, sum(d) d
from H
join statuses s on NextStatusID = s.sid
group by PersonID, sdescr, Y
order by PersonID, sdescr, Y
output
PersonID Status Y d
111 Active 2020 177
111 Other 2020 6
111 Out of the Office 2020 7
222 Out of the Office 2019 1
222 Out of the Office 2020 366
222 Out of the Office 2021 1
check the fiddle here

sql oracle goup by on dates with possibilities of null values

I have a table with emplid and end_date columns. I want from all emplids the max end_dates. If at least one end_date is null, I want to have the null value as max. So in this example:
emplid end_date
1 05/04/2019
1 05/10/2019
1 null
2 05/04/2019
2 05/10/2019
I want as result:
emplid end_date
1 null
2 05/10/2019
I tried something like
select emplid,
CASE
WHEN MAX(NVL(end_Date,'01/01/3000'))='01/01/3000' THEN null
ELSE end_date
END as end_dt
from people
group by emplid
then I get a group-by error.
Maybe it is very easy, but I don't figure out how to get properly what I want.

with s(id, dt) as (
select 1, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 1, to_date('05/10/2019', 'dd/mm/yyyy') from dual union all
select 1, null from dual union all
select 2, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 2, to_date('05/10/2019', 'dd/mm/yyyy') from dual)
select id, decode(count(dt), count(*), max(dt)) max_dt
from s
group by id;
ID MAX_DT
---------- -----------------------------
1
2 2019-10-05 00:00:00

I would simply do:
select emplid,
(case when count(*) = count(end_date)
then max(end_date)
end) as max_end_date
from t
group by emplid;
There is no reason to introduce a "magic" maximum value (even if it is correct).
The first expression in the case is simply asking "do the number of non-NULL end-date values match the number of rows".

Try this
SELECT
EMPLID,
CASE WHEN END_DATE='01/01/3000' THEN NULL ELSE END_DATE END AS END_DT
FROM
(
SELECT EMPLID, MAX(END_DATE) AS END_DATE FROM
(
SELECT EMPLID, NVL(END_DATE,'01/01/3000') AS END_DATE FROM PEOPLE
)
GROUP BY EMPLID
);

Case does not go with group by , you have to get the max value using group by first then evaluate the null values. Try below.
select empid, CASE WHEN NVL(eDate,'01-DEC-3000')='01-DEC-3000' THEN null ELSE edate end end_dt from (
select empid, MAX(NVL(eDate,'01-DEC-3000')) eDate
from
(select 1 empid, sysdate-100 edate from dual union all
select 1 empid, sysdate-10 edate from dual union all
select 1 empid, null edate from dual union all
select 2 empid, sysdate-105 edate from dual union all
select 2 empid, sysdate-1 edate from dual ) datad
group by empid);

Get rows from current month if older is not available

I have a table that looks like this:
+--------------------+---------+
| Month (date) | amount |
+--------------------+---------+
| 2016-10-01 | 20 |
| 2016-08-01 | 10 |
| 2016-07-01 | 17 |
+--------------------+---------+
I'm looking for a query (sql statement) which satisfies the following conditions:
Give me the value of the previous month.
If there is no value for the previous month lock back in time until one can be found.
If there is just a value for the current month give me this value.
In the example table the row I'm looking for would be this:
+--------------------+---------+
| 2016-08-01 | 10 |
+--------------------+---------+
Has anyone a idea for a non complex select query?
Thanks in advance,
Peter

You may need the following:
SELECT *
FROM ( SELECT *
FROM test
WHERE TRUNC(SYSDATE, 'month') >= month
ORDER BY CASE
WHEN TRUNC(SYSDATE, 'month') = month
THEN 0 /* if current month, ordered last */
ELSE 1 /* previous months are ordered first */
END DESC,
month DESC /* among previous months, the greatest first */
)
WHERE ROWNUM = 1

Another way using MAX
WITH tbl AS (
SELECT TO_DATE('2016-10-01', 'YYYY-MM-DD') AS "month", 20 AS amount FROM dual
UNION
SELECT TO_DATE('2016-08-01', 'YYYY-MM-DD') AS "month", 10 AS amount FROM dual
UNION
SELECT TO_DATE('2016-07-01', 'YYYY-MM-DD') AS "month", 5 AS amount FROM dual
)
SELECT *
FROM tbl
WHERE TRUNC("month", 'MONTH') = NVL((SELECT MAX(t."month")
FROM tbl t
WHERE t."month" < TRUNC(SYSDATE, 'MONTH')),
TRUNC(SYSDATE, 'MONTH'));

I would use row_number():
select t.*
from (select t.*,
row_number() over (order by (case when to_char(dte, 'YYYY-MM') = to_char(sysdate, 'YYYY-MM') then 1 else 2 end) desc,
dte desc
) as seqnum
from t
) t
where seqnum = 1;
Actually, you don't need row_number() for this:
select t.*
from (select t.*
from t
order by (case when to_char(dte, 'YYYY-MM') = to_char(sysdate, 'YYYY-MM') then 1 else 2 end) desc,
dte desc
) t
where rownum = 1;

It's not the nicest query but it should work.
select amount, date from (
select amount, date, row_number over(partition by HERE_PUT_ID order by
case trunc(date, 'month') when trunc(sysdate, 'month') then to_date('00010101', 'yyyymmdd') else trunc(date, 'month') end
desc) r)
where r = 1;
I guess you have some id in table so put id column instead of HERE_PUT_ID if you want query for whole table just delete: partition by HERE_PUT_ID

I added more data for testing, and an "id" column (a more realistic scenario) to show how this would work. If there is no "id" in your data, simply delete any reference to it from the solution.
Notes - month is a reserved Oracle word, don't use it as a column name. The solution assumes the date column contains dates that are already truncated to the beginning of the month. The trick in "order by" in the dense_rank last is to assign a value (ANY value!) when the month is the current month; by default, the value assigned to all other months is NULL, which by default come after any non-null value in an ascending order.
You may want to test the various solutions for efficiency if execution time is important.
with
inputs ( id, mth, amount ) as (
select 1, date '2016-10-01', 20 from dual union all
select 1, date '2016-08-01', 10 from dual union all
select 1, date '2016-07-01', 17 from dual union all
select 2, date '2016-10-01', 30 from dual union all
select 2, date '2016-09-01', 25 from dual union all
select 3, date '2016-10-01', 20 from dual union all
select 4, date '2016-08-01', 45 from dual union all
select 4, date '2016-06-01', 30 from dual
)
-- end of TEST DATA - the solution (SQL query) is below this line
select id,
max(mth) keep(dense_rank last order by
case when mth = trunc(sysdate, 'mm') then 0 end, mth) as mth,
max(amount) keep(dense_rank last order by
case when mth = trunc(sysdate, 'mm') then 0 end, mth) as amount
from inputs
group by id
order by id -- ORDER BY is optional
;
ID MTH AMOUNT
--- ---------- -------
1 2016-08-01 10
2 2016-09-01 25
3 2016-10-01 20
4 2016-08-01 45

You could sort the data in the direction you want to:
with MyData as
(
SELECT to_date('2016-10-01','YYYY-MM-DD') MY_DATE, 20 AMOUNT FROM DUAL UNION
SELECT to_date('2016-08-01','YYYY-MM-DD') MY_DATE, 10 AMOUNT FROM DUAL UNION
SELECT to_date('2016-07-01','YYYY-MM-DD') MY_DATE, 17 AMOUNT FROM DUAL
),
MyResult AS (
SELECT
D.*
FROM MyData D
ORDER BY
DECODE(
12*TO_CHAR(MY_DATE,'YYYY') + TO_CHAR(MY_DATE,'MM'),
12*TO_CHAR(SYSDATE,'YYYY') + TO_CHAR(SYSDATE,'MM'),
-1,
12*TO_CHAR(MY_DATE,'YYYY') + TO_CHAR(MY_DATE,'MM'))
DESC
)
SELECT * FROM MyResult WHERE RowNum = 1

Identify contiguous and discontinuous date ranges

I have a table named x . The data is as follows.
Acccount_num start_dt end_dt
A111326 02/01/2016 02/11/2016
A111326 02/12/2016 03/05/2016
A111326 03/02/2016 03/16/2016
A111331 02/28/2016 02/29/2016
A111331 02/29/2016 03/29/2016
A999999 08/25/2015 08/25/2015
A999999 12/19/2015 12/22/2015
A222222 11/06/2015 11/10/2015
A222222 05/16/2016 05/17/2016
Both A111326 and A111331 should be identified as contiguous data and A999999 and
A222222 should be identified as discontinuous data.In my code I currently use the following query to identify discontinuous data. The A111326 is also erroneously identified as discontinuous data. Please help to modify the below code so that A111326 is not identified as discontinuous data.Thanks in advance for your help.
(SELECT account_num
FROM (SELECT account_num,
(MAX (
END_DT)
OVER (PARTITION BY account_num
ORDER BY START_DT))
START_DT,
(LEAD (
START_DT)
OVER (PARTITION BY account_num
ORDER BY START_DT))
END_DT
FROM x
WHERE (START_DT + 1) <=
(END_DT - 1))
WHERE START_DT < END_DT);

Oracle Setup:
CREATE TABLE accounts ( Account_num, start_dt, end_dt ) AS
SELECT 'A', DATE '2016-02-01', DATE '2016-02-11' FROM DUAL UNION ALL
SELECT 'A', DATE '2016-02-12', DATE '2016-03-05' FROM DUAL UNION ALL
SELECT 'A', DATE '2016-03-02', DATE '2016-03-16' FROM DUAL UNION ALL
SELECT 'B', DATE '2016-02-28', DATE '2016-02-29' FROM DUAL UNION ALL
SELECT 'B', DATE '2016-02-29', DATE '2016-03-29' FROM DUAL UNION ALL
SELECT 'C', DATE '2015-08-25', DATE '2015-08-25' FROM DUAL UNION ALL
SELECT 'C', DATE '2015-12-19', DATE '2015-12-22' FROM DUAL UNION ALL
SELECT 'D', DATE '2015-11-06', DATE '2015-11-10' FROM DUAL UNION ALL
SELECT 'D', DATE '2016-05-16', DATE '2016-05-17' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-01', DATE '2016-01-02' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-05', DATE '2016-01-06' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-03', DATE '2016-01-07' FROM DUAL;
Query:
WITH times ( account_num, dt, lvl ) AS (
SELECT Account_num, start_dt - 1, 1 FROM accounts
UNION ALL
SELECT Account_num, end_dt, -1 FROM accounts
)
, totals ( account_num, dt, total ) AS (
SELECT account_num,
dt,
SUM( lvl ) OVER ( PARTITION BY Account_num ORDER BY dt, lvl DESC )
FROM times
)
SELECT Account_num,
CASE WHEN COUNT( CASE total WHEN 0 THEN 1 END ) > 1
THEN 'N'
ELSE 'Y'
END AS is_contiguous
FROM totals
GROUP BY Account_Num
ORDER BY Account_Num;
Output:
ACCOUNT_NUM IS_CONTIGUOUS
----------- -------------
A Y
B Y
C N
D N
E Y
Alternative Query:
(It's exactly the same method just using UNPIVOT rather than UNION ALL.)
SELECT Account_num,
CASE WHEN COUNT( CASE total WHEN 0 THEN 1 END ) > 1
THEN 'N'
ELSE 'Y'
END AS is_contiguous
FROM (
SELECT Account_num,
SUM( lvl ) OVER ( PARTITION BY Account_Num
ORDER BY CASE lvl WHEN 1 THEN dt - 1 ELSE dt END,
lvl DESC
) AS total
FROM accounts
UNPIVOT ( dt FOR lvl IN ( start_dt AS 1, end_dt AS -1 ) )
)
GROUP BY Account_Num
ORDER BY Account_Num;

WITH cte AS (
SELECT
AccountNumber
,CASE
WHEN
LAG(End_Dt) OVER (PARTITION BY AccountNumber ORDER BY End_Dt) IS NULL THEN 0
WHEN
LAG(End_Dt) OVER (PARTITION BY AccountNumber ORDER BY End_Dt) >= Start_Dt - 1 THEN 0
ELSE 1
END as discontiguous
FROM
#Table
)
SELECT
AccountNumber
,CASE WHEN SUM(discontiguous) > 0 THEN 'discontiguous' ELSE 'contiguous' END
FROM
cte
GROUP BY
AccountNumber;
One of your problems is that your contiguous desired result also includes overlapping date ranges in your example data set. Example A111326 Starts on 3/2/2016 but ends the row before on 3/5/2015 meaning it overlaps by 3 days.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Row lumping, cycle dates - sql

Related

Fetch latest start date, if there are two minimum start dates in a table

Complex query analyzing historical records

sql oracle goup by on dates with possibilities of null values

Get rows from current month if older is not available

Identify contiguous and discontinuous date ranges

Categories

Resources