I am having a table with the following structure, using which I am trying to find the TAT(Turn Around Time) between two days. But, do the overlapping days, I am unable to find the actual TAT
Appln No Start Date End Date
1001009 01-10-15 06-10-15
1001009 02-10-15 04-10-15
1001009 03-10-15 04-10-15
1001009 03-10-15 05-10-15
1001009 04-10-15 07-10-15
1001009 09-10-15 10-10-15
1001009 12-10-15 16-10-15
1001009 14-10-15 17-10-15
After removing the overlapping dates from the above sample data, the output will be in the following format -
Appln No Start Date End Date
1001009 01-10-15 07-10-15
1001009 09-10-15 10-10-15
1001009 12-10-15 17-10-15
Since I am a beginner in sql and using oracle sql developer, I am finding it difficult to write the above logic into code. Any suggestion on the issue is welcome :)
Try this:
select t1.* from myTable t1
inner join myTable t2
on t2.StartDate > t1.StartDate and t2.StartDate < t1.EndDate
More that a tricky task, as you can't trust any order of the intervals.
I attack it by removing the subintervals (intervals completely covered in other interval).
After this I can follow the order defined by START_DATE to see if the perceeding interval is overlaping with the next one and apply the standard grouping mechanism.
with subs as (
/* first remove all intervals that are subsets of other intervals */
select * from tst t1
where NOT exists (select null from tst t2 where t2.start_date < t1.start_date and t1.end_date < t2.end_date)
),overlap as (
select APPLN_NO, START_DATE, END_DATE,
case when (nvl(lag(END_DATE) over (partition by APPLN_NO order by START_DATE),START_DATE-1) < START_DATE) then
row_number() over (partition by APPLN_NO order by START_DATE) end grp
from subs),
overlap2 as (
select
APPLN_NO, START_DATE, END_DATE, GRP,
last_value(grp ignore nulls) over (partition by APPLN_NO order by START_DATE) as grp2
from overlap)
select
APPLN_NO, min(START_DATE) START_DATE, max(END_DATE) END_DATE
from overlap2
group by APPLN_NO, grp2
order by 1,2
;
For checking the query here my setup
drop table tst ;
create table tst
(appln_no number,
start_date date,
end_date date);
insert into tst values (1001009, to_date('01-10-15','dd-mm-rr'),to_date('06-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('02-10-15','dd-mm-rr'),to_date('04-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('03-10-15','dd-mm-rr'),to_date('04-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('03-10-15','dd-mm-rr'),to_date('05-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('04-10-15','dd-mm-rr'),to_date('07-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('09-10-15','dd-mm-rr'),to_date('10-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('12-10-15','dd-mm-rr'),to_date('16-10-15','dd-mm-rr'));
insert into tst values (1001009, to_date('13-10-15','dd-mm-rr'),to_date('14-10-15','dd-mm-rr')); /* this is added to make it more interesting */
insert into tst values (1001009, to_date('15-10-15','dd-mm-rr'),to_date('17-10-15','dd-mm-rr'));
give
APPLN_NO START_DATE END_DATE
---------- ------------------- -------------------
1001009 01.10.2015 00:00:00 07.10.2015 00:00:00
1001009 09.10.2015 00:00:00 10.10.2015 00:00:00
1001009 12.10.2015 00:00:00 17.10.2015 00:00:00
as expected.
This is a tricky query. You need to identify groups that overlap, by assigning a grouping id. One way to do this is to find where the overlapping groups start, then accumulate the number of starts between each record.
The following assumes that your table has a primary key (called id for lack of a better name).
This gives the opportunity to aggregate to get what you want:
select ApplnNo, min(start), max(end)
from (select t.*,
sum(IsGroupStart) over (partition by ApplnNo order by start) as grp
from (select t.*,
(case when exists (select 1
from t t2
where t2.end >= t.start and t2.start <= t.end and
t2.id <> t.id
)
then 0 else 1
end) as IsGroupStart
from t
) t
) t
group by ApplnNo, grp;
There are some nuances. The exact innermost subquery for exists depends on how you define overlaps. This includes even one day of overlap at the beginning or end.
Related
I'm trying to write a script that will look at the issue date and termination date for each policy in a table. I want to be able to take those two dates, create a row for each year in between those two dates, and then fill in the values in the remaining columns.
I've been working with a recursive CTE approach in Redshift and I've got to the point where I can create the annual records. The part I'm stuck on is how to include the other columns in the table and fill each of the created rows with the same information as the row above.
For example, if I start with a record that looks something like
policy_number
issue_date
termination_date
issue_state
product
plan_code
001
1985-05-26
2005-03-02
CT
ROP
123456
I want to build a table that would look like this
policy_number
issue_date
termination_date
issue_state
product
plan_code
start_date
001
1985-05-26
2005-03-02
CT
ROP
123456
1985-05-26
001
1985-05-26
2005-03-02
CT
ROP
123456
1986-05-26
001
1985-05-26
2005-03-02
CT
ROP
123456
1987-05-26
...
...
...
...
...
...
...
001
1985-05-26
2005-03-02
CT
ROP
123456
2004-05-26
001
1985-05-26
2005-03-02
CT
ROP
123456
2005-03-02
Here's the code I've got so far:
WITH RECURSIVE start_dt AS
(
SELECT MIN(issue_date) AS s_dt -- step 1: grab start date
FROM myTable
WHERE policy_number = '001'
GROUP BY policy_number
),
end_dt AS
(
SELECT MAX(effective_date) AS e_dt -- step 2: grab the termination date
FROM myTable
WHERE policy_number = '001'
GROUP BY policy_number
),
dates (dt) AS
(
-- start at the start date
SELECT s_dt dt -- selectin start date from step 1
FROM start_dt
UNION ALL
-- recursive lines
SELECT dateadd(YEAR,1,dt)::DATE dt -- converted to date to avoid type mismatch -- adding annual records until the termination date
FROM dates
WHERE dt <= (SELECT e_dt FROM end_dt)
-- stop at the end date
)
SELECT *
FROM dates
which yields
dt
1985-05-26
1986-05-26
1987-05-26
...
How can I include the rest the columns in my table? I'm also open to using a cross join if that would be a better approach. I'm expecting this to generate around 10,000,000 rows, so any optimization would be much appreciated.
If I understand correctly you have a table with begin/end dates and you have a process for generating all the needed dates to span the min / max of these. You want to apply this list of dates to the starting table to get all rows replicated between begin and end.
You have a good start - the list of dates. The usual process is to join the dates with the table using inequality conditions. (ON dt >= begin and dt <= end)
You will need to deal with some edge condition around the unique dates for each input row. If you need to maintain these unique dates you will need to fudge the join condition. All doable.
==============================================================
Back from biz trip and can give more concrete guidance.
There's 2 ways to do this. The first is the CTE approach you are driving down but this will pass all the data through each loop of the CTE. This could be slow. This would look like (including data setup):
create table mytable (
policy_number varchar(8),
issue_date timestamp,
termination_date timestamp,
issue_state varchar(4),
product varchar(16),
plan_code int);
insert into mytable values
('001', '1985-05-26', '2005-03-02', 'CT', 'ROP', 123456),
('002', '1988-07-25', '2005-08-07', 'CT', 'ROP', 654321)
;
with recursive pdata(policy_number, issue_date, termination_date,
issue_state, product, plan_code, start_date,
yr) as (
select policy_number, issue_date, termination_date, issue_state,
product, plan_code, issue_date as start_date, 0 as yr
from mytable
union all
select policy_number, issue_date, termination_date, issue_state,
product, plan_code,
issue_date + yr * (interval '1 years') as start_date,
yr + 1 as yr
from pdata
where start_date < termination_date
)
select policy_number, issue_date, termination_date,
issue_state, product, plan_code,
case when start_date > termination_date
then termination_date
else start_date
end as start_date
from pdata
order by start_date, policy_number;
The other way to do this is to generate the length of years in the recursive CTE but apply the data expansion in a loop join. This has the benefit of not carrying all the data through the recursive calls but has the expense of the loop join. It should be faster with large amounts of data but you can decide which is right for you.
Since each input row has its own date I left things in year intervals as this is cleaner. This looks like:
create table mytable (
policy_number varchar(8),
issue_date timestamp,
termination_date timestamp,
issue_state varchar(4),
product varchar(16),
plan_code int);
insert into mytable values
('001', '1985-05-26', '2005-03-02', 'CT', 'ROP', 123456),
('002', '1988-07-25', '2005-08-07', 'CT', 'ROP', 654321)
;
with recursive nums(yr, maxnum) as (
select 0::int as yr,
date_part('year', max(termination_date)) -
date_part('year', min(issue_date)) as maxnum
from mytable
union all
select yr + 1 as yr, maxnum
from nums
where yr <= maxnum
)
select policy_number, issue_date, termination_date,
issue_state, product, plan_code,
case when issue_date + yr * interval '1 year' > termination_date
then termination_date
else issue_date + yr * interval '1 year'
end as start_date
from mytable p
left join nums n
on termination_date + interval '1 year'
> issue_date + yr * interval '1 year'
order by start_date, policy_number;
I have these date ranges that represent start and end dates of subscription. There are no overlaps in date ranges.
Start Date End Date
1/5/2015 - 1/14/2015
1/15/2015 - 1/20/2015
1/24/2015 - 1/28/2015
1/29/2015 - 2/3/2015
I want to identify delays of more than 1 day between any subscription ending and a new one starting. e.g. for the data above, i want the output: 1/24/2015 - 1/28/2015.
How can I do this using a sql query?
Edit : Also there can be multiple gaps in the subscription date ranges but I want the date range after the latest one.
You do this using a left join or not exists:
select t.*
from t
where not exists (select 1
from t t2
where t2.enddate = dateadd(day, -1, t.startdate)
);
Note that this will also give you the first record in the sequence . . . which, strictly speaking, matches the conditions. Here is one solution to that problem:
select t.*
from t cross join
(select min(startdate) as minsd from t) as x
where not exists (select 1
from t t2
where t2.enddate = dateadd(day, -1, t.startdate)
) and
t.startdate <> minsd;
You can also approach this with window functions:
select t.*
from (select t.*,
lag(enddate) over (order by startdate) as prev_enddate,
min(startdate) over () as min_startdate
from t
) t
where minstartdate <> startdate and
enddate <> dateadd(day, -1, startdate);
Also note that this logic assumes that the time periods do not overlap. If they do, a clearer problem statement is needed to understand what you are really looking for.
You can achieve this using window function LAG() that would get value from previous row in ordered set for later comparison in WHERE clause. Then, in WHERE you just apply your "gapping definition" and discard the first row.
SQL FIDDLE - Test it!
Sample data:
create table dates(start_date date, end_date date);
insert into dates values
('2015-01-05','2015-01-14'),
('2015-01-15','2015-01-20'),
('2015-01-24','2015-01-28'), -- gap
('2015-01-29','2015-02-03'),
('2015-02-04','2015-02-07'),
('2015-02-09','2015-02-11'); -- gap
Query
SELECT
start_date,
end_date
FROM (
SELECT
start_date,
end_date,
LAG(end_date, 1) OVER (ORDER BY start_date) AS prev_end_date
FROM dates
) foo
WHERE
start_date IS DISTINCT FROM ( prev_end_date + 1 ) -- compare current row start_date with previous row end_date + 1 day
AND prev_end_date IS NOT NULL -- discard first row, which has null value in LAG() calculation
I assume that there are no overlaps in your data and that there are unique values for each pair. If that's not the case, you need to clarify this.
I have a table of orders which have a create_date_time (ie - 02/12/2015 14:00:44)
What I would like to do is group two months worth of orders by this create_date_time but instead of using trunc and using a proper day I'd like to go from 6am to 6am. I've tried this below but it doesn't seem to work in that way, rather it truncates and then alters the create_date_time.
select "Date", sum(CFS), sum(MCR) from
(select trunc(phi.create_date_Time)+6/24 as "Date",
case when pkt_sfx = 'CFS' then sum(total_nbr_of_units)
End as CFS,
case when pkt_sfx <> 'CFS' then sum(total_nbr_of_units)
end as MCR
from pkt_hdr ph
inner join pkt_hdr_intrnl phi
on phi.pkt_ctrl_nbr = ph.pkt_ctrl_nbr
where sale_grp = 'I'
group by trunc(phi.create_date_time)+6/24, pkt_sfx
union
select trunc(phi.create_date_Time)+6/24 as "Date",
case when pkt_sfx = 'CFS' then sum(total_nbr_of_units)
End as CFS,
case when pkt_sfx <> 'CFS' then sum(total_nbr_of_units)
end as MCR
from wm_archive.pkt_hdr ph
inner join wm_archive.pkt_hdr_intrnl PHI
on phi.pkt_Ctrl_nbr = ph.pkt_ctrl_nbr
where sale_grp = 'I'
and trunc(phi.create_date_time) >= trunc(sysdate)-60
group by trunc(phi.create_date_time)+6/24, pkt_sfx
)
group by "Date"
Please note the union isn't necessarily important but it is required in the code as half the results will be archived but the current archive day will cause date overlap that must be removed with the outer query.
Thanks
If I understood correctly, you need to subtract six hours from date and then trunc this date:
select trunc(dt-6/24) dt, sum(units) u
from ( select dt, units from t1 union all
select dt, units from t2 )
group by trunc(dt-6/24)
Test:
create table t1 (dt date, units number(5));
insert into t1 values (timestamp '2015-12-01 12:47:00', 7);
insert into t1 values (timestamp '2015-12-01 23:47:00', 7);
create table t2 (dt date, units number(5));
insert into t2 values (timestamp '2015-12-02 05:47:00', 7);
insert into t2 values (timestamp '2015-12-02 14:47:00', 7);
Output:
Dt U
---------- ---
2015-12-01 21
2015-12-02 7
Just swap the ordering of TRUNC and adding 6h - instead of
select trunc(phi.create_date_Time)+6/24 as "Date"
use
select trunc(phi.create_date_Time + 6/24) as "Date"
(you also need to change the other occurrences of trunc())
BTW: I'd use another name for the "Date" column - DATE is a SQL data type, so having a column named "Date" is somewhat confusing.
I have following table tbl in database and I have dynamic joining date 1-1-2012 and I want this date is between (Fall and spring) or (spring and summer) or (summer and fall).I want query in which i passed only joining date which return semestertime and joining date in Oracle.
Semestertime joiningDate
Fall 10-13-2011
Spring 2-1-2012
Summer 6-11-2012
Fall 10-1-2015
If I understand your question correctly:
SELECT *
FROM your_table
WHERE joiningDate between to_date (your_lower_limit_date_here, 'mm-dd-yyyy')
AND to_date (your_upper_limit_date_here, 'mm-dd-yyyy`);
What about something like that:
select 'BEFORE' term,
t."Semestertime", to_char(t."joiningDate", 'MM-DD-YYYY')
from (
select tbl.*, rownum rn from tbl where tbl."joiningDate" < to_date('1-1-2012','MM-DD-YYYY')
-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- your reference date
order by tbl."joiningDate" desc) t
where rn = 1
union all
select 'AFTER' term,
t."Semestertime", to_char(t."joiningDate", 'MM-DD-YYYY')
from (
select tbl.*, rownum rn from tbl where tbl."joiningDate" > to_date('1-1-2012','MM-DD-YYYY')
-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- your reference date
order by tbl."joiningDate" asc) t
where rn = 1
This will return the "term" before and after a given date. You will probably have to adapt such query to your specific needs. But that might be a good starting point.
For example, given your business rules, you might consider using <= instead of <. You you might require to have the result displayer a column instead of rows. Bu all of this shouldn't be too had to change.
As an alternate solution using CTE and sub-queries:
with testdata as (select to_date('1-1-2012','MM-DD-YYYY') refdate from dual)
select v.what, tbl.* from tbl join
(
select 'BEFORE' what, max(t1."joiningDate") d
from tbl t1
where t1."joiningDate" < to_date('1-1-2012','MM-DD-YYYY')
union all
select 'AFTER' what, min(t1."joiningDate") d
from tbl t1
where t1."joiningDate" > to_date('1-1-2012','MM-DD-YYYY')
) v
on tbl."joiningDate" = v.d
See http://sqlfiddle.com/#!4/c7fa5/15 for a live demo comparing those solutions.
I have data like below:
StartDate EndDate Duration
----------
41890 41892 3
41898 41900 3
41906 41907 2
41910 41910 1
StartDate and EndDate are respective ID values for any dates from calendar. I want to calculate the sum of duration for consecutive days. Here I want to include the days which are weekends. E.g. in the above data, let's say 41908 and 41909 are weekends, then my required result set should look like below.
I already have another proc that can return me the next working day, i.e. if I pass 41907 or 41908 or 41909 as DateID in that proc, it will return 41910 as the next working day. Basically I want to check if the DateID returned by my proc when I pass the above EndDateID is same as the next StartDateID from above data, then both the rows should be clubbed. Below is the data I want to get.
ID StartDate EndDate Duration
----------
278457 41890 41892 3
278457 41898 41900 3
278457 41906 41910 3
Please let me know in case the requirement is not clear, I can explain further.
My Date Table is like below:
DateId Date Day
----------
41906 09-04-2014 Thursday
41907 09-05-2014 Friday
41908 09-06-2014 Saturdat
41909 09-07-2014 Sunday
41910 09-08-2014 Monday
Here is the SQL Code for setup:
CREATE TABLE Table1
(
StartDate INT,
EndDate INT,
LeaveDuration INT
)
INSERT INTO Table1
VALUES(41890, 41892, 3),
(41898, 41900, 3),
(41906, 41907, 3),
(41910, 41910, 1)
CREATE TABLE DateTable
(
DateID INT,
Date DATETIME,
Day VARCHAR(20)
)
INSERT INTO DateTable
VALUES(41907, '09-05-2014', 'Friday'),
(41908, '09-06-2014', 'Saturday'),
(41909, '09-07-2014', 'Sunday'),
(41910, '09-08-2014', 'Monday'),
(41911, '09-09-2014', 'Tuesday')
This is rather complicated. Here is an approach using window functions.
First, use the date table to enumerate the dates without weekends (you can also take out holidays if you want). Then, expand the periods into one day per row, by using a non-equijoin.
You can then use a trick to identify sequential days. This trick is to generate a sequential number for each id and subtract it from the sequential number for the dates. This is a constant for sequential days. The final step is simply an aggregation.
The resulting query is something like this:
with d as (
select d.*, row_number() over (order by date) as seqnum
from dates d
where day not in ('Saturday', 'Sunday')
)
select t.id, min(t.date) as startdate, max(t.date) as enddate, sum(duration)
from (select t.*, ds.seqnum, ds.date,
(d.seqnum - row_number() over (partition by id order by ds.date) ) as grp
from table t join
d ds
on ds.date between t.startdate and t.enddate
) t
group by t.id, grp;
EDIT:
The following is the version on this SQL Fiddle:
with d as (
select d.*, row_number() over (order by date) as seqnum
from datetable d
where day not in ('Saturday', 'Sunday')
)
select t.id, min(t.date) as startdate, max(t.date) as enddate, sum(duration)
from (select t.*, ds.seqnum, ds.date,
(ds.seqnum - row_number() over (partition by id order by ds.date) ) as grp
from (select t.*, 'abc' as id from table1 t) t join
d ds
on ds.dateid between t.startdate and t.enddate
) t
group by grp;
I believe this is working, but the date table doesn't have all the dates in it.