How do I write a split statement in SQL, that divides a record into 2 records if months are different for start and end dates? - sql

I have the following SQL code, and I need to break a row into two rows if the 'COV_PRD_STRT_DT' and 'COV_PRD_END_DT' are different months .
WITH CTE AS
( SELECT COV_PRD_STRT_DT,TO_DATE,COV_PRD_END_DT as MO_END_DT,
case when dateadd (DAY,-DAY(DATEADD(MONTH,1, BF.COV_PRD_STRT_DT)),
DATEADD(MONTH,1, d.from_date))
< To_date THEN DATEADD(DAY,-DAY(DATEADD(MONTH,1, BF.COV_PRD_END_DT)),
DATEADD(MONTH,1, BF.COV_PRD_END_DT))
ELSE To_Date END as MO_END_DT
FROM BILLING_FACT
UNION ALL
SELECT COV_PRD_STRT_DT,To_date,DATEADD(DAY,1,BF.COV_PRD_END_DT) as MO_END_DT, < TO_DATE
THEN DATEADD(DAY,-1,DATEADD(MONTH,1,DATEADD(DAY,1,BF.COV_PRD_END_DT)))
ELSE To_Date END as MO_END_DT
FROM CTE WHERE COV_PRD_END_DT < To_Date
)
select * from CTE order by COV_PRD_STRT_DT,COV_PRD_END_DT
thanks

If start date and end date is not equal on month level, use cov_prd_strt_dt as start date and get its month end date as end date and have the month start date of cov_prd_end_dt as the other row's start date and cov_prd_end_dt as end date to split the dates
SELECT cov_prd_strt_dt strt_dt,
CASE WHEN DATE_PART('Month', cov_prd_strt_dt) <> DATE_PART('Month', cov_prd_end_dt)
THEN LAST_DAY(cov_prd_strt_dt)
ELSE cov_prd_end_dt
END end_dt
FROM billing_fact
UNION
SELECT CASE WHEN DATE_PART('Month', cov_prd_strt_dt) <> DATE_PART('Month', cov_prd_end_dt)
THEN DATE_TRUNC(cov_prd_end_dt)
ELSE cov_prd_strt_dt
END strt_dt,
cov_prd_end_dt end_dt,
FROM billing_fact
Using UNION only will eliminate duplicate records for those that were not processed for split (same month for cov_prd_strt_dt and cov_prd_end_dt).

Nice answers for one or two resulting rows. The general case, a join will do a better job, in performance/relational thinking as well as code readability (I know the last point can be a matter of opinion, but never the less it is important)
First I create a temp table with all the relevant months (but I guess you ought to have a similar table yourself already):
Create temp table months as
Select distinct substr(cov_prd_strt_dt,1,7) yyyy_mm
Union
Select distinct substr(cov_prd_end_dt,1,7) yyyy_mm
Then to the actual join:
Select
*,
substr(cov_prd_start_dt,1,7) start_yyyy_mm,
substr(cov_prd_end_dt,1,7) end_yyyy_mm
From billing_fact f
Join months m
On m.yyyy_mm between f.start_yyyy_mm and f.end_yyyy_mm
I hope this helps

Related

Oracle - Split a record into multiple records

I have a schedule table for each month schedule. And this table also has days off within that month. I need a result set that will tell working days and off days for that month.
Eg.
CREATE TABLE SCHEDULE(sch_yyyymm varchar2(6), sch varchar2(20), sch_start_date date, sch_end_date date);
INSERT INTO SCHEDULE VALUES('201703','Working Days', to_date('03/01/2017','mm/dd/yyyy'), to_date('03/31/2017','mm/dd/yyyy'));
INSERT INTO SCHEDULE VALUES('201703','Off Day', to_date('03/05/2017','mm/dd/yyyy'), to_date('03/07/2017','mm/dd/yyyy'));
INSERT INTO SCHEDULE VALUES('201703','off Days', to_date('03/08/2017','mm/dd/yyyy'), to_date('03/10/2017','mm/dd/yyyy'));
INSERT INTO SCHEDULE VALUES('201703','off Days', to_date('03/15/2017','mm/dd/yyyy'), to_date('03/15/2017','mm/dd/yyyy'));
Using SQL or PL/SQL I need to split the record with Working Days and Off Days.
From above records I need result set as:
201703 Working Days 03/01/2017 - 03/04/2017
201703 Off Days 03/05/2017 - 03/10/2017
201703 Working Days 03/11/2017 - 03/14/2017
201703 Off Days 03/15/2017 - 03/15/2017
201703 Working Days 03/16/2017 - 03/31/2017
Thank You for your help.
Edit: I've had a bit more of a think, and this approach works fine for your insert records above - however, it misses records where there are not continuous "off day" periods. I need to have a bit more of a think and will then make some changes
I've put together a test using the lead and lag functions and a self join.
The upshot is you self-join the "Off Days" onto the existing tables to find the overlaps. Then calculate the start/end dates on either side of each record. A bit of logic then lets us work out which date to use as the final start/end dates.
SQL fiddle here - I used Postgres as the Oracle function wasn't working but it should translate ok.
select sch,
/* Work out which date to use as this record's Start date */
case when prev_end_date is null then sch_start_date
else off_end_date + 1
end as final_start_date,
/* Work out which date to use as this record's end date */
case when next_start_date is null then sch_end_date
when next_start_date is not null and prev_end_date is not null then next_start_date - 1
else off_start_date - 1
end as final_end_date
from (
select a.*,
b.*,
/* Get the start/end dates for the records on either side of each working day record */
lead( b.off_start_date ) over( partition by a.sch_start_date order by b.off_start_date ) as next_start_date,
lag( b.off_end_date ) over( partition by a.sch_start_date order by b.off_start_date ) as prev_end_date
from (
/* Get all schedule records */
select sch,
sch_start_date,
sch_end_date
from schedule
) as a
left join
(
/* Get all non-working day schedule records */
select sch as off_sch,
sch_start_date as off_start_date,
sch_end_date as off_end_date
from schedule
where sch <> 'Working Days'
) as b
/* Join on "Off Days" that overlap "Working Days" */
on a.sch_start_date <= b.off_end_date
and a.sch_end_date >= b.off_start_date
and a.sch <> b.off_sch
) as c
order by final_start_date
If you had a dates table this would have been easier.
You can construct a dates table using a recursive cte and join on to it. Then use the difference of row number approach to classify rows with same schedules on consecutive dates into one group and then get the min and max of each group which would be the start and end dates for a given sch. I assume there are only 2 sch values Working Days and Off Day.
with dates(dt) as (select date '2017-03-01' from dual
union all
select dt+1 from dates where dt < date '2017-03-31')
,groups as (select sch_yyyymm,dt,sch,
row_number() over(partition by sch_yyyymm order by dt)
- row_number() over(partition by sch_yyyymm,sch order by dt) as grp
from (select s.sch_yyyymm,d.dt,
/*This condition is to avoid a given date with 2 sch values, as 03-01-2017 - 03-31-2017 are working days
on one row and there is an Off Day status for some of these days.
In such cases Off Day would be picked up as sch*/
case when count(*) over(partition by d.dt) > 1 then min(s.sch) over(partition by d.dt) else s.sch end as sch
from dates d
join schedule s on d.dt >= s.sch_start_date and d.dt <= s.sch_end_date
) t
)
select sch_yyyymm,sch,min(dt) as start_date,max(dt) as end_date
from groups
group by sch_yyyymm,sch,grp
I couldn't get the recursive cte running in Oracle. Here is a demo using SQL Server.
Sample Demo in SQL Server

Find the date after a gap in date range in sql

I have these date ranges that represent start and end dates of subscription. There are no overlaps in date ranges.
Start Date End Date
1/5/2015 - 1/14/2015
1/15/2015 - 1/20/2015
1/24/2015 - 1/28/2015
1/29/2015 - 2/3/2015
I want to identify delays of more than 1 day between any subscription ending and a new one starting. e.g. for the data above, i want the output: 1/24/2015 - 1/28/2015.
How can I do this using a sql query?
Edit : Also there can be multiple gaps in the subscription date ranges but I want the date range after the latest one.
You do this using a left join or not exists:
select t.*
from t
where not exists (select 1
from t t2
where t2.enddate = dateadd(day, -1, t.startdate)
);
Note that this will also give you the first record in the sequence . . . which, strictly speaking, matches the conditions. Here is one solution to that problem:
select t.*
from t cross join
(select min(startdate) as minsd from t) as x
where not exists (select 1
from t t2
where t2.enddate = dateadd(day, -1, t.startdate)
) and
t.startdate <> minsd;
You can also approach this with window functions:
select t.*
from (select t.*,
lag(enddate) over (order by startdate) as prev_enddate,
min(startdate) over () as min_startdate
from t
) t
where minstartdate <> startdate and
enddate <> dateadd(day, -1, startdate);
Also note that this logic assumes that the time periods do not overlap. If they do, a clearer problem statement is needed to understand what you are really looking for.
You can achieve this using window function LAG() that would get value from previous row in ordered set for later comparison in WHERE clause. Then, in WHERE you just apply your "gapping definition" and discard the first row.
SQL FIDDLE - Test it!
Sample data:
create table dates(start_date date, end_date date);
insert into dates values
('2015-01-05','2015-01-14'),
('2015-01-15','2015-01-20'),
('2015-01-24','2015-01-28'), -- gap
('2015-01-29','2015-02-03'),
('2015-02-04','2015-02-07'),
('2015-02-09','2015-02-11'); -- gap
Query
SELECT
start_date,
end_date
FROM (
SELECT
start_date,
end_date,
LAG(end_date, 1) OVER (ORDER BY start_date) AS prev_end_date
FROM dates
) foo
WHERE
start_date IS DISTINCT FROM ( prev_end_date + 1 ) -- compare current row start_date with previous row end_date + 1 day
AND prev_end_date IS NOT NULL -- discard first row, which has null value in LAG() calculation
I assume that there are no overlaps in your data and that there are unique values for each pair. If that's not the case, you need to clarify this.

SQL query for all the days of a month

i have the following table RENTAL(book_date, copy_id, member_id, title_id, act_ret_date, exp_ret_date). Where book_date shows the day the book was booked. I need to write a query that for every day of the month(so from 1-30 or from 1-29 or from 1-31 depending on month) it shows me the number of books booked.
i currently know how to show the number of books rented in the days that are in the table
select count(book_date), to_char(book_date,'DD')
from rental
group by to_char(book_date,'DD');
my questions are:
How do i show the rest of the days(if let's say for some reason in my database i have no books rented on 20th or 19th or multiple days) and put the number 0 there?
How do i show the number of days only of the current month so(28,29,30,31 all these 4 are possible depending on month or year)... i am lost . This must be done using only SQL query no pl/SQL or other stuff.
The following query would give you all days in the current month, in your case you can replace SYSDATE with your date column and join with this query to know how many for a given month
SELECT DT
FROM(
SELECT TRUNC (last_day(SYSDATE) - ROWNUM) dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')
The answer is to create a table like this:
table yearsmonthsdays (year varchar(4), month varchar(2), day varchar(2));
use any language you wish, e.g. iterate in java with Calendar.getInstance().getActualMaximum(Calendar.DAY_OF_MONTH) to get the last day of the month for as many years and months as you like, and fill that table with the year, month and days from 1 to last day of month of your result.
you'd get something like:
insert into yearsmonthsdays ('1995','02','01');
insert into yearsmonthsdays ('1995','02','02');
...
insert into yearsmonthsdays ('1995','02','28'); /* non-leap year */
...
insert into yearsmonthsdays ('1996','02','01');
insert into yearsmonthsdays ('1996','02','02');
...
insert into yearsmonthsdays ('1996','02','28');
insert into yearsmonthsdays ('1996','02','29'); /* leap year */
...
and so on.
Once you have this table done, your work is almost finished. Make an outer left join between your table and this table, joining year, month and day together, and when no lines appear, the count will be zero as you wish. Without using programming, this is your best bet.
In oracle, you can query from dual and use the conncect by level syntax to generate a series of rows - in your case, dates. From there on, it's just a matter of deciding what dates you want to display (in my example I used all the dates from 2014) and joining on your table:
SELECT all_date, COALESCE (cnt, 0)
FROM (SELECT to_date('01/01/2014', 'dd/mm/yyyy') + rownum - 1 AS all_date
FROM dual
CONNECT BY LEVEL <= 365) d
LEFT JOIN (SELECT TRUNC(book_date), COUNT(book_date) AS cnt
FROM rental
GROUP BY book_date) r ON d.all_date = TRUNC(r.book_date)
There's no need to get ROWNUM involved ... you can just use LEVEL in the CONNECT BY:
WITH d1 AS (
SELECT TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL AS book_date
FROM dual
CONNECT BY TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL <= LAST_DAY(SYSDATE)
)
SELECT TRUNC(d1.book_date), COUNT(r.book_date)
FROM d1 LEFT JOIN rental r
ON TRUNC(d1.book_date) = TRUNC(r.book_date)
GROUP BY TRUNC(d1.book_date);
Simply replace SYSDATE with a date in the month you're targeting for results.
All days of the month based on current date
select trunc(sysdate) - (to_number(to_char(sysdate,'DD')) - 1)+level-1 x from dual connect by level <= TO_CHAR(LAST_DAY(sysdate),'DD')
It did works to me:
SELECT DT
FROM (SELECT TRUNC(LAST_DAY(SYSDATE) - (CASE WHEN ROWNUM=1 THEN 0 ELSE ROWNUM-1 END)) DT
FROM DUAL
CONNECT BY ROWNUM <= 32)
WHERE DT >= TRUNC(SYSDATE, 'MM')
In Oracle SQL the query must look like this to not miss the last day of month:
SELECT DT
FROM(
SELECT trunc(add_months(sysdate, 1),'MM')- ROWNUM dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')

Add one for every row that fulfills where criteria between period

I have a Postgres table that I'm trying to analyze based on some date columns.
I'm basically trying to count the number of rows in my table that fulfill this requirement, and then group them by month and year. Instead of my query looking like this:
SELECT * FROM $TABLE WHERE date1::date <= '2012-05-31'
and date2::date > '2012-05-31';
it should be able to display this for the months available in my data so that I don't have to change the months manually every time I add new data, and so I can get everything with one query.
In the case above I'd like it to group the sum of rows which fit the criteria into the year 2012 and month 05. Similarly, if my WHERE clause looked like this:
date1::date <= '2012-06-31' and date2::date > '2012-06-31'
I'd like it to group this sum into the year 2012 and month 06.
This isn't entirely clear to me:
I'd like it to group the sum of rows
I'll interpret it this way: you want to list all rows "per month" matching the criteria:
WITH x AS (
SELECT date_trunc('month', min(date1)) AS start
,date_trunc('month', max(date2)) + interval '1 month' AS stop
FROM tbl
)
SELECT to_char(y.mon, 'YYYY-MM') AS mon, t.*
FROM (
SELECT generate_series(x.start, x.stop, '1 month') AS mon
FROM x
) y
LEFT JOIN tbl t ON t.date1::date <= y.mon
AND t.date2::date > y.mon -- why the explicit cast to date?
ORDER BY y.mon, t.date1, t.date2;
Assuming date2 >= date1.
Compute lower and upper border of time period and truncate to month (adding 1 to upper border to include the last row, too.
Use generate_series() to create the set of months in question
LEFT JOIN rows from your table with the declared criteria and sort by month.
You could also GROUP BY at this stage to calculate aggregates ..
Here is the reasoning. First, create a list of all possible dates. Then get the cumulative number of date1 up to a given date. Then get the cumulative number of date2 after the date and subtract the results. The following query does this using correlated subqueries (not my favorite construct, but handy in this case):
select thedate,
(select count(*) from t where date1::date <= d.thedate) -
(select count(*) from t where date2::date > d.thedate)
from (select distinct thedate
from ((select date1::date as thedate from t) union all
(select date2::date as thedate from t)
) d
) d
This is assuming that date2 occurs after date1. My model is start and stop dates of customers. If this isn't the case, the query might not work.
It sounds like you could benefit from the DATEPART T-SQL method. If I understand you correctly, you could do something like this:
SELECT DATEPART(year, date1) Year, DATEPART(month, date1) Month, SUM(value_col)
FROM $Table
-- WHERE CLAUSE ?
GROUP BY DATEPART(year, date1),
DATEPART(month, date1)

Effective date statement where date spans multiple columns

I'm working on a DB2 database and trying to get records by effective date. The only catch is the effective date fields are spanned across 4 columns (month, day, century, year). I think I have the date piece figured out in the select but when I add the where clause I'm having problems. (note that I'm using the digits command to pad because the year 2005 yields just 5 in the year field)
select date(concat(digits(vsmo),concat('/',concat(digits(vsdy),
concat('/',concat(digits(vsct),digits(vsyr))))))) from
ddpincgr d
where (SELECT MAX(<NOT SURE WHAT TO PUT IN HERE>) FROM ddpincgr a WHERE a.vgrno = d.vgrno) <= date('1/1/2000')
Ideas?
Turn it into a sub-query
select *
from (select date(concat(digits(vsmo),concat('/',concat(digits(vsdy),
concat('/',concat(digits(vsct),digits(vsyr))))))) as myDate from
ddpincgr d) as myTable
where max(myTable.myDate) <= date('1/1/2000')
Can't you just put the entire concatenation in the select?
select date(concat(digits(vsmo),concat('/',concat(digits(vsdy), concat('/',concat(digits(vsct),digits(vsyr)))))))
from ddpincgr d
where ( SELECT MAX(date(concat(digits(vsmo),concat('/',concat(digits(vsdy), concat('/',concat(digits(vsct),digits(vsyr))))))))
FROM ddpincgr a
WHERE a.vgrno = d.vgrno) <= date('1/1/2000')