Compare dates in different columns and rows dynamically in SQL - sql

I have a set of data like this.
Data
ID Start_dt End_dt
A 1/1/2010 12/31/2010
A 1/1/2011 12/31/2011
A 6/1/2012 12/31/2012
A 1/1/2014 12/31/2014
A 1/1/2016 10/31/2016
A 1/1/2018 12/31/2018
B 1/1/2016 2/29/2016
B 3/1/2016 10/31/2016
B 1/1/2017 7/31/2017
B 1/1/2019 12/31/9999
C 1/1/2017 12/31/2017
C 1/1/2017 12/31/2018
C 1/1/2019 12/31/9999
I need to create a query that looks at each member's row, compares the current Start_dt against the previous End_dt. If the difference is less than one year, treat those 2 records as one continuous enrollment and return the combined MIN Start_dt and MAX End_dt, and repeat that for all rows for each member. If the difference is >=1 year, treat that as separate enrollment.
Desired result
ID Start_dt End_dt
A 1/1/2010 12/31/2012
A 1/1/2014 12/31/2014
A 1/1/2016 10/31/2016
A 1/1/2018 12/31/2018
B 1/1/2016 7/31/2017
B 1/1/2019 12/31/2019
C 1/1/2017 12/31/9999
Here's a Create Table query:
if OBJECT_ID ('tempdb..#test1') is not null
drop table #test1
CREATE TABLE #test1 (
ID varchar(10),
Start_dt datetime,
End_dt datetime
);
INSERT INTO #test1 VALUES ('A', '1/1/2010', '12/31/2010')
,('A', '1/1/2011', '12/31/2011')
,('A', '6/1/2012', '12/31/2012')
,('A', '1/1/2014', '12/31/2014')
,('A', '1/1/2016', '10/31/2016')
,('A', '1/1/2018', '12/31/2018')
,('B', '1/1/2016', '2/29/2016')
,('B', '3/1/2016', '10/31/2016')
,('B', '1/1/2017', '7/31/2017')
,('B', '1/1/2019', '12/31/9999')
,('C', '1/1/2017', '12/31/2017')
,('C', '1/1/2017', '12/31/2018')
,('C', '1/1/2019', '12/31/2999')
I've been trying to solve this for days but have tried self-joins, loops but have not found a good solution. Can someone help?
Thank you!

You can use lag() or a cumulative max() to get the previous end date. Then compare it to the current start date.
When the difference is more than a year, then a new group starts. Do a cumulative sum of these new group starts to get a grouping id.
And the rest is aggregation:
select id, min(start_dt), max(end_dt)
from (select t1.*,
sum(case when prev_end_dt > dateadd(year, -1, start_dt) then 0 else 1 end) over
(partition by id order by start_dt) as grp
from (select t1.*,
max(end_dt) over (partition by id
order by start_dt
rows between unbounded preceding and 1 preceding
) as prev_end_dt
from test1 t1
) t1
) t1
group by id, grp
order by id, min(start_dt);

You could try this query
SELECT ID, StartDate, End_dt AS EndDate
FROM (
SELECT *
, LAG(End_dt) OVER(PARTITION BY ID ORDER BY ID, Start_dt, End_dt) AS PrevEnd
, DATEDIFF(DAY, LAG(End_dt) OVER(PARTITION BY ID ORDER BY ID, Start_dt, End_dt), Start_dt) AS DaysBreak
, (
CASE
WHEN DATEDIFF(DAY, LAG(End_dt) OVER(PARTITION BY ID ORDER BY ID, Start_dt, End_dt), Start_dt) > 365 THEN Start_dt
WHEN LAG(End_dt) OVER(PARTITION BY ID ORDER BY ID, Start_dt, End_dt) IS NULL THEN Start_dt
ELSE NULL
END
) AS StartDate
FROM #test1
) a
WHERE StartDate IS NOT NULL

Related

create time range with 2 columns date_time

The problem I am facing is how to find distinct time periods from multiple time periods with overlap in Teradata ANSI SQL.
For example, the attached tables contain multiple overlapping time periods, how can I combine those time periods into 3 unique time periods in Teradata SQL???
I think I can do it in python with the loop function, but not sure how to do it in SQL
ID
Start Date
End Date
001
2005-01-01
2006-01-01
001
2005-01-01
2007-01-01
001
2008-01-01
2008-06-01
001
2008-04-01
2008-12-01
001
2010-01-01
2010-05-01
001
2010-04-01
2010-12-01
001
2010-11-01
2012-01-01
My expected result is:
ID
start_Date
end_date
001
2005-01-01
2007-01-01
001
2008-01-01
2008-12-01
001
2010-01-01
2012-01-01
From Oracle 12, you can use MATCH_RECOGNIZE to perform a row-by-row comparison:
SELECT *
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY id
ORDER BY start_date
MEASURES
FIRST(start_date) AS start_date,
MAX(end_date) AS end_date
ONE ROW PER MATCH
PATTERN (overlapping_ranges* last_range)
DEFINE overlapping_ranges AS NEXT(start_date) <= MAX(end_date)
)
Which, for the sample data:
CREATE TABLE table_name (ID, Start_Date, End_Date) AS
SELECT '001', DATE '2005-01-01', DATE '2006-01-01' FROM DUAL UNION ALL
SELECT '001', DATE '2005-01-01', DATE '2007-01-01' FROM DUAL UNION ALL
SELECT '001', DATE '2008-01-01', DATE '2008-06-01' FROM DUAL UNION ALL
SELECT '001', DATE '2008-04-01', DATE '2008-12-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-01-01', DATE '2010-05-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-04-01', DATE '2010-12-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-11-01', DATE '2012-01-01' FROM DUAL;
Outputs:
ID
START_DATE
END_DATE
001
2005-01-01 00:00:00
2007-01-01 00:00:00
001
2008-01-01 00:00:00
2008-12-01 00:00:00
001
2010-01-01 00:00:00
2012-01-01 00:00:00
db<>fiddle here
Update: Alternative query
SELECT id,
start_date,
end_date
FROM (
SELECT id,
dt,
SUM(cnt) OVER (PARTITION BY id ORDER BY dt) AS grp,
cnt
FROM (
SELECT ID,
dt,
SUM(type) OVER (PARTITION BY id ORDER BY dt, ROWNUM) * type AS cnt
FROM table_name
UNPIVOT (dt FOR type IN (start_date AS 1, end_date AS -1))
)
WHERE cnt IN (1,0)
)
PIVOT (MAX(dt) FOR cnt IN (1 AS start_date, 0 AS end_date))
Or, an equivalent that does not use UNPIVOT, PIVOT or ROWNUM and works in both Oracle and PostgreSQL:
SELECT id,
MAX(CASE cnt WHEN 1 THEN dt END) AS start_date,
MAX(CASE cnt WHEN 0 THEN dt END) AS end_date
FROM (
SELECT id,
dt,
SUM(cnt) OVER (PARTITION BY id ORDER BY dt) AS grp,
cnt
FROM (
SELECT ID,
dt,
SUM(type) OVER (PARTITION BY id ORDER BY dt, rn) * type AS cnt
FROM (
SELECT r.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY dt ASC, type DESC) AS rn
FROM (
SELECT id, 1 AS type, start_date AS dt FROM table_name
UNION ALL
SELECT id, -1 AS type, end_date AS dt FROM table_name
) r
) p
) s
WHERE cnt IN (1,0)
) t
GROUP BY id, grp
Update 2: Another Alternative
SELECT id,
MIN(start_date) AS start_date,
MAX(end_Date) AS end_date
FROM (
SELECT t.*,
SUM(CASE WHEN start_date <= prev_max THEN 0 ELSE 1 END)
OVER (PARTITION BY id ORDER BY start_date) AS grp
FROM (
SELECT t.*,
MAX(end_date) OVER (
PARTITION BY id ORDER BY start_date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) AS prev_max
FROM table_name t
) t
) t
GROUP BY id, grp
db<>fiddle Oracle PostgreSQL
This is a gaps and islands problem. Try this:
with u as
(select ID, start_date, end_date,
case
when start_date <= lag(end_date) over(partition by ID order by start_date, end_date) then 0
else 1 end as grp
from table_name),
v as
(select ID, start_date, end_date,
sum(grp) over(partition by ID order by start_date, end_date) as island
from u)
select ID, min(start_date) as start_Date, max(end_date) as end_date
from v
group by ID, island;
Fiddle
Basically you can identify "islands" by comparing start_date of current row to end_date of previous row (ordered by start_date, end_date), if it precedes it then it's the same island. Then you can do a rolling sum() to get the island numbers. Finally select min(start_date) and max(end_date) from each island to get the desired output.
This may work ,with little bit of change in function , I tried it in Dbeaver :
select ID,Start_Date,End_Date
from
(
select t.*,
dense_rank () over(partition by extract (year from Start_Date) order BY End_Date desc) drnk
from testing_123 t
) temp
where temp.drnk = 1
ORDER BY Start_Date;
Try this
WITH a as (
SELECT
ID,
LEFT(Start_Date, 4) as Year,
MIN(Start_Date) as New_Start_Date
FROM
TAB1
GROUP BY
ID,
LEFT(Start_Date, 4)
), b as (
SELECT
a.ID,
Year,
New_Start_Date,
End_Date
FROM
a
LEFT JOIN
TAB1
ON LEFT(a.New_Start_Date, 4) = LEFT(TAB1.Start_Date, 4)
)
select
ID,
New_Start_Date as Start_Date,
MAX(End_Date)
from
b
GROUP BY
ID,
New_Start_Date;
Example: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=97f91b68c635aebfb752538cdd752ace

Exclude duplicates and capture only changes

I have a scenario, where I have to exclude duplicates and capture only the changes. Also calculate the valid_from and valid_to on the fly. I have tried a query and it works but it is very slow in performance and it is failing with memory error .
Input : Only capture Entries where there is a change either in Amount/Check-In-Out.
Calculate Valid_from and Valid_to based on Date Changed.
Output:
SQL I tried.
select * from (select
lead(start_date, "window_offset" - rn + 1, '9999-12-31') over (order by "grp" ) as valid_to,
case when rn = max(rn) over (partition by "grp") then 1 else 0 end as "isLastUpdate",
start_date as valid_from,*
from (
select
min("DateChanged") over (partition by "grp") as start_date,
count(*) over (partition by "grp") as "window_offset",
row_number() over (partition by "grp" order by "DateChanged") as rn,
*
from (
select sum("isChanged") over (partition by OrderId order by "DateChanged") as "grp",*
from (
select
case when "Amount" = lag( "Amount" ) over (partition by OrderId order by "DateChanged") and
"Check-In" = lag( "Check-In" ) over (partition by OrderId order by "DateChanged") and
"Check-Out" = lag( "Check-Out" ) over (partition by OrderId order by "DateChanged")
then 0
else 1
end "isChanged",
*
FROM :in_table
)
))
where "isLastUpdate" = 1;
The logic of your expected answer is unclear as to why you get valid_from as 8-mar-21 for the first order_id and 9-apr-21 for the second order_id as both order_ids have overlapping ranges but you take the least of the previous check_out and the next check_in for the first order_id and the greatest of those two for the second and it is inconsistent.
If you want to get valid_from as the greatest of either the current check_in or the previous check_outs and valid_to as the greatest of either the current check_out or the next check_in or, if there are no more rows, 9999-12-31 then:
SELECT orderid,
amount,
check_in,
check_out,
GREATEST(
check_in,
COALESCE(
MAX(check_out) OVER (
PARTITION BY orderid
ORDER BY check_in, check_out
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
),
check_in
)
) AS valid_from,
GREATEST(
check_out,
LEAD(check_in, 1, DATE '9999-12-31') OVER (
PARTITION BY orderid ORDER BY check_in, check_out
)
) AS valid_to
FROM (
SELECT DISTINCT *
FROM table_name
)
Which, for the sample data:
CREATE TABLE table_name (orderid, datechanged, amount, check_in, check_out) AS
SELECT 1, DATE '2021-03-3', 12.12, DATE '2021-03-03', DATE '2021-03-10' FROM DUAL UNION ALL
SELECT 1, DATE '2021-03-3', 12.12, DATE '2021-03-03', DATE '2021-03-10' FROM DUAL UNION ALL
SELECT 1, DATE '2021-03-3', 12.12, DATE '2021-03-03', DATE '2021-03-10' FROM DUAL UNION ALL
SELECT 1, DATE '2021-03-8', 21.12, DATE '2021-03-08', DATE '2021-03-18' FROM DUAL UNION ALL
SELECT 1, DATE '2021-03-8', 21.12, DATE '2021-03-08', DATE '2021-03-18' FROM DUAL UNION ALL
SELECT 2, DATE '2021-04-4', 9.10, DATE '2021-04-04', DATE '2021-04-09' FROM DUAL UNION ALL
SELECT 2, DATE '2021-04-4', 9.10, DATE '2021-04-04', DATE '2021-04-09' FROM DUAL UNION ALL
SELECT 2, DATE '2021-04-4', 10.20, DATE '2021-04-04', DATE '2021-04-12' FROM DUAL;
Outputs:
ORDERID
AMOUNT
CHECK_IN
CHECK_OUT
VALID_FROM
VALID_TO
1
12.12
2021-03-03 00:00:00
2021-03-10 00:00:00
2021-03-03 00:00:00
2021-03-10 00:00:00
1
21.12
2021-03-08 00:00:00
2021-03-18 00:00:00
2021-03-10 00:00:00
9999-12-31 00:00:00
2
9.1
2021-04-04 00:00:00
2021-04-09 00:00:00
2021-04-04 00:00:00
2021-04-09 00:00:00
2
10.2
2021-04-04 00:00:00
2021-04-12 00:00:00
2021-04-09 00:00:00
9999-12-31 00:00:00
db<>fiddle here

Split Overlapping\Merged Dates in SQL

I have a requirement where I will have to split overlapping records on a given table with 2 date fields.
Consider this to be my input table TableT.
ID
EFFECTIVE_DATE
END_DATE
JKL
2016-01-01
2016-12-31
JKL
2016-04-01
2016-12-31
JKL
2016-01-01
2016-03-04
JKL
2016-04-01
2016-12-31
JKL
2016-01-01
2016-12-31
I would want my output to look like below. I need to achieve this in both SQL Server and Oracle\DB2 so I am looking for a generic solution.
ID
EFFECTIVE_DATE
END_DATE
JKL
2016-01-01
2016-03-04
JKL
2016-03-05
2016-03-31
JKL
2016-04-01
2016-12-31
This is what I have tried
With EndDates as (
select END_DATE as END_DATE,TRIM(ID) as ID FROM TableT
union all
select ADD_DAYS(EFFECTIVE_DATE, -1) as END_DATE,TRIM(ID) as ID FROM TableT
), Periods as (
select ID as ID,MIN(EFFECTIVE_DATE) as EFFECTIVE_DATE,
(select MIN(END_DATE) from EndDates e
where e.ID = t.ID and
e.END_DATE >= MIN(EFFECTIVE_DATE)) as END_DATE
from
TableT t
group by ID),
EXTN_PERIOD as (select p.ID as ID, ADD_DAYS(p.END_DATE, 1) as EFFECTIVE_DATE,e.END_DATE as END_DATE
from
Periods p
inner join
EndDates e
on
p.ID = e.ID and
p.END_DATE < e.END_DATE
where
not exists (select * from EndDates e2 where
e2.ID = p.ID and
e2.END_DATE > p.END_DATE and
e2.END_DATE < e.END_DATE)
)
select * from EXTN_PERIOD
union
select * from PERIODS
It works partially fine but does not give me the desired output.
This is what the output I get when I run the above query:
ID
EFFECTIVE_DATE
END_DATE
JKL
2016-01-01
2016-03-04
JKL
2016-03-05
2016-03-31
Thanks in advance!
WITH
/*
MY_TAB (ID, EFFECTIVE_DATE, END_DATE) AS
(
VALUES
('JKL', DATE('2016-01-01'), DATE('2016-12-31'))
, ('JKL', DATE('2016-04-01'), DATE('2016-12-31'))
, ('JKL', DATE('2016-01-01'), DATE('2016-03-04'))
, ('JKL', DATE('2016-04-01'), DATE('2016-12-31'))
, ('JKL', DATE('2016-01-01'), DATE('2016-12-31'))
)
,
*/
A AS
(
SELECT DISTINCT T.ID, DECODE(V.I, 1, T.EFFECTIVE_DATE, 2, T.END_DATE + 1) DT
FROM MY_TAB T, (VALUES 1, 2) V(I)
)
, INTL AS
(
SELECT
ID
, LAG(DT) OVER (PARTITION BY ID ORDER BY DT) AS EFF_DT
, DT AS END_DT
FROM A
)
SELECT ID, EFF_DT, END_DT - 1 AS END_DT
FROM INTL
WHERE EFF_DT IS NOT NULL
ORDER BY 1, 2;
Almost universal. The only customization is the way the "virtual" table with the correlation name V of 2 rows (with INTEGERS 1 and 2) is generated.
The idea is to convert your data first to [inclusive, exclusive) form to simplify further calculations. Then we merge all effective and end dates and construct intervals using the OLAP LAG function. Finally we revert to your [inclusive, inclusive] form.
db<>fiddle link to test.
In Oracle you could do something like this:
with
tablet (id, effective_date, end_date) as (
select 'JKL', date '2016-01-01', date '2016-12-31' from dual union all
select 'JKL', date '2016-04-01', date '2016-12-31' from dual union all
select 'JKL', date '2016-01-01', date '2016-03-04' from dual union all
select 'JKL', date '2016-04-01', date '2016-12-31' from dual union all
select 'JKL', date '2016-01-01', date '2016-12-31' from dual
)
, prep (id, dt) as (
select distinct id, case col when 'EFF' then val else val + 1 end
from tablet
unpivot (val for col in (effective_date as 'EFF', end_date as 'END'))
)
, almost_done (id, effective_date, end_date) as (
select id, dt, lead(dt) over (partition by id order by dt) - 1
from prep
)
select id, effective_date, end_date
from almost_done
where end_date is not null
;
ID EFFECTIVE_DATE END_DATE
--- -------------- ----------
JKL 2016-01-01 2016-03-04
JKL 2016-03-05 2016-03-31
JKL 2016-04-01 2016-12-31
Notice the first CTE (tablet, used to generate testing data - you don't need it in your real-life case). Then, the first step is to unpivot the data; I don't know how SQL Server supports unpivoting, worst case you can do it manually with a cross join. (NOT with UNION ALL - that is inefficient.) Then you remove duplicates, and the rest is easy with the LEAD analytic function, which SQL Server should support too.

Hive SQL - How to collapse records that have continuous date ranges?

How do I create logic to combine multiple records that have continuous date ranges into a single row
the following sample data
Member_key start_date end_date
1 1/1/2017 1/31/2017
1 2/1/2017 2/28/2017
1 3/1/2017 3/31/2017
2 1/1/2017 1/31/2017
2 3/1/2017 3/31/2017
would end up returning the following result set
1 1/1/2017 3/31/2017
2 1/1/2017 1/31/2017
2 3/1/2017 3/31/2017
I found the following link to be very helpful and I am sure I am on the right track but am running into errors when trying to convert the code to hive sql
http://betteratoracle.com/posts/35-collapsing-continuous-ranges-into-single-rows
here's where I am getting stuck (2nd to last line below - with the order by in my max(grp) over .....
with data as(
select
member_key,
case
when datediff(start_date, lag(end_date) over (partition by member_key order by start_date asc)) <= 1 then
null
else
row_number() over ()
end grp,
start_date,
end_date
from default.eligibility_span_test
order by member_key, start_date
)
select member_key, start_date, end_date
, max(grp) over (order by member_key, start_date) sequence
from data
here are the insert statements I am using to add data to a test table:
insert into default.eligibility_span_test values (1, '2017-01-01','2017-01-31');
insert into default.eligibility_span_test values (1, '2017-02-01', '2017-02-28');
insert into default.eligibility_span_test values (1, '2017-03-01', '2017-03-31');
insert into default.eligibility_span_test values (2, '2017-01-01', '2017-01-31');
insert into default.eligibility_span_test values (2, '2017-03-01', '2017-03-31');
Can you try the below query -
with eligibility_span_test as
(
select 1 as Member_key, from_unixtime(unix_timestamp('2017-01-01', 'yyyy-MM-dd'), 'yyyy-MM-dd') as start_date, from_unixtime(unix_timestamp('2017-01-31', 'yyyy-MM-dd'), 'yyyy-MM-dd') end_date
union
select 1 as Member_key, from_unixtime(unix_timestamp('2017-02-01', 'yyyy-MM-dd'), 'yyyy-MM-dd') as start_date, from_unixtime(unix_timestamp('2017-02-28', 'yyyy-MM-dd'), 'yyyy-MM-dd') end_date
union
select 1 as Member_key, from_unixtime(unix_timestamp('2017-03-01', 'yyyy-MM-dd'), 'yyyy-MM-dd') as start_date, from_unixtime(unix_timestamp('2017-03-31', 'yyyy-MM-dd'), 'yyyy-MM-dd') end_date
union
select 2 as Member_key, from_unixtime(unix_timestamp('2017-01-01', 'yyyy-MM-dd'), 'yyyy-MM-dd') as start_date, from_unixtime(unix_timestamp('2017-01-31', 'yyyy-MM-dd'), 'yyyy-MM-dd') end_date
union
select 2 as Member_key, from_unixtime(unix_timestamp('2017-03-01', 'yyyy-MM-dd'), 'yyyy-MM-dd') as start_date, from_unixtime(unix_timestamp('2017-03-31', 'yyyy-MM-dd'), 'yyyy-MM-dd') end_date
),
res as (select member_key, month(start_date) - row_number() over (partition by member_key order by start_date) as groupBy, start_date, end_date from eligibility_span_test)
select member_key, min(start_date), min(end_date) from res group by groupBy, member_key;
Above query will fetch those memberId where we don't have consecutive start and end dates and one memberId if we have consecutive dates

Recursively loop through a SQL table and find intervals based on Start and End Dates

I have a SQL table that contains employeeid, StartDateTime and EndDatetime as follows:
CREATE TABLE Sample
(
SNO INT,
EmployeeID NVARCHAR(10),
StartDateTime DATE,
EndDateTime DATE
)
INSERT INTO Sample
VALUES
( 1, 'xyz', '2018-01-01', '2018-01-02' ),
( 2, 'xyz', '2018-01-03', '2018-01-05' ),
( 3, 'xyz', '2018-01-06', '2018-02-01' ),
( 4, 'xyz', '2018-02-15', '2018-03-15' ),
( 5, 'xyz', '2018-03-16', '2018-03-19' ),
( 6, 'abc', '2018-01-16', '2018-02-25' ),
( 7, 'abc', '2018-03-08', '2018-03-19' ),
( 8, 'abc', '2018-02-26', '2018-03-01' )
I want the result to be displayed as
EmployeeID | StartDateTime | EndDateTime
------------+-----------------+---------------
xyz | 2018-01-01 | 2018-02-01
xyz | 2018-02-15 | 2018-03-19
abc | 2018-01-16 | 2018-03-01
abc | 2018-03-08 | 2018-03-19
Basically, I want to recursively look at records of each employee and datemine the continuity of Start and EndDates and make a set of continuous date records.
I wrote my query as follows:
SELECT *
FROM dbo.TestTable T1
LEFT JOIN dbo.TestTable t2 ON t2.EmpId = T1.EmpId
WHERE t1.EndDate = DATEADD(DAY, -1, T2.startdate)
to see if I could decipher something from the output looking for a pattern. Later realized that with the above approach, I need to join the same table multiple times to get the output I desire.
Also, there is a case that there can be multiple employee records, so I need direction on efficient way of getting this desired output.
Any help is greatly appreciated.
This will do it for you. Use a recursive CTE to get all the adjacent rows, then get the highest end date for each start date, then the first start date for each end date.
;with cte as (
select EmployeeID, StartDateTime, EndDateTime
from sample s
union all
select CTE.EmployeeID, CTE.StartDateTime, s.EndDateTime
from sample s
join cte on cte.EmployeeID=s.EmployeeID and s.StartDateTime=dateadd(d,1,CTE.EndDateTime)
)
select EmployeeID, Min(StartDateTime) as StartDateTime, EndDateTime from (
select EmployeeID, StartDateTime, Max(EndDateTime) as EndDateTime from cte
group by EmployeeID, StartDateTime
) q group by EmployeeID, EndDateTime
You can use this.
WITH T AS (
SELECT S1.SNO,
S1.EmployeeID,
S1.StartDateTime,
ISNULL(S2.EndDateTime, S1.EndDateTime) EndDateTime,
ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId ORDER BY S1.StartDateTime)
- ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId, CASE WHEN S2.StartDateTime IS NULL THEN 0 ELSE 1 END ORDER BY S1.StartDateTime ) RN,
ROW_NUMBER() OVER(PARTITION BY S1.EmployeeId, ISNULL(S2.EndDateTime, S1.EndDateTime) ORDER BY S1.EmployeeId, S1.StartDateTime) RN_END
FROM Sample S1
LEFT JOIN Sample S2 ON DATEADD(DAY,1,S1.EndDateTime) = S2.StartDateTime
)
SELECT EmployeeID, MIN(StartDateTime) StartDateTime,MAX(EndDateTime) EndDateTime FROM T
WHERE RN_END = 1
GROUP BY EmployeeID, RN
ORDER BY EmployeeID DESC, StartDateTime
Result:
EmployeeID StartDateTime EndDateTime
---------- ------------- -----------
xyz 2018-01-01 2018-02-01
xyz 2018-02-15 2018-03-19
abc 2018-01-16 2018-03-01
abc 2018-03-08 2018-03-19