Adjust overlapping dates without CTE - sql

I am working on to adjust overlapping date in database tables.
Sample data as below
StartDate EndDate UNIT ID
2017-06-09 2017-06-22 1A 21
2017-06-09 2017-06-30 1B 21
2017-07-01 2017-07-31 1B 21
Expected output:
StartDate EndDate UNIT ID
2017-06-09 2017-06-22 1A 21
2017-06-22 2017-06-30 1B 21
2017-07-01 2017-07-31 1B 21
appreciate your help on this.

You can use lead/lag in case of 2012+, since your are using 2008 you can query as below:
;With cte as (
Select *, RowN = Row_Number() over(partition by Id order by EndDate ) from #sampledata
)
Select StartDate = Coalesce (Case when Dateadd(DD, 1, c2.Enddate) = c1.Startdate then c1.Startdate Else c2.Enddate End, c1.StartDate)
,c1.Enddate, c1.Unit, C1.Id
from cte c1 left join cte c2
on c1.RowN = c2.RowN+1
If you still do not want to use cte as above then you can do sub-query as below:
Select StartDate = Coalesce (Case when Dateadd(DD, 1, c2.Enddate) = c1.Startdate then c1.Startdate Else c2.Enddate End, c1.StartDate)
,c1.Enddate, c1.Unit, C1.Id
from (Select *, RowN = Row_Number() over(partition by Id order by EndDate ) from #sampledata ) c1
left join (Select *, RowN = Row_Number() over(partition by Id order by EndDate ) from #sampledata ) c2
on c1.RowN = c2.RowN+1

A little modification to #Kannan's Answer.
Select StartDate = Coalesce (Case when c1.Startdate <= c2.Enddate
then c2.Enddate
Else c1.Startdate
End,
c1.StartDate)
,c1.Enddate, c1.Unit, C1.Id
from
(Select *, RowN = Row_Number() over(partition by Id order by EndDate )
from #sample ) c1
left join
(Select *, RowN = Row_Number() over(partition by Id order by EndDate )
from #sample ) c2
on c1.RowN = c2.RowN+1

Related

Search sequence pattern in SQL Server

I have records like this and I want to search in SQL Server by pattern and sequence like (50,54,50) in value field it should return 02,03,04 any one have idea to do this.
======================================
Id Date Value
01 2020-01-01 50
02 2020-01-02 50
03 2020-01-03 54
04 2020-01-04 50
05 2020-01-05 35
06 2020-01-06 98
07 2020-01-07 13
======================================
There is a request on the user voice site Add support for Row Pattern Recognition in T-SQL (SQL:2016 features R010 and R020) which I believe would allow for this.
In the meantime this should do what you need
WITH T AS
(
SELECT *,
LAG(Id) OVER (ORDER BY Id) AS PrevId,
LAG(value) OVER (ORDER BY Id) AS PrevValue,
LEAD(Id) OVER (ORDER BY Id) AS NextId,
LEAD(value) OVER (ORDER BY Id) AS NextValue
FROM YourTable
)
SELECT PrevId, Id, NextId
FROM T
WHERE PrevValue = 50 AND Value =54 AND NextValue = 50
If you wanted a more flexible approach, you can use cross apply:
select t2.*
from t cross apply
(select string_agg(id, ',') within group (order by date) as ids,
string_agg(value, ',') within group (order by date) as vals
from (select top (3) t2.*
from t t2
where t2.date >= t.date
order by t2.date
) t2
) t2
where vals = '50,54,50';
Here is a db<>fiddle.
If string_agg() were supported as a window function, you could use:
select t.*
from (select t.*,
string_agg(id, ',') within group (order by date) over (order by id rows between current row and 2 following) as ids,
string_agg(value, ',') within group (order by date) over (order by id rows between current row and 2 following) as vals
from t
) t
where vals = '50,54,50';
But alas, it is not.
If I get your requirement correct, yo can try this below logic developed with the help of LAG and LEAD-
DEMO HERE
WITH CTE
AS
(
SELECT Id,Date,
LAG(value,2) OVER(ORDER BY id) lag_2,
LAG(value,1) OVER(ORDER BY id) lag_1,
Value c_val,
LEAD(value,1) OVER(ORDER BY id) lead_1,
LEAD(value,2) OVER(ORDER BY id) lead_2
FROM your_table
)
SELECT Id,Date,
CASE
WHEN (lag_2 = 50 AND lag_1 = 54 AND c_val = 50) OR
(lag_1 = 50 AND c_val = 54 AND lead_1 = 50) OR
(c_val = 50 AND lead_1 = 54 AND lead_2 = 50)
THEN (
CASE
WHEN lead_1 = 54 THEN 02
WHEN c_val = 54 THEN 03
WHEN lag_1 = 54 THEN 04
END
)
ELSE c_val
END
FROM CTE

Create episode for each value with new Begin and End Dates

This is in reference to below Question
Loop through each value to the seq num
But now Client want to see the data differently and started a new thread for this question.
below is the requirement.
This is the data .
ID seqNum DOS Service End Date
1 1 1/1/2017 1/15/2017
1 2 1/16/2017 1/16/2017
1 3 1/17/2017 1/21/2017
1 4 1/22/2017 2/13/2017
1 5 2/14/2017 3/21/2017
1 6 2/16/2017 3/21/2017
Expected outPut:
ID SeqNum DOSBeg DOSEnd
1 1 1/1/2017 1/30/2017
1 2 1/31/2017 3/1/2017
1 3 3/2/2017 3/31/2017
For each DOSBeg, add 29 and that is DOSEnd. then Add 1 to DOSEnd (1/31/2017) is new DOSBeg.
Now add 29 to (1/31/2017) and that is 3/1/2017 which is DOSEnd . Repeat this untill DOSend >=Max End Date i.e 3/21/2017.
Basically, we need episode of 29 days for each ID.
I tried with this code and it is giving me duplicates.
with cte as (
select ID, minDate as DOSBeg,dateadd(day,29,mindate) as DOSEnd
from #temp
union all
select ID,dateadd(day,1,DOSEnd) as DOSBeg,dateadd(day,29,dateadd(day,1,DOSEnd)) as DOSEnd
from cte
)
select ID,DOSBeg,DOSEnd
from cte
OPTION (MAXRECURSION 0)
Here mindate is Minimum DOS for this ID i.e. 1/1/2017
I came up with below logic and this is working fine for me. Is there any better way than this ?
declare #table table (id int, seqNum int identity(1,1), DOS date, ServiceEndDate date)
insert into #table
values
(1,'20170101','20170115'),
(1,'20170116','20170116'),
(1,'20170117','20170121'),
(1,'20170122','20170213'),
(1,'20170214','20170321'),
(1,'20170216','20170321'),
(2,'20170101','20170103'),
(2,'20170104','20170118')
select * into #temp from #table
--drop table #data
select distinct ID, cast(min(DOS) over (partition by ID) as date) as minDate
,row_Number() over (partition by ID order by ID, DOS) as SeqNum,
DOS,
max(ServiceEndDate) over (partition by ID)as maxDate
into #data
from #temp
--drop table #StartDateLogic
with cte as
(select ID,mindate as startdate,maxdate
from #data
union all
select ID,dateadd(day,30,startdate) as startdate,maxdate
from cte
where maxdate >= dateadd(day,30,startdate))
select distinct ID,startdate
into #StartDateLogic
from cte
OPTION (MAXRECURSION 0)
--final Result set
select ID
,ROW_NUMBER() over (Partition by ID order by ID,StartDate) as SeqNum
,StartDate
,dateadd(day,29,startdate) as EndDate
from #StartDateLogic
You were on the right track wit the recursive cte, but you forgot the anchor.
declare #table table (id int, seqNum int identity(1,1), DOS date, ServiceEndDate date)
insert into #table
values
(1,'20170101','20170115'),
(1,'20170116','20170116'),
(1,'20170117','20170121'),
(1,'20170122','20170213'),
(1,'20170214','20170321'),
(1,'20170216','20170321'),
(2,'20170101','20170103'),
(2,'20170104','20170118')
;with dates as(
select top 1 with ties id, seqnum, DOSBeg = DOS, DOSEnd = dateadd(day,29,DOS)
from #table
order by row_number() over (partition by id order by seqnum)
union all
select t.id, t.seqNum, DOSBeg = dateadd(day,1,d.DOSEnd), DOSEnd = dateadd(day,29,dateadd(day,1,d.DOSEnd))
from dates d
inner join #table t on
d.id = t.id and t.seqNum = d.seqNum + 1
)
select *
from dates d
where d.DOSEnd <= (select max(dateadd(month,1,ServiceEndDate)) from #table where id = d.id)
order by id, seqNum

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;

Collapse consecutive similar records into a single record

I have records of people from an old system that I'm trying to convert over to the new system. In the old system, a person might end up with several records for the same location. They could also go from location, to another, and then return to the previous location. Here's some example data:
PersonID | LocationID | StartDate | EndDate
1 | 1 | 1980-07-30 | 2007-07-16
1 | 1 | 2007-07-16 | 2008-01-30
1 | 2 | 2008-01-30 | 2009-03-02
1 | 2 | 2009-03-02 | 2009-11-06
1 | 3 | 2014-07-16 | 2015-01-16
1 | 1 | 2016-01-26 | 2999-12-31
I would like to collapse this data so that I get a date range for any consecutive LocationIDs. For the data above, this is what I would expect:
PersonID | LocationID | StartDate | EndDate
1 | 1 | 1980-07-30 | 2008-01-30
1 | 2 | 2008-01-30 | 2009-11-06
1 | 3 | 2014-07-16 | 2015-01-16
1 | 1 | 2016-01-26 | 2999-12-31
I'm unsure as to how to do this. I previously tried joining to the previous record, but that only works when there's two consecutive locations, not with 3 or more (there could be an undefined number of consecutive records).
select
a.PersonID,
a.LocationID,
a.StartDate,
a.EndDate,
case when a.LocationID = b.LocationID then a.PK_ID else b.PK_ID end as NewID
from employees a
left outer join employees b
on a.PersonID = b.PersonID
and a.PK_ID = b.PK_ID - 1
So, how can I write a query to get the results I need?
Note: we're treating '2999-12-31' are our 'NULL' date field
This is a classic Gaps-and-Islands (Edit- corrected for larger span 2999)
Select [PersonID]
,[LocationID]
,[StartDate] = min(D)
,[EndDate] = max(D)
From (
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
from YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
) G
Group By [PersonID],[LocationID],Grp
Order By [PersonID],min(D)
Returns
PersonID LocationID StartDate EndDate
1 1 1980-07-30 2008-01-30
1 2 2008-01-30 2009-11-06
1 3 2014-07-16 2015-01-16
1 1 2016-01-26 2999-12-31
Using your original query
Select [PersonID]
,[LocationID]
,[StartDate] = min(D)
,[EndDate] = max(D)
From (
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
From (
-- Your Original Query
select
a.PersonID,
a.LocationID,
a.StartDate,
a.EndDate,
case when a.LocationID = b.LocationID then a.PK_ID else b.PK_ID end as NewID
from employees a
left outer join employees b
on a.PersonID = b.PersonID
and a.PK_ID = b.PK_ID - 1
) A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
) G
Group By [PersonID],[LocationID],Grp
Order By [PersonID],min(D)
Requested Comments
Let's break it down to its components.
1) The CROSS APPLY Portion: This will expand a single record into N records. For example:
Declare #YourTable Table ([PersonID] int,[LocationID] int,[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,1,'1980-07-01','1980-07-03' )
,(1,1,'1980-07-02','1980-07-04' ) -- Notice the Overlap
,(1,2,'2008-01-30','2008-02-05')
Select *
from #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
The above query will generate
2) The Grp Portion: Perhaps easier if I provide a simple example:
Declare #YourTable Table ([PersonID] int,[LocationID] int,[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,1,'1980-07-01','1980-07-03' )
,(1,1,'1980-07-02','1980-07-04' ) -- Notice the Overlap
,(1,2,'2008-01-30','2008-02-05')
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
,RN1 = Row_Number() over (Order By D)
,RN2 = Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
from #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
The above query Generates:
RN1 and RN2 are breakouts of the GRP, just to illustrate the mechanic. Notice RN1 minus RN2 equals the GRP. Once we have the GRP, it becomes a simple matter of aggregation via a group by
3) Pulling it all Together:
Declare #YourTable Table ([PersonID] int,[LocationID] int,[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,1,'1980-07-01','1980-07-03' )
,(1,1,'1980-07-02','1980-07-04' ) -- Notice the Overlap
,(1,2,'2008-01-30','2008-02-05')
Select [PersonID]
,[LocationID]
,[StartDate] = min(D)
,[EndDate] = max(D)
From (
Select *
,Grp = Row_Number() over (Order By D) - Row_Number() over (Partition By [PersonID],[LocationID] Order By D)
from #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
) G
Group By [PersonID],[LocationID],Grp
Order By [PersonID],min(D)
Returns
For your sample data, you can use the difference of row numbers approach:
select personid, locationid, min(startdate), max(enddate)
from (select e.*,
row_number() over (partition by personid order by startdate) as seqnum_p,
row_number() over (partition by personid, locationid order by startdate) as seqnum_pl
from employees e
) e
group by (seqnum_p - seqnum_pl), personid, locationid;
This assumes that the start and end dates are contiguous. That is, there is no gap for a given employee at the same location.

problem with sql server stored procedure

;with cte as
(
select FingerId, [Date],
LogTime,
row_number() over(partition by FingerId order by LogTime) as rn
from InOut
)
select C3.EmployeeName,C1.FingerId,
C1.LogTime as InTime,
C2.LogTime as OutTime,
C2.[Date]
from cte as C1
left outer join cte as C2
on C1.FingerId = C2.FingerId and
C1.rn + 1 = C2.rn INNER JOIN
EmployeeMaster as C3 ON C3.Fingerid = C2.Fingerid
where C1.rn % 2 = 1 and C3.EmployeeName = 'xyz' and C1.[Date] between '2011-07-21' and '2011-07-29' order by C2.[Date]
select * From Inout order by LogTime asc
I am having INOUT Table It have 5 record and 3 record is of 2011-07-2011
InOuT Table:
AutoId FingerId LogTime Date
1 22 11:18:48 AM 2011-07-29
2 22 11:19:54 AM 2011-07-29
3 22 11:20:50 AM 2011-07-21
4 22 11:21:54 AM 2011-07-21
5 22 11:21:59 AM 2011-07-21
I am getting this output by this above query
EmployeeName FingerId InTime OutTime Date
xyz 22 11:20:50 AM 11:21:54 AM 2011-07-21
xyz 22 11:18:48 AM 11:19:54 AM 2011-07-29
I Want this type of OutPut:-
EmployeeName FingerId InTime OutTime Date
xyz 22 11:20:50 AM 11:21:54 AM 2011-07-21
xyz 22 11:21:59 AM ---- 2011-07-21
xyz 22 11:18:48 AM 11:19:54 AM 2011-07-29
Here 2nd row is having InTime and i want outtime should display "---" dash.But i m not getting with this query. Query is correct but need to do modify for this.
I've made some modifications to your query which I'm posting now with the changes highlighted in bold:
;WITH cte AS
(
SELECT FingerId, [Date],
LogTime,
ROW_NUMBER() OVER(PARTITION BY FingerId ORDER BY LogTime) AS rn
FROM InOut
)
SELECT C3.EmployeeName,C1.FingerId,
C1.LogTime AS InTime,
C2.LogTime AS OutTime,
COALESCE(C2.[Date], C1.[Date]) AS Date
FROM cte AS C1
LEFT OUTER JOIN cte AS C2 ON C1.FingerId = C2.FingerId AND C1.rn + 1 = C2.rn
INNER JOIN EmployeeMaster AS C3 ON C3.Fingerid = C1.Fingerid
WHERE C1.rn % 2 = 1
AND C3.EmployeeName = 'xyz'
AND C1.[Date] BETWEEN '2011-07-21' AND '2011-07-29'
ORDER BY C1.[Date]
Originally I was thinking about changing C1.[Date] in the SELECT clause to C2.[Date], but then I wasn't sure it would be quite adequate a replacement in case the two dates were different. You can see for yourself which option is better, if any.
You'll have to use C1.[Date] instead of C2.[Date] in your WHERE clause - since you are outer joining to C2 and the row you are missing has a NULL value for C2.[Date].