Adding a Date column based on the next row date value - sql

Im using SQL Server 2005. From the tbl_temp table below, I would like to add an EndDate column based on the next row's StartDate minus 1 day until there's a change in AID and UID combination. This calculated EndDate will go to the row above it as the EndDate. The last row of the group of AID and UID will get the system date as its EndDate. The table has to be ordered by AID, UID, StartDate sequence. Thanks for the help.
-- tbl_temp
AID UID StartDate
1 1 2013-02-20
2 1 2013-02-06
1 1 2013-02-21
1 1 2013-02-27
1 2 2013-02-02
1 2 2013-02-04
-- Result needed
AID UID StartDate EndDate
1 1 2013-02-20 2013-02-20
1 1 2013-02-21 2013-02-26
1 1 2013-02-27 sysdate
1 2 2013-02-02 2013-02-03
1 2 2013-02-04 sysdate
2 1 2013-02-06 sysdate

The easiest way to do this is with a correlated subquery:
select t.*,
(select top 1 dateadd(day, -1, startDate )
from tbl_temp t2
where t2.aid = t.aid and
t2.uid = t.uid and
t2.startdate > t.startdate
) as endDate
from tbl_temp t
To get the current date, use isnull():
select t.*,
isnull((select top 1 dateadd(day, -1, startDate )
from tbl_temp t2
where t2.aid = t.aid and
t2.uid = t.uid and
t2.startdate > t.startdate
), getdate()
) as endDate
from tbl_temp t
Normally, I would recommend coalesce() over isnull(). However, there is a bug in some versions of SQL Server where it evaluates the first argument twice. Normally, this doesn't make a difference, but with a subquery it does.
And finally, the use of sysdate makes me think of Oracle. The same approach will work there too.

;WITH x AS
(
SELECT AID, UID, StartDate,
ROW_NUMBER() OVER(PARTITION BY AID, UID ORDER BY StartDate) AS rn
FROM tbl_temp
)
SELECT x1.AID, x1.UID, x1.StartDate,
COALESCE(DATEADD(day,-1,x2.StartDate), CAST(getdate() AS date)) AS EndDate
FROM x x1
LEFT OUTER JOIN x x2 ON x2.AID = x1.AID AND x2.UID = x1.UID
AND x2.rn = x1.rn + 1
ORDER BY x1.AID, x1.UID, x1.StartDate
SQL Fiddle example

Related

Select which has matching date or latest date record

Here are two tables.
ItemInfo
Id Description
1 First Item
2 Second Item
ItemInfoHistory
Id ItemInfoId Price StartDate EndDate
1 1 45 2020-09-01 2020-09-15
2 2 55 2020-09-26 null
3 1 50 2020-09-16 null
Here is SQL query.
SELECT i.Id, Price, StartDate, EndDate
FROM Itemsinfo i
LEFT JOIN ItemInfoHistory ih ON i.id= ih.ItemsMasterId AND CONVERT(DATE, GETDATE()) >= StartDate AND ( CONVERT(DATE, GETDATE()) <= EndDate OR EndDate IS NULL)
Which gives following results, when runs the query on 9/20
Id Price StartDate EndDate
1 50 2020-09-16 NULL
2 NULL NULL NULL
For the second item, I want to get latest record from history table, as shown below.
Id Price StartDate EndDate
1 50 2020-09-16 NULL
2 55 2020-09-26 NULL
Thanks in advance.
Probably the most efficient method is two joins. Assuming the "latest" record has a NULL values for EndDate, then:
SELECT i.Id,
COALESCE(ih.Price, ih_last.Price) as Price,
COALESCE(ih.StartDate, ih_last.StartDate) as StartDate,
COALESCE(ih.EndDate, ih_last.EndDate) as EndDate
FROM Itemsinfo i LEFT JOIN
ItemInfoHistory ih
ON i.id = ih.ItemsMasterId AND
CONVERT(DATE, GETDATE()) >= StartDate AND
(CONVERT(DATE, GETDATE()) <= EndDate OR EndDate IS NULL) LEFT JOIN
ItemInfoHistory ih_last
ON i.id = ih_last.ItemsMasterId AND
ih_last.EndDate IS NULL;
Actually, the middle join doesn't need to check for NULL, so that could be removed.

SQL - Find if column dates include at least partially a date range

I need to create a report and I am struggling with the SQL script.
The table I want to query is a company_status_history table which has entries like the following (the ones that I can't figure out)
Table company_status_history
Columns:
| id | company_id | status_id | effective_date |
Data:
| 1 | 10 | 1 | 2016-12-30 00:00:00.000 |
| 2 | 10 | 5 | 2017-02-04 00:00:00.000 |
| 3 | 11 | 5 | 2017-06-05 00:00:00.000 |
| 4 | 11 | 1 | 2018-04-30 00:00:00.000 |
I want to answer to the question "Get all companies that have been at least for some point in status 1 inside the time period 01/01/2017 - 31/12/2017"
Above are the cases that I don't know how to handle since I need to add some logic of type :
"If this row is status 1 and it's date is before the date range check the next row if it has a date inside the date range."
"If this row is status 1 and it's date is after the date range check the row before if it has a date inside the date range."
I think this can be handled as a gaps and islands problem. Consider the following input data: (same as sample data of OP plus two additional rows)
id company_id status_id effective_date
-------------------------------------------
1 10 1 2016-12-15
2 10 1 2016-12-30
3 10 5 2017-02-04
4 10 4 2017-02-08
5 11 5 2017-06-05
6 11 1 2018-04-30
You can use the following query:
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
ORDER BY company_id, effective_date
to get:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 0
2 10 1 2016-12-30 1
3 10 5 2017-02-04 2
4 10 4 2017-02-08 2
5 11 5 2017-06-05 0
6 11 1 2018-04-30 0
Now you can identify status = 1 islands using:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
)
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
Output:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 1
2 10 1 2016-12-30 1
3 10 5 2017-02-04 1
4 10 4 2017-02-08 2
5 11 5 2017-06-05 1
6 11 1 2018-04-30 2
Calculated field grp will help us identify those islands:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
), CTE2 AS
(
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
)
SELECT company_id,
MIN(effective_date) AS start_date,
CASE
WHEN COUNT(*) > 1 THEN DATEADD(DAY, -1, MAX(effective_date))
ELSE MIN(effective_date)
END AS end_date
FROM CTE2
GROUP BY company_id, grp
HAVING COUNT(CASE WHEN status_id = 1 THEN 1 END) > 0
Output:
company_id start_date end_date
-----------------------------------
10 2016-12-15 2017-02-03
11 2018-04-30 2018-04-30
All you want know is those records from above that overlap with the specified interval.
Demo here with somewhat more complicated use case.
Maybe this is what you are looking for? For these kind of questions, you need to join two instance of your table, in this case I am just joining with next record by Id, which probably is not totally correct. To do it better, you can create a new Id using a windowed function like row_number, ordering the table by your requirement criteria
If this row is status 1 and it's date is before the date range check
the next row if it has a date inside the date range
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
else NULL
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
Implementing second criteria:
"If this row is status 1 and it's date is after the date range check
the row before if it has a date inside the date range."
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
when csh1.status_id=1 and csh1.effective_date>#range_en
then
case
when csh3.effective_date between #range_st and #range_en then true
else false
end
else null -- ¿?
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
left join company_status_history csh3
on csh1.id=csh3.id-1
I would suggest the use of a cte and the window functions ROW_NUMBER. With this you can find the desired records. An example:
DECLARE #t TABLE(
id INT
,company_id INT
,status_id INT
,effective_date DATETIME
)
INSERT INTO #t VALUES
(1, 10, 1, '2016-12-30 00:00:00.000')
,(2, 10, 5, '2017-02-04 00:00:00.000')
,(3, 11, 5, '2017-06-05 00:00:00.000')
,(4, 11, 1, '2018-04-30 00:00:00.000')
DECLARE #StartDate DATETIME = '2017-01-01';
DECLARE #EndDate DATETIME = '2017-12-31';
WITH cte AS(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) AS rn
FROM #t
),
cteLeadLag AS(
SELECT c.*, ISNULL(c2.effective_date, c.effective_date) LagEffective, ISNULL(c3.effective_date, c.effective_date)LeadEffective
FROM cte c
LEFT JOIN cte c2 ON c2.company_id = c.company_id AND c2.rn = c.rn-1
LEFT JOIN cte c3 ON c3.company_id = c.company_id AND c3.rn = c.rn+1
)
SELECT 'Included' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Following' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date > #EndDate
AND LagEffective BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Trailing' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date < #EndDate
AND LeadEffective BETWEEN #StartDate AND #EndDate
I first select all records with their leading and lagging Dates and then I perform your checks on the inclusion in the desired timespan.
Try with this, self-explanatory. Responds to this part of your question:
I want to answer to the question "Get all companies that have been at
least for some point in status 1 inside the time period 01/01/2017 -
31/12/2017"
Case that you want to find those id's that have been in any moment in status 1 and have records in the period requested:
SELECT *
FROM company_status_history
WHERE id IN
( SELECT Id
FROM company_status_history
WHERE status_id=1 )
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'
Case that you want to find id's in status 1 and inside the period:
SELECT *
FROM company_status_history
WHERE status_id=1
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'

SQL Server: Count days difference between previous date and current date

I've been trying to find a way to count days difference between two dates from previous and current rows which counting only business days.
Example data and criteria here.
ID StartDate EndDate NewDate DaysDifference
========================================================================
0 04/05/2017 null
1 12/06/2017 16/06/2017 12/06/2017 29
2 03/07/2017 04/07/2017 16/06/2017 13
3 07/07/2017 10/07/2017 04/07/2017 5
4 12/07/2017 26/07/2017 10/07/2017 13
My end goal is
I want two new columns; NewDate and DayDifference.
NewDate column is from EndDate from previous row. As you can see that for example, NewDate of ID 2 is 16/06/2017 which come from EndDate of ID 1. But if value in EndDate of previous row is null, use its StartDate instead(ID 1 case).
DaysDifference column is from counting only business days between EndDate and NewDate columns.
Here is script that I am using atm.
select distinct
c.ID
,c.EndDate
,isnull(p.EndDate,c.StartDate) as NewDate
,count(distinct cast(l.CalendarDate as date)) as DaysDifference
from
(select *
from table) c
full join
(select *
from table) p
on c.level = p.level
and c.id-1 = p.id
left join Calendar l
on (cast(l.CalendarDate as date) between cast(p.EndDate as date) and cast(c.EndDate as date)
or
cast(l.CalendarDate as date) between cast(p.EndDate as date) and cast(c.StartDate as date))
and l.Day not in ('Sat','Sun') and l.Holiday <> 'Y'
where c.ID <> 0
group by
c.ID
,c.EndDate
,isnull(p.EndDate,c.StartDate)
And this's the current result :
ID EndDate NewDate DaysDifference
=========================================================
1 16/06/2017 12/06/2017 0
2 04/07/2017 16/06/2017 13
3 10/07/2017 04/07/2017 5
4 26/07/2017 10/07/2017 13
Seems like in the real data, I've got correct DaysDifference for ID 2,3,4 except ID 1 because of the null value from its previous row(ID 0) that printing StartDate instead of null EndDate, so it counts incorrectly.
Hope I've provided enough info. :)
Could you please guide me a way to count DaysDifference correctly.
Thanks in advance!
I think you can use this logic to get the previous date:
select t.*,
lag(coalesce(enddate, startdate), 1) over (order by 1) as newdate
from t;
Then for the difference:
select id, enddate, newdate,
sum(case when c.day not in ('Sat', 'Sun') and c.holiday <> 'Y' then 1 else 0 end) as diff
from (select t.*,
lag(coalesce(enddate, startdate), 1) over (order by 1) as newdate
from t
) t join
calendar c
on c.calendardate >= newdate and c.calendardate <= startdate
group by select id, enddate, newdate;

SQL Server- find all records within a certain date (not that straightforward!)

Ok. My SQL is pretty pants so I'm struggling to get my head around this.
I have a table that stores records complete with a time stamp.
What I want, is a list of uids where there are 2 or more records for that user within a time frame of 1 second of each other. Maybe I've made it more complicated in my head, just cannot figure it out.
Shortened version of table (pk ignored)
uid date
1 2015-01-01 10:00:30.020*
1 2015-01-01 10:00:30.300*
1 2015-01-01 10:00:30.500*
1 2015-01-01 10:00:39.000
1 2015-01-01 10:00:35.000
1 2015-01-01 10:00:37.800
2 2015-02-02 12:00:30.000
2 2015-02-02 14:00:30.000
2 2015-02-02 15:00:30.000
2 2015-02-02 18:00:30.000
3 2015-03-02 15:00:24.000
3 2015-03-02 15:00:20.000 *
3 2015-03-02 15:00:20.300 *
I've marked * next to the records I'd expect to match.
The results list I'd like is just a list of uid, so the result I'd want would just be
1
3
You can do this with exists:
select distinct uid
from t
where exists (select 1
from t t2
where t2.uid = t.uid and
t2.date > t.date and
t2.date <= t.date + interval 1 second
);
Note: The syntax for adding 1 second varies by database. But the above gives the idea for the logic.
In SQL Server, the syntax is:
select distinct uid
from t
where exists (select 1
from t t2
where t2.uid = t.uid and
t2.date > t.date and
t2.date <= dateadd(second, 1, t.date)
);
EDIT:
Or, in SQL Server 2012+, a faster alternative is to use lead() or lag():
select distinct uid
from (select t.*, lead(date) over (partition by uid order by date) as next_date
from t
) t
where next_date < dateadd(second, 1, date);
If you want the records, not just the uids, then you need to get both:
select t.*
from (select t.*,
lag(date) over (partition by uid order by date) as prev_date,
lead(date) over (partition by uid order by date) as next_date
from t
) t
where next_date <= dateadd(second, 1, date) or
prev_date >= dateadd(second, -1, date);

Concatenation of adjacent dates in SQL

I would like to know how to make intersections or concatenations of adjacent date ranges in sql.
I have a list of customer start and end dates, for example (in dd/mm/yyyy format, where 31/12/9999 means the customer is still a current customer).
CustID | StartDate | Enddate |
1 | 01/08/2011|19/06/2012|
1 | 20/06/2012|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|10/10/2010|
2 | 11/10/2010|31/12/9999|
3 | 01/08/2010|19/08/2010|
3 | 20/08/2010|26/12/2011|
Although the dates in different rows don't overlap, I would consider some of the ranges as a contigous period of time, e.g when the start date comes one day after an end date (for a given customer). Hence I would like to return a query that returns just the intersection of the dates,
CustID | StartDate | Enddate |
1 | 01/08/2011|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|31/12/9999|
3 | 01/08/2010|26/12/2011|
I've looked at CTE tables, but I can't figure out how to return just one row for one contigous block of dates.
This should work in 2005 forward:
;WITH cte2 AS (SELECT 0 AS Number
UNION ALL
SELECT Number + 1
FROM cte2
WHERE Number < 10000)
SELECT CustID, Min(GroupStart) StartDate, MAX(EndDate) EndDate
FROM (SELECT *
, DATEADD(DAY,b.number,a.StartDate) GroupStart
, DATEADD(DAY,1- DENSE_RANK() OVER (PARTITION BY CustID ORDER BY DATEADD(DAY,b.number,a.StartDate)),DATEADD(DAY,b.number,a.StartDate)) GroupDate
FROM Table1 a
JOIN cte2 b
ON b.number <= DATEDIFF(d, startdate, EndDate)
) X
GROUP BY CustID, GroupDate
ORDER BY CustID, StartDate
OPTION (MAXRECURSION 0)
Demo: SQL Fiddle
You can build a quick table of numbers 0-something large enough to cover the spread of dates in your ranges to replace the cte so it doesn't run each time, indexed properly it will run quickly.
you can do this with recursive common table expression:
with cte as (
select t.CustID, t.StartDate, t.EndDate, t2.StartDate as NextStartDate
from Table1 as t
left outer join Table1 as t2 on t2.CustID = t.CustID and t2.StartDate = case when t.EndDate < '99991231' then dateadd(dd, 1, t.EndDate) end
), cte2 as (
select c.CustID, c.StartDate, c.EndDate, c.NextStartDate
from cte as c
where c.NextStartDate is null
union all
select c.CustID, c.StartDate, c2.EndDate, c2.NextStartDate
from cte2 as c2
inner join cte as c on c.CustID = c2.CustID and c.NextStartDate = c2.StartDate
)
select CustID, min(StartDate) as StartDate, EndDate
from cte2
group by CustID, EndDate
order by CustID, StartDate
option (maxrecursion 0);
sql fiddle demo
Quick performance tests:
Results on 750 rows, small periods of 2 days length:
sql fiddle demo
My query: 300 ms
Goat CO query with CTE: 10804 ms
Goat CO query with table of fixed numbers: 7 ms
Results on 5 rows, large periods:
sql fiddle demo
My query: 1 ms
Goat CO query with CTE: 700 ms
Goat CO query with table of fixed numbers: 36 ms