Select overlapping holidays per department - sql

I'm trying to check if any of the employees at my company are requesting overlapping holidays. It's the policy that only 1 person per department is allowed to be off at once.
What query should I use? (I am a noob, so tell me if you need more information).
An Example of the table I want to use:
Request ID Employee ID Department ID Start Date End Date
1 10 1 2015-12-20 2016-12-27
2 10 1 2016-06-01 2015-06-14
3 11 1 2015-12-26 2015-12-27
4 11 1 2016-06-09 2016-06-23
5 12 2 2015-12-26 2015-12-26
6 12 2 2016-07-01 2016-07-14
Results:
Request ID Status
1 Not Approved, overlapping 26-27/12
2 Not Approved, overlapping 09-14/06
3 Not Approved, overlapping completely
4 Not Approved, overlapping 09-14/06
5 Approved, not overlapping in this department
6 Approved, not overlapping in this department
In the second phase, I want to compare if the holidays requested, are within a week, containing a bank holiday (I will have a different table with the bank holidays).
Thanks in advance!

One way is with exists:
select e.*,
(case when exists (select 1
from example e2
where e2.departmentid = e.departmentid and
e2.employeeid <> e.employeeid and
e2.startdate <= e.enddate and
e2.enddate >= e.startdate
)
then 'Overlapping'
else 'NotOverlapping'
end) as Status
from example e;
Getting your full message is trickier and depends on the database. The problem are multiple overlaps.
Actually, we can get more information without too much problem:
select e.RequestId,
(case when count(e2.RequestId) = 0
then 'Approved, not overlapping in this department'
when count(e2.RequestId) = 1
then (case when min(e2.startdate) <= e.startdate and
max(e2.enddate) >= e.enddate
then 'Not Approved, overlapping completely'
else 'Not Approved, overlapping partially'
end)
else 'Not Approved, multiple overlaps'
end) as Status
from example e left join
example e2
on e2.departmentid = e.departmentid and
e2.startdate <= e.enddate and
e2.enddate >= e.startdate and
e2.employeeid <> e.employeeid
group by e.RequestId, e.startdate, e.enddate;
Getting the actual dates is trickier, without knowing the database.

create table #holidays(RequestID int, EmployeeID int, DepartmentID int, StartDate date, EndDate date)
insert into #holidays
select 1, 10, 1, '2015-12-20', '2016-12-27'
union select 2, 10, 1, '2016-06-01', '2015-06-14'
union select 3, 11, 1, '2015-12-26', '2015-12-27'
union select 4, 11, 1, '2016-06-09', '2016-06-23'
union select 5, 12, 2, '2015-12-26', '2015-12-26'
union select 6, 12, 2, '2016-07-01', '2016-07-14'
with CTE
as (
select T1.RequestID,max(T2.RequestID) as T2RequestID, count(*) as cnt
from #holidays T1
join #holidays T2 on
T1.DepartmentID = T2.DepartmentID
and T1.StartDate < T2.EndDate and T1.EndDate > T2.StartDate
and T1.RequestID <> T2.RequestID
group by T1.RequestID
)
select H.*,
case
when isnull(CTE.cnt,0) = 0 then 'Approved, not overlapping in this department'
else 'Not Approved, overlapping ' + convert(nvarchar(50),H2.StartDate) + ' - ' + convert(nvarchar(50),H2.EndDate)
end as Res
from #holidays H
left join CTE on H.RequestID = CTE.RequestID
left join #holidays H2 on H2.RequestID = CTE.T2RequestID
Btw, first employer have too big holidays - more than 1 year long :)
Another StartDate > EndDate (what is it ?)
Result

Related

subquery sum and newest record by different type

table employee {
id,
name
}
table payment_record {
id,
type, // 1 is salary, 2-4 is bonus
employee_id,
date_paid,
amount
}
i want to query employee's newest salary and sum(bonus) from some date.
like my payments is like
id, type, employee_id, date_paid, amount
1 1 1 2022-10-01 5000
2 2 1 2022-10-01 1000
3 3 1 2022-10-01 1000
4 1 1 2022-11-01 3000
5 1 2 2022-10-01 1000
6 1 2 2022-11-01 2000
7 2 2 2022-11-01 3000
query date in ['2022-10-01', '2022-11-01']
show me
employee_id, employee_name, newest_salary, sum(bonus)
1 Jeff 3000 2000
2 Alex 2500 3000
which jeff's newest_salary is 3000 becuase there is 2 type = 1(salary) record 5000 and 3000, the newest one is 3000.
and jeff's bonus sum is 1000(type 2) + 1000(type 3) = 2000
the current sql i try is like
select
e.employee_id,
employee.name,
e.newest_salary,
e.bonus
from
(
select
payment_record.employee_id,
SUM(case when type in ('2', '3', '4') then amount end) as bonus,
Max(case when type = '1' then amount end) as newest_salary
from
payment_record
where
date_paid in ('2022-10-01', '2022-11-01')
group by
employee_id
) as e
join
employee
on
employee.id = e.employee_id
order by
employee_id
it's almost done, but the rule of newest_salary is not correct, i just get the max value althought usually the max value is newest record.
The query is below:
SELECT
t1.id employee_id,
t1.name employee_name,
t3.amount newest_salary,
t2.bonus bonus
FROM employee t1
LEFT JOIN
(
SELECT
employee_id,
MAX(CASE WHEN type=1 THEN date_paid END) date_paid,
SUM(CASE WHEN type IN (2,3,4) THEN amount END) bonus
FROM payment_record
WHERE date_paid BETWEEN '2022-10-01' AND '2022-11-01'
GROUP BY employee_id
) t2
ON t1.id=t2.employee_id
LEFT JOIN payment_record t3
ON t3.type=1 AND
t2.employee_id=t3.employee_id AND
t2.date_paid=t3.date_paid
ORDER BY t1.id
db fiddle
I think Postgres is close enough to work with this solution I tested in sql-server, but it should at least be close enough to translate
My approach is to split the payments in the desired range into salary vs bonus, and sum the bonus but use a partitioned row number to identify the newest salary payment for each employee in the desired date range and only join that one to the bonus totals. Note that I used a LEFT JOIN because an employee might not get a bonus.
DECLARE #StartDate DATE = '2022-10-01';
DECLARE #EndDate DATE = '2022-11-01';
with cteSample as ( --BEGIN sample data
SELECT * FROM ( VALUES
(1, 1, 1, CONVERT(DATE,'2022-10-01'), 5000)
, (2, 2, 1, '2022-10-01', 1000)
, (3, 3, 1, '2022-10-01', 1000)
, (4, 1, 1, '2022-11-01', 3000)
, (5, 1, 2, '2022-10-01', 1000)
, (6, 1, 2, '2022-11-01', 2000)
, (7, 2, 2, '2022-11-01', 3000)
) as TabA(Pay_ID, Pay_Type, employee_id, date_paid, amount)
) --END Sample data
, ctePayments as ( --Filter down to just the payments in the date range you are interested in
SELECT Pay_ID, Pay_Type, employee_id, date_paid, amount
FROM cteSample --Replace this with your real table of payments
WHERE date_paid >= #StartDate AND date_paid <= #EndDate
), cteSalary as ( --Identify salary payments in range and order them newest first
SELECT employee_id, amount
, ROW_NUMBER() over (PARTITION BY employee_id ORDER BY date_paid DESC) as Newness
FROM ctePayments as S
WHERE S.Pay_Type = 1
), cteBonus as ( --Identify bonus payments in range and sum them
SELECT employee_id, SUM(amount) as BonusPaid
FROM ctePayments as S
WHERE S.Pay_Type in (2,3,4)
GROUP BY employee_id
)
SELECT S.employee_id, S.amount as SalaryNewest
, COALESCE(B.BonusPaid, 0) as BonusTotal
FROM cteSalary as S --Join the salary list to the bonusa list
LEFT OUTER JOIN cteBonus as B ON S.employee_id = B.employee_id
WHERE S.Newness = 1 --Keep only the newest
Result:
employee_id
SalaryNewest
BonusTotal
1
3000
2000
2
2000
3000

TSQL - dates overlapping - number of days

I have the following table on SQL Server:
ID
FROM
TO
OFFER NUMBER
1
2022.01.02
9999.12.31
1
1
2022.01.02
2022.02.10
2
2
2022.01.05
2022.02.15
1
3
2022.01.02
9999.12.31
1
3
2022.01.15
2022.02.20
2
3
2022.02.03
2022.02.25
3
4
2022.01.16
2022.02.05
1
5
2022.01.17
2022.02.13
1
5
2022.02.05
2022.02.13
2
The range includes the start date but excludes the end date.
The date 9999.12.31 is given (comes from another system), but we could use the last day of the current quarter instead.
I need to find a way to determine the number of days when the customer sees exactly one, two, or three offers. The following picture shows the method upon id 3:
The expected results should be like (without using the last day of the quarter):
ID
# of days when the customer sees only 1 offer
# of days when the customer sees 2 offers
# of days when the customer sees 3 offers
1
2913863
39
0
2
41
0
0
3
2913861
24
17
4
20
0
0
5
19
8
0
I've found this article but it did not enlighten me.
Also I have limited privileges that is I am not able to declare a variable for example so I need to use "basic" TSQL.
Please provide a detailed explanation besides the code.
Thanks in advance!
The following will (for each ID) extract all distinct dates, construct non-overlapping date ranges to test, and will count up the number of offers per range. The final step is to sum and format.
The fact that the start dates are inclusive and the end dates are exclusive while sometimes non-intuitive for the human, actually works well in algorithms like this.
DECLARE #Data TABLE (Id INT, FromDate DATETIME, ToDate DATETIME, OfferNumber INT)
INSERT #Data
VALUES
(1, '2022-01-02', '9999-12-31', 1),
(1, '2022-01-02', '2022-02-10', 2),
(2, '2022-01-05', '2022-02-15', 1),
(3, '2022-01-02', '9999-12-31', 1),
(3, '2022-01-15', '2022-02-20', 2),
(3, '2022-02-03', '2022-02-25', 3),
(4, '2022-01-16', '2022-02-05', 1),
(5, '2022-01-17', '2022-02-13', 1),
(5, '2022-02-05', '2022-02-13', 2)
;
WITH Dates AS ( -- Gather distinct dates
SELECT Id, Date = FromDate FROM #Data
UNION --(distinct)
SELECT Id, Date = ToDate FROM #Data
),
Ranges AS ( --Construct non-overlapping ranges (The ToDate = NULL case will be ignored later)
SELECT ID, FromDate = Date, ToDate = LEAD(Date) OVER(PARTITION BY Id ORDER BY Date)
FROM Dates
),
Counts AS ( -- Calculate days and count offers per date range
SELECT R.Id, R.FromDate, R.ToDate,
Days = DATEDIFF(DAY, R.FromDate, R.ToDate),
Offers = COUNT(*)
FROM Ranges R
JOIN #Data D ON D.Id = R.Id
AND D.FromDate <= R.FromDate
AND D.ToDate >= R.ToDate
GROUP BY R.Id, R.FromDate, R.ToDate
)
SELECT Id
,[Days with 1 Offer] = SUM(CASE WHEN Offers = 1 THEN Days ELSE 0 END)
,[Days with 2 Offers] = SUM(CASE WHEN Offers = 2 THEN Days ELSE 0 END)
,[Days with 3 Offers] = SUM(CASE WHEN Offers = 3 THEN Days ELSE 0 END)
FROM Counts
GROUP BY Id
The WITH clause introduces Common Table Expressions (CTEs) which progressively build up intermediate results until a final select can be made.
Results:
Id
Days with 1 Offer
Days with 2 Offers
Days with 3 Offers
1
2913863
39
0
2
41
0
0
3
2913861
24
17
4
20
0
0
5
19
8
0
Alternately, the final select could use a pivot. Something like:
SELECT Id,
[Days with 1 Offer] = ISNULL([1], 0),
[Days with 2 Offers] = ISNULL([2], 0),
[Days with 3 Offers] = ISNULL([3], 0)
FROM (SELECT Id, Offers, Days FROM Counts) C
PIVOT (SUM(Days) FOR Offers IN ([1], [2], [3])) PVT
ORDER BY Id
See This db<>fiddle for a working example.
Find all date points for each ID. For each date point, find the number of overlapping.
Refer to comments within query
with
dates as
(
-- get all date points
select ID, theDate = FromDate from offers
union -- union to exclude any duplicate
select ID, theDate = ToDate from offers
),
cte as
(
select ID = d.ID,
Date_Start = d.theDate,
Date_End = LEAD(d.theDate) OVER (PARTITION BY ID ORDER BY theDate),
TheCount = c.cnt
from dates d
cross apply
(
-- Count no of overlapping
select cnt = count(*)
from offers x
where x.ID = d.ID
and x.FromDate <= d.theDate
and x.ToDate > d.theDate
) c
)
select ID, TheCount, days = sum(datediff(day, Date_Start, Date_End))
from cte
where Date_End is not null
group by ID, TheCount
order by ID, TheCount
Result :
ID
TheCount
days
1
1
2913863
1
2
39
2
1
41
3
1
2913861
3
2
29
3
3
12
4
1
20
5
1
19
5
2
8
To get to the required format, use PIVOT
dbfiddle demo

SQL - Find if column dates include at least partially a date range

I need to create a report and I am struggling with the SQL script.
The table I want to query is a company_status_history table which has entries like the following (the ones that I can't figure out)
Table company_status_history
Columns:
| id | company_id | status_id | effective_date |
Data:
| 1 | 10 | 1 | 2016-12-30 00:00:00.000 |
| 2 | 10 | 5 | 2017-02-04 00:00:00.000 |
| 3 | 11 | 5 | 2017-06-05 00:00:00.000 |
| 4 | 11 | 1 | 2018-04-30 00:00:00.000 |
I want to answer to the question "Get all companies that have been at least for some point in status 1 inside the time period 01/01/2017 - 31/12/2017"
Above are the cases that I don't know how to handle since I need to add some logic of type :
"If this row is status 1 and it's date is before the date range check the next row if it has a date inside the date range."
"If this row is status 1 and it's date is after the date range check the row before if it has a date inside the date range."
I think this can be handled as a gaps and islands problem. Consider the following input data: (same as sample data of OP plus two additional rows)
id company_id status_id effective_date
-------------------------------------------
1 10 1 2016-12-15
2 10 1 2016-12-30
3 10 5 2017-02-04
4 10 4 2017-02-08
5 11 5 2017-06-05
6 11 1 2018-04-30
You can use the following query:
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
ORDER BY company_id, effective_date
to get:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 0
2 10 1 2016-12-30 1
3 10 5 2017-02-04 2
4 10 4 2017-02-08 2
5 11 5 2017-06-05 0
6 11 1 2018-04-30 0
Now you can identify status = 1 islands using:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
)
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
Output:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 1
2 10 1 2016-12-30 1
3 10 5 2017-02-04 1
4 10 4 2017-02-08 2
5 11 5 2017-06-05 1
6 11 1 2018-04-30 2
Calculated field grp will help us identify those islands:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
), CTE2 AS
(
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
)
SELECT company_id,
MIN(effective_date) AS start_date,
CASE
WHEN COUNT(*) > 1 THEN DATEADD(DAY, -1, MAX(effective_date))
ELSE MIN(effective_date)
END AS end_date
FROM CTE2
GROUP BY company_id, grp
HAVING COUNT(CASE WHEN status_id = 1 THEN 1 END) > 0
Output:
company_id start_date end_date
-----------------------------------
10 2016-12-15 2017-02-03
11 2018-04-30 2018-04-30
All you want know is those records from above that overlap with the specified interval.
Demo here with somewhat more complicated use case.
Maybe this is what you are looking for? For these kind of questions, you need to join two instance of your table, in this case I am just joining with next record by Id, which probably is not totally correct. To do it better, you can create a new Id using a windowed function like row_number, ordering the table by your requirement criteria
If this row is status 1 and it's date is before the date range check
the next row if it has a date inside the date range
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
else NULL
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
Implementing second criteria:
"If this row is status 1 and it's date is after the date range check
the row before if it has a date inside the date range."
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
when csh1.status_id=1 and csh1.effective_date>#range_en
then
case
when csh3.effective_date between #range_st and #range_en then true
else false
end
else null -- ¿?
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
left join company_status_history csh3
on csh1.id=csh3.id-1
I would suggest the use of a cte and the window functions ROW_NUMBER. With this you can find the desired records. An example:
DECLARE #t TABLE(
id INT
,company_id INT
,status_id INT
,effective_date DATETIME
)
INSERT INTO #t VALUES
(1, 10, 1, '2016-12-30 00:00:00.000')
,(2, 10, 5, '2017-02-04 00:00:00.000')
,(3, 11, 5, '2017-06-05 00:00:00.000')
,(4, 11, 1, '2018-04-30 00:00:00.000')
DECLARE #StartDate DATETIME = '2017-01-01';
DECLARE #EndDate DATETIME = '2017-12-31';
WITH cte AS(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) AS rn
FROM #t
),
cteLeadLag AS(
SELECT c.*, ISNULL(c2.effective_date, c.effective_date) LagEffective, ISNULL(c3.effective_date, c.effective_date)LeadEffective
FROM cte c
LEFT JOIN cte c2 ON c2.company_id = c.company_id AND c2.rn = c.rn-1
LEFT JOIN cte c3 ON c3.company_id = c.company_id AND c3.rn = c.rn+1
)
SELECT 'Included' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Following' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date > #EndDate
AND LagEffective BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Trailing' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date < #EndDate
AND LeadEffective BETWEEN #StartDate AND #EndDate
I first select all records with their leading and lagging Dates and then I perform your checks on the inclusion in the desired timespan.
Try with this, self-explanatory. Responds to this part of your question:
I want to answer to the question "Get all companies that have been at
least for some point in status 1 inside the time period 01/01/2017 -
31/12/2017"
Case that you want to find those id's that have been in any moment in status 1 and have records in the period requested:
SELECT *
FROM company_status_history
WHERE id IN
( SELECT Id
FROM company_status_history
WHERE status_id=1 )
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'
Case that you want to find id's in status 1 and inside the period:
SELECT *
FROM company_status_history
WHERE status_id=1
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'

SQL Join two tables by unrelated date

I’m looking to join two tables that do not have a common data point, but common value (date). I want a table that lists the date and total number of hired/terminated employees on that day. Example is below:
Table 1
Hire Date Employee Number Employee Name
--------------------------------------------
5/5/2018 10078 Joe
5/5/2018 10077 Adam
5/5/2018 10078 Steve
5/8/2018 10079 Jane
5/8/2018 10080 Mary
Table 2
Termination Date Employee Number Employee Name
----------------------------------------------------
5/5/2018 10010 Tony
5/6/2018 10025 Jonathan
5/6/2018 10035 Mark
5/8/2018 10052 Chris
5/9/2018 10037 Sam
Desired result:
Date Total Hired Total Terminated
--------------------------------------
5/5/2018 3 1
5/6/2018 0 2
5/7/2018 0 0
5/8/2018 2 1
5/9/2018 0 1
Getting the total count is easy, just unsure as the best approach from the standpoint of "adding" a date column
If you need all dates within some window then you need to join the data to a calendar. You can then left join and sum flags for data points.
DECLARE #StartDate DATETIME = (SELECT MIN(ActionDate) FROM(SELECT ActionDate = MIN(HireDate) FROM Table1 UNION SELECT ActionDate = MIN(TerminationDate) FROM Table2)AS X)
DECLARE #EndDate DATETIME = (SELECT MAX(ActionDate) FROM(SELECT ActionDate = MAX(HireDate) FROM Table1 UNION SELECT ActionDate = MAX(TerminationDate) FROM Table2)AS X)
;WITH AllDates AS
(
SELECT CalendarDate=#StartDate
UNION ALL
SELECT DATEADD(DAY, 1, CalendarDate)
FROM AllDates
WHERE DATEADD(DAY, 1, CalendarDate) <= #EndDate
)
SELECT
CalendarDate,
TotalHired = SUM(CASE WHEN H.HireDate IS NULL THEN NULL ELSE 1 END),
TotalTerminated = SUM(CASE WHEN T.TerminationDate IS NULL THEN NULL ELSE 1 END)
FROM
AllDates D
LEFT OUTER JOIN Table1 H ON H.HireDate = D.CalendarDate
LEFT OUTER JOIN Table2 T ON T.TerminationDate = D.CalendarDate
/* If you only want dates with data points then uncomment out the where clause
WHERE
NOT (H.HireDate IS NULL AND T.TerminationDate IS NULL)
*/
GROUP BY
CalendarDate
I would do this with a union all and aggregations:
select dte, sum(is_hired) as num_hired, sum(is_termed) as num_termed
from (select hiredate as dte, 1 as is_hired, 0 as is_termed from table1
union all
select terminationdate, 0 as is_hired, 1 as is_termed from table2
) ht
group by dte
order by dte;
This does not include the "missing" dates. If you want those, a calendar or recursive CTE works. For instance:
with ht as (
select dte, sum(is_hired) as num_hired, sum(is_termed) as num_termed
from (select hiredate as dte, 1 as is_hired, 0 as is_termed from table1
union all
select terminationdate, 0 as is_hired, 1 as is_termed from table2
) ht
group by dte
),
d as (
select min(dte) as dte, max(dte) as max_dte)
from ht
union all
select dateadd(day, 1, dte), max_dte
from d
where dte < max_dte
)
select d.dte, coalesce(ht.num_hired, 0) as num_hired, coalesce(ht.num_termed) as num_termed
from d left join
ht
on d.dte = ht.dte
order by dte;
Try this one
SELECT ISNULL(a.THE_DATE, b.THE_DATE) as Date,
ISNULL(a.Total_Hire,0) as Total_Hire,
ISNULL (b.Total_Terminate,0) as Total_terminate
FROM (SELECT Hire_date as the_date, COUNT(1) as Total_Hire
FROM TABLE_HIRE GROUP BY HIRE_DATE) a
FULL OUTER JOIN (SELECT Termination_Date as the_date, COUNT(1) as Total_Terminate
FROM TABLE_TERMINATE GROUP BY HIRE_DATE) a
ON a.the_date = b.the_date

How to find contiguous dates in numerous rows in SQL Server

We have a table with service provisions for people. For example:
id people_id dateStart dateEnd
1 1 28.07.14 19.07.16
2 2 14.04.15 16.02.16
3 2 16.02.16 18.04.16
4 2 18.04.16 27.06.16
5 2 27.06.16 19.07.16
6 2 19.07.16 NULL
7 3 24.02.12 17.06.12
8 3 23.07.12 19.09.12
9 3 18.08.14 NULL
10 4 28.06.15 NULL
11 5 19.01.16 NULL
I need to extract distinct people_id's (clients) with real start date of unfinished uninterrupted service that lasts more than year and then count days pass. 'Start date' and 'End date' of two different rows should be the same to be count as contiguous. One client can only have one unfinished service.
So the perfect result for the table above would be:
people_id dateStart lasts(days)
2 14.04.15 472
3 18.08.14 711
4 28.06.15 397
I didn't have problem with a single service:
SELECT
--some other columns from PEOPLE,
p.PEOPLE_ID,
s.DATESTART,
DATEDIFF(DAY, s.DATESTART, GETDATE()) as lasts
FROM
PEOPLE p
INNER JOIN service s on s.ID =
(
SELECT TOP 1 s2.ID
FROM service s2
WHERE s2.PEOPLE_ID = p.PEOPLE_ID
AND s2.DATESTART IS NOT NULL
AND s2.DATEEND IS NULL
ORDER BY s2.DATESTART DESC
)
WHERE
DATEDIFF(DAY, s.DATESTART , GETDATE()) >= 365
But I can't figure out how to determine contiguous services.
You can find where periods of "continuous" service begin by using lag(). Then a cumulative sum of this flag provides a group, which can be used for aggregation:
select people_id, min(datestart) as datestart,
(case when count(dateend) = count(*) then max(dateend) end) as dateend
from (select t.*,
sum(case when prev_dateend = datestart then 0 else 1 end) over
(partition by people_id order by datestart) as grp
from (select t.*,
lag(dateend) over (partition by people_id order by date_start) as prev_dateend
from t
) t
) t
group by people_id, grp
having count(*) > count(dateend);
Try this query:
select PeopleId, min(dateStart) as dateStart, sum(diff) as [lasts(days)] from
(
select P.*, datediff(day,datestart, DateEnd) as diff from
(select peopleId, dateStart,
isnull(dateend, cast(getdate() as date)) as DateEnd
from People
) P
where Dateend in
(select DateStart from People
where PeopleId = P.PeopleId)
or DateEnd = cast(getdate() as date ) -- check for continuous dates
) P1 group by PeopleId having sum(diff)> 365 --check for > one year
The comments in the query should explain things