SQL Gaps/Islands Question - Determine if someone has worked for X years without a Y days break

SQL Gaps/Islands Question - Determine if someone has worked for X years without a Y days break - sql

Working on problem for a company in Japan. The government has some rules such as... If you are on a work visa:
You cannot work for more than 3 years at a company without taking 30 days off
You cannot work for the same staffing company for more than 5 years without taking 6 months off
So we want to figure out if anyone will be violating either rule in the next 30/60/90 days.
Sample data (list of contracts):
if object_id('tempdb..#sampleDates') is not null drop table #sampleDates
create table #sampleDates (UserId int, CompanyID int, WorkPeriodStart datetime, WorkPeriodEnd datetime)
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27809, 972, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27853, 484, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27856, 172, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2015-01-01', '2015-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2016-01-01', '2017-02-28')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2017-01-01', '2017-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2018-01-01', '2018-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2019-01-01', '2020-01-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2020-01-01', '2020-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27897, 179, '2019-10-10', '2020-10-10')
My first issue is possibly overlapping dates. I am close to a solution on that already, but until I know how to solve the Working X years/ Y Days off issue, I'm not sure what the output of my cte or temp table should look like.
I don't expect anyone to do the work for me, but I want to find an article that can tell me:
How can I determine if someone has taken any breaks in the time period, and for how long (gaps between date ranges)?
How can I figure if they will have worked for 3/5 years without a 30/180 days break in the next 30/60/90 days?
This seemed so simple until I started coding the procedure.
Thanks for any help in advance.
EDIT:
For what it's worth, here's my second working attempt at eliminating overlapping dates (first version used a dense_rank approach and it worked until I screwed something up, went with something simple):
;with CJ as (
select UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd from #sampleDates c
)
select
c.CompanyID,
c.WorkPeriodStart,
min(t1.WorkPeriodEnd) as EndDate
from CJ c
inner join CJ t1 on c.WorkPeriodStart <= t1.WorkPeriodEnd and c.UserId = t1.UserId and c.CompanyID = t1.CompanyID
and not exists(select * from CJ t2 where t1.UserId = t2.UserId and t1.CompanyID = t2.CompanyID and t1.WorkPeriodEnd >= t2.WorkPeriodStart AND t1.WorkPeriodEnd < t2.WorkPeriodEnd)
where not exists(select * from CJ c2 where c.UserId = c2.UserId and c.CompanyID = c2.CompanyID and c.WorkPeriodStart > c2.WorkPeriodStart AND c.WorkPeriodStart <= c2.WorkPeriodEnd)
group by c.UserId, c.CompanyID, c.WorkPeriodStart
order by c.UserId, c.WorkPeriodStart

Disclaimer: This is an incomplete answer.
I can continue later, but this shows how to compute the islands. Then identifying the offender ones shouldn't be that complicated.
See augmented example. I added user 27897 that has three islands: 0, 1, and 2. See below:
create table t (UserId int, CompanyID int, WorkPeriodStart date, WorkPeriodEnd date);
insert t (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values
(27809, 972, '2019-10-10', '2020-10-10'),
(27853, 484, '2019-10-10', '2020-10-10'),
(27856, 172, '2019-10-10', '2020-10-10'),
(27857, 1234, '2015-01-01', '2015-12-31'),
(27857, 1234, '2016-01-01', '2017-02-28'),
(27857, 1234, '2017-01-01', '2017-12-31'),
(27857, 1234, '2018-01-01', '2018-12-31'),
(27857, 1234, '2019-01-01', '2020-01-31'),
(27857, 1234, '2020-01-01', '2020-12-31'),
(27897, 179, '2015-05-28', '2015-09-30'),
(27897, 179, '2017-03-11', '2017-04-30'),
(27897, 188, '2017-02-20', '2017-07-07'),
(27897, 179, '2019-10-10', '2020-10-10');
With this data, the query that computes the island for each row can look like:
select *,
sum(hop) over(partition by UserId order by WorkPeriodStart) as island
from (
select *,
case when WorkPeriodStart > dateadd(day, 1, max(WorkPeriodEnd)
over(partition by UserId
order by WorkPeriodStart
rows between unbounded preceding and 1 preceding))
then 1 else 0 end as hop
from t
) x
order by UserId, WorkPeriodStart
Result:
UserId CompanyID WorkPeriodStart WorkPeriodEnd hop island
------ --------- --------------- ------------- --- ------
27809 972 2019-10-10 2020-10-10 0 0
27853 484 2019-10-10 2020-10-10 0 0
27856 172 2019-10-10 2020-10-10 0 0
27857 1234 2015-01-01 2015-12-31 0 0
27857 1234 2016-01-01 2017-02-28 0 0
27857 1234 2017-01-01 2017-12-31 0 0
27857 1234 2018-01-01 2018-12-31 0 0
27857 1234 2019-01-01 2020-01-31 0 0
27857 1234 2020-01-01 2020-12-31 0 0
27897 179 2015-05-28 2015-09-30 0 0
27897 188 2017-02-20 2017-07-07 1 1
27897 179 2017-03-11 2017-04-30 0 1
27897 179 2019-10-10 2020-10-10 1 2
Now, we can augment this query to get the "worked days" for each island, and the "days off" before each island, by doing:
select *,
datediff(day, s, e) + 1 as worked,
datediff(day, lag(e) over(partition by UserId order by island), s) as prev_days_off
from (
select UserId, island, min(WorkPeriodStart) as s, max(WorkPeriodEnd) as e
from (
select *,
sum(hop) over(partition by UserId order by WorkPeriodStart) as island
from (
select *,
case when WorkPeriodStart > dateadd(day, 1, max(WorkPeriodEnd)
over(partition by UserId
order by WorkPeriodStart
rows between unbounded preceding and 1 preceding))
then 1 else 0 end as hop
from t
) x
) y
group by UserId, island
) x
order by UserId, island
Result:
UserId island s e worked prev_days_off
------ ------ ---------- ---------- ------ -------------
27809 0 2019-10-10 2020-10-10 367 <null>
27853 0 2019-10-10 2020-10-10 367 <null>
27856 0 2019-10-10 2020-10-10 367 <null>
27857 0 2015-01-01 2020-12-31 2192 <null>
27897 0 2015-05-28 2015-09-30 126 <null>
27897 1 2017-02-20 2017-07-07 138 509
27897 2 2019-10-10 2020-10-10 367 825
This result is much close to what you need. That data is actually useful to filter rows according to your criteria.

This script merges any overlapping work periods and then calculates the total days worked within the previous 3 and 5 year periods. Then takes this value and determines if this is more than the maximum working days allowed within that period by UserId and CompanyId for the 3 year limit, and just by UserId for the 5 year limit. (Is this a correct interpretation of the rules in your question?)
From this it then simply adds on 30, 60 and 90 days to that total, to see if that larger value would be over the respective limits. Given the different grouping rules, this would be cleaner as 2 queries (no duplication of UserId for 5 year rule) but the result is still a flag against any offending UserId.
In the example below you can see UserId = 27857 only violating the 5 year rule at present, but then also violating the 3 year rule should they stay on for another 60 days. In addition, UserId = 27858 is currently okay, but will violate the 5 year rule in 60 days.
I have made some assumptions about how you define a year and whether or not your WorkPeriodEnd values are inclusive or not, so do check that your required logic is properly applied.
Script
if object_id('tempdb..#sampleDates') is not null drop table #sampleDates
create table #sampleDates (UserId int, CompanyId int, WorkPeriodStart datetime, WorkPeriodEnd datetime)
insert #sampleDates values
(27809, 972, '2019-10-10', '2020-10-10')
,(27853, 484, '2019-10-10', '2020-10-10')
,(27856, 172, '2019-10-10', '2020-10-10')
,(27857, 1234, '2015-01-01', '2015-12-31')
,(27857, 1234, '2016-01-01', '2017-02-28')
,(27857, 1234, '2017-01-01', '2017-12-31')
,(27857, 1234, '2018-01-01', '2018-12-31')
,(27857, 1234, '2019-01-01', '2020-01-31')
,(27857, 1234, '2020-01-01', '2020-05-31')
,(27858, 1234, '2015-01-01', '2015-12-31')
,(27858, 1234, '2016-01-01', '2017-02-28')
,(27858, 1234, '2017-01-01', '2017-12-31')
,(27858, 1234, '2018-01-01', '2018-12-31')
,(27858, 1234, '2019-09-01', '2020-01-31')
,(27858, 1234, '2020-01-01', '2020-08-31')
,(27859, 12345, '2015-01-01', '2015-12-31')
,(27859, 12346, '2016-01-01', '2017-02-28')
,(27859, 12347, '2017-01-01', '2017-12-31')
,(27859, 12348, '2018-01-01', '2018-12-31')
,(27859, 12349, '2019-01-01', '2020-01-31')
,(27859, 12340, '2020-01-01', '2020-12-31')
,(27897, 179, '2019-10-10', '2020-10-10')
;
declare #3YearsAgo date = dateadd(year,-3,getdate());
declare #3YearWorkingDays int = (365*3)-30;
declare #5YearsAgo date = dateadd(year,-5,getdate());
declare #5YearWorkingDays int = (365*5)-(365/2);
with p as
(
select UserId
,CompanyId
,min(WorkPeriodStart) as WorkPeriodStart
,max(WorkPeriodEnd) as WorkPeriodEnd
from(select l.*,
sum(case when dateadd(day,1,l.PrevEnd) < l.WorkPeriodStart then 1 else 0 end) over (partition by l.UserId, l.CompanyId order by l.WorkPeriodStart rows unbounded preceding) as grp
from(select d.*,
lag(d.WorkPeriodEnd) over (partition by d.UserId, d.CompanyId order by d.WorkPeriodEnd) as PrevEnd
from #sampleDates as d
) as l
) as g
group by grp
,UserId
,CompanyId
)
,d as
(
select UserId
,CompanyId
,sum(case when #3YearsAgo < WorkPeriodEnd
then datediff(day
,case when #3YearsAgo between WorkPeriodStart and WorkPeriodEnd then #3YearsAgo else WorkPeriodStart end
,WorkPeriodEnd
)
else 0
end
) as WorkingDays3YearsToToday
,sum(case when #5YearsAgo < WorkPeriodEnd
then datediff(day
,case when #5YearsAgo between WorkPeriodStart and WorkPeriodEnd then #5YearsAgo else WorkPeriodStart end
,WorkPeriodEnd
)
else 0
end
) as WorkingDays5YearsToToday
from p
group by UserId
,CompanyId
)
select UserId
,CompanyId
,#3YearWorkingDays as Limit3Year
,#5YearWorkingDays as Limit5Year
,WorkingDays3YearsToToday
,WorkingDays5YearsToToday
,case when WorkingDays3YearsToToday > #3YearWorkingDays then 1 else 0 end as Violation3YearNow
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) > #5YearWorkingDays then 1 else 0 end as Violation5YearNow
,case when WorkingDays3YearsToToday + 30 > #3YearWorkingDays then 1 else 0 end as Violation3Year30Day
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 30 > #5YearWorkingDays then 1 else 0 end as Violation5Year30Day
,case when WorkingDays3YearsToToday + 60 > #3YearWorkingDays then 1 else 0 end as Violation3Year60Day
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 60 > #5YearWorkingDays then 1 else 0 end as Violation5Year60Day
,case when WorkingDays3YearsToToday + 90 > #3YearWorkingDays then 1 else 0 end as Violation3Year90Day
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 90 > #5YearWorkingDays then 1 else 0 end as Violation5Year90Day
from d
order by UserId
,CompanyId;
Output
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| UserId | CompanyId | Limit3Year | Limit5Year | WorkingDays3YearsToToday | WorkingDays5YearsToToday | Violation3YearNow | Violation5YearNow | Violation3Year30Day | Violation5Year30Day | Violation3Year60Day | Violation5Year60Day | Violation3Year90Day | Violation5Year90Day |
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| 27809 | 972 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 27853 | 484 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 27856 | 172 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 27857 | 1234 | 1065 | 1643 | 1029 | 1760 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 |
| 27858 | 1234 | 1065 | 1643 | 877 | 1608 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 27859 | 12340 | 1065 | 1643 | 365 | 365 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12345 | 1065 | 1643 | 0 | 147 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12346 | 1065 | 1643 | 0 | 424 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12347 | 1065 | 1643 | 147 | 364 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12348 | 1065 | 1643 | 364 | 364 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12349 | 1065 | 1643 | 395 | 395 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27897 | 179 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+

Here is what I ended up with.
<UselessExplanation>
The issues I kept facing were:
How can I handle any and all date range overlaps and determine just the days within the contract date ranges.
The client is STILL using SQL 2008, so I need some old(er) school tsql.
Ensure that the break times (times between contracts) is accurately calculated.
So I chose to come up with my own solution which is probably dumb, being that it needs to generate a record in memory for every Workday/Candidate combination. I do not see the contracts table going beyond the 5-10k record range. Only reason I'm going this direction.
I created a calendar table with every date in it from 1/1/1980 - 12/31/2050
I then left joined the contract ranges against the calendar table by CandidateId. These will be the dates worked.
Any dates in the calendar table that do not match a date within a contract range is a Break Day.
</UselessExplanation>
Calendar table
if object_id('CalendarTable') is not null drop table CalendarTable
go
create table CalendarTable (pk int identity, CalendarDate date )
declare #StartDate date = cast('1980-01-01' as date)
declare #EndDate date = cast('2050-12-31' as date)
while #StartDate <= #EndDate
begin
insert into CalendarTable ( CalendarDate ) values ( #StartDate )
set #StartDate = dateadd(dd, 1, #StartDate)
end
go
Query for 5 year violations (working 5 years without a 6 month cool off period)
declare #enddate date = dateadd(dd, 30, getdate())
declare #beginDate date = dateadd(dd, -180, dateadd(year, -5, getdate()))
select poss.CandidateId,
min(work.CalendarDate) as FirstWorkDate,
count(work.CandidateId) as workedDays,
sum(case when work.CandidateId is null then 1 else 0 end) as breakDays,
case when count(work.CandidateId) > (365*5) and sum(case when work.CandidateId is null then 1 else 0 end) < (365/2) then 1 else 0 end as Year5Violation,
case when count(work.CandidateId) > (365*5) and sum(case when work.CandidateId is null then 1 else 0 end) < (365/2) then DATEADD(year, 5, min(work.CalendarDate)) else null end as ViolationDate
from
(
select cand.CandidateId, cal.CalendarDate
from CalendarTable cal
join (select distinct c.CandidateId from contracts c where c.WorkPeriodStart is not null and c.WorkPeriodEnd is not null and c.Deleted = 0) cand on 1 = 1
where cal.CalendarDate between #beginDate and #enddate
) as poss
left join
(
select distinct c.CandidateId, cal.CalendarDate
from contracts c
join CalendarTable cal on cal.CalendarDate between c.WorkPeriodStart and c.WorkPeriodEnd
where c.WorkPeriodStart is not null and c.WorkPeriodEnd is not null and c.Deleted = 0
) as work on work.CandidateId = poss.CandidateId and work.CalendarDate = poss.CalendarDate
group by poss.CandidateId

Related

SQL Count Rows GROUP BY Month Name

I have a table and it has the following structure:
DEVICE_ID | DATE | STATUS
------------------------------------------
1 | 2021/01/05 | accepted
2 | 2021/01/23 | success
3 | 2021/02/07 | success
4 | 2021/03/11 | accepted
5 | 2021/03/20 | unsuccess
6 | 2021/03/26 | success
I want to calculate no of records in 2021 by status and GROUP BY month name like this :
MONTH | ACCEPTED | SUCCESS | UNSUCCESS
------------------------------------------------
January | 1 | 1 | 0
February | 0 | 1 | 0
March | 1 | 1 | 1
April | 0 | 0 | 0
May | 0 | 0 | 0
June | 0 | 0 | 0
July | 0 | 0 | 0
August | 0 | 0 | 0
September | 0 | 0 | 0
October | 0 | 0 | 0
November | 0 | 0 | 0
December | 0 | 0 | 0
Please help me to solve this issue

Explanation - because you want the full month list you need to be able to have all 12 months somewhere in the data. Then you want the custom status columns pivoted to display as you asked.
You should next time at least or tell us what you tried. It helps us figure out how youre thinking about it and how we can help you get past whatever roadblocks youve encountered.
IF OBJECT_ID('TempDb..#tmp') IS NOT NULL DROP TABLE #tmp;
IF OBJECT_ID('TempDb..#tmp2') IS NOT NULL DROP TABLE #tmp2;
CREATE TABLE #tmp
(
Device_ID INT
, Date VARCHAR(12)
, Status VARCHAR(15)
)
;
CREATE TABLE #tmp2
(
MOnthName VARCHAR(25)
)
;
INSERT INTO #tmp2
(MonthName)
VALUES
('January'),
('February'),
('March'),
('April'),
('May'),
('June'),
('July'),
('August'),
('September'),
('October'),
('November'),
('December')
;
INSERT INTO #tmp
(
Device_ID
, Date
, Status
)
VALUES
(1,'2021/01/05','accepted'),
(2,'2021/01/23','success'),
(3,'2021/02/07','success'),
(4,'2021/03/11','accepted'),
(5,'2021/03/20','unsuccess'),
(6,'2021/03/26','success')
;
SELECT
MOnthName
, success
, accepted
, unsuccess
FROM
(
SELECT
tt.MonthName
, Status
FROM
#tmp2 tt
LEFT JOIN #tmp t ON tt.MOnthName = DATENAME(month, CAST(Date AS DATE))
GROUP BY
tt.MonthName
, Status
) AS SourceTable
PIVOT
(
COUNT(Status) FOR Status IN ([accepted], [success], [unsuccess])
) AS PivotTable
ORDER BY
CASE
WHEN MonthName ='January' THEN 1
WHEN MonthName ='February' THEN 2
WHEN MonthName ='March' THEN 3
WHEN MonthName ='April' THEN 4
WHEN MonthName ='May' THEN 5
WHEN MonthName ='June' THEN 6
WHEN MonthName ='July' THEN 7
WHEN MonthName ='August' THEN 8
WHEN MonthName ='September' THEN 9
WHEN MonthName ='October' THEN 10
WHEN MonthName ='November' THEN 11
WHEN MonthName ='December' THEN 12
END

create table yourtable(DEVICE_ID int, DATE date, STATUS varchar(50));
insert into yourtable values(1, '2021/01/05' , 'accepted');
insert into yourtable values(2, '2021/01/23' , 'success');
insert into yourtable values(3, '2021/02/07' , 'success');
insert into yourtable values(4, '2021/03/11' , 'accepted');
insert into yourtable values(5, '2021/03/20' , 'unsuccess');
insert into yourtable values(6, '2021/03/26' , 'success');
Query:
;WITH months(MonthNumber) AS
(
SELECT 0
UNION ALL
SELECT MonthNumber+1
FROM months
WHERE MonthNumber < 11
)
SELECT DATENAME(MONTH,DATEADD(MONTH,MonthNumber,'01-01-2021')) AS [month],
coalesce(sum(case when status='ACCEPTED' then 1 end),0) ACCEPTED,
coalesce(sum(case when status='SUCCESS' then 1 end),0) SUCCESS,
coalesce(sum(case when status='UNSUCCESS' then 1 end),0) UNSUCCESS
FROM months m left join yourtable y
on m.monthnumber=month(y.[date])-1
group by monthnumber
Output:
month
ACCEPTED
SUCCESS
UNSUCCESS
January
1
1
0
February
0
1
0
March
1
1
1
April
0
0
0
May
0
0
0
June
0
0
0
July
0
0
0
August
0
0
0
September
0
0
0
October
0
0
0
November
0
0
0
December
0
0
0
db<>fiddle here

Getting duplicate dates while repeating the rows

I'm trying to rotate or repeat the sfhitId(1,2) between the date range for each employee.
Everything is working fine but I don't know how to stop getting duplicate dates means why I am getting dublicate dates and how can I rid from it...
Can anyone help me with this?
My only requirement is if any employeeid has 1 or more than 1 shift then shiftId should repeat between given date range for each employee.
DECLARE #TempTable TABLE (EmployeeId int, ShiftId int)
INSERT INTO #TempTable
SELECT 1 , 1
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 2, 3
DECLARE #StartDate datetime = '2020-03-05',
#EndDate datetime = '2020-03-09';
WITH theDates AS
(
SELECT #StartDate AS theDate
UNION ALL
SELECT DATEADD(DAY, 1, theDate)
FROM theDates
WHERE DATEADD(DAY, 1, theDate) <= #EndDate
)
SELECT theDate, EmployeeID, SHiftId
FROM theDates
CROSS APPLY #TempTable
ORDER BY EmployeeId, theDate
OPTION (MAXRECURSION 0);
and I want result like this...
theDate EmployeeID SHiftId
2020-03-05 1 1
2020-03-06 1 3
2020-03-07 1 1
2020-03-08 1 3
2020-03-09 1 1
2020-03-05 2 3
2020-03-06 2 3
2020-03-07 2 3
2020-03-08 2 3
2020-03-09 2 3

Use window functions to join the 2 tables:
DECLARE #TempTable TABLE (EmployeeId int, ShiftId int)
INSERT INTO #TempTable
SELECT 1 , 1
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 2, 3
DECLARE #StartDate datetime = '2020-03-05',
#EndDate datetime = '2020-03-09';
WITH
theDates AS (
SELECT 1 rn, #StartDate AS theDate
UNION ALL
SELECT rn + 1, DATEADD(DAY, 1, theDate)
FROM theDates
WHERE DATEADD(DAY, 1, theDate) <= #EndDate
),
theShifts AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY ShiftId) rn,
COUNT(*) OVER (PARTITION BY EmployeeId) counter
FROM #TempTable
)
SELECT d.theDate, s.EmployeeID, s.ShiftId
FROM theDates d INNER JOIN theShifts s
ON s.rn % s.counter = d.rn % s.counter
ORDER BY s.EmployeeId, d.theDate
OPTION (MAXRECURSION 0);
See the demo.
Results:
> theDate | EmployeeID | ShiftId
> :---------------------- | ---------: | ------:
> 2020-03-05 00:00:00.000 | 1 | 1
> 2020-03-06 00:00:00.000 | 1 | 3
> 2020-03-07 00:00:00.000 | 1 | 1
> 2020-03-08 00:00:00.000 | 1 | 3
> 2020-03-09 00:00:00.000 | 1 | 1
> 2020-03-05 00:00:00.000 | 2 | 3
> 2020-03-06 00:00:00.000 | 2 | 3
> 2020-03-07 00:00:00.000 | 2 | 3
> 2020-03-08 00:00:00.000 | 2 | 3
> 2020-03-09 00:00:00.000 | 2 | 3

How to return same row multiple times with multiple conditions

My knowledge is pretty basic so your help would be highly appreciated.
I'm trying to return the same row multiple times when it meets the condition (I only have access to select query).
I have a table of more than 500000 records with Customer ID, Start Date and End Date, where end date could be null.
I am trying to add a new column called Week_No and list all rows accordingly. For example if the date range is more than one week, then the row must be returned multiple times with corresponding week number. Also I would like to count overlapping days, which will never be more than 7 (week) per row and then count unavailable days using second table.
Sample data below
t1
ID | Start_Date | End_Date
000001 | 12/12/2017 | 03/01/2018
000002 | 13/01/2018 |
000003 | 02/01/2018 | 11/01/2018
...
t2
ID | Unavailable
000002 | 14/01/2018
000003 | 03/01/2018
000003 | 04/01/2018
000003 | 08/01/2018
...
I cannot pass the stage of adding week no. I have tried using CASE and UNION ALL but keep getting errors.
declare #week01start datetime = '2018-01-01 00:00:00'
declare #week01end datetime = '2018-01-07 00:00:00'
declare #week02start datetime = '2018-01-08 00:00:00'
declare #week02end datetime = '2018-01-14 00:00:00'
...
SELECT
ID,
'01' as Week_No,
'2018' as YEAR,
Start_Date,
End_Date
FROM t1
WHERE (Start_Date <= #week01end and End_Date >= #week01start)
or (Start_Date <= #week01end and End_Date is null)
UNION ALL
SELECT
ID,
'02' as Week_No,
'2018' as YEAR,
Start_Date,
End_Date
FROM t1
WHERE (Start_Date <= #week02end and End_Date >= #week02start)
or (Start_Date <= #week02end and End_Date is null)
...
The new table should look like this
ID | Week_No | Year | Start_Date | End_Date | Overlap | Unavail_Days
000001 | 01 | 2018 | 12/12/2017 | 03/01/2018 | 3 |
000002 | 02 | 2018 | 13/01/2018 | | 2 | 1
000003 | 01 | 2018 | 02/01/2018 | 11/01/2018 | 6 | 2
000003 | 02 | 2018 | 02/01/2018 | 11/01/2018 | 4 | 1
...

business wise i cannot understand what you are trying to achieve. You can use the following code though to calculate your overlapping days etc. I did it the way you asked, but i would recommend a separate table, like a Time dimension to produce a "cleaner" solution
/*sample data set in temp table*/
select '000001' as id, '2017-12-12'as start_dt, ' 2018-01-03' as end_dt into #tmp union
select '000002' as id, '2018-01-13 'as start_dt, null as end_dt union
select '000003' as id, '2018-01-02' as start_dt, '2018-01-11' as end_dt
/*calculate week numbers and week diff according to dates*/
select *,
DATEPART(WK,start_dt) as start_weekNumber,
DATEPART(WK,end_dt) as end_weekNumber,
case
when DATEPART(WK,end_dt) - DATEPART(WK,start_dt) > 0 then (DATEPART(WK,end_dt) - DATEPART(WK,start_dt)) +1
else (52 - DATEPART(WK,start_dt)) + DATEPART(WK,end_dt)
end as WeekDiff
into #tmp1
from
(
SELECT *,DATEADD(DAY, 2 - DATEPART(WEEKDAY, start_dt), CAST(start_dt AS DATE)) [start_dt_Week_Start_Date],
DATEADD(DAY, 8 - DATEPART(WEEKDAY, start_dt), CAST(start_dt AS DATE)) [startdt_Week_End_Date],
DATEADD(DAY, 2 - DATEPART(WEEKDAY, end_dt), CAST(end_dt AS DATE)) [end_dt_Week_Start_Date],
DATEADD(DAY, 8 - DATEPART(WEEKDAY, end_dt), CAST(end_dt AS DATE)) [end_dt_Week_End_Date]
from #tmp
) s
/*cte used to create duplicates when week diff is over 1*/
;with x as
(
SELECT TOP (10) rn = ROW_NUMBER() --modify the max you want
OVER (ORDER BY [object_id])
FROM sys.all_columns
ORDER BY [object_id]
)
/*final query*/
select --*
ID,
start_weekNumber+ (r-1) as Week,
DATEPART(YY,start_dt) as [YEAR],
start_dt,
end_dt,
null as Overlap,
null as unavailable_days
from
(
select *,
ROW_NUMBER() over (partition by id order by id) r
from
(
select d.* from x
CROSS JOIN #tmp1 AS d
WHERE x.rn <= d.WeekDiff
union all
select * from #tmp1
where WeekDiff is null
) a
)a_ext
order by id,start_weekNumber
--drop table #tmp1,#tmp
The above will produce the results you want except the overlap and unavailable columns. Instead of just counting weeks, i added the number of week in the year using start_dt, but you can change that if you don't like it:
ID Week YEAR start_dt end_dt Overlap unavailable_days
000001 50 2017 2017-12-12 2018-01-03 NULL NULL
000001 51 2017 2017-12-12 2018-01-03 NULL NULL
000001 52 2017 2017-12-12 2018-01-03 NULL NULL
000002 2 2018 2018-01-13 NULL NULL NULL
000003 1 2018 2018-01-02 2018-01-11 NULL NULL
000003 2 2018 2018-01-02 2018-01-11 NULL NULL

Display data for all date ranges including missing dates

I'm having a issue with dates. I have a table with given from and to dates for an employee. For an evaluation, I'd like to display each date of the month with corresponding values from the second sql table.
SQL Table:
EmpNr | datefrom | dateto | hours
0815 | 01.01.2019 | 03.01.2019 | 15
0815 | 05.01.2019 | 15.01.2019 | 15
0815 | 20.01.2019 | 31.12.9999 | 40
The given employee (0815) worked during 01.01.-15.01. 15 hours, and during 20.01.-31.01. 40 hours
I'd like to have the following result:
0815 | 01.01.2019 | 15
0815 | 02.01.2019 | 15
0815 | 03.01.2019 | 15
0815 | 04.01.2019 | NULL
0815 | 05.01.2019 | 15
...
0815 | 15.01.2019 | 15
0815 | 16.01.2019 | NULL
0815 | 17.01.2019 | NULL
0815 | 18.01.2019 | NULL
0815 | 19.01.2019 | NULL
0815 | 20.01.2019 | 40
0815 | 21.01.2019 | 40
...
0815 | 31.01.2019 | 40
as for the dates, we have:
declare #year int = 2019, #month int = 1;
WITH numbers
as
(
Select 1 as value
UNion ALL
Select value + 1 from numbers
where value + 1 <= Day(EOMONTH(datefromparts(#year,#month,1)))
)
SELECT b.empnr, b.hours, datefromparts(#year,#month,numbers.value) Datum FROM numbers left outer join
emptbl b on b.empnr = '0815' and (datefromparts(#year,#month,numbers.value) >= b.datefrom and datefromparts(#year,#month,numbers.value) <= case b.dateto )
which is working quite well, yet I have the odd issue, that this code is only shoes the dates between 01.01.2019 and 03.01.2019
thank you very much in advance!

Did you check, if datefrom and dateto is in correct range?
Minimum value of DateTime field is 1753-01-01 and maximum value is 9999-12-31.
Look at your source table to check initial values.

The recursive CTE needs to begin with MIN(datefrom) and MAX(dateto):
DECLARE #t TABLE (empnr INT, datefrom DATE, dateto DATE, hours INT);
INSERT INTO #t VALUES
(815, '2019-01-01', '2019-01-03', 15),
(815, '2019-01-05', '2019-01-15', 15),
(815, '2019-01-20', '9999-01-01', 40),
-- another employee
(999, '2018-01-01', '2018-01-31', 15),
(999, '2018-03-01', '2018-03-31', 15),
(999, '2018-12-01', '9999-01-01', 40);
WITH rcte AS (
SELECT empnr
, MIN(datefrom) AS refdate
, ISNULL(NULLIF(MAX(dateto), '9999-01-01'), CURRENT_TIMESTAMP) AS maxdate -- clamp year 9999 to today
FROM #t
GROUP BY empnr
UNION ALL
SELECT empnr
, DATEADD(DAY, 1, refdate)
, maxdate
FROM rcte
WHERE refdate < maxdate
)
SELECT rcte.empnr
, rcte.refdate
, t.hours
FROM rcte
LEFT JOIN #t AS t ON rcte.empnr = t.empnr AND rcte.refdate BETWEEN t.datefrom AND t.dateto
ORDER BY rcte.empnr, rcte.refdate
OPTION (MAXRECURSION 1000) -- approx 3 years
Demo on db<>fiddle

It could be in your select, try:
SELECT b.empnr, b.hours, datefromparts(#year,#month,numbers.value) Datum
FROM numbers
LEFT OUTER JOIN emptbl b ON b.empnr = '0815' AND
datefromparts(#year,#month,numbers.value) BETWEEN b.datefrom AND b.dateto

Your CTE produces only 31 number and therefore it is showing only January dates.
declare #year int = 2019, #month int = 1;
WITH numbers
as
(
Select 1 as value
UNion ALL
Select value + 1 from numbers
where value + 1 <= Day(EOMONTH(datefromparts(#year,#month,1)))
)
SELECT *
FROM numbers
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=a24e58ef4ce522d3ec914f90907a0a9e
You can try below code,
with t0 (i) as (select 0 union all select 0 union all select 0),
t1 (i) as (select a.i from t0 a ,t0 b ),
t2 (i) as (select a.i from t1 a ,t1 b ),
t3 (srno) as (select row_number()over(order by a.i) from t2 a ,t2 b ),
tbldt(dt) as (select dateadd(day,t3.srno-1,'01/01/2019') from t3)
select tbldt.dt
from tbldt
where tbldt.dt <= b.dateto -- put your condition here
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=b16469908b323b8d1b98d77dd09bab3d

SQL: Generate Record Per Month In Date Range

I have a table which describes a value which is valid for a certain period of days / months.
The table looks like this:
+----+------------+------------+-------+
| Id | From | To | Value |
+----+------------+------------+-------+
| 1 | 2018-01-01 | 2018-03-31 | ValA |
| 2 | 2018-01-16 | NULL | ValB |
| 3 | 2018-04-01 | 2018-05-12 | ValC |
+----+------------+------------+-------+
As you can see, the only value still valid on this day is ValB (To is nullable, From isn't).
I am trying to achieve a view on this table like this (assuming I render this view someday in july 2018):
+----------+------------+------------+-------+
| RecordId | From | To | Value |
+----------+------------+------------+-------+
| 1 | 2018-01-01 | 2018-01-31 | ValA |
| 1 | 2018-02-01 | 2018-02-28 | ValA |
| 1 | 2018-03-01 | 2018-03-31 | ValA |
| 2 | 2018-01-16 | 2018-01-31 | ValB |
| 2 | 2018-02-01 | 2018-02-28 | ValB |
| 2 | 2018-03-01 | 2018-03-31 | ValB |
| 2 | 2018-04-01 | 2018-04-30 | ValB |
| 2 | 2018-05-01 | 2018-05-31 | ValB |
| 2 | 2018-06-01 | 2018-06-30 | ValB |
| 3 | 2018-04-01 | 2018-04-30 | ValC |
| 3 | 2018-05-01 | 2018-05-12 | ValC |
+----------+------------+------------+-------+
This view basically creates a record for each record in the table, but splitted by month, using the correct dates (especially minding the start and end dates that are not on the first or the last day of the month).
The one record without a To date (so it's still valid to this day), is rendered until the last day of the month in which I render the view, so at the time of writing, this is july 2018.
This is a simple example, but a solution will seriously help me along. I'll need this for multiple calculations, including proration of amounts.
Here's a table script and some insert statements that you can use:
CREATE TABLE [dbo].[Test]
(
[Id] INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
[From] SMALLDATETIME NOT NULL,
[To] SMALLDATETIME NULL,
[Value] NVARCHAR(100) NOT NULL
)
INSERT INTO dbo.Test ([From],[To],[Value])
VALUES
('2018-01-01','2018-03-31','ValA'),
('2018-01-16',null,'ValB'),
('2018-04-01','2018-05-12','ValC');
Thanks in advance!

Generate all months that might appear on your values (with start and end), then join where each month overlaps the period of your values. Change the result so if a month doesn't overlap fully, you just display the limits of your period.
DECLARE #StartDate DATE = '2018-01-01'
DECLARE #EndDate DATE = '2020-01-01'
;WITH GeneratedMonths AS
(
SELECT
StartDate = #StartDate,
EndDate = EOMONTH(#StartDate)
UNION ALL
SELECT
StartDate = DATEADD(MONTH, 1, G.StartDate),
EndDate = EOMONTH(DATEADD(MONTH, 1, G.StartDate))
FROM
GeneratedMonths AS G
WHERE
DATEADD(MONTH, 1, G.StartDate) < #EndDate
)
SELECT
T.Id,
[From] = CASE WHEN T.[From] >= G.StartDate THEN T.[From] ELSE G.StartDate END,
[To] = CASE WHEN G.EndDate >= T.[To] THEN T.[To] ELSE G.EndDate END,
T.Value
FROM
dbo.Test AS T
INNER JOIN GeneratedMonths AS G ON
G.EndDate >= T.[From] AND
G.StartDate <= ISNULL(T.[To], GETDATE())
ORDER BY
T.Id,
G.StartDate
OPTION
(MAXRECURSION 3000)

Recursive cte is very simple way if you don't have a large dataset :
with t as (
select id, [from], [to], Value
from Test
union all
select id, dateadd(mm, 1, [from]), [to], value
from t
where dateadd(mm, 1, [from]) < coalesce([to], getdate())
)
select id, [from], (case when eomonth([from]) <= coalesce([to], cast(getdate() as date))
then eomonth([from]) else coalesce([to], eomonth([from]))
end) as [To],
Value
from t
order by id;

By using date functions and recursive CTE.
with cte as
(
Select Id, Cast([From] as date) as [From], EOMONTH([from]) as [To1],
COALESCE([To],EOMONTH(GETDATE())) AS [TO],Value from test
UNION ALL
Select Id, DATEADD(DAY,1,[To1]),
CASE when EOMONTH(DATEADD(DAY,1,[To1])) > [To] THEN CAST([To] AS DATE)
ELSE EOMONTH(DATEADD(DAY,1,[To1])) END as [To1],
[To],Value from cte where TO1 <> [To]
)
Select Id, [From],[To1] as [To], Value from cte order by Id

#EzLo your solution is good but require setting 2 variables with fixed values.
To avoid this you can do recursive CTE on real data
WITH A AS(
SELECT
T.Id, CAST(T.[From] AS DATE) AS [From], CASE WHEN T.[To]<EOMONTH(T.[From], 0) THEN T.[To] ELSE EOMONTH(T.[From], 0) END AS [To], T.Value, CAST(0 AS INTEGER) AS ADD_M
FROM
TEST T
UNION ALL
SELECT
T.Id, DATEADD(DAY, 1, EOMONTH(T.[From], -1+(A.ADD_M+1))), CASE WHEN T.[To]<EOMONTH(T.[From], A.ADD_M+1) THEN T.[To] ELSE EOMONTH(T.[From], A.ADD_M+1) END AS [To], T.Value, A.ADD_M+1
FROM
TEST T
INNER JOIN A ON T.Id=A.Id AND DATEADD(MONTH, A.ADD_M+1, T.[From]) < CASE WHEN T.[To] IS NULL THEN CAST(GETDATE() AS DATE) ELSE T.[To] END
)
SELECT
A.[Id], A.[From], A.[To], A.[Value]
FROM
A
ORDER BY A.[Id], A.[From]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Gaps/Islands Question - Determine if someone has worked for X years without a Y days break - sql

Related

SQL Count Rows GROUP BY Month Name

Getting duplicate dates while repeating the rows

How to return same row multiple times with multiple conditions

Display data for all date ranges including missing dates

SQL: Generate Record Per Month In Date Range

Categories

Resources