Calculate Experience without overlapping - sql

I'm trying to come up with the correct query to calculate the employment experience time but, I can't get it right. Here's the data I have:
Case 1:
EmployeeID PoisitionID StartDate EndDate
1 15 5/22/2017 5/22/2018
1 17 7/14/2018 8/10/2019
Case 2:
EmployeeID PositonID StartDate EndDate
1 15 5/22/2017 8/10/2019
1 17 3/8/2019 8/10/2019
Case 3:
EmployeeID PositonID StartDate EndDate
1 15 5/22/2017 NULL
1 17 3/8/2019 NULL
In the first case, my expected result in months would be: 27 months for both positions.
In the second case, my expected result in months would be:27 months for positonid 15 and 0 months for positionid 17 because positionid 17 falls during the date range of the first position and therefore, the employee will not be awarded with any years of experience.
In the third case, my expected result in months would be:30 months using today's date as an enddate for positonid 15 and 0 months for positionid 17 because positionid 17 falls during the date range of the first position and therefore, the employee will not be awarded with any years of experience.

You don't have any gaps, so I think this does what you want:
select employeeid,
datediff(month, min(startdate), coalesce(max(enddate), getdate())) as months
from t
group by employeeid;

This is what I have:
Your table 1:
select 1 as EmployeeID , 15 as PositonID , cast('5/22/2017' as date) as StartDate, cast('5/22/2018' as date) as EndDate into t2
union select 1, 17, '7/14/2018', '8/10/2019'
And the query to get the result
with a as
(
select EmployeeID, isnull(StartDate, cast(getdate() as date)) as sedate from t2
union
select EmployeeID, isnull(EndDate, cast(getdate() as date)) from t2
)
select a1.*, a2.sedate, case when datediff(month,a1.sedate, a2.sedate)< 0 then 0 else isnull(datediff(month,a1.sedate, a2.sedate), 0) end as months from a a1 left join a a2 on a1.EmployeeID = a2.EmployeeID and a1.sedate < a2.sedate
and not exists(select 1 from a a3 where a3.EmployeeID = a2.EmployeeID and a3.sedate > a1.sedate and a3.sedate < a2.sedate )
I changed the table to the values of Case2 and Case 3 and it seemed to work.
Let us know if that helps

Related

Split dates into quarters based on start and end date

I want to split quarters based on a given start and end date.
I have the following table:
table1
ID
start_date
end_date
No. of Quarters
1
01-01-2017
01-01-2018
4
2
01-04-2017
01-10-2018
7
So the result table should be have dates split based on number of quarters and end date.
The result table should look like:
table2
ID
Quarterly Start Date
1
01-01-2017
1
01-04-2017
1
01-07-2017
1
01-10-2017
2
01-04-2017
2
01-07-2017
2
01-10-2017
2
01-01-2018
2
01-04-2018
2
01-07-2018
2
01-10-2018
I found a solution on stackoverflow which states
declare #startDate datetime
declare #endDate datetime
select
#startDate= ET.start_date,
#endDate= ET.end_date
from
table1
;With cte
As
( Select #startDate date1
Union All
Select DateAdd(Month,3,date1) From cte where date1 < #endDate
) select cast(cast( Year(date1)*10000 + MONTH(date1)*100 + 1 as
varchar(255)) as date) quarterlyDates From cte
Since I am new to sql, I am having troubles customizing it to my problem.
Could anyone please recommend a way? Thanks!
If I understand correctly, the recursive CTE would look like:
with cte as (
select id, start_date, num_quarters
from t
union all
select id, dateadd(month, 3, start_date), num_quarters - 1
from cte
where num_quarters > 1
)
select *
from cte;
Here is a db<>fiddle.

SQL Server: Count days difference between previous date and current date

I've been trying to find a way to count days difference between two dates from previous and current rows which counting only business days.
Example data and criteria here.
ID StartDate EndDate NewDate DaysDifference
========================================================================
0 04/05/2017 null
1 12/06/2017 16/06/2017 12/06/2017 29
2 03/07/2017 04/07/2017 16/06/2017 13
3 07/07/2017 10/07/2017 04/07/2017 5
4 12/07/2017 26/07/2017 10/07/2017 13
My end goal is
I want two new columns; NewDate and DayDifference.
NewDate column is from EndDate from previous row. As you can see that for example, NewDate of ID 2 is 16/06/2017 which come from EndDate of ID 1. But if value in EndDate of previous row is null, use its StartDate instead(ID 1 case).
DaysDifference column is from counting only business days between EndDate and NewDate columns.
Here is script that I am using atm.
select distinct
c.ID
,c.EndDate
,isnull(p.EndDate,c.StartDate) as NewDate
,count(distinct cast(l.CalendarDate as date)) as DaysDifference
from
(select *
from table) c
full join
(select *
from table) p
on c.level = p.level
and c.id-1 = p.id
left join Calendar l
on (cast(l.CalendarDate as date) between cast(p.EndDate as date) and cast(c.EndDate as date)
or
cast(l.CalendarDate as date) between cast(p.EndDate as date) and cast(c.StartDate as date))
and l.Day not in ('Sat','Sun') and l.Holiday <> 'Y'
where c.ID <> 0
group by
c.ID
,c.EndDate
,isnull(p.EndDate,c.StartDate)
And this's the current result :
ID EndDate NewDate DaysDifference
=========================================================
1 16/06/2017 12/06/2017 0
2 04/07/2017 16/06/2017 13
3 10/07/2017 04/07/2017 5
4 26/07/2017 10/07/2017 13
Seems like in the real data, I've got correct DaysDifference for ID 2,3,4 except ID 1 because of the null value from its previous row(ID 0) that printing StartDate instead of null EndDate, so it counts incorrectly.
Hope I've provided enough info. :)
Could you please guide me a way to count DaysDifference correctly.
Thanks in advance!
I think you can use this logic to get the previous date:
select t.*,
lag(coalesce(enddate, startdate), 1) over (order by 1) as newdate
from t;
Then for the difference:
select id, enddate, newdate,
sum(case when c.day not in ('Sat', 'Sun') and c.holiday <> 'Y' then 1 else 0 end) as diff
from (select t.*,
lag(coalesce(enddate, startdate), 1) over (order by 1) as newdate
from t
) t join
calendar c
on c.calendardate >= newdate and c.calendardate <= startdate
group by select id, enddate, newdate;

Split 2 dates into 2 row isq

I want to split holiday between 2 month into 2 row in sql for example :
EmpId StartDate EndDate TotalDays
1 2017/5/25 2017/6/10 16
I need to split it into 2 row like the following :
EmpId StartDate EndDate TotalDays
1 2017/5/25 2017/5/31 6
1 2017/6/1 2017/6/10 10
Thanks you
Assuming holidays only have one month split (as in your example):
select empid, startdate,
(case when eomonth(startdate) < enddate then eomonth(startdate) else enddate end) as enddate
from t
union all
select empid, dateadd(day, 1, eomonth(startdate)), enddate
from t
where eomonth(startdate) < enddate;
Well, that doesn't give TotalDays, but you can do that using a subquery and datediff().

T/SQL - Group/Multiply records

Source date:
CREATE TABLE #Temp (ID INT Identity(1,1) Primary Key, BeginDate datetime, EndDate datetime, GroupBy INT)
INSERT INTO #Temp
SELECT '2015-06-05 00:00:00.000','2015-06-12 00:00:00.000',7
UNION
SELECT '2015-06-05 00:00:00.000', '2015-06-08 00:00:00.000',7
UNION
SELECT '2015-10-22 00:00:00.000', '2015-10-31 00:00:00.000',7
SELECT *, DATEDIFF(DAY,BeginDate, EndDate) TotalDays FROM #Temp
DROP TABLE #Temp
ID BeginDate EndDate GroupBy TotalDays
1 6/5/15 0:00 6/8/15 0:00 7 3
2 6/5/15 0:00 6/12/15 0:00 7 7
3 10/22/15 0:00 10/31/15 0:00 7 9
Desired Output:
ID BeginDate EndDate GroupBy TotalDays GroupCnt GroupNum
1 6/5/15 0:00 6/8/15 0:00 7 3 1 1
2 6/5/15 0:00 6/12/15 0:00 7 7 1 1
3 10/22/15 0:00 10/29/15 0:00 7 9 2 1
3 10/29/15 0:00 10/31/15 0:00 7 9 2 2
Goal:
Group the records based on ID/BeginDate/EndDate.
Based on the GroupBy number (# of days) and TotalDays (days diff),
if the GroupBy => TotalDays, keep a single group record
else multiply the group records (1 record per GroupBy count) while staying within TotalDays limit.
Apologies if it's confusing but basically, in the above example, there should be one record for each group (ID/BeginDate/EndDate) for the record where days diff b/w Begin/End date = 7 or less (GroupBy).
If the days diff goes above 7 days, create another record (for every additional 7 days diff).
So since 1st two records have days diff of 7 days or less, there's only one record.
The 3rd record has days diff of 9 (7 + 2). Therefore, there should be 2 records (1st for the first 7 days and 2nd for the additional 2 days).
GroupCNT = how many records there're of the grouped records after applying the above records.
GroupNum is basically row number of the group.
GroupBy # can be different for each record. Dataset is huge so performance does matter.
One pattern I was able to figure out was related to the modulus b/w GroupBy and days diff.
When the GroupBy value is < days diff, modulus is always less than GroupBy. When the GroupBy value = days diff, modulus is always 0. And when the GroupBy value > days diff, modulus is always equals GroupBy. I'm not sure if/how to use that to group/multiply records to meet the requirement.
SELECT DISTINCT
ID
, BeginDate
, EndDate
, GroupBy
, DATEDIFF(DAY,BeginDate, EndDate) TotalDays
, CAST(GroupBy as decimal(18,6))%CAST(DATEDIFF(DAY,BeginDate, EndDate) AS decimal(18,6)) Modulus
, CASE WHEN DATEDIFF(DAY,BeginDate, EndDate) <= GroupBy THEN BeginDate END NewBeginDate
, CASE WHEN DATEDIFF(DAY,BeginDate, EndDate) <= GroupBy THEN EndDate END NewEndDate
FROM #Temp
Update:
Forgot to mention/include that the begin/enddate, when the records gets multiplied, will change accordingly. In other words, begin/end date will reflect the GroupBy - desired output shows what I mean more clearly in the 3rd and 4th record.
Also, GroupCnt/GroupNum are not as important to calculate as grouping/multiplying the records.
You could do something like this using a recursive CTE..
;WITH cte AS (
SELECT ID,
BeginDate,
EndDate,
GroupBy,
DATEDIFF(DAY, BeginDate, EndDate) AS TotalDays,
1 AS GroupNum
FROM #Temp
UNION ALL
SELECT ID,
BeginDate,
EndDate,
GroupBy,
TotalDays,
GroupNum + 1
FROM cte
WHERE GroupNum * GroupBy < TotalDays
)
SELECT ID,
BeginDate = CASE WHEN GroupNum = 1 THEN BeginDate
ELSE DATEADD(DAY, GroupBy * (GroupNum - 1), BeginDate)
END ,
EndDate = CASE WHEN TotalDays <= GroupBy THEN EndDate
WHEN DATEADD(DAY, GroupBy * GroupNum, BeginDate) > EndDate THEN EndDate
ELSE DATEADD(DAY, GroupBy * GroupNum, BeginDate)
END ,
GroupBy,
TotalDays,
COUNT(*) OVER (PARTITION BY ID) GroupCnt,
GroupNum
FROM cte
OPTION (MAXRECURSION 0)
the cte builds out a recordset like this.
ID BeginDate EndDate GroupBy TotalDays GroupNum
----------- ----------------------- ----------------------- ----------- ----------- -----------
1 2015-06-05 00:00:00.000 2015-06-08 00:00:00.000 7 3 1
2 2015-06-05 00:00:00.000 2015-06-12 00:00:00.000 7 7 1
3 2015-10-22 00:00:00.000 2015-10-31 00:00:00.000 7 9 1
3 2015-10-22 00:00:00.000 2015-10-31 00:00:00.000 7 9 2
then you just have to take this and use some case statements to determine what the begin and end date should be.
you should end up with
ID BeginDate EndDate GroupBy TotalDays GroupCnt GroupNum
----------- ----------------------- ----------------------- ----------- ----------- ----------- -----------
1 2015-06-05 00:00:00.000 2015-06-08 00:00:00.000 7 3 1 1
2 2015-06-05 00:00:00.000 2015-06-12 00:00:00.000 7 7 1 1
3 2015-10-22 00:00:00.000 2015-10-29 00:00:00.000 7 9 2 1
3 2015-10-29 00:00:00.000 2015-10-31 00:00:00.000 7 9 2 2
since you're using SQL 2012, you can also use the LAG and LEAD functions in your final query.
;WITH cte AS (
SELECT ID,
BeginDate,
EndDate,
GroupBy,
DATEDIFF(DAY, BeginDate, EndDate) AS TotalDays,
1 AS GroupNum
FROM #Temp
UNION ALL
SELECT ID,
BeginDate,
EndDate,
GroupBy,
TotalDays,
GroupNum + 1
FROM cte
WHERE GroupNum * GroupBy < TotalDays
)
SELECT ID,
BeginDate = COALESCE(LAG(BeginDate) OVER (PARTITION BY ID ORDER BY GroupNum) + GroupBy * (GroupNum - 1), BeginDate),
EndDate = COALESCE(LEAD(BeginDate) OVER (PARTITION BY ID ORDER BY GroupNum) + GroupBy * GroupNum, EndDate),
GroupBy,
TotalDays,
COUNT(*) OVER (PARTITION BY ID) GroupCnt,
GroupNum
FROM cte
OPTION (MAXRECURSION 0)
CREATE TABLE dim_number (id INT);
INSERT INTO dim_number VALUES ((0), (1), (2), (3)); -- Populate this to a large number
SELECT
#Temp.Id,
CASE WHEN dim_number.id = 0
THEN #Temp.BeginDate
ELSE DATEADD(DAY, dim_number.id * #Temp.GroupBy, #Temp.BeginDate)
END AS BeginDate,
CASE WHEN dim_number.id = parts.count
THEN #Temp.EndDate
ELSE DATEADD(DAY, (dim_number.id + 1) * #Temp.GroupBy, #Temp.BeginDate)
END AS EndDate,
#Temp.GroupBy AS GroupBy,
DATEDIFF(DAY, #Temp.BeginDate, #Temp.EndDate) AS TotalDays,
parts.count + 1 AS GroupCnt,
dim_number.id + 1 AS GroupNum
FROM
#Temp
CROSS APPLY
(SELECT DATEDIFF(DAY, #Temp.BeginDate, #Temp.EndDate) / #Temp.GroupBy AS count) AS parts
INNER JOIN
dim_number
ON dim_number.id >= 0
AND dim_number.id <= parts.count

Efficient join with a "correlated" subquery

Given three tables Dates(date aDate, doUse boolean), Days(rangeId int, day int, qty int) and Range(rangeId int, startDate date) in Oracle
I want to join these so that Range is joined with Dates from aDate = startDate where doUse = 1 whith each day in Days.
Given a single range it might be done something like this
SELECT rangeId, aDate, CASE WHEN doUse = 1 THEN qty ELSE 0 END AS qty
FROM (
SELECT aDate, doUse, SUM(doUse) OVER (ORDER BY aDate) day
FROM Dates
WHERE aDate >= :startDAte
) INNER JOIN (
SELECT rangeId, day,qty
FROM Days
WHERE rangeId = :rangeId
) USING (day)
ORDER BY day ASC
What I want to do is make query for all ranges in Range, not just one.
The problem is that the join value "day" is dependent on the range startDate to be calculated, wich gives me some trouble in formulating a query.
Keep in mind that the Dates table is pretty huge so I would like to avoid calculating the day value from the first date in the table, while each Range Days shouldn't be more than a 100 days or so.
Edit: Sample data
Dates Days
aDate doUse rangeId day qty
2008-01-01 1 1 1 1
2008-01-02 1 1 2 10
2008-01-03 0 1 3 8
2008-01-04 1 2 1 2
2008-01-05 1 2 2 5
Ranges
rangeId startDate
1 2008-01-02
2 2008-01-03
Result
rangeId aDate qty
1 2008-01-02 1
1 2008-01-03 0
1 2008-01-04 10
1 2008-01-05 8
2 2008-01-03 0
2 2008-01-04 2
2 2008-01-05 5
Try this:
SELECT rt.rangeId, aDate, CASE WHEN doUse = 1 THEN qty ELSE 0 END AS qty
FROM (
SELECT *
FROM (
SELECT r.*, t.*, SUM(doUse) OVER (PARTITION BY rangeId ORDER BY aDate) AS span
FROM (
SELECT r.rangeId, startDate, MAX(day) AS dm
FROM Range r, Days d
WHERE d.rangeid = r.rangeid
GROUP BY
r.rangeId, startDate
) r, Dates t
WHERE t.adate >= startDate
ORDER BY
rangeId, t.adate
)
WHERE
span <= dm
) rt, Days d
WHERE d.rangeId = rt.rangeID
AND d.day = GREATEST(rt.span, 1)
P. S. It seems to me that the only point to keep all these Dates in the database is to get a continuous calendar with holidays marked.
You may generate a calendar of arbitrary length in Oracle using following construction:
SELECT :startDate + ROWNUM
FROM dual
CONNECT BY
1 = 1
WHERE rownum < :length
and keep only holidays in Dates. A simple join will show you which Dates are holidays and which are not.
Ok, so maybe I've found a way. Someting like this:
SELECT irangeId, aDate + sum(case when doUse = 1 then 0 else 1) over (partionBy rangeId order by aDate) as aDate, qty
FROM Days INNER JOIN (
select irangeId, startDate + day - 1 as aDate, qty
from Range inner join Days using (irangeid)
) USING (aDate)
Now I just need a way to fill in the missing dates...
Edit: Nah, this way means that I'll miss the doUse vaue of the last dates...