I have a table with attendance dates in SQL Workbench/J. I need to define the attendance periods of the people, where any attendance period with gaps less or equal than 90 days are merged into a single attendance period and any gaps larger than that are considered a different attendance period. For example for a single person this is the table I have
id
year
start_date
end_date
prev_att_month
diff
1
2012
2012-08-01
2012-08-31
2012-07-01
31
1
2012
2012-07-01
2012-07-31
2012-04-01
91
1
2012
2012-04-01
2012-04-30
2012-03-01
31
1
2012
2012-03-01
2012-03-31
2012-02-01
29
1
2012
2012-02-01
2012-02-29
2012-01-01
31
1
2012
2012-01-01
2012-01-31
2011-12-01
31
1
2011
2011-12-01
2011-12-31
2011-11-01
30
1
2011
2011-11-01
2011-11-30
2011-10-01
31
1
2011
2011-10-01
2011-10-31
2011-09-01
30
1
2011
2011-09-01
2011-09-30
2011-08-01
31
1
2011
2011-08-01
2011-08-31
2011-07-01
31
1
2011
2011-07-01
2011-07-31
2011-05-01
61
1
2011
2011-05-01
2011-05-31
2011-04-01
30
1
2011
2011-04-01
2011-04-30
2011-03-01
31
1
2011
2011-03-01
2011-03-31
2011-02-01
28
1
2011
2011-02-01
2011-02-28
2010-08-01
184
1
2010
2010-08-01
2010-08-31
2010-07-01
31
1
2010
2010-07-01
2010-07-31
2010-06-01
30
1
2010
2010-06-01
2010-06-30
2010-05-01
31
1
2010
2010-05-01
2010-05-31
2010-04-01
30
1
2010
2010-04-01
2010-04-30
where I defined the previous attendance month column with a lag function and then found the difference between that column and the the start_date in the diff column. This way I can check the gaps between the attendance months
I want this output of the attendance periods with the 90 day rule explained above:
id
start_date
end_date
1
1/04/2010
31/08/2010
1
1/02/2011
30/04/2012
1
1/07/2012
31/08/2012
Does any one have an idea of how to do this?
So far I was just able to define the difference between the attendance months but since this is a large data set I have not been able to find a solution to define the attendance periods without making a row to row analysis.
with [table] as (
select id, year, start_date, end_date,
lag(start_date) over (partition by id order by id, year, start_date) as prev_att_month,
start_date-prev_att_month as diff
from source
)
select *
from [table]
where id = 1
One method would be to use a windowed COUNT to count how many times a value greater than 90 has appeared in the diff column, which provides a unique group number. Then you can just group your data into those groups and get the MIN and MAX values:
WITH Grps AS(
SELECT V.id,
V.year,
V.start_date,
V.end_date,
V.prev_att_month,
V.diff,
COUNT(CASE WHEN diff > 90 THEN 1 END) OVER (PARTITION BY ID ORDER BY V.start_date ASC) AS Grp
FROM (VALUES(1,2012,CONVERT(date,'20120801'),CONVERT(date,'20120831'),CONVERT(date,'2012-07-01'),31),
(1,2012,CONVERT(date,'20120701'),CONVERT(date,'20120731'),CONVERT(date,'2012-04-01'),91),
(1,2012,CONVERT(date,'20120401'),CONVERT(date,'20120430'),CONVERT(date,'2012-03-01'),31),
(1,2012,CONVERT(date,'20120301'),CONVERT(date,'20120331'),CONVERT(date,'2012-02-01'),29),
(1,2012,CONVERT(date,'20120201'),CONVERT(date,'20120229'),CONVERT(date,'2012-01-01'),31),
(1,2012,CONVERT(date,'20120101'),CONVERT(date,'20120131'),CONVERT(date,'2011-12-01'),31),
(1,2011,CONVERT(date,'20111201'),CONVERT(date,'20111231'),CONVERT(date,'2011-11-01'),30),
(1,2011,CONVERT(date,'20111101'),CONVERT(date,'20111130'),CONVERT(date,'2011-10-01'),31),
(1,2011,CONVERT(date,'20111001'),CONVERT(date,'20111031'),CONVERT(date,'2011-09-01'),30),
(1,2011,CONVERT(date,'20110901'),CONVERT(date,'20110930'),CONVERT(date,'2011-08-01'),31),
(1,2011,CONVERT(date,'20110801'),CONVERT(date,'20110831'),CONVERT(date,'2011-07-01'),31),
(1,2011,CONVERT(date,'20110701'),CONVERT(date,'20110731'),CONVERT(date,'2011-05-01'),61),
(1,2011,CONVERT(date,'20110501'),CONVERT(date,'20110531'),CONVERT(date,'2011-04-01'),30),
(1,2011,CONVERT(date,'20110401'),CONVERT(date,'20110430'),CONVERT(date,'2011-03-01'),31),
(1,2011,CONVERT(date,'20110301'),CONVERT(date,'20110331'),CONVERT(date,'2011-02-01'),28),
(1,2011,CONVERT(date,'20110201'),CONVERT(date,'20110228'),CONVERT(date,'2010-08-01'),184),
(1,2010,CONVERT(date,'20100801'),CONVERT(date,'20100831'),CONVERT(date,'2010-07-01'),31),
(1,2010,CONVERT(date,'20100701'),CONVERT(date,'20100731'),CONVERT(date,'2010-06-01'),30),
(1,2010,CONVERT(date,'20100601'),CONVERT(date,'20100630'),CONVERT(date,'2010-05-01'),31),
(1,2010,CONVERT(date,'20100501'),CONVERT(date,'20100531'),CONVERT(date,'2010-04-01'),30),
(1,2010,CONVERT(date,'20100401'),CONVERT(date,'20100430'),NULL,NULL))V(id,year,start_date,end_date,prev_att_month,diff))
SELECT id,
MIN(Start_date) AS Start_date,
MAX(End_Date) AS End_Date
FROM Grps
GROUP BY Id,
Grp
ORDER BY id,
Start_date;
I am using the following script to determine what the business days are for each particular month.
DECLARE #startdate DATETIME
SET #startdate ='20170401'
;
WITH bd AS(
SELECT
DATEADD(DAY,
CASE
(DATEPART(WEEKDAY, DATEADD(MONTH, DATEDIFF(MONTH, 0, #startdate), 0)) + ##DATEFIRST - 1) % 7
WHEN 6 THEN 2
WHEN 7 THEN 1
ELSE 0
END,
DATEADD(MONTH, DATEDIFF(MONTH, 0, #startdate), 0)
) AS bd, 1 AS n
UNION ALL
SELECT DATEADD(DAY,
CASE
(DATEPART(WEEKDAY, bd.bd) + ##DATEFIRST - 1) % 7
WHEN 5 THEN 3
WHEN 6 THEN 2
ELSE 1
END,
bd.bd
) AS db,
bd.n+1
FROM bd WHERE MONTH(bd.bd) = MONTH(#startdate)
)
SELECT * INTO #BD
FROM (
SELECT 'BD'+ CAST(n AS VARCHAR(5)) AS Expected_Date_Rule, bd AS Expected_Calendar_Date
from bd
) AS x
The result of this table works fine. Bd is the the calendar days for the particular month and n is the business day number. The script does its job of not counting the weekend as a business day.
bd n
----------------------- -----------
2017-04-03 00:00:00.000 1
2017-04-04 00:00:00.000 2
2017-04-05 00:00:00.000 3
2017-04-06 00:00:00.000 4
2017-04-07 00:00:00.000 5
2017-04-10 00:00:00.000 6
2017-04-11 00:00:00.000 7
2017-04-12 00:00:00.000 8
2017-04-13 00:00:00.000 9
2017-04-14 00:00:00.000 10
2017-04-17 00:00:00.000 11
2017-04-18 00:00:00.000 12
2017-04-19 00:00:00.000 13
2017-04-20 00:00:00.000 14
2017-04-21 00:00:00.000 15
2017-04-24 00:00:00.000 16
2017-04-25 00:00:00.000 17
2017-04-26 00:00:00.000 18
2017-04-27 00:00:00.000 19
2017-04-28 00:00:00.000 20
2017-05-01 00:00:00.000 21
But then I notice that a potential issue will occur in July where the output will count the 4th of July as BD2 when it should be counted as BD3. Some had created a holiday table that is updated with all the holidays (excuse the bad spelling).
Holiday table
1 2017-01-01 New Year Day
4 2017-01-02 New Year Day-Follow
1 2017-01-16 MArtin Luther King Day
4 2017-01-17 MArtin Luther King Day-Follow
1 2017-02-20 Preseiednt Day
4 2017-02-21 Preseiednt Day-Follow
1 2017-05-29 Memorial Day
4 2017-05-30 Memorial Day-Follow
1 2017-07-04 Independence Day
4 2017-07-05 Independence Day-Follow
1 2017-09-04 Labour Day
4 2017-09-05 Labour Day-Follow
1 2017-10-09 Columbus Day
4 2017-10-10 Columbus Day-Follow
1 2017-11-10 Vetrans Day
4 2017-11-11 Vetrans Day-Follow
1 2017-11-23 ThanksGiving
1 2017-11-24 Day After Thanks Giving
4 2017-11-24 ThanksGiving-Follow
4 2017-11-25 Day After Thanks Giving-Follow
1 2017-12-25 Christmas
4 2017-12-26 Christmas-Follow
I was thinking there may be some way I can update my script to check the holiday table and skip the holiday and dont count it as a business day. Any tips?
Please help me how can I break a date range into quarters of a year.Ex date range 1st Jan 2012 to 31st October 2013 should give me a result set of all 8 quarters.The results should be in following format, I am using SQL server 2008 :
Quarter Month start Month end
1 Jan-12 Mar-12
2 Apr-12 Jun-12
3 Jul-12 Sep-12
4 Oct-12 Dec-12
1 Jan-13 Mar-13
2 Apr-13 Jun-13
3 Jul-13 Sep-13
4 Oct-13 Oct-13
You'd need to look at the DATEPART(QUARTER,date) and break them up that way. Something akin to this:
select datepart(year, dateTarget) as theYear, num as theQuarter, min(dateTarget) as startDate, max(dateTarget) as endDate
from numbers
join dates on datepart(quarter, dateper) = num
where num between 1 and 4
group by datepart(year, dateTarget),num
Where the dates table is the table you're looking at, and numbers is, well, a numbers table (something I find pretty useful to just have around).
This gives you quarter start dates for 12 quarrters:
with calendar as (
select
--DATEFROMPARTS(year(getdate()),1,1) as [start],
convert(datetime, convert(char(4), year(getdate()))+'0101') as [start],
qtrsBack = 1
union all
select
dateadd(mm,-3,[start]),
qtrsBack+1
from calendar
where qtrsback < 12
)
select * from calendar
producing:
start qtrsBack
---------- -----------
2013-01-01 1
2012-10-01 2
2012-07-01 3
2012-04-01 4
2012-01-01 5
2011-10-01 6
2011-07-01 7
2011-04-01 8
2011-01-01 9
2010-10-01 10
2010-07-01 11
2010-04-01 12