Group dates by 7 days excluding specific dates - sql

I need query, where I could group dates by every 7 days from beginning of the month. The problem is I have to exclude some days, specifically days before/after holidays and holidays. In my DateDay dimension there's a column, thats indicates which type of day it is. Example of calendar for November:
DTD_GID DTD_Date DTD_DayType
20161101 2016-11-01 2 --holiday was on 2016-10-31
20161102 2016-11-02 0
20161103 2016-11-03 0
20161104 2016-11-04 0
20161105 2016-11-05 0
20161106 2016-11-06 0
20161107 2016-11-07 0
20161108 2016-11-08 0
20161109 2016-11-09 0
20161110 2016-11-10 2
20161111 2016-11-11 1--public holiday
20161112 2016-11-12 2
20161113 2016-11-13 0
20161114 2016-11-14 0
20161115 2016-11-15 0
20161116 2016-11-16 0
20161117 2016-11-17 0
20161118 2016-11-18 0
20161119 2016-11-19 0
20161120 2016-11-20 0
20161121 2016-11-21 0
20161122 2016-11-22 0
20161123 2016-11-23 0
20161124 2016-11-24 0
20161125 2016-11-25 0
20161126 2016-11-26 0
20161127 2016-11-27 0
20161128 2016-11-28 0
20161129 2016-11-29 0
20161130 2016-11-30 0
I need to group it like that:
1: 2016-11-02 - 2016-11-08 (inclusive)
2: 2016-11-13 - 2016-11-19
3: 2016-11-20 - 2016-11-26
If such group would have less than 7 days, it shouldn't be returned by query.
Let me know if you need more details.
EDIT: I'm not sure if it will help, but I wrote query that's counting proper days in weeks
SELECT
DTD_DTMGID
,CONVERT(VARCHAR(5), DATEADD(WK, Week, 0), 103) + ' - ' + CONVERT(VARCHAR(5), DATEADD(DD, 6, DATEADD(WK, Week, 0)), 103) AS Week
,Cnt
FROM (
SELECT
DTD_DTMGID
, DATEDIFF(WK, 0, DTD_DATE) AS Week
, COUNT(*) AS Cnt
FROM DIM_DateDay
WHERE DTD_DayType = 0
GROUP BY DTD_DTMGID ,DATEDIFF(WK, 0, DTD_DATE)
) AS X
ORDER BY DTD_DTMGID
and result:
DTD_DTMGID Week Cnt
201301 31/12 - 06/01 2
201301 07/01 - 13/01 5
201301 14/01 - 20/01 7
201301 21/01 - 27/01 7
201301 28/01 - 03/02 5
201302 28/01 - 03/02 2
EDIT2: As output I expect ID's of days that are in those groups. As ID's I mean DTD_GID column which is primary key in my DateDay dimension.
So for group 1) I'd get following list:
20161102
20161103
20161104
20161105
20161106
20161107
20161108

Here is one solution that gives you start and end date of each 7-day range:
WITH CTE1 AS (
SELECT DTD_Date, DATEDIFF(DAY, ROW_NUMBER() OVER (ORDER BY DTD_Date), DTD_Date) AS Group1
FROM #Table1
WHERE DTD_DayType = 0
), CTE2 AS (
SELECT DTD_Date, Group1, (ROW_NUMBER() OVER (PARTITION BY Group1 ORDER BY Group1) - 1) / 7 AS Group2
FROM CTE1
)
SELECT MIN(DTD_Date) AS DTD_From, MAX(DTD_Date) AS DTD_Upto, COUNT(DTD_Date) AS C
FROM CTE2
GROUP BY Group1, Group2
ORDER BY DTD_From
-- HAVING COUNT(*) >= 7
Output:
DTD_From | DTD_Upto | C
-----------+------------+--
2016-11-02 | 2016-11-08 | 7
2016-11-09 | 2016-11-09 | 1
2016-11-13 | 2016-11-19 | 7
2016-11-20 | 2016-11-26 | 7
2016-11-27 | 2016-11-30 | 4
Here is how it works:
The first CTE removes holidays and assigns a group number to remaining rows. Consecutive dates get same group number (see this question).
The second CTE assigns another group number to each row in each group. Row number 1-7 get 0, 8-14 get 1, and so on.
Finally you group the results by the group numbers.

Related

Get quarter start/end dates for more than a year (start year to current year)

I've been trying to get start and end dates range for each quarter given a specific date/year, like this:
SELECT DATEADD(mm, (quarter - 1) * 3, year_date) StartDate,
DATEADD(dd, 0, DATEADD(mm, quarter * 3, year_date)) EndDate
--quarter QuarterNo
FROM
(
SELECT '2012-01-01' year_date
) s CROSS JOIN
(
SELECT 1 quarter UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4
) q
which produces the following output:
2012-01-01 00:00:00 2012-04-01 00:00:00
2012-04-01 00:00:00 2012-07-01 00:00:00
2012-07-01 00:00:00 2012-10-01 00:00:00
2012-10-01 00:00:00 2013-01-01 00:00:00
Problem: I need to do this for a given start_date and end_date, the problem being the end_date=current_day, so how can I achieve this:
2012-01-01 00:00:00 2012-04-01 00:00:00
2012-04-01 00:00:00 2012-07-01 00:00:00
2012-07-01 00:00:00 2012-10-01 00:00:00
2012-10-01 00:00:00 2013-01-01 00:00:00
... ...
2021-01-01 00:00:00 2021-01-06 00:00:00
I think here is what you want to do :
SET startdatevar AS DATEtime = '2020-01-10'
;WITH RECURSIVE cte AS (
SELECT startdatevar AS startdate , DATEADD(QUARTER, 1 , startdatevar) enddate , 1 quarter
UNION ALL
SELECT enddate , CASE WHEN DATEADD(QUARTER, 1 , enddate) > CURRENT_DATE() THEN GETDATE() ELSE DATEADD(QUARTER, 1 , enddate) END enddate, quarter + 1
FROM cte
WHERE
cte.enddate <= CURRENT_DATE()
and quarter < 4
)
SELECT * FROM cte
to use your code , if you want to have more than 4 quarters :
SET quarter_limit = DATEDIFF(quarter , <startdate>,<enddate>)
;WITH RECURSIVE cte(q, qDate,enddate) as
(
select 1,
DATEFROMPARTS(year('2012-01-01'::date), 1, 1) -- First quarter date
,time_slice('2012-01-01'::date, 3, 'MONTH', 'END')
UNION ALL
select q+1,
DATEADD(q, 1, qdate) -- next quarter start date
,time_slice(qdate::date, (q+1)*3, 'MONTH', 'END')
from cte
where q < quarter_limit -- limiting the number of next quarters
AND cte.endDate <= <enddate>
)
SELECT * FROM cte
After #eshirvana's answer, I came up with this slightly change after your answer:
WITH RECURSIVE cte(q, qDate,enddate) as
(
select 1,
DATEFROMPARTS(year('2012-01-01'::date), 1, 1) -- First quarter date
,time_slice('2012-01-01'::date, 3, 'MONTH', 'END')
UNION ALL
select q+1,
DATEADD(q, 1, qdate) -- next quarter start date
,time_slice(qdate::date, (q+1)*3, 'MONTH', 'END')
from cte
where q <4 -- limiting the number of next quarters
AND cte.endDate <= CURRENT_DATE()
)
SELECT * FROM cte
Which works fine for whatever year I pass there (2012 will produce 4 records, 2021 just one, since we're still on the first quarter right now).
[EDIT]: it still doesn't work as expected after your 2nd code sugestion:
WITH RECURSIVE cte(q, qDate,enddate) as
(
select 1,
DATEFROMPARTS(year('2012-01-01'::date), 1, 1) -- First quarter date
,CASE WHEN time_slice('2012-01-01'::date, 3, 'MONTH', 'END') > CURRENT_DATE
THEN current_date
ELSE time_slice('2012-01-01'::date, 3, 'MONTH', 'END')
END
UNION ALL
select q+1,
DATEADD(q, 1, qdate) -- next quarter start date
,time_slice(qdate::date, (q+1)*3, 'MONTH', 'END')
from cte
where q < DATEDIFF(quarter , '2012-01-01'::date,'2021-01-06'::date)
AND cte.endDate <= '2021-01-06'::date
)
SELECT * FROM cte
is outputing this:
Sorry #eshirvana, it doesn't work as expected though. It all goes well to some point, but it's not returning all the records. Instead, it produces less records and wrong one, like this:
1 2012-01-01 2012-04-01
2 2012-04-01 2012-07-01
3 2012-07-01 2012-10-01
4 2012-10-01 2013-01-01
5 2013-01-01 2013-10-01
6 2013-04-01 2013-07-01
7 2013-07-01 2013-10-01
8 2013-10-01 2014-01-01
9 2014-01-01 2015-01-01
10 2014-04-01 2015-01-01
11 2014-07-01 2016-10-01
12 2014-10-01 2015-01-01
13 2015-01-01 2015-07-01
14 2015-04-01 2015-07-01
15 2015-07-01 2018-10-01
16 2015-10-01 2018-01-01
17 2016-01-01 2016-10-01
18 2016-04-01 2019-07-01
19 2016-07-01 2017-07-01
20 2016-10-01 2020-01-01
21 2017-01-01 2017-04-01
22 2017-04-01 2019-07-01
23 2017-07-01 2021-10-01
Although my logic it's still not ok for not printing just Q1 dates for 2021, could this output issues be related to date format or something?
Now, it seems to be working, at least for 2012-01-01 till today (2021-01-06).
The code :
WITH RECURSIVE cte(q, qDate,enddate) as
(
select
-- it might not be the first quarter, so better to protect that:
quarter('2012-01-01'::date)::numeric
, DATEFROMPARTS(year('2012-01-01'::date), 1, 1) -- First quarter date
, CASE WHEN time_slice('2012-01-01'::date, 3, 'MONTH', 'END') > '2021-01-06'::date
THEN '2021-01-06'::date
ELSE time_slice('2012-01-01'::date, 3, 'MONTH', 'END')
END
UNION ALL
select q+1
, DATEADD(q, 1, qdate) -- next quarter start date
,CASE WHEN time_slice(DATEADD(q, 1, qdate), 3, 'MONTH', 'END')> '2021-01-06'::date
THEN '2021-01-06'::date
ELSE time_slice(DATEADD(q, 1, qdate), 3, 'MONTH', 'END')
END
from cte
where q <= DATEDIFF(quarter , '2012-01-01'::date,'2021-01-06'::date)
AND cte.endDate <= '2021-01-06'::date
)
SELECT * FROM cte
The output:
1 2012-01-01 2012-04-01
2 2012-04-01 2012-07-01
3 2012-07-01 2012-10-01
4 2012-10-01 2013-01-01
5 2013-01-01 2013-04-01
6 2013-04-01 2013-07-01
7 2013-07-01 2013-10-01
8 2013-10-01 2014-01-01
9 2014-01-01 2014-04-01
10 2014-04-01 2014-07-01
11 2014-07-01 2014-10-01
12 2014-10-01 2015-01-01
13 2015-01-01 2015-04-01
14 2015-04-01 2015-07-01
15 2015-07-01 2015-10-01
16 2015-10-01 2016-01-01
17 2016-01-01 2016-04-01
18 2016-04-01 2016-07-01
19 2016-07-01 2016-10-01
20 2016-10-01 2017-01-01
21 2017-01-01 2017-04-01
22 2017-04-01 2017-07-01
23 2017-07-01 2017-10-01
24 2017-10-01 2018-01-01
25 2018-01-01 2018-04-01
26 2018-04-01 2018-07-01
27 2018-07-01 2018-10-01
28 2018-10-01 2019-01-01
29 2019-01-01 2019-04-01
30 2019-04-01 2019-07-01
31 2019-07-01 2019-10-01
32 2019-10-01 2020-01-01
33 2020-01-01 2020-04-01
34 2020-04-01 2020-07-01
35 2020-07-01 2020-10-01
36 2020-10-01 2021-01-01
37 2021-01-01 2021-01-06
In case you're wondering: yes, the idea is to present the end_date as last_day of the month+one. But it could easily be adapted.
It's not pretty, but I think it's somehow easy to understand.

Following Start and End Date Columns

I have start and end date columns, and there are some where the start date equals the end date of the previous row without a gap. I'm trying to get it so that it would basically go from the Start Date row who's End Date is null and kinda "zig-zag" up going until the Start Date does not match the End Date.
I've tried CTEs, and ROW_NUMBER() OVER().
START_DTE END_DTE
2018-01-17 2018-01-19
2018-01-26 2018-02-22
2018-02-22 2018-08-24
2018-08-24 2018-09-24
2018-09-24 NULL
Expected:
START_DTE END_DTE
2018-01-26 2018-09-24
EDIT
Using a proposed solution with an added CTE to ensure dates don't have times with them.
WITH
CTE_TABLE_NAME AS
(
SELECT
ID_NUM,
CONVERT(DATE,START_DTE) START_DTE,
CONVERT(DATE,END_DTE) END_DTE
FROM
TABLE_NAME
WHERE ID_NUM = 123
)
select min(start_dte) as start_dte, max(end_dte) as end_dte, grp
from (select t.*,
sum(case when prev_end_dte = end_dte then 0 else 1 end) over (order by start_dte) as grp
from (select t.*,
lag(end_dte) over (order by start_dte) as prev_end_dte
from CTE_TABLE_NAME t
) t
) t
group by grp;
The following query provides these results:
start_dte end_dte grp
2014-08-24 2014-12-19 1
2014-08-31 2014-09-02 2
2014-09-02 2014-09-18 3
2014-09-18 2014-11-03 4
2014-11-18 2014-12-09 5
2014-12-09 2015-01-16 6
2015-01-30 2015-02-02 7
2015-02-02 2015-05-15 8
2015-05-15 2015-07-08 9
2015-07-08 2015-07-09 10
2015-07-09 2015-08-25 11
2015-08-31 2015-09-01 12
2015-10-06 2015-10-29 13
2015-11-10 2015-12-11 14
2015-12-11 2015-12-15 15
2015-12-15 2016-01-20 16
2016-01-29 2016-02-01 17
2016-02-01 2016-03-03 18
2016-03-30 2016-08-29 19
2016-08-30 2016-12-06 20
2017-01-27 2017-02-20 21
2017-02-20 2017-08-15 22
2017-08-15 2017-08-29 23
2017-08-29 2018-01-17 24
2018-01-17 2018-01-19 25
2018-01-26 2018-02-22 26
2018-02-22 2018-08-24 27
2018-08-24 2018-09-24 28
2018-09-24 NULL 29
I tried using having count (*) > 1 as suggested, but it provided no results
Expected example
START_DTE END_DTE
2017-01-27 2018-01-17
2018-01-26 2018-09-24
You can identify where groups of connected rows start by looking for where adjacent rows are not connected. A cumulative sum of these starts then gives you the groups.
select min(start_dte) as start_dte, max(end_dte) as end_dte
from (select t.*,
sum(case when prev_end_dte = start_dte then 0 else 1 end) over (order by start_dte) as grp
from (select t.*,
lag(end_dte) over (order by start_dte) as prev_end_dte
from t
) t
) t
group by grp;
If you want only multiply connected rows (as implied by your question), then add having count(*) > 1 to the outer query.
Here is a db<>fiddle.

PostgreSQL - rank over rows listed in blocks of 0 and 1

I have a table that looks like:
id code date1 date2 block
--------------------------------------------------
20 1234 2017-07-01 2017-07-31 1
15 1234 2017-06-01 2017-06-30 1
13 1234 2017-05-01 2017-05-31 0
11 1234 2017-03-01 2017-03-31 0
9 1234 2017-02-01 2017-02-28 1
8 1234 2017-01-01 2017-01-31 0
7 1234 2016-11-01 2016-11-31 0
6 1234 2016-10-01 2016-10-31 1
2 1234 2016-09-01 2016-09-31 1
I need to rank the rows according to the blocks of 0's and 1's, like:
id code date1 date2 block desired_rank
-------------------------------------------------------------------
20 1234 2017-07-01 2017-07-31 1 1
15 1234 2017-06-01 2017-06-30 1 1
13 1234 2017-05-01 2017-05-31 0 2
11 1234 2017-03-01 2017-03-31 0 2
9 1234 2017-02-01 2017-02-28 1 3
8 1234 2017-01-01 2017-01-31 0 4
7 1234 2016-11-01 2016-11-31 0 4
6 1234 2016-10-01 2016-10-31 1 5
2 1234 2016-09-01 2016-09-31 1 5
I've tried to use rank() and dense_rank(), but the result I end up with is:
id code date1 date2 block dense_rank()
-------------------------------------------------------------------
20 1234 2017-07-01 2017-07-31 1 1
15 1234 2017-06-01 2017-06-30 1 2
13 1234 2017-05-01 2017-05-31 0 1
11 1234 2017-03-01 2017-03-31 0 2
9 1234 2017-02-01 2017-02-28 1 3
8 1234 2017-01-01 2017-01-31 0 3
7 1234 2016-11-01 2016-11-31 0 4
6 1234 2016-10-01 2016-10-31 1 4
2 1234 2016-09-01 2016-09-31 1 5
In the last table, the rank doesn't care about the rows, it just takes all the 1's and 0's as a unit and sets an ascending count starting at the first 1 and 0.
My query goes like this:
CREATE TEMP TABLE data (id integer,code text, date1 date, date2 date, block integer);
INSERT INTO data VALUES
(20,'1234', '2017-07-01','2017-07-31',1),
(15,'1234', '2017-06-01','2017-06-30',1),
(13,'1234', '2017-05-01','2017-05-31',0),
(11,'1234', '2017-03-01','2017-03-31',0),
(9, '1234', '2017-02-01','2017-02-28',1),
(8, '1234', '2017-01-01','2017-01-31',0),
(7, '1234', '2016-11-01','2016-11-30',0),
(6, '1234', '2016-10-01','2016-10-31',1),
(2, '1234', '2016-09-01','2016-09-30',1);
SELECT *,dense_rank() OVER (PARTITION BY code,block ORDER BY date2 DESC)
FROM data
ORDER BY date2 DESC;
By the way, the database is in postgreSQL.
I hope there's a workaround... Thanks :)
Edit: Note that the blocks of 0's and 1's aren't equal.
There's no way to get this result using a single Window Function:
SELECT *,
Sum(flag) -- now sum the 0/1 to create the "rank"
Over (PARTITION BY code
ORDER BY date2 DESC)
FROM
(
SELECT *,
CASE
WHEN Lag(block) -- check if this is the 1st row of a new block
Over (PARTITION BY code
ORDER BY date2 DESC) = block
THEN 0
ELSE 1
END AS flag
FROM DATA
) AS dt

Skip Holidays in Business day Table

I am using the following script to determine what the business days are for each particular month.
DECLARE #startdate DATETIME
SET #startdate ='20170401'
;
WITH bd AS(
SELECT
DATEADD(DAY,
CASE
(DATEPART(WEEKDAY, DATEADD(MONTH, DATEDIFF(MONTH, 0, #startdate), 0)) + ##DATEFIRST - 1) % 7
WHEN 6 THEN 2
WHEN 7 THEN 1
ELSE 0
END,
DATEADD(MONTH, DATEDIFF(MONTH, 0, #startdate), 0)
) AS bd, 1 AS n
UNION ALL
SELECT DATEADD(DAY,
CASE
(DATEPART(WEEKDAY, bd.bd) + ##DATEFIRST - 1) % 7
WHEN 5 THEN 3
WHEN 6 THEN 2
ELSE 1
END,
bd.bd
) AS db,
bd.n+1
FROM bd WHERE MONTH(bd.bd) = MONTH(#startdate)
)
SELECT * INTO #BD
FROM (
SELECT 'BD'+ CAST(n AS VARCHAR(5)) AS Expected_Date_Rule, bd AS Expected_Calendar_Date
from bd
) AS x
The result of this table works fine. Bd is the the calendar days for the particular month and n is the business day number. The script does its job of not counting the weekend as a business day.
bd n
----------------------- -----------
2017-04-03 00:00:00.000 1
2017-04-04 00:00:00.000 2
2017-04-05 00:00:00.000 3
2017-04-06 00:00:00.000 4
2017-04-07 00:00:00.000 5
2017-04-10 00:00:00.000 6
2017-04-11 00:00:00.000 7
2017-04-12 00:00:00.000 8
2017-04-13 00:00:00.000 9
2017-04-14 00:00:00.000 10
2017-04-17 00:00:00.000 11
2017-04-18 00:00:00.000 12
2017-04-19 00:00:00.000 13
2017-04-20 00:00:00.000 14
2017-04-21 00:00:00.000 15
2017-04-24 00:00:00.000 16
2017-04-25 00:00:00.000 17
2017-04-26 00:00:00.000 18
2017-04-27 00:00:00.000 19
2017-04-28 00:00:00.000 20
2017-05-01 00:00:00.000 21
But then I notice that a potential issue will occur in July where the output will count the 4th of July as BD2 when it should be counted as BD3. Some had created a holiday table that is updated with all the holidays (excuse the bad spelling).
Holiday table
1 2017-01-01 New Year Day
4 2017-01-02 New Year Day-Follow
1 2017-01-16 MArtin Luther King Day
4 2017-01-17 MArtin Luther King Day-Follow
1 2017-02-20 Preseiednt Day
4 2017-02-21 Preseiednt Day-Follow
1 2017-05-29 Memorial Day
4 2017-05-30 Memorial Day-Follow
1 2017-07-04 Independence Day
4 2017-07-05 Independence Day-Follow
1 2017-09-04 Labour Day
4 2017-09-05 Labour Day-Follow
1 2017-10-09 Columbus Day
4 2017-10-10 Columbus Day-Follow
1 2017-11-10 Vetrans Day
4 2017-11-11 Vetrans Day-Follow
1 2017-11-23 ThanksGiving
1 2017-11-24 Day After Thanks Giving
4 2017-11-24 ThanksGiving-Follow
4 2017-11-25 Day After Thanks Giving-Follow
1 2017-12-25 Christmas
4 2017-12-26 Christmas-Follow
I was thinking there may be some way I can update my script to check the holiday table and skip the holiday and dont count it as a business day. Any tips?

Pulling Quarters from date range

Please help me how can I break a date range into quarters of a year.Ex date range 1st Jan 2012 to 31st October 2013 should give me a result set of all 8 quarters.The results should be in following format, I am using SQL server 2008 :
Quarter Month start Month end
1 Jan-12 Mar-12
2 Apr-12 Jun-12
3 Jul-12 Sep-12
4 Oct-12 Dec-12
1 Jan-13 Mar-13
2 Apr-13 Jun-13
3 Jul-13 Sep-13
4 Oct-13 Oct-13
You'd need to look at the DATEPART(QUARTER,date) and break them up that way. Something akin to this:
select datepart(year, dateTarget) as theYear, num as theQuarter, min(dateTarget) as startDate, max(dateTarget) as endDate
from numbers
join dates on datepart(quarter, dateper) = num
where num between 1 and 4
group by datepart(year, dateTarget),num
Where the dates table is the table you're looking at, and numbers is, well, a numbers table (something I find pretty useful to just have around).
This gives you quarter start dates for 12 quarrters:
with calendar as (
select
--DATEFROMPARTS(year(getdate()),1,1) as [start],
convert(datetime, convert(char(4), year(getdate()))+'0101') as [start],
qtrsBack = 1
union all
select
dateadd(mm,-3,[start]),
qtrsBack+1
from calendar
where qtrsback < 12
)
select * from calendar
producing:
start qtrsBack
---------- -----------
2013-01-01 1
2012-10-01 2
2012-07-01 3
2012-04-01 4
2012-01-01 5
2011-10-01 6
2011-07-01 7
2011-04-01 8
2011-01-01 9
2010-10-01 10
2010-07-01 11
2010-04-01 12