How do I get the month number with the maximum number of days from the date range? - sql

I have a table with 10 million rows, where there are two columns that contain the start date and the end date of the range. For example, 2019-09-25 and 2019-10-20. I want to extract the month number with the maximum number of days, in this example it will be 10. In addition to dates that are separated by one month, there are also such examples: 2019-07-01 and 2019-07-29 (within one month), as well as 2019-07-01 and 2019-09-05 (more than one month). How can I implement this?

Seems like you could do something like this:
SELECT CASE WHEN DATEDIFF(DAY, DATEFROMPARTS(YEAR(EndDate),MONTH(EndDate),1),EndDate) >= DATEDIFF(DAY, StartDate, EOMONTH(StartDate)) THEN DATEPART(MONTH,EndDate)
ELSE DATEPART(MONTH,StartDate)
END
FROM (VALUES('20190925','20191020'))V(StartDate,EndDate);

Does the following fit your requirements?
You can build a table of days-in-month (this would be permanent ideally)
and then join to it using the month numbers of your min and max dates.
declare #start date='20190925', #end date='20191020';
--declare #start date='20190701', #end date='20190729';
--declare #start date='20190701', #end date='20190905';
with dim as (
select m,DAY(DATEADD(DD,-1,DATEADD(mm, DATEDIFF(mm, 0, DateFromParts(Year(GetDate()),m,1) )+1, 0)))d
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))m(m)
)
select top(1) with ties m
from dim
where m between Month(#start) and Month(#end)
order by d desc
You don't state how you determin the most days where there are several months with the same number of months, so with ties includes all qualifying months.
Edit
So I don't know if there is a requirement to span years - the sample data suggests not - however with a permanent list of dates and corresponding days in month values (this is often part of a calendar table) a slight tweak will accomodate it.
with dim as (
select Year(#start)*100 + m m, Day(DATEADD(DD,-1,DATEADD(mm, DATEDIFF(mm, 0, DateFromParts(Year(#start),m,1) )+1, 0)))d
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))m(m)
union all
select Year(#end)*100 + m m, Day(DATEADD(DD,-1,DATEADD(mm, DATEDIFF(mm, 0, DateFromParts(Year(#end),m,1) )+1, 0)))d
from (values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12))m(m)
)
select top(1) with ties m
from dim
where m between Year(#start)*100 + Month(#start) and Year(#end)*100 + Month(#end)
order by d desc

You could try something like this
with
l0(n) as (
select 1 n
from (values (1),(1),(1),(1),(1),(1),(1),(1)) as v(n))
select top(1) with ties
vTable.*, calc.dt month_with_most_days
from (values ('20190925','20191020'),
('20190925','20191120')) vTable(startdate, enddate)
cross apply (values (datediff(month, vTable.startdate, vTable.enddate))) diff(mo_count)
cross apply (select top (diff.mo_count+1)
row_number() over (order by (select null)) n
from l0 l1, l0 l2, l0 l3, l0 l4) tally /* 8^4 months possible */
cross apply (values (cast(case when tally.n=1 then startdate
when tally.n=diff.mo_count+1 then enddate
else eomonth(dateadd(month, tally.n-1, startdate)) end as date))) calc(dt)
order by row_number() over (partition by startdate, enddate
order by day(calc.dt) desc);
startdate enddate month_with_most_days
20190925 20191020 2019-09-25
20190925 20191120 2019-10-31

Related

SQL Server - Split year into 4 weekly periods

I would like to split up the year into 13 periods with 4 weeks in each
52 weeks a year / 4 = 13 even periods
I would like each period to start on a saturday and end on a friday.
It should look like the below image
Obviously I could do this manually, but the dates would change each year and I am looking for a way to automate this with SQL rather than manually do this for each upcoming year
Is there a way to produce this yearly split automatically?
In this previous answer I show an approach to create a numbers/date table. Such a table is very handsome in many places.
With this approach you might try something like this:
CREATE TABLE dbo.RunningNumbers(Number INT NOT NULL,CalendarDate DATE NOT NULL, CalendarYear INT NOT NULL,CalendarMonth INT NOT NULL,CalendarDay INT NOT NULL, CalendarWeek INT NOT NULL, CalendarYearDay INT NOT NULL, CalendarWeekDay INT NOT NULL);
DECLARE #CountEntries INT = 100000;
DECLARE #StartNumber INT = 0;
WITH E1(N) AS(SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)), --10 ^ 1
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
E8(N) AS(SELECT 1 FROM E4 a CROSS JOIN E4 b), -- 10 ^ 8 = 10,000,000 rows
CteTally AS
(
SELECT TOP(ISNULL(#CountEntries,1000000)) ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) -1 + ISNULL(#StartNumber,0) As Nmbr
FROM E8
)
INSERT INTO dbo.RunningNumbers
SELECT CteTally.Nmbr,CalendarDate.d,CalendarExt.*
FROM CteTally
CROSS APPLY
(
SELECT DATEADD(DAY,CteTally.Nmbr,{ts'1900-01-01 00:00:00'})
) AS CalendarDate(d)
CROSS APPLY
(
SELECT YEAR(CalendarDate.d) AS CalendarYear
,MONTH(CalendarDate.d) AS CalendarMonth
,DAY(CalendarDate.d) AS CalendarDay
,DATEPART(WEEK,CalendarDate.d) AS CalendarWeek
,DATEPART(DAYOFYEAR,CalendarDate.d) AS CalendarYearDay
,DATEPART(WEEKDAY,CalendarDate.d) AS CalendarWeekDay
) AS CalendarExt;
GO
NTILE - SQL Server 2008+ will create (almost) even chunks.
This the actual query
SELECT *,NTILE(13) OVER(ORDER BY CalendarDate) AS Periode
FROM RunningNumbers
WHERE CalendarWeekDay=6
AND CalendarDate>={d'2017-01-01'} AND CalendarDate <= {d'2017-12-31'};
GO
--Carefull with existing data!
--DROP TABLE dbo.RunningNumbers;
Hint 1: Place indexes!
Hint 2: Read the link about NTILE, especially the Remark-section.
I think this will fit for this case. You might think about using Prdp's approach with ROW_NUMBER() in conncetion with INT division. But - big advantage! - NTILE would allow PARTITION BY CalendarYear.
Hint 3: You might add a column to the table
...where you set the period's number as a fix value. This will make future queries very easy and would allow manual correction on special cases (53rd week..)
Here is one way using Calendar table
DECLARE #start DATE = '2017-04-01',
#end_date DATE = '2017-12-31'
SET DATEFIRST 7;
WITH Calendar
AS (SELECT 1 AS id,
#start AS start_date,
Dateadd(dd, 6, #start) AS end_date
UNION ALL
SELECT id + 1,
Dateadd(week, 1, start_date),
Dateadd(week, 1, end_date)
FROM Calendar
WHERE end_date < #end_date)
SELECT id,
( Row_number()OVER(ORDER BY id) - 1 ) / 4 + 1 AS Period,
start_date,
end_date
FROM Calendar
OPTION (maxrecursion 0)
I have generated dates using Recursive CTE but it is better to create a physical calendar table use it in queries like this
Firstly, you will never get 52 even weeks in a year, there are overlap weeks in most calendar standards. You will occasionally get a week 53.
You can tell SQL to use Saturday as the first day of the week with datefirst, then running a datepart on today's date with getdate() will tell you the week of the year:
SET datefirst 6 -- 6 is Saturday
SELECT datepart(ww,getdate()) as currentWeek
You could then divide this by 4 with a CEILING command to get the 4-week split:
SET datefirst 6
SELECT DATEPART(ww,getdate()) as currentWeek,
CEILING(DATEPART(ww,getdate())/4) as four_week_split

Group by contiguous dates and Count

I have a table which contains information about reports being accessed along with the Date.I need to group reports being accessed according to a date range and count them.
I'm using T-SQL
Table
EventId ReportId Date
60 4 11/24/2015
59 11 11/23/2015
58 6 11/22/2015
57 11 11/22/2015
56 9 11/21/2015
55 3 11/20/2015
54 5 11/20/2015
53 6 11/19/2015
52 5 11/19/2015
51 4 11/18/2015
50 3 11/17/2015
49 9 11/16/2015
If days' difference is 3 then I need result in the format
StartDate EndDate ReportsAccessed
11/22/2015 11/24/2015 4
11/19/2015 11/21/2015 5
11/16/2015 11/18/2015 3
but the difference between days could change.
Assuming you have values for all the dates, then you can calculate the difference in days between each date and the maximum (or minimum) date. Then divide this by three and use that for aggregation:
select min(date), max(date), count(*) as ReportsAccessed
from (select t.*, max(date) over () as maxd
from table t
) t
group by (datediff(day, date, maxd) / 3)
order by min(date);
"3" is what I think you are referring to as the "difference in days".
Those 2 blocks are simply for added clarity on what parameters you'd have to change
DECLARE #t as TABLE(
id int identity(1,1),
reportId int,
dateAccess date)
DECLARE #NumberOfDays int=3;
And here comes the actual select
Select StartDate, EndDate, COUNT(reportId) from
(
select *,
DATEADD(day, DATEDIFF(DAY, dateAccess, maxdate.maxdate)%#NumberOfDays, dateAccess) as EndDate,
DATEADD(day, DATEDIFF(DAY, dateAccess, maxdate.maxdate)%#NumberOfDays-#NumberOfDays+1, dateAccess) as StartDate
from #t, (select MAX(dateAccess) maxdate from #t t2) maxdate
) results
GROUP BY StartDate, EndDate
ORDER BY StartDate desc
There are a few places I'm unsure if it's optimized or not, for instance cross joining with select max(date) instead of using a subquery, but that returns the exact result from your OP.
Basically, I simply split the entries into groups based on how far they are from the MAX(date), and then use a COUNT. On that note, it might be more useful to use COUNT(distinct ...) otherwise if someone looks at the document #9 3 times, it will tell you tha 3 documents were checked, but only 1 was truly looked at.
The upside with using MAX(date) over MIN(date) is that your first group will always have the maximal amount of days. This will prove very useful if you want to compare the last few periods to the average. The downside is that you don't have stable data. With every new entry (assuming it's a new day), your query will cycle itself to produce a new set of results. If you wanted to graph the data, you'd be better comparing to MIN(date) that way the first days won't change when you add a new one.
Depending on the usage, it could even be useful to extrapolate the number of accesses done in the last period (in that case MIN(date) is also preferable).
Here's an adaptation of Gordon's answer that's probably much more optimized (it's at the very least much more aesthetic) :
SELECT DateADD(day, -datediff(day, dateAccess, maxdate)/3*3, maxdate) as EndDate,
DateADD(day, (-datediff(day, dateAccess, maxdate)/3+1)*3, maxdate) as StartDate,
count(reportId)
from (select *, MAX(dateAccess) over() as maxdate from #t) t
GROUP BY datediff(day, dateAccess, maxdate)/3, maxdate
I will insist that most efficient way of doing this is to use tally table. That way you are getting sargable predicates with all benefits from indexes on date column:
declare #c int = 3
;with minmax as(select min(date) as mind, max(date) as maxd from t),
tally as(select #c * (-1 + row_number() over(order by(select null))) as rn
from master..spt_values),
intervals as(select dateadd(dd, rn, mind) as f, dateadd(dd, rn + #c - 1, mind) t
from tally t cross join minmax m where dateadd(dd, rn, mind) <= maxd)
select i.f as [from], i.t as [to], count(*) as reeports
from intervals i
join t on t.date >= i.f and t.date <= i.t
group by i.f, i.t
Explanation: minmax selects minimum date and maximum date from table.
tally generates numbers from 0 to N(depends on system, but enougth to calc intervals). intervals selects resulting intervals. The last part is simple join on intervals to calculate counts per interval.
Fiddle http://sqlfiddle.com/#!3/c61d1/5

How can I sum values per day and then plot them on calendar from start date to last date

I have a table, part of which is given below. It contain multiple values (durations) per day. I need two things 1) addition of durations per day. 2) plotting them on calendar in such a way that startdate is first_date from the table and last_date is Last_update from the table. I want to mention 0 for which date there is no duration. I think it will something like below but need help.
;WITH AllDates AS(
SELECT #Fromdate As TheDate
UNION ALL
SELECT TheDate + 1
FROM AllDates
WHERE TheDate + 1 <= #ToDate
)SELECT UserId,
TheDate,
COALESCE(
SUM(
-- When the game starts and ends in the same date
CASE WHEN DATEDIFF(DAY, GameStartTime, GameEndTime) = 0
Here is what I am looking for
Another way to generate the date range you are after would be something like .....
;WITH DateLimits AS
(
SELECT MIN(First_Date) FirstDate
,MAX(Last_Update) LastDate
FROM TableName
),
DateRange AS
(
SELECT TOP (SELECT DATEDIFF(DAY,FirstDate,LastDate ) FROM DateLimits)
DATEADD(DAY
,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, (SELECT FirstDate FROM DateLimits)
) AS Dates
FROM master..spt_values a cross join master..spt_values b
)
SELECT * FROM DateRange --<-- you have the desired date range here
-- other query whatever you need.

SQL that lists x records with the Weeknumber and Monday's date for each week

I'm looking for an SQL query that would provide me a list of the Weeknumber and the Monday's date for that particular week.
For example:
WeekNumber DateMonday
39 2013-09-23
40 2013-09-30
... ...
The following justs produces one week
select
(DATEPART(ISO_WEEK,(CAST(getdate() as DATETIME)))) as WeekNumber,
DATEADD(wk, DATEDIFF(d, 0, CAST(getdate() as DATETIME)) / 7, 0) AS DateMonday
If you don't have a numbers table you can generate a list of sequential numbers on the fly using system tables:
e.g
SELECT Number = ROW_NUMBER() OVER(ORDER BY object_id)
FROM sys.all_objects;
If you need to extend this for more numbers you can CROSS JOIN tables:
SELECT Number = ROW_NUMBER() OVER(ORDER BY a.object_id)
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;
Then you just need to add/subtract these number of weeks from your starting date:
DECLARE #Monday DATE = DATEADD(WEEK, DATEDIFF(WEEK, 0, GETDATE()), 0);
WITH Numbers AS
( SELECT Number = ROW_NUMBER() OVER(ORDER BY object_id)
FROM sys.all_objects
)
SELECT WeekNumber = DATEPART(ISO_WEEK, w.DateMonday),
w.DateMonday
FROM ( SELECT DateMonday = DATEADD(WEEK, - n.Number, #Monday)
FROM Numbers n
) w;
This is a verbose way of doing this for step by step clarity, it can be condensed to:
SELECT WeekNumber = DATEPART(ISO_WEEK, w.DateMonday),
w.DateMonday
FROM ( SELECT DateMonday = DATEADD(WEEK, DATEDIFF(WEEK, 0, GETDATE()) - ROW_NUMBER() OVER(ORDER BY object_id), 0)
FROM sys.all_objects
) w;
Example on SQL Fiddle
Aaron Bertrand has done some in depth comparisons ways of generating sequential lists of numbers:
Generate a set or sequence without loops – part
1
Generate a set or sequence without loops – part
2
Generate a set or sequence without loops – part
3
Of course the easiest way to do this would be to create a calendar table

Create a weekCount column in SQL Server 2012

I have this data:
id worked_date
-----------------
1 2013-09-25
2 2013-09-26
3 2013-10-01
4 2013-10-04
5 2013-10-07
I want to add a column called weekCount. The based date is 2013-09-25. So all the data with worked_date from 2013-09-25 to 2013-10-01 will have weekCount as 1 and from 2013-10-02 to 2013-10-8 will have weekCount as 2 and so on. How can that be done?
Thanks.
Here's one way using DATEDIFF:
select id,
worked_date,
1 + (datediff(day, '2013-09-25', worked_date) / 7) weekCount
from yourtable
SQL Fiddle Demo
Perhaps an approach like this will solve your problem.
I compute an in-memory table that contains the week's boundaries along with a monotonically increasing number (BuildWeeks). I then compare my worked_date values to my date boundaries. Based on your comment to #sgeddes, you need the reverse week number so I then use a DENSE_RANK function to calculate the ReverseWeekNumber.
WITH BOT(StartDate) AS
(
SELECT CAST('2013-09-25' AS date)
)
, BuildWeeks (WeekNumber, StartOfWeek, EndOfWeek) AS
(
SELECT
N.number AS WeekNumber
, DateAdd(week, N.number -1, B.StartDate) AS StartOfWeek
, DateAdd(d, -1, DateAdd(week, N.number, B.StartDate)) AS EndOfWeek
FROM
dbo.Numbers AS N
CROSS APPLY
BOT AS B
)
SELECT
M.*
, BW.*
, DENSE_RANK() OVER (ORDER BY BW.WeekNumber DESC) AS ReverseWeekNumber
FROM
dbo.MyTable M
INNER JOIN
BuildWeeks AS BW
ON M.worked_date BETWEEN BW.StartOfWeek ANd BW.EndOfWeek
;
SQLFiddle
If you are looking for a Fiscal Week number, I would use a function that would calculate the week:
CREATE FUNCTION FiscalWeek(#FiscalStartDate datetime, #EvalDate datetime)
RETURNS INT
AS
BEGIN
DECLARE #weekNumber INT = (DATEDIFF(DAY, #FiscalStartDate, #EvalDate) / 7) + 1
RETURN (#weekNumber % 52)
END
GO
If you used a fiscal starting date of '2013-09-25' and an evaluation date of '2014-09-25' you would get a week number of 1.
Using a function gives you a little more flexibility to do whatever you need.
Perhaps not the most elegant way but this works for me to get the top rank number:
WITH CTE AS (
SELECT employee_id, DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, ''20130925'', worked_date )/7 DESC) AS weekRank
FROM Timesheet
)
SELECT TOP (1) weekRank
FROM CTE
WHERE employee_id=#employee_id
ORDER BY weekRank DESC
This is how I can create weekRank column and pass a parameter dynamically:
WITH rank_cte AS (
SELECT timesheet_id,employee_id, date_worked,
dateadd(week, datediff(day,'20000105',worked_date) / 7, '20000105') AS WeekStart,
dateadd(week, datediff(day,'20000105',worked_date) / 7, '20000105')+6 AS WeekEnd,
DENSE_RANK() OVER (ORDER BY 1 + DATEDIFF(DAY, '20130925', worked_date )/7 DESC) AS weekRank
FROM Timesheet
)
SELECT timesheet_id, worked_date, WeekStart, WeekEnd, weekRank
FROM rank_cte rc
WHERE employee_id=#employee_id
AND weekRank=#weekRank
ORDER BY worked_date DESC
Thanks