How to setup a cumulative count grouped by month with underlying conditions - sql

I've come across somewhat of an interesting scenario where I'm needing to aggregate enrollment counts and group them by the individual month and all subsequent months leading up to the completion date. The starting counter will be placed into the month when the enrollment began, and now I'm needing to set up a cumulative sum to carry out the single count.
Here's a couple of test records I'm working with
I've set up the following query to compile the date_month CTE to compile the full 12 months derived from my Start/End Range variables. I've then joined it to my test table in order to establish the Counter placements.
DECLARE #EnrollmentDateStart DATETIME = '2020-01-01'
DECLARE #EnrollmentDateEnd DATETIME = '2020-12-01'
;WITH CTE_Months(year_month) AS
(
SELECT DATEADD(MONTH, n, DATEADD(MONTH, DATEDIFF(MONTH, 0, #EnrollmentDateStart), 0))
FROM ( SELECT TOP (DATEDIFF(MONTH, #EnrollmentDateStart, #EnrollmentDateEnd) + 1)
n = ROW_NUMBER() OVER (ORDER BY [object_id]) - 1
FROM sys.all_objects ORDER BY [object_id] ) AS n
)
SELECT
[Year] = YEAR(cm.year_month),
[Month] = DATENAME(MONTH, cm.year_month),
SUM(IIF(tt.[Enrollment Start Date] >= #EnrollmentDateStart,1,0)) AS EnrollmentCount
FROM CTE_Months cm
LEFT OUTER JOIN #TMP_Testing_Table tt
ON tt.[Enrollment Start Date] >= cm.year_month
AND tt.[Enrollment Start Date] < DATEADD(MONTH, 1, cm.year_month)
GROUP BY tt.Department, cm.year_month
At this stage, I'm pulling back the following results, so I now have the Enrollment Counts placed into the correct starting months derived from the Enrollment Start Date.
Now I'm trying to figure out what would be the best course of action to place the subsequent count for the additional months leading up to the Completion date?
For example - The first User (UserId: 1) was Enrolled in March, 2020, and Completed in August, 2020, so essentially I'm looking to produce the following result to reflect the number of months ranging between March <> July (Last month prior to Completion)
January: 0
February: 0
March: 1
April: 1
May: 1
June: 1
July: 1
August: 0
September: 0
October: 0
November: 0
December: 0
Thinking a cumulative total should be able to address the subsequent for the month by month range, however, I would then need to zero out the total for all subsequent months on and after the recorded Completion date for this record in question.
Seeing if I can get your thoughts/suggestions on how to address this scenario? Apologies if the information/explanation is confusing, but please let me know, and I'll do my best to elaborate.

....................
SELECT
[Year] = YEAR(cm.year_month),
[Month] = DATENAME(MONTH, cm.year_month),
count(tt.userid) AS EnrollmentCount
FROM CTE_Months cm
LEFT OUTER JOIN #TMP_Testing_Table tt on cm.year_month > eomonth([Enrollment Start Date], -1)
and cm.year_month <= tt.[Enrollment End Date]
GROUP BY cm.year_month

Related

How do I include months that have no data?

I am trying to create a report that shows how many training records will expire within a chosen date range however when I run the report it excludes months that have no training records going out of date. I have tried various solutions I've seen posted but I haven't been able to get any of them to work in my case.
This is my query:
SELECT COUNT(ISNULL(TRAININGRECORDID, 0)) AS NUMBEROFRECORDS
,DEPARTMENTNUMBER
,DATENAME( Month, EXPIRY ) + '-' + DATENAME( Year, EXPIRY ) AS [MONTHYEAR]
FROM Training_Records TR
JOIN Departments TD ON TR.DEPARTMENTID = TD.DEPARTMENTID
WHERE TR.EXPIRY IS NOT NULL
AND TD.DEPARTMENTNUMBER IN (#DEPTNO)
AND TR.EXPIRY BETWEEN #StartDate AND #EndDate
GROUP BY TD.DEPARTMENTNUMBER, DATENAME(Year, TR.EXPIRY), DATENAME(Month, TR.EXPIRY)
ORDER BY TD.DEPARTMENTNUMBER, [MONTHYEAR]
An example of results from this query looks like this:
NUMBEROFRECORDS DEPARTMENTNUMBER MONTHYEAR
1 21 April-2023
4 23 June-2023
1 83 August-2023
I am displaying the results of this query in a matrix with MONTHYEAR as the columns. In the example above the report will display April, June and August 2023 but will skip over the months May, July 2023 because there are no records going out of date in those months but despite that I still want them displayed in my report/returned in my query.
I've tried various solutions I've found on here but none of them have worked for me. How would I go about including these months with no records going out of date?
You need to first get all of the months, and then outer join to them (not using BETWEEN). Here is an example that gets April, May, June, and July, and then shows how you would outer join that against your table.
DECLARE #StartDate date = '20220405',
#EndDate date = '20220708';
;WITH Months(TheMonth) AS
(
SELECT DATEFROMPARTS(YEAR(#StartDate), MONTH(#StartDate), 1)
UNION ALL
SELECT DATEADD(MONTH, 1, TheMonth)
FROM Months
WHERE TheMonth < DATEFROMPARTS(YEAR(#EndDate), MONTH(#EndDate), 1)
)
SELECT TheMonth -- , COALESCE(SUM({your table}.{column}),0)
FROM Months AS m
-- LEFT OUTER JOIN {your table}
-- ON {your table}.{date column} >= m.TheMonth
-- AND {your table}.{date column} < DATEADD(MONTH, 1, m.TheMonth);
Output:
TheMonth
2022-04-01
2022-05-01
2022-06-01
2022-07-01
Example db<>fiddle
If your range could last more than 100 months, you'll need to add:
OPTION (MAXRECURSION 0);

sql repeat rows for weekend and holidays

I have a table A that we import based on the day that it lands on a location. We dont receive files on weekend and public holidays, and the table has multiple countries data so the public holidays vary. In essence we looking to duplicate a row multiple times till it encounters the next record for that ID (unless its the max date for that ID). A typical record looks like this:
Account Datekey Balance
1 20181012 100
1 20181112 100
1 20181212 100
1 20181512 100
1 20181712 100
And needs to look like this (sat, sun & PH added to indicate the day of week):
Account Datekey Balance
1 20181012 100
1 20181112 100
1 20181212 100
1 20181312 100 Sat
1 20181412 100 Sun
1 20181512 100
1 20181612 100 PH
1 20181712 100
Also Datekey is numeric and not a date. I tried a couple solutions suggested but found that it simply duplicates the previous row multiple times without stopping when the next dates record is found. I need to run it as an update query that would execute daily on table A and add missing records when its executed (sometimes 2 or 3 days later).
Hope you can assist.
Thanks
This question has multiple parts:
Converting an obscene date format to a date
Generating "in-between" rows
Filling in the new rows with the previous value
Determining the day of the week
The following does most of this. I refuse to regenerate the datekey format. You really need to fix that.
This also assumes that your setting are for English week day names.
with t as (
select Account, Datekey, Balance, convert(date, left(dkey, 4) + right(dkey, 2) + substring(dkey, 5, 2)) as proper_date
from yourtable
),
dates as (
select account, min(proper_date) as dte, max(proper_date) as max_dte
from t
group by account
union all
select account, dateadd(day, 1, dte), max_dte
from dates
where dte < max_dte
)
select d.account, d.dte, t.balance,
(case when datename(weekday, d.dte) in ('Saturday', 'Sunday')
then left(datename(weekday, d.dte), 3)
else 'PH'
end) as indicator
from dates d cross apply
(select top (1) t.*
from t
where t.account = d.account and
t.proper_date <= d.dte
order by t.proper_date desc
) t
option (maxrecursion 0);

Finding the first and last business day for every month sql

I am trying to find the first and last business day for every month since 1986.
Using this, I can find the first day of any given month using, but just that month and it does not take into consideration whether it is a business day or not. To make it easier for now, business day is simply weekdays and does not consider public holiday.
SELECT DATEADD(s,0,DATEADD(mm, DATEDIFF(m,0,getdate()),0))
But I am not able to get the correct business day, so I created a calendar table consisting of all the weekdays and thought that I can extract the min(date) from each month, but I am currently stuck.
Date
---------------
1986-01-01
1986-01-02
1986-01-03
1986-01-06
...and so on
I have tried to get the first day of every month instead, but it does not take into account whether the day is a weekend or not. It just simply give the first day of each month
declare #DatFirst date = '20000101', #DatLast date = getdate();
declare #DatFirstOfFirstMonth date = dateadd(day,1-day(#DatFirst),#DatFirst);
select DatFirstOfMonth = dateadd(month,n,#DatFirstOfFirstMonth)
from (select top (datediff(month,#DatFirstOfFirstMonth,#DatLast)+1)
n=row_number() over (order by (select 1))-1
from (values (1),(1),(1),(1),(1),(1),(1),(1)) a (n)
cross join (values (1),(1),(1),(1),(1),(1),(1),(1)) b (n)
cross join (values (1),(1),(1),(1),(1),(1),(1),(1)) c (n)
cross join (values (1),(1),(1),(1),(1),(1),(1),(1)) d (n)
) x
I am wondering if anyone can perhaps shed some light as to how can I best approach this issue.
If you already have your calendar table with all available dates, then you just need to filter by weekday.
SET DATEFIRST 1 -- 1: Monday, 7: Sunday
SELECT
Year = YEAR(T.Date),
Month = MONTH(T.Date),
FirstBusinessDay = MIN(T.Date),
LastBusinessDay = MAX(T.Date)
FROM
Calendar AS T
WHERE
DATEPART(WEEKDAY, T.Date) BETWEEN 1 AND 5 -- 1: Monday, 5: Friday
GROUP BY
YEAR(T.Date),
MONTH(T.Date)
You should use the query to mark these days on your calendar table, so it's easy to access them afterwards.
This is how you can mix it up with the generation of the calendar table (with recursion).
SET DATEFIRST 1 -- 1: Monday, 7: Sunday
declare
#DatFirst date = '20000101',
#DatLast date = getdate();
;WITH AllDays AS
(
SELECT
Date = #DatFirst
UNION ALL
SELECT
Date = DATEADD(DAY, 1, D.Date)
FROM
AllDays AS D
WHERE
D.Date < #DatLast
),
BusinessLimitsByMonth AS
(
SELECT
Year = YEAR(T.Date),
Month = MONTH(T.Date),
FirstBusinessDay = MIN(T.Date),
LastBusinessDay = MAX(T.Date)
FROM
AllDays AS T
WHERE
DATEPART(WEEKDAY, T.Date) BETWEEN 1 AND 5 -- 1: Monday, 5: Friday
GROUP BY
YEAR(T.Date),
MONTH(T.Date)
)
SELECT
*
FROM
BusinessLimitsByMonth AS B
ORDER BY
B.Year,
B.Month
OPTION
(MAXRECURSION 0) -- 0: Unlimited
If you got already a table with all the weekdays only:
select min(datecol), max(datecol)
from BusinessOnlyCalendar
group by year(datecol), month(datecol)
But you should expand your calendar to include all those calculations you might do on date, like FirstDayOfWeek/Month/Quarter/Year, WeekNumber, etc.
When you got a column in your calendar indicating business day yes/no, it's a simple:
select min(datecol), max(datecol)
from calendar
where businessday = 'y'
group by year(datecol), month(datecol)

Query to check number of records created in a month.

My table creates a new record with timestamp daily when an integration is successful. I am trying to create a query that would check (preferably automated) the number of days in a month vs number of records in the table within a time frame.
For example, January has 31 days, so i would like to know how many days in january my process was not successful. If the number of records is less than 31, than i know the job failed 31 - x times.
I tried the following but was not getting very far:
SELECT COUNT (DISTINCT CompleteDate)
FROM table
WHERE CompleteDate BETWEEN '01/01/2015' AND '01/31/2015'
Every 7 days the system executes the job twice, so i get two records on the same day, but i am trying to determine the number of days that nothing happened (failures), so i assume some truncation of the date field is needed?!
One way to do this is to use a calendar/date table as the main source of dates in the range and left join with that and count the number of null values.
In absence of a proper date table you can generate a range of dates using a number sequence like the one found in the master..spt_values table:
select count(*) failed
from (
select dateadd(day, number, '2015-01-01') date
from master..spt_values where type='P' and number < 365
) a
left join your_table b on a.date = b.CompleteDate
where b.CompleteDate is null
and a.date BETWEEN '01/01/2015' AND '01/31/2015'
Sample SQL Fiddle (with count grouped by month)
Assuming you have an Integers table*. This query will pull all dates where no record is found in the target table:
declare #StartDate datetime = '01/01/2013',
#EndDate datetime = '12/31/2013'
;with d as (
select *, date = dateadd(d, i - 1 , #StartDate)
from dbo.Integers
where i <= datediff(d, #StartDate, #EndDate) + 1
)
select d.date
from d
where not exists (
select 1 from <target> t
where DATEADD(dd, DATEDIFF(dd, 0, t.<timestamp>), 0) = DATEADD(dd, DATEDIFF(dd, 0, d.date), 0)
)
Between is not safe here
SELECT 31 - count(distinct(convert(date, CompleteDate)))
FROM table
WHERE CompleteDate >= '01/01/2015' AND CompleteDate < '02/01/2015'
You can use the following query:
SELECT DATEDIFF(day, t.d, dateadd(month, 1, t.d)) - COUNT(DISTINCT CompleteDate)
FROM mytable
CROSS APPLY (SELECT CAST(YEAR(CompleteDate) AS VARCHAR(4)) +
RIGHT('0' + CAST(MONTH(CompleteDate) AS VARCHAR(2)), 2) +
'01') t(d)
GROUP BY t.d
SQL Fiddle Demo
Explanation:
The value CROSS APPLY-ied, i.e. t.d, is the ANSI string of the first day of the month of CompleteDate, e.g. '20150101' for 12/01/2015, or 18/01/2015.
DATEDIFF uses the above mentioned value, i.e. t.d, in order to calculate the number of days of the month that CompleteDate belongs to.
GROUP BY essentially groups by (Year, Month), hence COUNT(DISTINCT CompleteDate) returns the number of distinct records per month.
The values returned by the query are the differences of [2] - 1, i.e. the number of failures per month, for each (Year, Month) of your initial data.
If you want to query a specific Year, Month then just simply add a WHERE clause to the above:
WHERE YEAR(CompleteDate) = 2015 AND MONTH(CompleteDate) = 1

MSSQL select count where condition is met across a date range

I have a table containing date, employeeID(int) , and ShiftWorked (can be night/day/weekend or evening) . There is a row for each employee and date combination
I would like to construct a query that gives me a count of how many people have worked a night shift in the week before and after each date in the roster period.
--------------------------------------------------------------------------
Date (yyyy-MM-dd) | CountOfNightshifts(for 1 week either side of date)
--------------------------------------------------------------------------
2012-1-1 | 8
2012-1-2 | 12
2012-1-3 | 11
2012-1-4 | 6
etc | etc
I hope this is clear. I have spent days trying to get this to work but I am not getting anywhere.
For example:
SELECT COUNT(id), [date]
FROM ROSTER
WHERE Shift = night AND [date] BETWEEN DATEADD(D,-7,[date]) AND DATEADD(d,7,[date])
GROUP by [date]
group by [date]
This will give me a list of dates and a count of nights on that particular day - not all night shifts in the 7 days before and after the date.
The following query will return two columns: the reference (roster) date and the number of (distinct) people that have worked on the night sift seven days before to seven days after the reference date.
SELECT tmain.date,
(
SELECT COUNT(DISTINCT taux.employeeId)
FROM roster taux
WHERE taux.shiftWorked = 'night'
AND taux.date >= DATEADD(DAY, -7, tmain.date)
AND taux.date <= DATEADD(DAY, 7, tmain.date)
) AS [number_of_distinct_people_with_night_shift]
FROM roster tmain
ORDER BY tmain.date;
Note 1: Usually I prefer joins over sub-queries, but I guess this solution is easier to read.
Note 2: I am assuming the time component of date values are irrelevant and all dates have the same time (i.e. '00:00:00.00'); if it is not the case, there are more adjustments to be done on the date comparison.
how about this?
SELECT
[date]
,count(*)
FROM
Shifts as s
WHERE
s.Date > DATEADD(day,-7,GETDATE())
AND ShiftWorked = 'Night'
GROUP BY
date
http://sqlfiddle.com/#!3/e88cc/1
a bit more data:
http://sqlfiddle.com/#!3/b7793/2
If you are only interested in a specific date then you could use:
DECLARE #target datetime
SET #target = GETDATE()
SELECT
count(*) as NightShifts
FROM
Shifts as s
WHERE
ShiftWorked = 'Night'
AND s.Date > DATEADD(day,-7,#target)
AND s.Date < DATEADD(day,7,#target)
http://sqlfiddle.com/#!3/b7793/20
but if you have another table that actually has the periods in it (e.g. billing or payroll dates):
DECLARE #target datetime
SET #target = GETDATE()
SELECT
p.periodDate
,count(*)
FROM
Shifts as s
INNER JOIN periods as p
ON s.date > dateadd(day,-7,p.periodDate)
AND s.date < dateadd(day,7,p.periodDate)
WHERE
ShiftWorked = 'Night'
GROUP BY p.periodDate
http://sqlfiddle.com/#!3/fc54d/2
OR to get ) when no night shift was worked:
SELECT
p.periodDate
,ISNULL(t.num,0) as nightShifts
FROM
periods as p
LEFT OUTER JOIN (
SELECT
p.periodDate
,count(*) as num
FROM
Shifts as s
INNER JOIN periods as P
ON s.date > dateadd(day,-7,p.periodDate)
AND s.date < dateadd(day,7,p.periodDate)
WHERE
ShiftWorked = 'Night'
GROUP BY p.periodDate
) as t
ON p.periodDate = t.periodDate
http://sqlfiddle.com/#!3/fc54d/11
You can pull it off by joining the ROSTER table to itself, thereby creating several result rows per employee and day. Otherwise your GROUP BY clause will group the resulting rows from the period you are after into the dates of the original table.
SELECT
r.[date],
COUNT(period.id)
FROM ROSTER r
JOIN ROSTER period
ON period.employeeID=r.employeeID
AND period.shift = night
AND r.[date] BETWEEN DATEADD(d,-7,period.[date]) and DATEADD(d,7,period.[date])
WHERE
r.shift = night
GROUP BY r.[date]