Normalization of Year bringing nulls back - sql

I have the following query:
SELECT DISTINCT
YEAR(DateRegistered) as Years,
Months.[MonthName],
COUNT(UserID)as totalReg
FROM
Months WITH(NOLOCK)
LEFT OUTER JOIN
UserProfile WITH(NOLOCK)
ON
Months.MonthID = MONTH(DateRegistered)
AND
DateRegistered > DATEADD(MONTH, -12,GETDATE())
GROUP BY YEAR(DateRegistered), Months.[MonthName]
ORDER BY Months.[MonthName]
As you can tell this will always bring back 12 months worth of data. As such it is working, although there is a bug with this method.
It creates Null values in months where there is no data, now the record should exist(whole point of the query) but Year field is bringing Nulls which is something I dont want.
Now I understand the problem is because there is no data, how is it supposed to know what year?
So my question is - is there any way to sort this out and replace the nulls? I suspect I will have to completely change my methodology.
**YEAR** **MONTH** **TOTAL**
2013 April 1
2013 August 1
NULL December 0
2013 February 8
2013 January 1
2013 July 1
NULL June 0
2013 March 4
NULL May 0
NULL November 0
NULL October 0
2012 September 3

If you want 12 months of data, then construct a list of numbers from 1 to 12 and use these as offsets with getdate():
with nums as (
select 12 as level union all
select level - 1
from nums
where level > 1
)
select YEAR(thedate) as Years,
Months.[MonthName],
COUNT(UserID) as totalReg
FROM (select DATEADD(MONTH, - nums.level, GETDATE()) as thedate
from nums
) mon12 left outer join
Months WITH (NOLOCK)
on month(mon12.thedate) = months.monthid left outer join
UserProfile WITH (NOLOCK)
ON Months.MonthID = MONTH(DateRegistered) and
DateRegistered > DATEADD(MONTH, -12, GETDATE())
GROUP BY YEAR(thedate), Months.[MonthName]
ORDER BY Months.[MonthName];
I find something strange about the query though. You are defining the span from the current date. However, you seem to be splitting the months themselves on calendar boundaries. I also find the table months to be awkward. Why aren't you just using the datename() and month() functions?

Try this out:
;With dates as (
Select DateName(Month, getdate()) as [Month],
DatePart(Year, getdate()) as [Year],
1 as Iteration
Union All
Select DateName(Month,DATEADD(MONTH, -Iteration, getdate())),
DatePart(Year,DATEADD(MONTH, -Iteration, getdate())),
Iteration + 1
from dates
where Iteration < 12
)
SELECT DISTINCT
d.Year,
d.Month as [MonthName],
COUNT(up.UserID)as totalReg
FROM dates d
LEFT OUTER JOIN UserProfile up ON d.Month = DateName(DateRegistered)
And d.Year = DatePart(Year, DateRegistered)
GROUP BY d.Year, d.Month
ORDER BY d.Year, d.Month

Here's my attempt at a solution:
declare #UserProfile table
(
id bigint not null identity(1,1) primary key clustered
, name nvarchar(32) not null
, dateRegistered datetime not null default(getutcdate())
)
insert #UserProfile
select 'person 1', '2011-01-23'
union select 'person 2', '2013-01-01'
union select 'person 3', '2013-05-27'
declare #yearMin int, #yearMax int
select #yearMin = year(MIN(dateRegistered))
, #yearMax= year(MAX(dateRegistered))
from #UserProfile
;with monthCte as
(
select 1 monthNo, DATENAME(month, '1900-01-01') Name
union all
select monthNo + 1, DATENAME(month, dateadd(month,monthNo,'1900-01-01'))
from monthCte
where monthNo < 12
)
, yearCte as
(
select #yearMin yearNo
union all
select yearNo + 1
from yearCte
where yearNo < #yearMax
)
select y.yearNo, m.Name, COUNT(up.id) UsersRegisteredThisPeriod
from yearCte y
cross join monthCte m
left outer join #UserProfile up
on year(up.dateRegistered) = y.yearNo
and month(up.dateRegistered) = m.monthNo
group by y.yearNo, m.monthNo, m.Name
order by y.yearNo, m.monthNo
SQL Fiddle Version: http://sqlfiddle.com/#!6/d41d8/6640

You have to calculate the counts in a Derived Table (or a CTE) first and then join
untested:
SELECT
COALESCE(dt.Years, YEAR(DATEADD(MONTH, -Months.MonthID, GETDATE()))),
Months.[MonthName],
COALESCE(dt.totalReg, 0)
FROM
Months WITH(NOLOCK)
LEFT OUTER JOIN
(
SELECT
YEAR(DateRegistered) AS Years,
MONTH(DateRegistered) AS Mon,
COUNT(UserID)AS totalReg
FROM UserProfile WITH(NOLOCK)
WHERE DateRegistered > DATEADD(MONTH, -12,GETDATE())
GROUP BY
YEAR(DateRegistered),
MONTH(DateRegistered)
) AS dt
ON Months.MonthID = dt.mon
ORDER BY 1, Months.MonthID
I changed the order to Months.MonthID instead of MonthName and i added year because you might have august 2012 and 2013 in your result.

Related

Count number of days each employee take vacation in a month SQL Server

I have this table:
Vacationtbl:
ID Start End
-------------------------
01 04/10/17 04/12/17
01 04/27/17 05/02/17
02 04/13/17 04/15/17
02 04/17/17 04/20/17
03 06/14/17 06/22/17
Employeetbl:
ID Fname Lname
------------------
01 John AAA
02 Jeny BBB
03 Jeby CCC
I like to count the number of days each employee take vacation in April.
My query:
SELECT
SUM(DATEDIFF(DAY, Start, End) + 1) AS Days
FROM
Vacationtbl
GROUP BY
ID
01 returns 9 (not correct)
02 returns 7 (correct)
How do I fix the query so that it counts until the end of month and stops at end of month. For example, April has 30 days. On second row, Employee 01 should counts 4/27/17 until 4/30/17. And 05/02/17 is for May.
Thanks
The Tally/Calendar table is the way to go. However, you can use an ad-hoc tally table.
Example
Select Year = Year(D)
,Month = Month(D)
,ID
,Days = count(*)
From Vacationtbl A
Cross Apply (
Select Top (DateDiff(DAY,[Start],[End])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),[Start])
From master..spt_values
) B
-- YOUR OPTIONAL WHERE STATEMENT HERE --
Group By ID,Year(D),Month(D)
Order By 1,2,3
Returns
Year Month ID Days
2017 4 01 7
2017 4 02 7
2017 5 01 2
EDIT - To Show All ID even if Zero Days
Select ID
,Year = Year(D)
,Month = Month(D)
,Days = sum(case when D between [Start] and [End] then 1 else 0 end)
From (
Select Top (DateDiff(DAY,'05/01/2017','05/31/2017')+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),'05/01/2017')
From master..spt_values
) D
Cross Join Vacationtbl B
Group By ID,Year(D),Month(D)
Order By 1,2,3
Returns
ID Year Month Days
1 2017 5 2
2 2017 5 0
dbFiddle if it Helps
EDIT - 2 Corrects for Overlaps (Gaps and Islands)
--Create Some Sample Data
----------------------------------------------------------------------
Declare #Vacationtbl Table ([ID] varchar(50),[Start] date,[End] date)
Insert Into #Vacationtbl Values
(01,'04/10/17','04/12/17')
,(01,'04/27/17','05/02/17')
,(02,'04/13/17','04/15/17')
,(02,'04/17/17','04/20/17')
,(02,'04/16/17','04/17/17') -- << Overlap
,(03,'05/16/17','05/17/17')
-- The Actual Query
----------------------------------------------------------------------
Select ID
,Year = Year(D)
,Month = Month(D)
,Days = sum(case when D between [Start] and [End] then 1 else 0 end)
From (Select Top (DateDiff(DAY,'04/01/2017','04/30/2017')+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),'04/01/2017') From master..spt_values ) D
Cross Join (
Select ID,[Start] = min(D),[End] = max(D)
From (
Select E.*,Grp = Dense_Rank() over (Order By D) - Row_Number() over (Partition By ID Order By D)
From (
Select Distinct A.ID,D
From #Vacationtbl A
Cross Apply (Select Top (DateDiff(DAY,A.[Start],A.[End])+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[Start]) From master..spt_values ) B
) E
) G
Group By ID,Grp
) B
Group By ID,Year(D),Month(D)
Order By 1,2,3
Returns
ID Year Month Days
1 2017 4 7
2 2017 4 8
3 2017 4 0
Without a dates table, you could use
select Id
,sum(case when [end]>'20170430' and [start]<'20170401' then datediff(day,'20170401','20170430')+1
when [end]>'20170430' then datediff(day,[start],'20170430')+1
when [start]<'20170401' then datediff(day,'20170401',[end])+1
else datediff(day,[start],[end])+1
end) as VacationDays
from Vacationtbl
where [start] <= '20170430' and [end] >= '20170401'
group by Id
There are 3 conditions here
Start is before this month and the end is after this month. In this case you subtract the end and start dates of the month.
End is after month end and start is in the month, in this case subtract month end date from the start.
Start is before this month but the end is in the month. In this case subtract month start date and the end date.
Edit: Based on the OP's comments that the future dates have to be included,
/*This recursive cte generates the month start and end dates with in a given time frame
For Eg: all the month start and end dates for 2017
Change the start and end period as needed*/
with dates (month_start_date,month_end_date) as
(select cast('2017-01-01' as date),cast(eomonth('2017-01-01') as date)
union all
select dateadd(month,1,month_start_date),eomonth(dateadd(month,1,month_start_date)) from dates
where month_start_date < '2017-12-01'
)
--End recursive cte
--Query logic is the same as above
select v.Id
,year(d.month_start_date) as yr,month(d.month_start_date) as mth
,sum(case when v.[end]>d.month_end_date and v.[start]<d.month_start_date then datediff(day,d.month_start_date,d.month_end_date)+1
when v.[end]>d.month_end_date then datediff(day,v.[start],d.month_end_date)+1
when v.[start]<d.month_start_date then datediff(day,d.month_start_date,v.[end])+1
else datediff(day,v.[start],v.[end])+1
end) as VacationDays
from dates d
join Vacationtbl v on v.[start] <= d.month_end_date and v.[end] >= d.month_start_date
group by v.id,year(d.month_start_date),month(d.month_start_date)
Assuming you want only one month and you want to count all days, you can do this with arithmetic. A separate calendar table is not necessary. The advantage is performance.
I think this would be easier if SQL Server supported least() and greatest(), but case will do:
select id,
sum(1 + datediff(day, news, newe)) as vacation_days_april
from vactiontbl v cross apply
(values (case when [start] < '2017-04-01' then cast('2017-04-01' as date) else [start] end),
(case when [end] >= '2017-05-01' then cast('2017-04-30' as date) else [end] end)
) v(news, newe)
where news <= newe
group by id;
You can readily extend this to any month:
with m as (
select cast('2017-04-01' as date) as month_start,
cast('2017-04-30' as date) as month_end
)
select id,
sum(1 + datediff(day, news, newe)) as vacation_days_aprile
from m cross join
vactiontbl v cross apply
(values (case when [start] < m.month_start then m.month_start else [start] end),
(case when [end] >= m.month_end then m.month_end else [end] end)
) v(news, newe)
where news <= newe
group by id;
You can even use a similar idea to extend to multiple months, with a different row for each user and each month.
You can use a Calendar or dates table for this sort of thing.
For only 152kb in memory, you can have 30 years of dates in a table with this:
/* dates table */
declare #fromdate date = '20000101';
declare #years int = 30;
/* 30 years, 19 used data pages ~152kb in memory, ~264kb on disk */
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
select top (datediff(day, #fromdate,dateadd(year,#years,#fromdate)))
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,#fromdate))
into dbo.Dates
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date];
create unique clustered index ix_dbo_Dates_date
on dbo.Dates([Date]);
Without taking the actual step of creating a table, you can use it inside a common table expression with just this:
declare #fromdate date = '20170401';
declare #thrudate date = '20170430';
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (datediff(day, #fromdate, #thrudate)+1)
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,#fromdate))
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date]
)
select [Date]
from dates;
Use either like so:
select
v.Id
, count(*) as VacationDays
from Vacationtbl v
inner join Dates d
on d.Date >= v.[Start]
and d.Date <= v.[End]
where d.Date >= '20170401'
and d.Date <= '20170430'
group by v.Id
rextester demo (table): http://rextester.com/PLW73242
rextester demo (cte): http://rextester.com/BCY62752
returns:
+----+--------------+
| Id | VacationDays |
+----+--------------+
| 01 | 7 |
| 02 | 7 |
+----+--------------+
Number and Calendar table reference:
Generate a set or sequence without loops - 2 - Aaron Bertrand
The "Numbers" or "Tally" Table: What it is and how it replaces a loop - Jeff Moden
Creating a Date Table/Dimension in sql Server 2008 - David Stein
Calendar Tables - Why You Need One - David Stein
Creating a date dimension or calendar table in sql Server - Aaron Bertrand
Try this,
declare #Vacationtbl table(ID int,Startdate date,Enddate date)
insert into #Vacationtbl VALUES
(1 ,'04/10/17','04/12/17')
,(1 ,'04/27/17','05/02/17')
,(2 ,'04/13/17','04/15/17')
,(2 ,'04/17/17','04/20/17')
-- somehow convert your input into first day of month
Declare #firstDayofGivenMonth date='2017-04-01'
Declare #LasttDayofGivenMonth date=dateadd(day,-1,dateadd(month,datediff(month,0,#firstDayofGivenMonth)+1,0))
;with CTE as
(
select *
,case when Startdate<#firstDayofGivenMonth then #firstDayofGivenMonth else Startdate end NewStDT
,case when Enddate>#LasttDayofGivenMonth then #LasttDayofGivenMonth else Enddate end NewEDT
from #Vacationtbl
)
SELECT
SUM(DATEDIFF(DAY, NewStDT, NewEDT) + 1) AS Days
FROM
CTE
GROUP BY
ID

For every row with data I need a row for each category

I have timesheet data that I need to create a report for by date range. I need to have a row for each person for each day, and each time type. If there's no entry for that time type on a given day, i want null data. I've tried a left join, but it doesn't seem to be working. A cross join will give erroneous data.
The tables I have are a Person table (personID, Name), a TimeLog table (TimeLogID, StartDate, EndDate, TimeLogTypeID), and a TimeLogType table (TimeLogTypeID, PersonID, Description, DeletedInd)
All I can get in the result set is the rows with data, and not the empty rows for each TimeLogType
Here's what I have so far:
DECLARE
#startDate DATE,
#endDate DATE
SET #startDate = '2014-05-01'
SET #endDate = '2014-05-30'
SELECT
CONVERT(DATE, TimeLog.StartDateTime, 101) AS TimeLogDay,
SUM(dbo.fnCalculateHoursAsDecimal(TimeLog.StartDateTime, TimeLog.EndDateTime)) AS Hours,
TimeLog.PersonID,
TimeLog.TimeLogTypeID
INTO #HourTable
FROM
TimeLog
WHERE
TimeLog.StartDateTime BETWEEN #startDate AND #endDate
GROUP BY
CONVERT(DATE, TimeLog.StartDateTime, 101),
TimeLog.TimeLogTypeID,
TimeLog.PersonID
SELECT
TimeLogType.Description,
#HourTable.*
FROM
TimeLogType LEFT JOIN
#HourTable ON TimeLogType.TimeLogTypeID = #HourTable.TimeLogTypeID
WHERE
ISNULL(TimeLogType.DeletedInd, 0) = 0
ORDER BY
PersonID, TimeLogDay, TimeLogType.TimeLogTypeID
The data goes something like this:
TimeLogType:
1, Billable
2, Non-Billable
Person:
1, Billy
2, Tom
TimeLog:
1, 1, 2014-05-01 08:00:00, 2014-05-01 09:00:00, 1, 0
2, 1, 2014-05-01 09:00:00, 2014-05-01 10:00:00, 1, 0
3, 2, 2014-05-01 08:00:00, 2014-05-01 08:30:00, 2, 0
4, 2, 2014-05-01 08:30:00, 2014-05-01 09:00:00, 1, 0
5, 1, 2014-05-02 08:00:00, 2014-05-02 09:00:00, 2, 0
Expected Output: (order by person, date, timelog type)
Day, Person, Bill Type, Total Hours
2014-05-01, Billy, Billiable, 2.0
2014-05-01, Billy, Non-Billiable, NULL
2014-05-02, Billy, Billiable, 1.0
2014-05-02, Billy, Non-Billiable, NULL
etc...
2014-05-01, Tom, Billiable, 0.5
2014-05-01, Tom, Non-Billiable, 0.5
etc...
You need to generate all the combinations first and then use left join to bring in the information you want. I think the query is like this:
with dates as (
select dateadd(day, number - 1, mind) as thedate
from (select min(StartDate) as mind, max(EndDate) as endd
from TimeLogType
) tlt join
master..spt_values v
on dateadd(day, v.number, mind) <= tlt.endd
)
select p.PersonId, tlt.TimeLogTypeId, d.thedate,
from Person p cross join
(select tlt.* from TimeLogType tlt where ISNULL(TimeLogType.DeletedInd, 0) = 0
) tlt cross join
date d left join
TimeLog tl
on tl.Person_id = p.PersonId and tl.TimeLogTypeId = tlt.TimeLogTypeId and
d.thedate >= tl.StartDate and d.thedate <= tl.EndDate
After reading Gordon's answer here's what I've come up with. I created it in steps so I could see what was going on. I created the dates w/o the master..spt_values table. I also created a temp table of people so I could select just the ones that had a TimeLogRecord, and then re-use it to pull in details for the final select. Let me know if there's any way to make this run faster.
DECLARE
#startDate DATE,
#endDate DATE
SET #startDate = '2014-01-01'
SET #endDate = '2014-01-31'
-- create day rows --
;WITH dates(TimeLogDay) AS
(
SELECT #startDate AS TimeLogDay
UNION ALL
SELECT DATEADD(d, 1, TimeLogDay)
FROM dates
WHERE TimeLogDay < #enddate
)
-- create a type row for each day --
SELECT
dates.TimeLogDay,
tlt.TimeLogTypeID
INTO #TypeDate
FROM
dates CROSS JOIN
(SELECT
TimeLogType.TimeLogTypeID
FROM
TimeLogType
WHERE
ISNULL(TimeLogType.DeletedInd, 0) = 0
) AS TLT
-- create a temp person table for referance later ---
SELECT * INTO #person FROM Person WHERE Person.personID IN
(SELECT Timelog.PersonID FROM TimeLog WHERE TimeLog.StartDateTime BETWEEN #startDate AND #endDate)
-- sum up the log times and tie in the date/type rows --
SELECT
#TypeDate.TimeLogDay,
#TypeDate.TimeLogTypeID,
#person.PersonID,
SUM(dbo.fnCalculateHoursAsDecimal(TimeLog.StartDateTime, TimeLog.EndDateTime)) AS Hours
INTO #Hours
FROM
#person CROSS JOIN
#TypeDate LEFT JOIN
TimeLog ON
TimeLog.PersonID = #person.PersonID AND
TimeLog.TimeLogTypeID = #TypeDate.TimeLogTypeID AND
#TypeDate.TimeLogDay = CONVERT(DATE, TimeLog.StartDateTime, 101)
GROUP BY
#TypeDate.TimeLogDay,
#TypeDate.TimeLogTypeID,
#person.PersonID
-- now tie in the details to complete --
SELECT
#Hours.TimeLogDay,
TimeLogType.Description,
Person.LastName,
Person.FirstName,
#Hours.Hours
FROM
#Hours LEFT JOIN
Person ON #Hours.PersonID = Person.PersonID LEFT JOIN
TimeLogType ON #Hours.TimeLogTypeID = TimeLogType.TimeLogTypeID
ORDER BY
Person.FirstName,
Person.LastName,
#Hours.TimeLogDay,
TimeLogType.SortOrder

Group a query by every month

I have the following query :
select
(select Sum(Stores) from XYZ where Year = '2013' and Month = '8' )
-
(select Sum(SalesStores) from ABC where Year = '2013' and Month = '8') as difference
Here in the above query Year and Month are also columns of a table.
I would like to know if there is a way to run the same query so that , it is run against every month of the year ?
If there are months without data/rows within XYZ or ABC tables then I would use FULL OUTER JOIN:
SELECT ISNULL(x.[Month], y.[Month]) AS [Month],
ISNULL(x.Sum_Stores, 0) - ISNULL(y.Sum_SalesStores, 0) AS Difference
FROM
(
SELECT [Month], Sum(Stores) AS Sum_Stores
FROM XYZ
WHERE [Year] = '2013'
GROUP BY [Month]
) AS x
FULL OUTER JOIN
(
SELECT [Month], Sum(SalesStores) AS Sum_SalesStores
FROM ABC
WHERE [Year] = '2013'
GROUP BY [Month]
) AS y ON x.[Month] = y.[Month]
;WITH Months(Month) AS
(
SELECT 1
UNION ALL
SELECT Month + 1
FROM Months
where Month < 12
)
SELECT '2013' [Year], m.Month, COALESCE(SUM(Stores), 0) - COALESCE(SUM(SalesStores), 0) [Difference]
FROM months m
LEFT JOIN XYZ x ON m.Month = x.Month
LEFT JOIN ABC a ON a.Month = m.Month
GROUP BY m.Month
You could use GROUP BY in your inner trades, and then run a join, like this:
SELECT left.Month, (left.sum - COALESCE(right.sum, 0)) as difference
FROM (
SELECT Month, SUM(Stores) as sum
FROM XYZ WHERE Year = '2013'
GROUP BY Month
) left
LEFT OUTER JOIN (
SELECT Month, SUM(Stores) as sum
FROM ABC WHERE Year = '2013'
GROUP BY Month
) right ON left.Month = right.Months
Note the use of COALESCE. It lets you preserve the value of the first SUM in case when there are no records for the month in the ABC table.
In the following example uses the UNION ALL operator with CTE
;WITH cte AS
(SELECT SUM(Stores) AS Stores, [Month]
FROM dbo.XYZ
WHERE [Year] = '2013'
GROUP BY [Month]
UNION ALL
SELECT -1.00 * SUM(SalesStores), [Month]
FROM dbo.ABC
WHERE [Year] = '2013'
GROUP BY [Month]
)
SELECT [Month], SUM(Stores) AS Difference
FROM cte
GROUP BY [Month]
Demo on SQLFiddle
;WITH Months(Month) AS
(
SELECT 1
UNION ALL
SELECT Month + 1
FROM Months
where Month < 12
)
SELECT Months. Month ,
(select isnull(Sum(Stores),0) from XYZ where Year = '2013' and Month = Months.Month) - (select isnull(Sum(SalesStores),0) from ABC where Year = '2013' and Month =Months.Month) as difference
FROM Months

How to count open records, grouped by hour and day in SQL-server-2008-r2

I have hospital patient admission data in Microsoft SQL Server r2 that looks something like this:
PatientID, AdmitDate, DischargeDate
Jones. 1-jan-13 01:37. 1-jan-13 17:45
Smith 1-jan-13 02:12. 2-jan-13 02:14
Brooks. 4-jan-13 13:54. 5-jan-13 06:14
I would like count the number of patients in the hospital day by day and hour by hour (ie at
1-jan-13 00:00. 0
1-jan-13 01:00. 0
1-jan-13 02:00. 1
1-jan-13 03:00. 2
And I need to include the hours when there are no patients admitted in the result.
I can't create tables so making a reference table listing all the hours and days is out, though.
Any suggestions?
To solve this problem, you need a list of date-hours. The following gets this from the admit date cross joined to a table with 24 hours. The table of 24 hours is calculating from information_schema.columns -- a trick for getting small sequences of numbers in SQL Server.
The rest is just a join between this table and the hours. This version counts the patients at the hour, so someone admitted and discharged in the same hour, for instance is not counted. And in general someone is not counted until the next hour after they are admitted:
with dh as (
select DATEADD(hour, seqnum - 1, thedatehour ) as DateHour
from (select distinct cast(cast(AdmitDate as DATE) as datetime) as thedatehour
from Admission a
) a cross join
(select ROW_NUMBER() over (order by (select NULL)) as seqnum
from INFORMATION_SCHEMA.COLUMNS
) hours
where hours <= 24
)
select dh.DateHour, COUNT(*) as NumPatients
from dh join
Admissions a
on dh.DateHour between a.AdmitDate and a.DischargeDate
group by dh.DateHour
order by 1
This also assumes that there are admissions on every day. That seems like a reasonable assumption. If not, a calendar table would be a big help.
Here is one (ugly) way:
;WITH DayHours AS
(
SELECT 0 DayHour
UNION ALL
SELECT DayHour+1
FROM DayHours
WHERE DayHour+1 <= 23
)
SELECT B.AdmitDate, A.DayHour, COUNT(DISTINCT PatientID) Patients
FROM DayHours A
CROSS JOIN (SELECT DISTINCT CONVERT(DATE,AdmitDate) AdmitDate
FROM YourTable) B
LEFT JOIN YourTable C
ON B.AdmitDate = CONVERT(DATE,C.AdmitDate)
AND A.DayHour = DATEPART(HOUR,C.AdmitDate)
GROUP BY B.AdmitDate, A.DayHour
This is a bit messy and includes a temp table with the test data you provided but
CREATE TABLE #HospitalPatientData (PatientId NVARCHAR(MAX), AdmitDate DATETIME, DischargeDate DATETIME)
INSERT INTO #HospitalPatientData
SELECT 'Jones.', '1-jan-13 01:37:00.000', '1-jan-13 17:45:00.000' UNION
SELECT 'Smith', '1-jan-13 02:12:00.000', '2-jan-13 02:14:00.000' UNION
SELECT 'Brooks.', '4-jan-13 13:54:00.000', '5-jan-13 06:14:00.000'
;WITH DayHours AS
(
SELECT 0 DayHour
UNION ALL
SELECT DayHour+1
FROM DayHours
WHERE DayHour+1 <= 23
),
HospitalPatientData AS
(
SELECT CONVERT(nvarchar(max),AdmitDate,103) as AdmitDate ,DATEPART(hour,(AdmitDate)) as AdmitHour, COUNT(PatientID) as CountOfPatients
FROM #HospitalPatientData
GROUP BY CONVERT(nvarchar(max),AdmitDate,103), DATEPART(hour,(AdmitDate))
),
Results AS
(
SELECT MAX(h.AdmitDate) as Date, d.DayHour
FROM HospitalPatientData h
INNER JOIN DayHours d ON d.DayHour=d.DayHour
GROUP BY AdmitDate, CountOfPatients, DayHour
)
SELECT r.*, COUNT(h.PatientId) as CountOfPatients
FROM Results r
LEFT JOIN #HospitalPatientData h ON CONVERT(nvarchar(max),AdmitDate,103)=r.Date AND DATEPART(HOUR,h.AdmitDate)=r.DayHour
GROUP BY r.Date, r.DayHour
ORDER BY r.Date, r.DayHour
DROP TABLE #HospitalPatientData
This may get you started:
BEGIN TRAN
DECLARE #pt TABLE
(
PatientID VARCHAR(10)
, AdmitDate DATETIME
, DischargeDate DATETIME
)
INSERT INTO #pt
( PatientID, AdmitDate, DischargeDate )
VALUES ( 'Jones', '1-jan-13 01:37', '1-jan-13 17:45' ),
( 'Smith', '1-jan-13 02:12', '2-jan-13 02:14' )
, ( 'Brooks', '4-jan-13 13:54', '5-jan-13 06:14' )
DECLARE #StartDate DATETIME = '20130101'
, #FutureDays INT = 7
;
WITH dy
AS ( SELECT TOP (#FutureDays)
ROW_NUMBER() OVER ( ORDER BY name ) dy
FROM sys.columns c
) ,
hr
AS ( SELECT TOP 24
ROW_NUMBER() OVER ( ORDER BY name ) hr
FROM sys.columns c
)
SELECT refDate, COUNT(p.PatientID) AS PtCount
FROM ( SELECT DATEADD(HOUR, hr.hr - 1,
DATEADD(DAY, dy.dy - 1, #StartDate)) AS refDate
FROM dy
CROSS JOIN hr
) ref
LEFT JOIN #pt p ON ref.refDate BETWEEN p.AdmitDate AND p.DischargeDate
GROUP BY refDate
ORDER BY refDate
ROLLBACK

SQL calculate number of days of residency by month, by user, by location

I'm working on a query for a rehab organization where tenants (client/patients) live in a building when they first arrive, as they progress in their treatment they move to another building and as they near the end of treatment they are in a third building.
For funding purposes we need to know how many nights a tenant spent in each building in each month.
I can use DateDiff to get the total number of nights, but how do I get the total for each client in each month in each building?
For example, John Smith is in Building A 9/12-11/3; moves to Building B 11/3-15; moves to Building C on and is still there: 11/15 - today
What query returns a result that show the number of nights he spent in:
Building A in Septmeber, October and November.
Buidling B in November
Building C in November
Two tables hold the client's name, building name and move-in date and move-out date
CREATE TABLE [dbo].[clients](
[ID] [nvarchar](50) NULL,
[First_Name] [nvarchar](100) NULL,
[Last_Name] [nvarchar](100) NULL
) ON [PRIMARY]
--populate w/ two records
insert into clients (ID,First_name, Last_name)
values ('A2938', 'John', 'Smith')
insert into clients (ID,First_name, Last_name)
values ('A1398', 'Mary', 'Jones')
CREATE TABLE [dbo].[Buildings](
[ID_U] [nvarchar](50) NULL,
[Move_in_Date_Building_A] [datetime] NULL,
[Move_out_Date_Building_A] [datetime] NULL,
[Move_in_Date_Building_B] [datetime] NULL,
[Move_out_Date_Building_B] [datetime] NULL,
[Move_in_Date_Building_C] [datetime] NULL,
[Move_out_Date_Building_C] [datetime] NULL,
[Building_A] [nvarchar](50) NULL,
[Building_B] [nvarchar](50) NULL,
[Building_C] [nvarchar](50) NULL
) ON [PRIMARY]
-- Populate the tables with two records
insert into buildings (ID_U,Move_in_Date_Building_A,Move_out_Date_Building_A, Move_in_Date_Building_B,
Move_out_Date_Building_B, Move_in_Date_Building_C, Building_A, Building_B, Building_C)
VALUES ('A2938','2010-9-12', '2010-11-3','2010-11-3','2010-11-15', '2010-11-15', 'Kalgan', 'Rufus','Waylon')
insert into buildings (ID_U,Move_in_Date_Building_A,Building_A)
VALUES ('A1398','2010-10-6', 'Kalgan')
Thanks for your help.
I'd use a properly normalized database schema, your Buildings table is not useful like this. After splitting it up I believe that getting your answer will be pretty easy.
Edit (and updated): Here's a CTE which will take this strange table structure and split it into a more normalized form, displaying the user id, building name, move in and move out dates. By grouping on the ones you want (and using DATEPART() etc.) you should be able to get the data you need with that.
WITH User_Stays AS (
SELECT
ID_U,
Building_A Building,
Move_in_Date_Building_A Move_In,
COALESCE(Move_out_Date_Building_A, CASE WHEN ((Move_in_Date_Building_B IS NULL) OR (Move_in_Date_Building_C<Move_in_Date_Building_B)) AND (Move_in_Date_Building_C>Move_in_Date_Building_A) THEN Move_in_Date_Building_C WHEN Move_in_Date_Building_B>=Move_in_Date_Building_A THEN Move_in_Date_Building_B END, GETDATE()) Move_Out
FROM dbo.Buildings
WHERE Move_in_Date_Building_A IS NOT NULL
UNION ALL
SELECT
ID_U,
Building_B,
Move_in_Date_Building_B,
COALESCE(Move_out_Date_Building_B, CASE WHEN ((Move_in_Date_Building_A IS NULL) OR (Move_in_Date_Building_C<Move_in_Date_Building_A)) AND (Move_in_Date_Building_C>Move_in_Date_Building_B) THEN Move_in_Date_Building_C WHEN Move_in_Date_Building_A>=Move_in_Date_Building_B THEN Move_in_Date_Building_A END, GETDATE())
FROM dbo.Buildings
WHERE Move_in_Date_Building_B IS NOT NULL
UNION ALL
SELECT
ID_U,
Building_C,
Move_in_Date_Building_C,
COALESCE(Move_out_Date_Building_C, CASE WHEN ((Move_in_Date_Building_B IS NULL) OR (Move_in_Date_Building_A<Move_in_Date_Building_B)) AND (Move_in_Date_Building_A>Move_in_Date_Building_C) THEN Move_in_Date_Building_A WHEN Move_in_Date_Building_B>=Move_in_Date_Building_C THEN Move_in_Date_Building_B END, GETDATE())
FROM dbo.Buildings
WHERE Move_in_Date_Building_C IS NOT NULL
)
SELECT *
FROM User_Stays
ORDER BY ID_U, Move_In
This query run on your sample data produces he following output:
ID_U Building Move_In Move_Out
-------- ----------- ----------------------- -----------------------
A1398 Kalgan 2010-10-06 00:00:00.000 2010-11-23 18:35:59.050
A2938 Kalgan 2010-09-12 00:00:00.000 2010-11-03 00:00:00.000
A2938 Rufus 2010-11-03 00:00:00.000 2010-11-15 00:00:00.000
A2938 Waylon 2010-11-15 00:00:00.000 2010-11-23 18:35:59.050
(4 row(s) affected)
As you can see, from here on it will be much easier to isolate the days per patient or building, and also to find the records for specific months and calculate the correct stay duration in that case. Note that the CTE displays the current date for patients which are still in a building.
Edit (again): In order to get all months including their start and end dates for all relevant years, you can use a CTE like this:
WITH User_Stays AS (
[...see above...]
)
,
Months AS (
SELECT m.IX,
y.[Year], dateadd(month,(12*y.[Year])-22801+m.ix,0) StartDate, dateadd(second, -1, dateadd(month,(12*y.[Year])-22800+m.ix,0)) EndDate
FROM (
SELECT 1 IX UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9 UNION ALL
SELECT 10 UNION ALL
SELECT 11 UNION ALL
SELECT 12
)
m
CROSS JOIN (
SELECT Datepart(YEAR, us.Move_In) [Year]
FROM User_Stays us UNION
SELECT Datepart(YEAR, us.Move_Out)
FROM User_Stays us
)
y
)
SELECT *
FROM months;
So since we now have a tabular representation of all date ranges which can be of interest, we simply join this together:
WITH User_Stays AS ([...]),
Months AS ([...])
SELECT m.[Year],
DATENAME(MONTH, m.StartDate) [Month],
us.ID_U,
us.Building,
DATEDIFF(DAY, CASE WHEN us.Move_In>m.StartDate THEN us.Move_In ELSE m.StartDate END, CASE WHEN us.Move_Out<m.EndDate THEN us.Move_Out ELSE DATEADD(DAY, -1, m.EndDate) END) Days
FROM Months m
JOIN User_Stays us ON (us.Move_In < m.EndDate) AND (us.Move_Out >= m.StartDate)
ORDER BY m.[Year],
us.ID_U,
m.Ix,
us.Move_In
Which finally produces this output:
Year Month ID_U Building Days
----------- ------------ -------- ---------- -----------
2010 October A1398 Kalgan 25
2010 November A1398 Kalgan 22
2010 September A2938 Kalgan 18
2010 October A2938 Kalgan 30
2010 November A2938 Kalgan 2
2010 November A2938 Rufus 12
2010 November A2938 Waylon 8
-- set the dates for which month you want
Declare #startDate datetime
declare #endDate datetime
set #StartDate = '09/01/2010'
set #EndDate = '09/30/2010'
select
-- determine if the stay occurred during this month
Case When #StartDate <= Move_out_Date_Building_A and #EndDate >= Move_in_Date_Building_A
Then
(DateDiff(d, #StartDate , #enddate+1)
)
-- drop the days off the front
- (Case When #StartDate < Move_in_Date_Building_A
Then datediff(d, #StartDate, Move_in_Date_Building_A)
Else 0
End)
--drop the days of the end
- (Case When #EndDate > Move_out_Date_Building_A
Then datediff(d, #EndDate, Move_out_Date_Building_A)
Else 0
End)
Else 0
End AS Building_A_Days_Stayed
from Clients c
inner join Buildings b
on c.id = b.id_u
Try using a date table. For example, you could create one like so:
CREATE TABLE Dates
(
[date] datetime,
[year] smallint,
[month] tinyint,
[day] tinyint
)
INSERT INTO Dates(date)
SELECT dateadd(yy, 100, cast(row_number() over(order by s1.object_id) as datetime))
FROM sys.objects s1
CROSS JOIN sys.objects s2
UPDATE Dates
SET [year] = year(date),
[month] = month(date),
[day] = day(date)
Just modify the initial Dates population to meet your needs (on my test instance, the above yielded dates from 2000-01-02 to 2015-10-26). With a dates table, the query is pretty straight forward, something like this:
select c.First_name, c.Last_name,
b.Building_A BuildingName, dA.year, dA.month, count(distinct dA.day) daysInBuilding
from clients c
join Buildings b on c.ID = b.ID_U
left join Dates dA on dA.date between b.Move_in_Date_Building_A and isnull(b.Move_out_Date_Building_A, getDate())
group by c.First_name, c.Last_name,
b.Building_A, dA.year, dA.month
UNION
select c.First_name, c.Last_name,
b.Building_B, dB.year, dB.month, count(distinct dB.day)
from clients c
join Buildings b on c.ID = b.ID_U
left join Dates dB on dB.date between b.Move_in_Date_Building_B and isnull(b.Move_out_Date_Building_B, getDate())
group by c.First_name, c.Last_name,
b.Building_B, dB.year, dB.month
UNION
select c.First_name, c.Last_name,
b.Building_C, dC.year, dC.month, count(distinct dC.day)
from clients c
join Buildings b on c.ID = b.ID_U
left join Dates dC on dC.date between b.Move_in_Date_Building_C and isnull(b.Move_out_Date_Building_C, getDate())
group by c.First_name, c.Last_name,
b.Building_C, dC.year, dC.month
If you can't restructure the Building table you can create a query that will normalize it for you and allow for easier calculations:
SELECT "A" as Building, BuidlingA as Name, Move_in_Date_Building_A as MoveInDate,
Move_out_Date_Building_A As MoveOutDate
UNION
SELECT "B", BuidlingB, Move_in_Date_Building_B, Move_out_Date_Building_B
UNION
SELECT "C", BuidlingC, Move_in_Date_Building_C, Move_out_Date_Building_C