Get records based on Current Date and Configured Data - sql

I am working on automating the SMS sending part in my web application.
SQL Fiddle Link
DurationType table stores whether the sms should be sent out an interval of Hours, Days, Weeks, Months. Referred in SMSConfiguration
CREATE TABLE [dbo].[DurationType](
[Id] [int] NOT NULL PRIMARY KEY,
[DurationType] VARCHAR(10) NOT NULL
)
Bookings table Contains the original Bookings data. For this booking I need to send the SMS based on configuration.
SMS Configuration. Which defines the configuration for sending automatic SMS. It can be send before/after. Before=0, After=1. DurationType can be Hours=1, Days=2, Weeks=3, Months=4
Now I need to find out the list of bookings that needs SMS to be sent out at the current time based on the SMS Configuration Set.
Tried SQL using UNION
DECLARE #currentTime smalldatetime = '2014-07-12 11:15:00'
-- 'SMS CONFIGURED FOR HOURS BASIS'
SELECT B.Id AS BookingId,
B.StartTime AS BookingStartTime,#currentTime As CurrentTime, SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
WHERE (DATEDIFF(HOUR, #currentTime, B.StartTime) = SMS.Duration AND SMS.DurationType=1 AND BeforeAfter=0)
OR
(DATEDIFF(HOUR, B.StartTime, #currentTime) = SMS.Duration AND SMS.DurationType=1 AND BeforeAfter=1)
--'SMS CONFIGURED FOR DAYS BASIS'
UNION
SELECT B.Id AS BookingId,
B.StartTime AS BookingStartTime,#currentTime As CurrentTime, SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
WHERE (DATEDIFF(DAY, #currentTime, B.StartTime) = SMS.Duration AND SMS.DurationType=2 AND BeforeAfter=0)
OR
(DATEDIFF(DAY, B.StartTime, #currentTime) = SMS.Duration AND SMS.DurationType=2 AND BeforeAfter=1)
--'SMS CONFIGURED FOR WEEKS BASIS'
UNION
SELECT B.Id AS BookingId,
B.StartTime AS BookingStartTime, #currentTime As CurrentTime, SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
WHERE (DATEDIFF(DAY, #currentTime, B.StartTime)/7 = SMS.Duration AND SMS.DurationType=3 AND BeforeAfter=0)
OR
(DATEDIFF(DAY, B.StartTime, #currentTime)/7 = SMS.Duration AND SMS.DurationType=3 AND BeforeAfter=1)
--'SMS CONFIGURED FOR MONTHS BASIS'
UNION
SELECT B.Id AS BookingId,
B.StartTime AS BookingStartTime, #currentTime As CurrentTime, SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
WHERE (dbo.FullMonthsSeparation(#currentTime, B.StartTime) = SMS.Duration AND SMS.DurationType=4 AND BeforeAfter=0)
OR
(dbo.FullMonthsSeparation(B.StartTime, #currentTime) = SMS.Duration AND SMS.DurationType=4 AND BeforeAfter=1)
Result
Problem:
The SQL procedure will be running every 15 mins. Current query keep returning days/weeks/months records even for the current time '2014-07-12 11:30:00', '2014-07-12 11:45:00', etc
I want a single query that takes care of all Hours/Days/Weeks/Months
calculation and I should be get records only one time when they meet
the correct time. Otherwise I will be sending sms again and again
every 15 mins for day/week/months records matched.
It should consider the following scenarios.
Hour, If booking is 10:15 A.M same day 9:15 A.M if it is before 1 hour configured
Day(24 Hour difference), If booking is 10:15 A.M 3rd day morning 10:15 A.M if it is configured after 3 days in SMSConfiguration
Match Week. If booking is 10:15 A.M today(Wednesday) morning then after 14 days morning 10:15 A.M if it is configured after 2 weeks.
Month also same logic like above.

Try the simplified version FIDDLE , removed Union and used OR conditions
LIST FOR HOURS - RUN Every 15 mins
DECLARE #currentTime smalldatetime = '2014-07-12 11:15:00'
SELECT B.Id AS BookingId, C.Id, C.Name,
B.StartTime AS BookingStartTime,#currentTime As CurrentTime, SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
LEFT JOIN Category C ON C.Id=B.CategoryId
WHERE
(DATEDIFF(MINUTE, #currentTime, B.StartTime) = SMS.Duration*60 AND SMS.DurationType=1 AND BeforeAfter=0)
OR
(DATEDIFF(MINUTE, B.StartTime, #currentTime) = SMS.Duration*60 AND SMS.DurationType=1 AND BeforeAfter=1)
Order BY B.Id
GO
LIST FOR DAYS/WEEKS/MONTHS - RUN Every Day Morning Once
DECLARE #currentTime smalldatetime = '2014-07-12 08:00:00'
SELECT B.Id AS BookingId, C.Id, C.Name,
B.StartTime AS BookingStartTime,#currentTime As CurrentTime, SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
LEFT JOIN Category C ON C.Id=B.CategoryId
WHERE
(((DATEDIFF(DAY, #currentTime, B.StartTime) = SMS.Duration AND SMS.DurationType=2)
OR (DATEDIFF(DAY, #currentTime, B.StartTime) = SMS.Duration*7 AND SMS.DurationType=3)
OR (DATEDIFF(DAY, #currentTime, B.StartTime) = SMS.Duration*30 AND SMS.DurationType=4))
AND BeforeAfter=0)
OR
(((DATEDIFF(DAY, B.StartTime, #currentTime) = SMS.Duration AND SMS.DurationType=2)
OR (DATEDIFF(DAY, #currentTime, B.StartTime) = SMS.Duration*7 AND SMS.DurationType=3)
OR (DATEDIFF(DAY, #currentTime, B.StartTime) = SMS.Duration*30 AND SMS.DurationType=4))
AND BeforeAfter=1)
Order BY B.Id

You seem to have forgotten to pick some columns to return.
DECLARE #currentTime smalldatetime = '2014-07-12 09:15:00';
SELECT [col1], [col2], [etcetera]
FROM
Bookings B
INNER JOIN SMSConfiguration SMS ON SMS.CategoryId=B.CategoryId
WHERE DATEDIFF(hour,B.StartTime,#currentTime)=1

This works
DECLARE #currentTime smalldatetime
set #currentTime= '2014-07-12 09:15:00'
SELECT B.CategoryId FROM
Bookings B
INNER JOIN SMSConfiguration SMS ON SMS.CategoryId=B.CategoryId
WHERE DATEDIFF(hh,B.StartTime,#currentTime)=1
There were two problems in your sql fiddle
1) you didn't set the value of currentTime
2) there were columns in your select statement
Note: I took B.CategoryId as one of the column to fetch records, you can change it as per your requirement

Just swap the dates in DATEDIFF function
Change DATEDIFF(hour, B.StartTime, #currentTime) to DATEDIFF(hour, #currentTime, B.StartTime) - for HOUR only
Because yours one return the value in Negative (-1).
DECLARE #currentTime smalldatetime = '2014-07-12 09:15:00'
SELECT B.Id AS BookingId, #currentTime AS CurrentTime,
B.StartTime AS BookingStartTime, SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
WHERE (SMS.CategoryId IS NOT NULL AND
DATEDIFF(HOUR, #currentTime, B.StartTime) = SMS.Duration) OR
(SMS.CategoryId IS NULL AND
(DATEDIFF(DAY, B.StartTime, #currentTime) = SMS.Duration OR
DATEDIFF(MONTH, B.StartTime, #currentTime) = SMS.Duration))

You can use the nested CASE expressions for achieve your results in single query. please try this,
DECLARE #currentTime smalldatetime = '2014-07-12 11:15:00'
SELECT B.Id AS BookingId,
SMS.Duration,
B.StartTime AS BookingStartTime,
#currentTime As CurrentTime,
SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS
ON SMS.CategoryId = B.CategoryId
OR SMS.CategoryId IS NULL
WHERE
SMS.Duration =
CASE
WHEN SMS.DurationType = 1 THEN --'SMS CONFIGURED FOR HOURS BASIS'
CASE WHEN BeforeAfter = 0 THEN
DATEDIFF(HOUR, #currentTime, B.StartTime)
ELSE
DATEDIFF(HOUR, B.StartTime, #currentTime)
END
WHEN SMS.DurationType = 2 THEN --'SMS CONFIGURED FOR DAY BASIS'
CASE WHEN BeforeAfter = 0 THEN
DATEDIFF(DAY, #currentTime, B.StartTime)
ELSE
DATEDIFF(DAY, B.StartTime, #currentTime)
END
WHEN SMS.DurationType = 3 THEN --'SMS CONFIGURED FOR WEEK BASIS'
CASE WHEN BeforeAfter = 0 THEN
DATEDIFF(DAY, #currentTime, B.StartTime)/7
ELSE
DATEDIFF(DAY, B.StartTime, #currentTime)/7
END
ELSE
CASE WHEN BeforeAfter = 0 THEN -- 'SMS CONFIGURED FOR MONTH BASIS'
dbo.FullMonthsSeparation(#currentTime, B.StartTime)
ELSE
dbo.FullMonthsSeparation(B.StartTime, #currentTime)
END
END
ORDER BY BOOKINGID;
you can automate in two schedules one for getting booking details of Hourly sms and other one for Daily/Weekly/Monthly booking details.

Try:
DECLARE #currentTime smalldatetime = '2014-07-12 09:15:00'
SELECT B.Id AS BookingId,
B.StartTime AS BookingStartTime,
#currentTime CURRENT_DT,
SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
WHERE ( SMS.DurationType = 1
AND #currentTime = DATEADD(HOUR, (2*SMS.BeforeAfter-1)*SMS.Duration, B.StartTime))
OR
( SMS.DurationType = 2
AND #currentTime = DATEADD(DAY, (2*SMS.BeforeAfter-1)*SMS.Duration, B.StartTime))
OR
( SMS.DurationType = 3
AND #currentTime = DATEADD(WEEK, (2*SMS.BeforeAfter-1)*SMS.Duration, B.StartTime))
OR
( SMS.DurationType = 4
AND #currentTime = DATEADD(MONTH, (2*SMS.BeforeAfter-1)*SMS.Duration, B.StartTime))
The 2*SMS.BeforeAfter-1 converts Before(0) to -1 and After(1) to 1. Thus, adding or subtracting time from the BookingStartTime
[Edit]
The above solution should work with your data model as is. Below I propose simplifying the solution by changing your data model.
Option 1: Change Before and After meta data
http://sqlfiddle.com/#!3/6e9e9/1
Instead of Before = 0 and After = 1, use Before = -1 and After = 1. Then these can be used to change the direction of any date offset
For example:
INSERT INTO [dbo].[SMSConfiguration]
([CategoryId],[SMSText],[BeforeAfter],[Duration],[DurationType],[IsActive])
VALUES
(1,'Before 1 hour',-1,1,1,1)--
INSERT INTO [dbo].[SMSConfiguration]
([CategoryId],[SMSText],[BeforeAfter],[Duration],[DurationType],[IsActive])
VALUES
(NULL,'After 1 hour (no category id))',1,1,1,1)
Updated query:
DECLARE #currentTime smalldatetime = '2014-07-12 09:15:00'
SELECT B.Id AS BookingId,
B.StartTime AS BookingStartTime,
#currentTime CURRENT_DT,
SMS.SMSText
FROM Bookings B INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
WHERE ( SMS.DurationType = 1
AND #currentTime = DATEADD(HOUR, SMS.BeforeAfter*SMS.Duration, B.StartTime))
OR
( SMS.DurationType = 2
AND #currentTime = DATEADD(DAY, SMS.BeforeAfter*SMS.Duration, B.StartTime))
OR
( SMS.DurationType = 3
AND #currentTime = DATEADD(WEEK, SMS.BeforeAfter*SMS.Duration, B.StartTime))
OR
( SMS.DurationType = 4
AND #currentTime = DATEADD(MONTH, SMS.BeforeAfter*SMS.Duration, B.StartTime))
Option 2: Allow 30-days = 1 month
http://sqlfiddle.com/#!3/808cd/1
On top of the Before/After change, if a 30 day period is close enough to represent a month, then all duration types could be converted to hours. The question is how particular your customers will be in regard to precision. If you are off by a day or two on a "3 month before" rule, I don't think you will have many complaints.
So your DurationType table could look like this
CREATE TABLE [dbo].[DurationType](
[Id] [int] NOT NULL PRIMARY KEY,
[DurationType] VARCHAR(10) NOT NULL,
[HourConversion] [int] NOT NULL
)
GO
INSERT INTO [dbo].[DurationType]([Id],[DurationType],[HourConversion])
VALUES(1,'Hours',1)
INSERT INTO [dbo].[DurationType]([Id],[DurationType],[HourConversion])
VALUES(2,'Days',24)
INSERT INTO [dbo].[DurationType]([Id],[DurationType],[HourConversion])
VALUES(3,'Weeks',24*7)
INSERT INTO [dbo].[DurationType]([Id],[DurationType],[HourConversion])
VALUES(4,'Months',24*30)
With these changes, the query could be modified to:
DECLARE #currentTime smalldatetime = '2014-07-12 09:15:00'
SELECT B.Id AS BookingId,
B.StartTime AS BookingStartTime,
#currentTime CURRENT_DT,
SMS.SMSText
FROM Bookings B
INNER JOIN
SMSConfiguration SMS ON SMS.CategoryId = B.CategoryId OR SMS.CategoryId IS NULL
INNER JOIN
DurationType DT ON DT.Id = SMS.DurationType
WHERE #currentTime = DATEADD(HOUR, SMS.BeforeAfter*SMS.Duration*DT.HourConversion, B.StartTime)

Related

How Can I write in my code to exclude accounts that have had a payment posted w/in last 45 days?

I'm working in SQL Server Mgmt Studio and I'm trying to identify accounts that have not had a payment posted since the last 45 days. I've tried declaring a min and max date with getdate being less than 45 days, but that didnt work. I've also tried stating "where last payment !> getdate() -45".... but this did not give me the results I was looking for. what I need the query to do is only identify accounts that truly have not had a payment post in the last 45 days. but I'm not sure how to state this in my where clause. Here's my query
DECLARE #minDate datetime;
DECLARE #maxDate datetime;
SET #minDate = GETDATE();
SET #maxDate = GETDATE() - 45;
SELECT DISTINCT
a.id,
a.FacilityCode AS [Facility],
a.accountUnit,
a.accountNum AS [Account Number],
a.amountBalance AS [Balance],
a.accountstatus AS [Status],
CONVERT(date, a.accountStatusDate) AS StatusDate,
pmtEntryDate AS [Pmt Entry Date],
v.LastLinkARpmtDate AS [Last Pmt Date],
CONVERT(date, a.accountnextactdate) AS NextActionDate,
CONVERT(date, a.rpcDate) AS RightpartyContactDate,
a.flag AS accountFlag,
pay.flag AS pmtFlag,
a.accounttype
FROM dbo.tbAccount a
JOIN dbo.tbpmtinfo pay ON a.id = pay.id
JOIN dbo.vLastPmtDate v ON v.id = a.id
FULL JOIN (SELECT id,
MAX(deDate) AS MFSLoadDate
FROM dbo.tbDataEvents (NOLOCK)
WHERE deNewVal = 'MFS'
GROUP BY id) m ON m.id = a.id
WHERE pay.flag IN ('D')
AND a.amountBalance > 0
AND a.FacilityCode IN ('PHKY', 'QANM', 'QAOH', 'QBCA', 'QBIL', 'QBTX', 'QCIL', 'QDAL', 'QEWY', 'QFAR', 'QFGA',
'QGIL', 'QHAR', 'QHIL', 'QHTN', 'QLAL', 'QLPA', 'QMIL', 'QMNC', 'QMNM', 'QMNV', 'QMOR',
'QMTN', 'QMTX', 'QMUT', 'QRIL', 'QRKY', 'QSPA', 'QTGA', 'QTKY', 'QUIL', 'QVIL', 'QWIL', 'LHFL')
AND a.amountBalance >= '1000'
AND a.accountStatus IN ('PA', 'PP', 'BPA')
AND a.flag = 'A'
AND a.accountType IN ('Resid', 'Slfpy')
--and LastLinkARpmtDate !> Getdate() -45 and accountNum = '40728597'
AND LastLinkARpmtDate BETWEEN #minDate AND #maxDate;
I would use a WHERE NOT EXISTS clause to eliminate recent activity:
SELECT DISTINCT a.id
,a.FacilityCode [Facility]
,a.accountUnit
,a.accountNum [Account Number]
,a.amountBalance [Balance]
,a.accountstatus [Status]
,convert(DATE, a.accountStatusDate) StatusDate
,pmtEntryDate [Pmt Entry Date]
,v.LastLinkARpmtDate [Last Pmt Date]
,convert(DATE, a.accountnextactdate) NextActionDate
,convert(DATE, a.rpcDate) RightpartyContactDate
,a.flag AS accountFlag
,pay.flag AS pmtFlag
,a.accounttype
FROM dbo.tbAccount AS a
JOIN dbo.tbpmtinfo AS pay ON a.id = pay.id
JOIN dbo.vLastPmtDate AS v ON v.id = a.id
FULL JOIN (
SELECT id
,Max(deDate) AS MFSLoadDate
FROM dbo.tbDataEvents(NOLOCK)
WHERE deNewVal = 'MFS'
GROUP BY id
) m ON m.id = a.id
WHERE pay.flag IN ('D')
AND a.amountBalance > 0
AND a.FacilityCode IN (
'PHKY'
,'QANM'
,'QAOH'
,'QBCA'
,'QBIL'
,'QBTX'
,'QCIL'
,'QDAL'
,'QEWY'
,'QFAR'
,'QFGA'
,'QGIL'
,'QHAR'
,'QHIL'
,'QHTN'
,'QLAL'
,'QLPA'
,'QMIL'
,'QMNC'
,'QMNM'
,'QMNV'
,'QMOR'
,'QMTN'
,'QMTX'
,'QMUT'
,'QRIL'
,'QRKY'
,'QSPA'
,'QTGA'
,'QTKY'
,'QUIL'
,'QVIL'
,'QWIL'
,'LHFL'
)
AND a.amountBalance >= '1000'
AND a.accountStatus IN (
'PA'
,'PP'
,'BPA'
)
AND a.flag = 'A'
AND a.accountType IN (
'Resid'
,'Slfpy'
)
AND NOT EXISTS (
SELECT NULL
FROM vLastPmtDate
WHERE v.id = a.id
AND CAST(v.LastLinkARpmtDate AS DATE) > CAST(dateadd(day, - 45, getdate() AS DATE))
)
AND accountNum = '40728597'
AND LastLinkARpmtDate BETWEEN #minDate
AND #maxDate;
The function dateadd() is what you need:
DATEADD (datepart , number , date)
Set #maxDate = DATEADD(DAY, -45, GETDATE());

Aggregate for each day over time series, without using non-equijoin logic

Initial Question
Given the following dataset paired with a dates table:
MembershipId | ValidFromDate | ValidToDate
==========================================
0001 | 1997-01-01 | 2006-05-09
0002 | 1997-01-01 | 2017-05-12
0003 | 2005-06-02 | 2009-02-07
How many Memberships were open on any given day or timeseries of days?
Initial Answer
Following this question being asked here, this answer provided the necessary functionality:
select d.[Date]
,count(m.MembershipID) as MembershipCount
from DIM.[Date] as d
left join Memberships as m
on(d.[Date] between m.ValidFromDateKey and m.ValidToDateKey)
where d.CalendarYear = 2016
group by d.[Date]
order by d.[Date];
though a commenter remarked that There are other approaches when the non-equijoin takes too long.
Followup
As such, what would the equijoin only logic look like to replicate the output of the query above?
Progress So Far
From the answers provided so far I have come up with the below, which outperforms on the hardware I am working with across 3.2 million Membership records:
declare #s date = '20160101';
declare #e date = getdate();
with s as
(
select d.[Date] as d
,count(s.MembershipID) as s
from dbo.Dates as d
join dbo.Memberships as s
on d.[Date] = s.ValidFromDateKey
group by d.[Date]
)
,e as
(
select d.[Date] as d
,count(e.MembershipID) as e
from dbo.Dates as d
join dbo.Memberships as e
on d.[Date] = e.ValidToDateKey
group by d.[Date]
),c as
(
select isnull(s.d,e.d) as d
,sum(isnull(s.s,0) - isnull(e.e,0)) over (order by isnull(s.d,e.d)) as c
from s
full join e
on s.d = e.d
)
select d.[Date]
,c.c
from dbo.Dates as d
left join c
on d.[Date] = c.d
where d.[Date] between #s and #e
order by d.[Date]
;
Following on from that, to split this aggregate into constituent groups per day I have the following, which is also performing well:
declare #s date = '20160101';
declare #e date = getdate();
with s as
(
select d.[Date] as d
,s.MembershipGrouping as g
,count(s.MembershipID) as s
from dbo.Dates as d
join dbo.Memberships as s
on d.[Date] = s.ValidFromDateKey
group by d.[Date]
,s.MembershipGrouping
)
,e as
(
select d.[Date] as d
,e..MembershipGrouping as g
,count(e.MembershipID) as e
from dbo.Dates as d
join dbo.Memberships as e
on d.[Date] = e.ValidToDateKey
group by d.[Date]
,e.MembershipGrouping
),c as
(
select isnull(s.d,e.d) as d
,isnull(s.g,e.g) as g
,sum(isnull(s.s,0) - isnull(e.e,0)) over (partition by isnull(s.g,e.g) order by isnull(s.d,e.d)) as c
from s
full join e
on s.d = e.d
and s.g = e.g
)
select d.[Date]
,c.g
,c.c
from dbo.Dates as d
left join c
on d.[Date] = c.d
where d.[Date] between #s and #e
order by d.[Date]
,c.g
;
Can anyone improve on the above?
If most of your membership validity intervals are longer than few days, have a look at an answer by Martin Smith. That approach is likely to be faster.
When you take calendar table (DIM.[Date]) and left join it with Memberships, you may end up scanning the Memberships table for each date of the range. Even if there is an index on (ValidFromDate, ValidToDate), it may not be super useful.
It is easy to turn it around.
Scan the Memberships table only once and for each membership find those dates that are valid using CROSS APPLY.
Sample data
DECLARE #T TABLE (MembershipId int, ValidFromDate date, ValidToDate date);
INSERT INTO #T VALUES
(1, '1997-01-01', '2006-05-09'),
(2, '1997-01-01', '2017-05-12'),
(3, '2005-06-02', '2009-02-07');
DECLARE #RangeFrom date = '2006-01-01';
DECLARE #RangeTo date = '2006-12-31';
Query 1
SELECT
CA.dt
,COUNT(*) AS MembershipCount
FROM
#T AS Memberships
CROSS APPLY
(
SELECT dbo.Calendar.dt
FROM dbo.Calendar
WHERE
dbo.Calendar.dt >= Memberships.ValidFromDate
AND dbo.Calendar.dt <= Memberships.ValidToDate
AND dbo.Calendar.dt >= #RangeFrom
AND dbo.Calendar.dt <= #RangeTo
) AS CA
GROUP BY
CA.dt
ORDER BY
CA.dt
OPTION(RECOMPILE);
OPTION(RECOMPILE) is not really needed, I include it in all queries when I compare execution plans to be sure that I'm getting the latest plan when I play with the queries.
When I looked at the plan of this query I saw that the seek in the Calendar.dt table was using only ValidFromDate and ValidToDate, the #RangeFrom and #RangeTo were pushed to the residue predicate. It is not ideal. The optimiser is not smart enough to calculate maximum of two dates (ValidFromDate and #RangeFrom) and use that date as a starting point of the seek.
It is easy to help the optimiser:
Query 2
SELECT
CA.dt
,COUNT(*) AS MembershipCount
FROM
#T AS Memberships
CROSS APPLY
(
SELECT dbo.Calendar.dt
FROM dbo.Calendar
WHERE
dbo.Calendar.dt >=
CASE WHEN Memberships.ValidFromDate > #RangeFrom
THEN Memberships.ValidFromDate
ELSE #RangeFrom END
AND dbo.Calendar.dt <=
CASE WHEN Memberships.ValidToDate < #RangeTo
THEN Memberships.ValidToDate
ELSE #RangeTo END
) AS CA
GROUP BY
CA.dt
ORDER BY
CA.dt
OPTION(RECOMPILE)
;
In this query the seek is optimal and doesn't read dates that may be discarded later.
Finally, you may not need to scan the whole Memberships table.
We need only those rows where the given range of dates intersects with the valid range of the membership.
Query 3
SELECT
CA.dt
,COUNT(*) AS MembershipCount
FROM
#T AS Memberships
CROSS APPLY
(
SELECT dbo.Calendar.dt
FROM dbo.Calendar
WHERE
dbo.Calendar.dt >=
CASE WHEN Memberships.ValidFromDate > #RangeFrom
THEN Memberships.ValidFromDate
ELSE #RangeFrom END
AND dbo.Calendar.dt <=
CASE WHEN Memberships.ValidToDate < #RangeTo
THEN Memberships.ValidToDate
ELSE #RangeTo END
) AS CA
WHERE
Memberships.ValidToDate >= #RangeFrom
AND Memberships.ValidFromDate <= #RangeTo
GROUP BY
CA.dt
ORDER BY
CA.dt
OPTION(RECOMPILE)
;
Two intervals [a1;a2] and [b1;b2] intersect when
a2 >= b1 and a1 <= b2
These queries assume that Calendar table has an index on dt.
You should try and see what indexes are better for the Memberships table.
For the last query, if the table is rather large, most likely two separate indexes on ValidFromDate and on ValidToDate would be better than one index on (ValidFromDate, ValidToDate).
You should try different queries and measure their performance on the real hardware with real data. Performance may depend on the data distribution, how many memberships there are, what are their valid dates, how wide or narrow is the given range, etc.
I recommend to use a great tool called SQL Sentry Plan Explorer to analyse and compare execution plans. It is free. It shows a lot of useful stats, such as execution time and number of reads for each query. The screenshots above are from this tool.
On the assumption your date dimension contains all dates contained in all membership periods you can use something like the following.
The join is an equi join so can use hash join or merge join not just nested loops (which will execute the inside sub tree once for each outer row).
Assuming index on (ValidToDate) include(ValidFromDate) or reverse this can use a single seek against Memberships and a single scan of the date dimension. The below has an elapsed time of less than a second for me to return the results for a year against a table with 3.2 million members and general active membership of 1.4 million (script)
DECLARE #StartDate DATE = '2016-01-01',
#EndDate DATE = '2016-12-31';
WITH MD
AS (SELECT Date,
SUM(Adj) AS MemberDelta
FROM Memberships
CROSS APPLY (VALUES ( ValidFromDate, +1),
--Membership count decremented day after the ValidToDate
(DATEADD(DAY, 1, ValidToDate), -1) ) V(Date, Adj)
WHERE
--Members already expired before the time range of interest can be ignored
ValidToDate >= #StartDate
AND
--Members whose membership starts after the time range of interest can be ignored
ValidFromDate <= #EndDate
GROUP BY Date),
MC
AS (SELECT DD.DateKey,
SUM(MemberDelta) OVER (ORDER BY DD.DateKey ROWS UNBOUNDED PRECEDING) AS CountOfNonIgnoredMembers
FROM DIM_DATE DD
LEFT JOIN MD
ON MD.Date = DD.DateKey)
SELECT DateKey,
CountOfNonIgnoredMembers AS MembershipCount
FROM MC
WHERE DateKey BETWEEN #StartDate AND #EndDate
ORDER BY DateKey
Demo (uses extended period as the calendar year of 2016 isn't very interesting with the example data)
One approach is to first use an INNER JOIN to find the set of matches and COUNT() to project MemberCount GROUPed BY DateKey, then UNION ALL with the same set of dates, with a 0 on that projection for the count of members for each date. The last step is to SUM() the MemberCount of this union, and GROUP BY DateKey. As requested, this avoids LEFT JOIN and NOT EXISTS. As another member pointed out, this is not an equi-join, because we need to use a range, but I think it does what you intend.
This will serve up 1 year's worth of data with around 100k logical reads. On an ordinary laptop with a spinning disk, from cold cache, it serves 1 month in under a second (with correct counts).
Here is an example that creates 3.3 million rows of random duration. The query at the bottom returns one month's worth of data.
--Stay quiet for a moment
SET NOCOUNT ON
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
--Clean up if re-running
DROP TABLE IF EXISTS DIM_DATE
DROP TABLE IF EXISTS FACT_MEMBER
--Date dimension
CREATE TABLE DIM_DATE
(
DateKey DATE NOT NULL
)
--Membership fact
CREATE TABLE FACT_MEMBER
(
MembershipId INT NOT NULL
, ValidFromDateKey DATE NOT NULL
, ValidToDateKey DATE NOT NULL
)
--Populate Date dimension from 2001 through end of 2018
DECLARE #startDate DATE = '2001-01-01'
DECLARE #endDate DATE = '2018-12-31'
;WITH CTE_DATE AS
(
SELECT #startDate AS DateKey
UNION ALL
SELECT
DATEADD(DAY, 1, DateKey)
FROM
CTE_DATE AS D
WHERE
D.DateKey < #endDate
)
INSERT INTO
DIM_DATE
(
DateKey
)
SELECT
D.DateKey
FROM
CTE_DATE AS D
OPTION (MAXRECURSION 32767)
--Populate Membership fact with members having a random membership length from 1 to 36 months
;WITH CTE_DATE AS
(
SELECT #startDate AS DateKey
UNION ALL
SELECT
DATEADD(DAY, 1, DateKey)
FROM
CTE_DATE AS D
WHERE
D.DateKey < #endDate
)
,CTE_MEMBER AS
(
SELECT 1 AS MembershipId
UNION ALL
SELECT MembershipId + 1 FROM CTE_MEMBER WHERE MembershipId < 500
)
,
CTE_MEMBERSHIP
AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY NEWID()) AS MembershipId
, D.DateKey AS ValidFromDateKey
FROM
CTE_DATE AS D
CROSS JOIN CTE_MEMBER AS M
)
INSERT INTO
FACT_MEMBER
(
MembershipId
, ValidFromDateKey
, ValidToDateKey
)
SELECT
M.MembershipId
, M.ValidFromDateKey
, DATEADD(MONTH, FLOOR(RAND(CHECKSUM(NEWID())) * (36-1)+1), M.ValidFromDateKey) AS ValidToDateKey
FROM
CTE_MEMBERSHIP AS M
OPTION (MAXRECURSION 32767)
--Add clustered Primary Key to Date dimension
ALTER TABLE DIM_DATE ADD CONSTRAINT PK_DATE PRIMARY KEY CLUSTERED
(
DateKey ASC
)
--Index
--(Optimize in your spare time)
DROP INDEX IF EXISTS SK_FACT_MEMBER ON FACT_MEMBER
CREATE CLUSTERED INDEX SK_FACT_MEMBER ON FACT_MEMBER
(
ValidFromDateKey ASC
, ValidToDateKey ASC
, MembershipId ASC
)
RETURN
--Start test
--Emit stats
SET STATISTICS IO ON
SET STATISTICS TIME ON
--Establish range of dates
DECLARE
#rangeStartDate DATE = '2010-01-01'
, #rangeEndDate DATE = '2010-01-31'
--UNION the count of members for a specific date range with the "zero" set for the same range, and SUM() the counts
;WITH CTE_MEMBER
AS
(
SELECT
D.DateKey
, COUNT(*) AS MembershipCount
FROM
DIM_DATE AS D
INNER JOIN FACT_MEMBER AS M ON
M.ValidFromDateKey <= #rangeEndDate
AND M.ValidToDateKey >= #rangeStartDate
AND D.DateKey BETWEEN M.ValidFromDateKey AND M.ValidToDateKey
WHERE
D.DateKey BETWEEN #rangeStartDate AND #rangeEndDate
GROUP BY
D.DateKey
UNION ALL
SELECT
D.DateKey
, 0 AS MembershipCount
FROM
DIM_DATE AS D
WHERE
D.DateKey BETWEEN #rangeStartDate AND #rangeEndDate
)
SELECT
M.DateKey
, SUM(M.MembershipCount) AS MembershipCount
FROM
CTE_MEMBER AS M
GROUP BY
M.DateKey
ORDER BY
M.DateKey ASC
OPTION (RECOMPILE, MAXDOP 1)
Here's how I'd solve this problem with equijoin:
--data generation
declare #Membership table (MembershipId varchar(10), ValidFromDate date, ValidToDate date)
insert into #Membership values
('0001', '1997-01-01', '2006-05-09'),
('0002', '1997-01-01', '2017-05-12'),
('0003', '2005-06-02', '2009-02-07')
declare #startDate date, #endDate date
select #startDate = MIN(ValidFromDate), #endDate = max(ValidToDate) from #Membership
--in order to use equijoin I need all days between min date and max date from Membership table (both columns)
;with cte as (
select #startDate [date]
union all
select DATEADD(day, 1, [date]) from cte
where [date] < #endDate
)
--in this query, we will assign value to each day:
--one, if project started on that day
--minus one, if project ended on that day
--then, it's enough to (cumulative) sum all this values to get how many projects were ongoing on particular day
select [date],
sum(case when [DATE] = ValidFromDate then 1 else 0 end +
case when [DATE] = ValidToDate then -1 else 0 end)
over (order by [date] rows between unbounded preceding and current row)
from cte [c]
left join #Membership [m]
on [c].[date] = [m].ValidFromDate or [c].[date] = [m].ValidToDate
option (maxrecursion 0)
Here's another solution:
--data generation
declare #Membership table (MembershipId varchar(10), ValidFromDate date, ValidToDate date)
insert into #Membership values
('0001', '1997-01-01', '2006-05-09'),
('0002', '1997-01-01', '2017-05-12'),
('0003', '2005-06-02', '2009-02-07')
;with cte as (
select CAST('2016-01-01' as date) [date]
union all
select DATEADD(day, 1, [date]) from cte
where [date] < '2016-12-31'
)
select [date],
(select COUNT(*) from #Membership where ValidFromDate < [date]) -
(select COUNT(*) from #Membership where ValidToDate < [date]) [ongoing]
from cte
option (maxrecursion 0)
Pay attention, I think #PittsburghDBA is right when it says that current query return wrong result.
The last day of membership is not counted and so final sum is lower than it should be.
I have corrected it in this version.
This should improve a bit your actual progress:
declare #s date = '20160101';
declare #e date = getdate();
with
x as (
select d, sum(c) c
from (
select ValidFromDateKey d, count(MembershipID) c
from Memberships
group by ValidFromDateKey
union all
-- dateadd needed to count last day of membership too!!
select dateadd(dd, 1, ValidToDateKey) d, -count(MembershipID) c
from Memberships
group by ValidToDateKey
)x
group by d
),
c as
(
select d, sum(x.c) over (order by d) as c
from x
)
select d.day, c cnt
from calendar d
left join c on d.day = c.d
where d.day between #s and #e
order by d.day;
First of all, your query yields '1' as MembershipCount even if no active membership exists for the given date.
You should return SUM(CASE WHEN m.MembershipID IS NOT NULL THEN 1 ELSE 0 END) AS MembershipCount.
For optimal performance create an index on Memberships(ValidFromDateKey, ValidToDateKey, MembershipId) and another on DIM.[Date](CalendarYear, DateKey).
With that done, the optimal query shall be:
DECLARE #CalendarYear INT = 2000
SELECT dim.DateKey, SUM(CASE WHEN con.MembershipID IS NOT NULL THEN 1 ELSE 0 END) AS MembershipCount
FROM
DIM.[Date] dim
LEFT OUTER JOIN (
SELECT ValidFromDateKey, ValidToDateKey, MembershipID
FROM Memberships
WHERE
ValidFromDateKey <= CONVERT(DATETIME, CONVERT(VARCHAR, #CalendarYear) + '1231')
AND ValidToDateKey >= CONVERT(DATETIME, CONVERT(VARCHAR, #CalendarYear) + '0101')
) con
ON dim.DateKey BETWEEN con.ValidFromDateKey AND con.ValidToDateKey
WHERE dim.CalendarYear = #CalendarYear
GROUP BY dim.DateKey
ORDER BY dim.DateKey
Now, for your last question, what would be the equijoin equivalent query.
There is NO WAY you can rewrite this as a non-equijoin!
Equijoin doesn't imply using join sintax. Equijoin implies using an equals predicate, whatever the sintax.
Your query yields a range comparison, hence equals doesn't apply: a between or similar is required.

Display Month Gaps for Each location

I have the following query which takes in the opps and calculates the duration, and revenue for each month. However, for some locations, where there is no data, it is missing some months. Essentially, I would like all months to appear for each of the location and record type. I tried a left outer join on the calendar but that didn't seem to work either.
Here is the query:
;With DateSequence( [Date] ) as
(
Select CAST(#fromdate as DATE) as [Date]
union all
Select CAST(dateadd(day, 1, [Date]) as Date)
from DateSequence
where Date < #todate
)
INSERT INTO CalendarTemp (Date, Day, DayOfWeek, DayOfYear, WeekOfYear, Month, MonthName, Year)
Select
[Date] as [Date],
DATEPART(DAY,[Date]) as [Day],
DATENAME(dw, [Date]) as [DayOfWeek],
DATEPART(DAYOFYEAR,[Date]) as [DayOfYear],
DATEPART(WEEK,[Date]) as [WeekOfYear],
DATEPART(MONTH,[Date]) as [Month],
DATENAME(MONTH,[Date]) as [MonthName],
DATEPART(YEAR,[Date]) as [Year]
from DateSequence option (MaxRecursion 10000)
;
DELETE FROM CalendarTemp WHERE DayOfWeek IN ('Saturday', 'Sunday');
SELECT
AccountId
,AccountName
,Office
,Stage = (CASE WHEN StageName = 'Closed Won' THEN 'Closed Won'
ELSE 'Open'
END)
,Id
,Name
,RecordType= (CASE
WHEN recordtypeid = 'LAS1' THEN 'S'
END)
,Start_Date
,End_Date
,Probability
,Estimated_Revenue_Won = ISNULL(Amount, 0)
,ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Name) AS Row
--,Revenue_Per_Day = CAST(ISNULL(Amount/NULLIF(dbo.CalculateNumberOFWorkDays(Start_Date, End_Date),0),0) as money)
,YEAR(c.Date) as year
,MONTH(c.Date) as Month
,c.MonthName
--, ISNULL(CAST(Sum((Amount)/NULLIF(dbo.CalculateNumberOFWorkDays(Start_Date, End_Date),0)) as money),0) As RevenuePerMonth
FROM SF_Extracted_Opps o
LEFT OUTER JOIN CalendarTemp c on o.Start_Date <= c.Date AND o.End_Date >= c.Date
WHERE
Start_Date <= #todate AND End_Date >= #fromdate
AND Office IN (#Location)
AND recordtypeid IN ('LAS1')
GROUP BY
AccountId
,AccountName
,Office
,(CASE WHEN StageName = 'Closed Won' THEN 'Closed Won'
ELSE 'Open'
END)
,Id
,Name
,(CASE
WHEN recordtypeid = 'LAS1' THEN 'S'
END)
,Amount
--, CAST(ISNULL(Amount/NULLIF(dbo.CalculateNumberOFWorkDays(Start_Date, End_Date),0),0) as money)
,Start_Date
,End_Date
,Probability
,YEAR(c.Date)
,Month(c.Date)
,c.MonthName
,dbo.CalculateNumberOFWorkDays(Start_Date, End_Date)
ORDER BY Office
, (CASE
WHEN recordtypeid = 'LAS1' THEN 'S'
END)
,(CASE WHEN StageName = 'Closed Won' THEN 'Closed Won'
ELSE 'Open'
END)
, [Start_Date], Month(c.Date), AccountName, Row;
I tried adding another left outer join to this and using this a sub query and the join essentially on the calendar based on the year and month, but that did not seem to work either. Suggestions would be extremely appreciated.
--Date Calendar for each location:
;With DateSequence( [Date], Locatio) as
(
Select CAST(#fromdate as DATE) as [Date], oo.Office as location
union all
Select CAST(dateadd(day, 1, [Date]) as Date), oo.Office as location
from DateSequence dts
join Opportunity_offices oo on 1 = 1
where Date < #todate
)
--select result
INSERT INTO CalendarTemp (Location,Date, Day, DayOfWeek, DayOfYear, WeekOfYear, Month, MonthName, Year)
Select
location,
[Date] as [Date],
DATEPART(DAY,[Date]) as [Day],
DATENAME(dw, [Date]) as [DayOfWeek],
DATEPART(DAYOFYEAR,[Date]) as [DayOfYear],
DATEPART(WEEK,[Date]) as [WeekOfYear],
DATEPART(MONTH,[Date]) as [Month],
DATENAME(MONTH,[Date]) as [MonthName],
DATEPART(YEAR,[Date]) as [Year]
from DateSequence option (MaxRecursion 10000)
;
you have your LEFT JOIN backwards if you want all records from CalendarTemp and only those that match from SF_Extracted_Opps then you the CalendarTemp should be the table on the LEFT. You can however switch LEFT JOIN to RIGHT JOIN and it should be fixed. The other issue will be your WHERE statement is using columns from your SF_Extracted_Opps table which will just make that an INNER JOIN again.
here is one way to fix.
SELECT
.....
FROM
CalendarTemp c
LEFT JOIN SF_Extracted_Opps o
ON o.Start_Date <= c.Date AND o.End_Date >= c.Date
AND o.Start_Date <= #todate AND End_Date >= #fromdate
AND o.Office IN (#Location)
AND o.recordtypeid IN ('LAS1')
The other issue you might run into is because you remove weekends from your CalendarTemp Table not all dates are represented I would test with the weekends still in and out and see if you get different results.
this line:
AND o.Start_Date <= #todate AND End_Date >= #fromdate
should not be needed either because you are already limiting the dates from the line before and values in your CalendarTempTable
A note about your CalendarDate table you don't have to go back and delete those records simply add the day of week as a WHERE statement on the select that populates that table.
Edit for All Offices you can use a cross join of your offices table with your CalendarTemp table to do this do it in your final query not the cte that builds the calendar. The problem with doing it in the CTE calendar definition is that it is recursive so you would have to do it in both the anchor and the recursive member definition.
SELECT
.....
FROM
CalendarTemp c
CROSS JOIN Opportunity_offices oo
LEFT JOIN SF_Extracted_Opps o
ON o.Start_Date <= c.Date AND o.End_Date >= c.Date
AND o.Start_Date <= #todate AND End_Date >= #fromdate
AND oo.office = o.Office
AND o.recordtypeid IN ('LAS1')

For every row with data I need a row for each category

I have timesheet data that I need to create a report for by date range. I need to have a row for each person for each day, and each time type. If there's no entry for that time type on a given day, i want null data. I've tried a left join, but it doesn't seem to be working. A cross join will give erroneous data.
The tables I have are a Person table (personID, Name), a TimeLog table (TimeLogID, StartDate, EndDate, TimeLogTypeID), and a TimeLogType table (TimeLogTypeID, PersonID, Description, DeletedInd)
All I can get in the result set is the rows with data, and not the empty rows for each TimeLogType
Here's what I have so far:
DECLARE
#startDate DATE,
#endDate DATE
SET #startDate = '2014-05-01'
SET #endDate = '2014-05-30'
SELECT
CONVERT(DATE, TimeLog.StartDateTime, 101) AS TimeLogDay,
SUM(dbo.fnCalculateHoursAsDecimal(TimeLog.StartDateTime, TimeLog.EndDateTime)) AS Hours,
TimeLog.PersonID,
TimeLog.TimeLogTypeID
INTO #HourTable
FROM
TimeLog
WHERE
TimeLog.StartDateTime BETWEEN #startDate AND #endDate
GROUP BY
CONVERT(DATE, TimeLog.StartDateTime, 101),
TimeLog.TimeLogTypeID,
TimeLog.PersonID
SELECT
TimeLogType.Description,
#HourTable.*
FROM
TimeLogType LEFT JOIN
#HourTable ON TimeLogType.TimeLogTypeID = #HourTable.TimeLogTypeID
WHERE
ISNULL(TimeLogType.DeletedInd, 0) = 0
ORDER BY
PersonID, TimeLogDay, TimeLogType.TimeLogTypeID
The data goes something like this:
TimeLogType:
1, Billable
2, Non-Billable
Person:
1, Billy
2, Tom
TimeLog:
1, 1, 2014-05-01 08:00:00, 2014-05-01 09:00:00, 1, 0
2, 1, 2014-05-01 09:00:00, 2014-05-01 10:00:00, 1, 0
3, 2, 2014-05-01 08:00:00, 2014-05-01 08:30:00, 2, 0
4, 2, 2014-05-01 08:30:00, 2014-05-01 09:00:00, 1, 0
5, 1, 2014-05-02 08:00:00, 2014-05-02 09:00:00, 2, 0
Expected Output: (order by person, date, timelog type)
Day, Person, Bill Type, Total Hours
2014-05-01, Billy, Billiable, 2.0
2014-05-01, Billy, Non-Billiable, NULL
2014-05-02, Billy, Billiable, 1.0
2014-05-02, Billy, Non-Billiable, NULL
etc...
2014-05-01, Tom, Billiable, 0.5
2014-05-01, Tom, Non-Billiable, 0.5
etc...
You need to generate all the combinations first and then use left join to bring in the information you want. I think the query is like this:
with dates as (
select dateadd(day, number - 1, mind) as thedate
from (select min(StartDate) as mind, max(EndDate) as endd
from TimeLogType
) tlt join
master..spt_values v
on dateadd(day, v.number, mind) <= tlt.endd
)
select p.PersonId, tlt.TimeLogTypeId, d.thedate,
from Person p cross join
(select tlt.* from TimeLogType tlt where ISNULL(TimeLogType.DeletedInd, 0) = 0
) tlt cross join
date d left join
TimeLog tl
on tl.Person_id = p.PersonId and tl.TimeLogTypeId = tlt.TimeLogTypeId and
d.thedate >= tl.StartDate and d.thedate <= tl.EndDate
After reading Gordon's answer here's what I've come up with. I created it in steps so I could see what was going on. I created the dates w/o the master..spt_values table. I also created a temp table of people so I could select just the ones that had a TimeLogRecord, and then re-use it to pull in details for the final select. Let me know if there's any way to make this run faster.
DECLARE
#startDate DATE,
#endDate DATE
SET #startDate = '2014-01-01'
SET #endDate = '2014-01-31'
-- create day rows --
;WITH dates(TimeLogDay) AS
(
SELECT #startDate AS TimeLogDay
UNION ALL
SELECT DATEADD(d, 1, TimeLogDay)
FROM dates
WHERE TimeLogDay < #enddate
)
-- create a type row for each day --
SELECT
dates.TimeLogDay,
tlt.TimeLogTypeID
INTO #TypeDate
FROM
dates CROSS JOIN
(SELECT
TimeLogType.TimeLogTypeID
FROM
TimeLogType
WHERE
ISNULL(TimeLogType.DeletedInd, 0) = 0
) AS TLT
-- create a temp person table for referance later ---
SELECT * INTO #person FROM Person WHERE Person.personID IN
(SELECT Timelog.PersonID FROM TimeLog WHERE TimeLog.StartDateTime BETWEEN #startDate AND #endDate)
-- sum up the log times and tie in the date/type rows --
SELECT
#TypeDate.TimeLogDay,
#TypeDate.TimeLogTypeID,
#person.PersonID,
SUM(dbo.fnCalculateHoursAsDecimal(TimeLog.StartDateTime, TimeLog.EndDateTime)) AS Hours
INTO #Hours
FROM
#person CROSS JOIN
#TypeDate LEFT JOIN
TimeLog ON
TimeLog.PersonID = #person.PersonID AND
TimeLog.TimeLogTypeID = #TypeDate.TimeLogTypeID AND
#TypeDate.TimeLogDay = CONVERT(DATE, TimeLog.StartDateTime, 101)
GROUP BY
#TypeDate.TimeLogDay,
#TypeDate.TimeLogTypeID,
#person.PersonID
-- now tie in the details to complete --
SELECT
#Hours.TimeLogDay,
TimeLogType.Description,
Person.LastName,
Person.FirstName,
#Hours.Hours
FROM
#Hours LEFT JOIN
Person ON #Hours.PersonID = Person.PersonID LEFT JOIN
TimeLogType ON #Hours.TimeLogTypeID = TimeLogType.TimeLogTypeID
ORDER BY
Person.FirstName,
Person.LastName,
#Hours.TimeLogDay,
TimeLogType.SortOrder

Query is not returning proper values

I have the query below which is a mammoth:
DECLARE #Start Date, #End Date, #DaySpan int, #UserId int, #ProjectId int
SET #Start = '7/08/2014 12:00 AM -05:00';
SET #End = '7/27/2014 12:00 AM -05:00';
SET #DaySpan = 1;
SET #UserId = 102;
SET #ProjectId = 2065;
WITH T(StartDate, EndDate)
AS (
SELECT #Start StartDate, DATEADD(DAY, #DaySpan - 1, #Start) EndDate
UNION ALL
SELECT DATEADD(DAY, 1, EndDate) StartDate, DATEADD(DAY, #DaySpan, EndDate) FROM T WHERE DATEADD(DAY, #DaySpan, EndDate) <= #End
)
SELECT convert(datetimeoffset, T.StartDate) StartDate, T.EndDate, ISNULL(Completes, 0) Completes, SUM(h.Hours) Hours, SUM(h.Hours) / NULLIF(Completes, 0) HoursPerRecruit,
ISNULL(Completes, 0) / NULLIF(SUM(h.Hours), 0) RecruitsPerHour
FROM T LEFT JOIN (
SELECT StartDate, EndDate, COUNT(r.Id) Completes
FROM Respondents r JOIN T st ON r.RecruitedOn >= st.StartDate AND r.RecruitedOn < DATEADD(day, 1, st.EndDate)
WHERE r.RecruitingStatus = 7
AND RecruitedBy = #UserId
AND r.ProjectId = #ProjectId -- **REMOVE Line If you just want by User**
GROUP BY st.StartDate, st.EndDate
) c ON T.StartDate = c.StartDate AND T.EndDate = c.EndDate
LEFT JOIN (
SELECT st.StartDate, st.EndDate, SUM(Hours) Hours
FROM T st JOIN TimeEntries te ON te.Date >= CONVERT(DATE, st.StartDate) AND te.Date < DATEADD(day, DATEDIFF(day,0,CONVERT(DATE, st.EndDate)),1)
JOIN Users u ON te.HarvestUserId = u.HarvestId
--JOIN Projects PR ON te.HarvestProjectId = PR.Id
WHERE u.Id = #UserId
GROUP BY st.StartDate, st.EndDate
) h ON T.StartDate = h.StartDate AND T.EndDate = h.EndDate
GROUP BY T.StartDate, T.EndDate, Completes
ORDER BY T.StartDate
OPTION(MAXRECURSION 32767)
It returns results like below:
StartDate EndDate Completes Hours HoursPerRecruit RecruitsPerHour
2014-07-10 00:00:00.0000000 +00:00 2014-07-10 6 3.00 0.500000 2.00000000000000000000000000
It works great.. But now I want to limit the hours returned by project. So in the query you will see a line that is commented out that JOIN Projects PR ON te.HarvestProjectId = PR.Id. When I add that bit of code it completely messes up the calculations and returns nothing. Like so:
StartDate EndDate Completes Hours HoursPerRecruit RecruitsPerHour
2014-07-10 00:00:00.0000000 +00:00 2014-07-10 6 NULL NULL NULL
What am I missing that is is making the HoursPerRecruit and RecruitsPerHour be null? I can't seem to figure it out.
I know this isn't a complete answer, but I can't format code in a comment.
Look at just this bit of code, and execute it with some valid value for the parameter:
SELECT st.StartDate, st.EndDate, SUM(Hours) Hours
FROM T st JOIN TimeEntries te ON te.Date >= CONVERT(DATE, st.StartDate) AND te.Date < DATEADD(day, DATEDIFF(day,0,CONVERT(DATE, st.EndDate)),1)
JOIN Users u ON te.HarvestUserId = u.HarvestId
--JOIN Projects PR ON te.HarvestProjectId = PR.Id
WHERE u.Id = #UserId
GROUP BY st.StartDate, st.EndDate
Then un-comment the commented line. Does it return no rows? I'm guessing that will be the case based on what you describe.
If so, then look at the results of this:
SELECT * FROM Projects
and see if you can figure out why no rows from the Projects table are joining to the TimeEntries table. Maybe you're joining on the wrong columns, or there's a mis-match in the data format.
If you are trying to limit the results the condition should be in the Where clause below your commented out join. If your Project.ID is directly linked to TimeEntries.HarvestProjectId with no other conditions you should be able to change the line
WHERE u.Id = #UserId
to:
WHERE u.Id = #UserId and te.HarvestProjectId=#ProjectID