Query is not returning proper values - sql

I have the query below which is a mammoth:
DECLARE #Start Date, #End Date, #DaySpan int, #UserId int, #ProjectId int
SET #Start = '7/08/2014 12:00 AM -05:00';
SET #End = '7/27/2014 12:00 AM -05:00';
SET #DaySpan = 1;
SET #UserId = 102;
SET #ProjectId = 2065;
WITH T(StartDate, EndDate)
AS (
SELECT #Start StartDate, DATEADD(DAY, #DaySpan - 1, #Start) EndDate
UNION ALL
SELECT DATEADD(DAY, 1, EndDate) StartDate, DATEADD(DAY, #DaySpan, EndDate) FROM T WHERE DATEADD(DAY, #DaySpan, EndDate) <= #End
)
SELECT convert(datetimeoffset, T.StartDate) StartDate, T.EndDate, ISNULL(Completes, 0) Completes, SUM(h.Hours) Hours, SUM(h.Hours) / NULLIF(Completes, 0) HoursPerRecruit,
ISNULL(Completes, 0) / NULLIF(SUM(h.Hours), 0) RecruitsPerHour
FROM T LEFT JOIN (
SELECT StartDate, EndDate, COUNT(r.Id) Completes
FROM Respondents r JOIN T st ON r.RecruitedOn >= st.StartDate AND r.RecruitedOn < DATEADD(day, 1, st.EndDate)
WHERE r.RecruitingStatus = 7
AND RecruitedBy = #UserId
AND r.ProjectId = #ProjectId -- **REMOVE Line If you just want by User**
GROUP BY st.StartDate, st.EndDate
) c ON T.StartDate = c.StartDate AND T.EndDate = c.EndDate
LEFT JOIN (
SELECT st.StartDate, st.EndDate, SUM(Hours) Hours
FROM T st JOIN TimeEntries te ON te.Date >= CONVERT(DATE, st.StartDate) AND te.Date < DATEADD(day, DATEDIFF(day,0,CONVERT(DATE, st.EndDate)),1)
JOIN Users u ON te.HarvestUserId = u.HarvestId
--JOIN Projects PR ON te.HarvestProjectId = PR.Id
WHERE u.Id = #UserId
GROUP BY st.StartDate, st.EndDate
) h ON T.StartDate = h.StartDate AND T.EndDate = h.EndDate
GROUP BY T.StartDate, T.EndDate, Completes
ORDER BY T.StartDate
OPTION(MAXRECURSION 32767)
It returns results like below:
StartDate EndDate Completes Hours HoursPerRecruit RecruitsPerHour
2014-07-10 00:00:00.0000000 +00:00 2014-07-10 6 3.00 0.500000 2.00000000000000000000000000
It works great.. But now I want to limit the hours returned by project. So in the query you will see a line that is commented out that JOIN Projects PR ON te.HarvestProjectId = PR.Id. When I add that bit of code it completely messes up the calculations and returns nothing. Like so:
StartDate EndDate Completes Hours HoursPerRecruit RecruitsPerHour
2014-07-10 00:00:00.0000000 +00:00 2014-07-10 6 NULL NULL NULL
What am I missing that is is making the HoursPerRecruit and RecruitsPerHour be null? I can't seem to figure it out.

I know this isn't a complete answer, but I can't format code in a comment.
Look at just this bit of code, and execute it with some valid value for the parameter:
SELECT st.StartDate, st.EndDate, SUM(Hours) Hours
FROM T st JOIN TimeEntries te ON te.Date >= CONVERT(DATE, st.StartDate) AND te.Date < DATEADD(day, DATEDIFF(day,0,CONVERT(DATE, st.EndDate)),1)
JOIN Users u ON te.HarvestUserId = u.HarvestId
--JOIN Projects PR ON te.HarvestProjectId = PR.Id
WHERE u.Id = #UserId
GROUP BY st.StartDate, st.EndDate
Then un-comment the commented line. Does it return no rows? I'm guessing that will be the case based on what you describe.
If so, then look at the results of this:
SELECT * FROM Projects
and see if you can figure out why no rows from the Projects table are joining to the TimeEntries table. Maybe you're joining on the wrong columns, or there's a mis-match in the data format.

If you are trying to limit the results the condition should be in the Where clause below your commented out join. If your Project.ID is directly linked to TimeEntries.HarvestProjectId with no other conditions you should be able to change the line
WHERE u.Id = #UserId
to:
WHERE u.Id = #UserId and te.HarvestProjectId=#ProjectID

Related

SQL with while loop to DAX conversion

Trying to convert the SQL with while loop code into DAX. Trying to build this query without using temp tables as access is an issue on the database and only have views to work with. I believe best option for me is to code it in DAX. Could someone help with it.
DECLARE #sd DATETIME
DECLARE #ed DATETIME
SELECT #sd = CONVERT(DATETIME, '2021-01-31')
SELECT #ed = GETDATE()
DECLARE #date DATETIME = EOMONTH(#sd)
WHILE ( (#date) <= #ed )
BEGIN
SELECT MONTH(#date) as Month, YEAR(#date) as Year, DAY(#date) as Day, A.*
FROM [people] A
WHERE A.effective_date = (SELECT MAX(B.effective_date)
FROM [people] B
WHERE B.employee_id = A.employee_id
AND B.record_id = A.record_id
AND B.effective_date <= #date)
AND A.effective_sequence = (SELECT MAX(C.effective_sequence)
FROM [people] C
WHERE C.employee_id = A.employee_id
AND C.record_id = A.record_id
AND C.effective_date = A.effective_date)
ORDER BY A.employee_id;
SET #date = EOMONTH(DATEADD(MONTH,1,#date))
END
While you could do this as a view, you would either have to hard-code the start and end dates, or filter them afterwards (which is likely to be inefficient). Instead you can do this as an inline Table Valued Function.
We can use a virtual tally-table (generated with a couple cross-joins) to generate a row for each month
We can use row-numbering instead of the two subqueries
CREATE FUNCTION dbo.GetData (#sd DATETIME, #ed DATETIME)
RETURNS TABLE AS RETURN
WITH L0 AS (
SELECT *
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) v(n)
),
L1 AS (
SELECT 1 n FROM L0 a CROSS JOIN L0 b
)
SELECT
MONTH(m.Month) as Month,
YEAR(m.Month) as Year,
DAY(m.Month) as Day,
p.* -- specify columns
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY m.Month, p.employee_id, p.record_id ORDER BY p.effective_date, p.effective_sequence) AS rn
FROM [people] p
CROSS JOIN (
SELECT TOP (DATEDIFF(month, #sd, #ed) + 1)
DATEADD(month, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1, EOMONTH(#sd)) AS Month
FROM L1
) m
WHERE p.effective_date <= m.Month
) p
WHERE p.rn = 1
;
Then in PowerBI you can just do for example
SELECT *
FROM dbo.GetData ('2021-01-31', GETDATE()) d
ORDER BY
d.employee_id
Note that you cannot put the ORDER BY within the function, it doesn't work.

For every row with data I need a row for each category

I have timesheet data that I need to create a report for by date range. I need to have a row for each person for each day, and each time type. If there's no entry for that time type on a given day, i want null data. I've tried a left join, but it doesn't seem to be working. A cross join will give erroneous data.
The tables I have are a Person table (personID, Name), a TimeLog table (TimeLogID, StartDate, EndDate, TimeLogTypeID), and a TimeLogType table (TimeLogTypeID, PersonID, Description, DeletedInd)
All I can get in the result set is the rows with data, and not the empty rows for each TimeLogType
Here's what I have so far:
DECLARE
#startDate DATE,
#endDate DATE
SET #startDate = '2014-05-01'
SET #endDate = '2014-05-30'
SELECT
CONVERT(DATE, TimeLog.StartDateTime, 101) AS TimeLogDay,
SUM(dbo.fnCalculateHoursAsDecimal(TimeLog.StartDateTime, TimeLog.EndDateTime)) AS Hours,
TimeLog.PersonID,
TimeLog.TimeLogTypeID
INTO #HourTable
FROM
TimeLog
WHERE
TimeLog.StartDateTime BETWEEN #startDate AND #endDate
GROUP BY
CONVERT(DATE, TimeLog.StartDateTime, 101),
TimeLog.TimeLogTypeID,
TimeLog.PersonID
SELECT
TimeLogType.Description,
#HourTable.*
FROM
TimeLogType LEFT JOIN
#HourTable ON TimeLogType.TimeLogTypeID = #HourTable.TimeLogTypeID
WHERE
ISNULL(TimeLogType.DeletedInd, 0) = 0
ORDER BY
PersonID, TimeLogDay, TimeLogType.TimeLogTypeID
The data goes something like this:
TimeLogType:
1, Billable
2, Non-Billable
Person:
1, Billy
2, Tom
TimeLog:
1, 1, 2014-05-01 08:00:00, 2014-05-01 09:00:00, 1, 0
2, 1, 2014-05-01 09:00:00, 2014-05-01 10:00:00, 1, 0
3, 2, 2014-05-01 08:00:00, 2014-05-01 08:30:00, 2, 0
4, 2, 2014-05-01 08:30:00, 2014-05-01 09:00:00, 1, 0
5, 1, 2014-05-02 08:00:00, 2014-05-02 09:00:00, 2, 0
Expected Output: (order by person, date, timelog type)
Day, Person, Bill Type, Total Hours
2014-05-01, Billy, Billiable, 2.0
2014-05-01, Billy, Non-Billiable, NULL
2014-05-02, Billy, Billiable, 1.0
2014-05-02, Billy, Non-Billiable, NULL
etc...
2014-05-01, Tom, Billiable, 0.5
2014-05-01, Tom, Non-Billiable, 0.5
etc...
You need to generate all the combinations first and then use left join to bring in the information you want. I think the query is like this:
with dates as (
select dateadd(day, number - 1, mind) as thedate
from (select min(StartDate) as mind, max(EndDate) as endd
from TimeLogType
) tlt join
master..spt_values v
on dateadd(day, v.number, mind) <= tlt.endd
)
select p.PersonId, tlt.TimeLogTypeId, d.thedate,
from Person p cross join
(select tlt.* from TimeLogType tlt where ISNULL(TimeLogType.DeletedInd, 0) = 0
) tlt cross join
date d left join
TimeLog tl
on tl.Person_id = p.PersonId and tl.TimeLogTypeId = tlt.TimeLogTypeId and
d.thedate >= tl.StartDate and d.thedate <= tl.EndDate
After reading Gordon's answer here's what I've come up with. I created it in steps so I could see what was going on. I created the dates w/o the master..spt_values table. I also created a temp table of people so I could select just the ones that had a TimeLogRecord, and then re-use it to pull in details for the final select. Let me know if there's any way to make this run faster.
DECLARE
#startDate DATE,
#endDate DATE
SET #startDate = '2014-01-01'
SET #endDate = '2014-01-31'
-- create day rows --
;WITH dates(TimeLogDay) AS
(
SELECT #startDate AS TimeLogDay
UNION ALL
SELECT DATEADD(d, 1, TimeLogDay)
FROM dates
WHERE TimeLogDay < #enddate
)
-- create a type row for each day --
SELECT
dates.TimeLogDay,
tlt.TimeLogTypeID
INTO #TypeDate
FROM
dates CROSS JOIN
(SELECT
TimeLogType.TimeLogTypeID
FROM
TimeLogType
WHERE
ISNULL(TimeLogType.DeletedInd, 0) = 0
) AS TLT
-- create a temp person table for referance later ---
SELECT * INTO #person FROM Person WHERE Person.personID IN
(SELECT Timelog.PersonID FROM TimeLog WHERE TimeLog.StartDateTime BETWEEN #startDate AND #endDate)
-- sum up the log times and tie in the date/type rows --
SELECT
#TypeDate.TimeLogDay,
#TypeDate.TimeLogTypeID,
#person.PersonID,
SUM(dbo.fnCalculateHoursAsDecimal(TimeLog.StartDateTime, TimeLog.EndDateTime)) AS Hours
INTO #Hours
FROM
#person CROSS JOIN
#TypeDate LEFT JOIN
TimeLog ON
TimeLog.PersonID = #person.PersonID AND
TimeLog.TimeLogTypeID = #TypeDate.TimeLogTypeID AND
#TypeDate.TimeLogDay = CONVERT(DATE, TimeLog.StartDateTime, 101)
GROUP BY
#TypeDate.TimeLogDay,
#TypeDate.TimeLogTypeID,
#person.PersonID
-- now tie in the details to complete --
SELECT
#Hours.TimeLogDay,
TimeLogType.Description,
Person.LastName,
Person.FirstName,
#Hours.Hours
FROM
#Hours LEFT JOIN
Person ON #Hours.PersonID = Person.PersonID LEFT JOIN
TimeLogType ON #Hours.TimeLogTypeID = TimeLogType.TimeLogTypeID
ORDER BY
Person.FirstName,
Person.LastName,
#Hours.TimeLogDay,
TimeLogType.SortOrder

Changing comparator in WHERE clause has catastrophic results on query performance

I have a monster query that I'm running against a SQL SERVER 2005 database that is acting very strange. I have two conditions in the WHERE clause of the outermost select, comparing a field to a constant date. When the constant dates are either identical (down to the second) or their date parts are not equal, the query runs in under 2 seconds. When the date parts are the same but the time parts are different, the query takes around 7 minutes to complete. Specifically, having a WHERE clause of
WHERE
d.date >= '2011-11-07 00:00:00' AND
d.date <= '2011-11-08 11:59:59'
works well and as expected. Changing the WHERE clause to
WHERE
d.date >= '2011-11-07 00:00:00' AND
d.date <= '2011-11-07 11:59:59'
causes the query to take many minutes.
I also noticed that when I turned off the index on the Agent_Hours table that the bad case of having the same dates the same reduces the query time to 25 seconds, still far longer than when they dates are different, but not by as much.
Below is the full query for reference (the WHERE clause in question is at the very end):
SELECT
s.transaction_id AS 'transaction',
s.created_on AS transaction_date,
s.first_name + ' ' + s.Last_Name AS customer_name,
a.name AS agent_name,
a.phantom AS phantom,
a.team AS agent_team,
a.id AS agent_number,
h.hours,
h2.hours_today,
d.*
FROM
(SELECT
agents.first_name + ' ' + agents.last_name AS name,
agents.id AS id,
agents.phantom AS phantom,
transient.value AS team,
transient.start_date AS team_start_date,
transient.end_date AS team_end_date
FROM
Agents.dbo.Agent_Static AS agents
JOIN
Agents.dbo.Agent_Transient AS transient
ON transient.agent = agents.id
WHERE
transient.field = 'team') AS a
LEFT JOIN Agents.dbo.Agent_Daily AS d
ON d.agent = a.id
LEFT JOIN (SELECT
agent_hours.agent AS agent,
dates.date AS date,
CAST(COUNT(*) AS FLOAT) / 4 AS hours
FROM
Agents.dbo.Agent_Hours AS agent_hours
JOIN
(SELECT
DISTINCT CONVERT(
VARCHAR(10),
hour_worked,
101)
AS date
FROM
Agents.dbo.Agent_Hours) AS dates
ON dates.date = CONVERT(
VARCHAR(10),
agent_hours.hour_worked,
101)
WHERE
(status = 'Phone' OR
status = 'Meeting')
GROUP BY
agent_hours.agent,
dates.date) AS h
ON h.agent = a.id AND
h.date = d.date
LEFT JOIN (SELECT
agent_hours.agent AS agent,
dates.date AS date,
CAST(COUNT(*) AS FLOAT) / 4 AS hours_today
FROM
Agents.dbo.Agent_Hours AS agent_hours
JOIN
(SELECT
DISTINCT CONVERT(
VARCHAR(10),
hour_worked,
101)
AS date
FROM
Agents.dbo.Agent_Hours) AS dates
ON dates.date = CONVERT(
VARCHAR(10),
agent_hours.hour_worked,
101)
WHERE
(status = 'Phone' OR
status = 'Meeting') AND
CONVERT(
VARCHAR(10),
CAST('11/09/2011 13:01' AS DATETIME),
101) = CONVERT(
VARCHAR(10),
agent_hours.hour_worked,
101) AND
CONVERT(
VARCHAR(10),
CAST('11/09/2011 13:01' AS DATETIME),
114) > CONVERT(
VARCHAR(10),
agent_hours.hour_worked,
114)
GROUP BY
agent_hours.agent,
dates.date) AS h2
ON h2.agent = a.id AND
h2.date = d.date
LEFT JOIN sale_transactions AS s
ON a.id = s.agent_hermes_id AND
s.created_on >= a.team_start_date AND
s.created_on <= a.team_end_date AND
CONVERT(
VARCHAR(10),
d.date,
101) = CONVERT(
VARCHAR(10),
s.created_on,
101)
LEFT JOIN sold_phrases AS p
ON s.Transaction_ID = p.transaction_id
WHERE
d.date >= '2011-11-07 00:00:00' AND
d.date <= '2011-11-07 11:59:59'
As a general rule, always post your exact table definition, including all indexes, when asking performance problems in SQL.
I cannot see any difference between the two cases, but considering your explanation, this is what likely happens: the cardinality estimates for the date range may trigger the index tipping point and you get wildly different execution plans. Such issues are best addressed by using plan guides, see Optimizing Queries in Deployed Applications by Using Plan Guides. You should be able to confirm if the problem is indeed the plan, see Displaying Graphical Execution Plans (SQL Server Management Studio).
This is maybe a micro optimization but have you consider changing the way you get the date part from datetime to DATEADD(dd, 0, DATEDIFF(dd, 0, datetime_format)). It's usually faster way than convert function.
SELECT
s.transaction_id AS 'transaction',
s.created_on AS transaction_date,
s.first_name + ' ' + s.Last_Name AS customer_name,
a.name AS agent_name,
a.phantom AS phantom,
a.team AS agent_team,
a.id AS agent_number,
h.hours,
h2.hours_today,
d.*
FROM (SELECT
agents.first_name + ' ' + agents.last_name AS name,
agents.id AS id,
agents.phantom AS phantom,
transient.value AS team,
transient.start_date AS team_start_date,
transient.end_date AS team_end_date
FROM
Agents.dbo.Agent_Static AS agents
JOIN
Agents.dbo.Agent_Transient AS transient
ON transient.agent = agents.id
WHERE
transient.field = 'team'
) AS a
LEFT JOIN Agents.dbo.Agent_Daily AS d ON d.agent = a.id
LEFT JOIN (
SELECT
agent_hours.agent AS agent,
dates.date AS date,
COUNT(*) / 4.0 AS hours
FROM Agents.dbo.Agent_Hours AS agent_hours
JOIN (
SELECT DATEADD(dd, 0, DATEDIFF(dd, 0, hour_worked)) as date
FROM Agents.dbo.Agent_Hours GROUP BY DATEADD(dd, 0, DATEDIFF(dd, 0, hour_worked))
) AS dates ON dates.date = DATEADD(dd, 0, DATEDIFF(dd, 0, agent_hours.hour_worked))
WHERE (status = 'Phone' OR status = 'Meeting')
GROUP BY agent_hours.agent, dates.date
) AS h ON h.agent = a.id AND h.date = d.date
LEFT JOIN (
SELECT
agent_hours.agent AS agent,
dates.date AS date,
COUNT(*) / 4.0 AS hours_today
FROM Agents.dbo.Agent_Hours AS agent_hours
JOIN (
SELECT DATEADD(dd, 0, DATEDIFF(dd, 0, hour_worked)) as date
FROM Agents.dbo.Agent_Hours GROUP BY DATEADD(dd, 0, DATEDIFF(dd, 0, hour_worked))
) AS dates ON dates.date = DATEADD(dd, 0, DATEDIFF(dd, 0, agent_hours.hour_worked))
WHERE
(status = 'Phone' OR status = 'Meeting') AND
agent_hours.hour_worked >=
DATEADD(dd, 0, DATEDIFF(dd, 0, CAST('11/09/2011 13:01' AS DATETIME)))
AND
agent_hours.hour_worked <
CAST('11/09/2011 13:01' AS DATETIME)
GROUP BY agent_hours.agent, dates.date
) AS h2 ON h2.agent = a.id AND h2.date = d.date
LEFT JOIN sale_transactions AS s
ON a.id = s.agent_hermes_id AND
s.created_on >= a.team_start_date AND
s.created_on <= a.team_end_date AND
DATEADD(dd, 0, DATEDIFF(dd, 0, d.date))
=
DATEADD(dd, 0, DATEDIFF(dd, 0, s.created_on))
LEFT JOIN sold_phrases AS p
ON s.Transaction_ID = p.transaction_id
WHERE
d.date >= '2011-11-07 00:00:00' AND
d.date <= '2011-11-07 11:59:59'
The more important (as Remus Rusanu already wrote) are indexes. Execute both queries and check which indexes are used in faster query and force SQL Server to use them always. You can do it using with(index(index_name)).

SQL grouping and running total of open items for a date range

I have a table of items that, for sake of simplicity, contains the ItemID, the StartDate, and the EndDate for a list of items.
ItemID StartDate EndDate
1 1/1/2011 1/15/2011
2 1/2/2011 1/14/2011
3 1/5/2011 1/17/2011
...
My goal is to be able to join this table to a table with a sequential list of dates,
and say both how many items are open on a particular date, and also how many items are cumulatively open.
Date ItemsOpened CumulativeItemsOpen
1/1/2011 1 1
1/2/2011 1 2
...
I can see how this would be done with a WHILE loop,
but that has performance implications. I'm wondering how
this could be done with a set-based approach?
SELECT COUNT(CASE WHEN d.CheckDate = i.StartDate THEN 1 ELSE NULL END)
AS ItemsOpened
, COUNT(i.StartDate)
AS ItemsOpenedCumulative
FROM Dates AS d
LEFT JOIN Items AS i
ON d.CheckDate BETWEEN i.StartDate AND i.EndDate
GROUP BY d.CheckDate
This may give you what you want
SELECT DATE,
SUM(ItemOpened) AS ItemsOpened,
COUNT(StartDate) AS ItemsOpenedCumulative
FROM
(
SELECT d.Date, i.startdate, i.enddate,
CASE WHEN i.StartDate = d.Date THEN 1 ELSE 0 END AS ItemOpened
FROM Dates d
LEFT OUTER JOIN Items i ON d.Date BETWEEN i.StartDate AND i.EndDate
) AS x
GROUP BY DATE
ORDER BY DATE
This assumes that your date values are DATE data type. Or, the dates are DATETIME with no time values.
You may find this useful. The recusive part can be replaced with a table. To demonstrate it works I had to populate some sort of date table. As you can see, the actual sql is short and simple.
DECLARE #i table (itemid INT, startdate DATE, enddate DATE)
INSERT #i VALUES (1,'1/1/2011', '1/15/2011')
INSERT #i VALUES (2,'1/2/2011', '1/14/2011')
INSERT #i VALUES (3,'1/5/2011', '1/17/2011')
DECLARE #from DATE
DECLARE #to DATE
SET #from = '1/1/2011'
SET #to = '1/18/2011'
-- the recusive sql is strictly to make a datelist between #from and #to
;WITH cte(Date)
AS (
SELECT #from DATE
UNION ALL
SELECT DATEADD(day, 1, DATE)
FROM cte ch
WHERE DATE < #to
)
SELECT cte.Date, sum(case when cte.Date=i.startdate then 1 else 0 end) ItemsOpened, count(i.itemid) ItemsOpenedCumulative
FROM cte
left join #i i on cte.Date between i.startdate and i.enddate
GROUP BY cte.Date
OPTION( MAXRECURSION 0)
If you are on SQL Server 2005+, you could use a recursive CTE to obtain running totals, with the additional help of the ranking function ROW_NUMBER(), like this:
WITH grouped AS (
SELECT
d.Date,
ItemsOpened = COUNT(i.ItemID),
rn = ROW_NUMBER() OVER (ORDER BY d.Date)
FROM Dates d
LEFT JOIN Items i ON d.Date BETWEEN i.StartDate AND i.EndDate
GROUP BY d.Date
WHERE d.Date BETWEEN #FilterStartDate AND #FilterEndDate
),
cumulative AS (
SELECT
Date,
ItemsOpened,
ItemsOpenedCumulative = ItemsOpened
FROM grouped
WHERE rn = 1
UNION ALL
SELECT
g.Date,
g.ItemsOpened,
ItemsOpenedCumulative = g.ItemsOpenedCumulative + c.ItemsOpened
FROM grouped g
INNER JOIN cumulative c ON g.Date = DATEADD(day, 1, c.Date)
)
SELECT *
FROM cumulative

why does adding the where statement to this sql make it run so much slower?

I have inherited a stored procedure and am having problems with it takes a very long time to run (around 3 minutes). I have played around with it, and without the where clause it actually only takes 12 seconds to run. None of the tables it references have a lot of data in them, can anybody see any reason why adding the main where clause below makes it take so much longer?
ALTER Procedure [dbo].[MissingReadingsReport] #SiteID INT,
#FormID INT,
#StartDate Varchar(8),
#EndDate Varchar(8)
As
If #EndDate > GetDate()
Set #EndDate = Convert(Varchar(8), GetDate(), 112)
Select Dt.FormID,
DT.FormDAte,
DT.Frequency,
Dt.DayOfWeek,
DT.NumberOfRecords,
Dt.FormName,
dt.OrgDesc,
Dt.CDesc
FROM (Select MeterForms.FormID,
MeterForms.FormName,
MeterForms.SiteID,
MeterForms.Frequency,
DateTable.FormDate,
tblOrganisation.OrgDesc,
CDesc = ( COMPANY.OrgDesc ),
DayOfWeek = CASE Frequency
WHEN 'Day' THEN DatePart(dw, DateTable.FormDate)
WHEN 'WEEK' THEN
DatePart(dw, MeterForms.FormDate)
END,
NumberOfRecords = CASE Frequency
WHEN 'Day' THEN (Select TOP 1 RecordID
FROM MeterReadings
Where
MeterReadings.FormDate =
DateTable.FormDate
And MeterReadings.FormID =
MeterForms.FormID
Order By RecordID DESC)
WHEN 'WEEK' THEN (Select TOP 1 ( FormDate )
FROM MeterReadings
Where
MeterReadings.FormDate >=
DateAdd(d
, -4,
DateTable.FormDate)
And MeterReadings.FormDate
<=
DateAdd(d, 3,
DateTable.FormDate)
AND MeterReadings.FormID =
MeterForms.FormID)
END
FROM MeterForms
INNER JOIN DateTable
ON MeterForms.FormDate <= DateTable.FormDate
INNER JOIN tblOrganisation
ON MeterForms.SiteID = tblOrganisation.pkOrgId
INNER JOIN tblOrganisation COMPANY
ON tblOrganisation.fkOrgID = COMPANY.pkOrgID
/*this is what makes the query run slowly*/
Where DateTable.FormDAte >= #StartDAte
AND DateTable.FormDate <= #EndDate
AND MeterForms.SiteID = ISNULL(#SiteID, MeterForms.SiteID)
AND MeterForms.FormID = IsNull(#FormID, MeterForms.FormID)
AND MeterForms.FormID > 0)DT
Where ( Frequency = 'Day'
And dt.NumberofRecords IS NULL )
OR ( ( Frequency = 'Week'
AND DayOfWeek = DATEPART (dw, Dt.FormDate) )
AND ( FormDate <> NumberOfRecords
OR dt.NumberofRecords IS NULL ) )
Order By FormID
Based on what you've already mentioned, it looks like the tables are properly indexed for columns in the join conditions but not for the columns in the where clause.
If you're not willing to change the query, it may be worth it to look into indexes defined on the where clause columns, specially that have the NULL check
Try replacing your select with this:
FROM
(select siteid, formid, formdate from meterforms
where siteid = isnull(#siteid, siteid) and
meterforms.formid = isnull(#formid, formid) and formid >0
) MeterForms
INNER JOIN
(select formdate from datetable where formdate >= #startdate and formdate <= #enddate) DateTable
ON MeterForms.FormDate <= DateTable.FormDate
INNER JOIN tblOrganisation
ON MeterForms.SiteID = tblOrganisation.pkOrgId
INNER JOIN tblOrganisation COMPANY
ON tblOrganisation.fkOrgID = COMPANY.pkOrgID
/*this is what makes the query run slowly*/
)DT
I would be willing to bet that if you moved the Meterforms where clauses up to the from statement:
FROM (select [columns] from MeterForms WHERE SiteID= ISNULL [etc] ) MF
INNER JOIN [etc]
It would be faster, as the filtering would occur before the join. Also, having your INNER JOIN on your DateTable doing a <= down in your where clause may be returning more than you'd like ... try moving that between up to a subselect as well.
Have you run an execution plan on this yet to see where the bottleneck is?
Random suggestion, coming from an Oracle background:
What happens if you rewrite the following:
AND MeterForms.SiteID = ISNULL(#SiteID, MeterForms.SiteID)
AND MeterForms.FormID = IsNull(#FormID, MeterForms.FormID)
...to
AND (#SiteID is null or MeterForms.SiteID = #SiteID)
AND (#FormID is null or MeterForms.FormID = #FormID)