why does adding the where statement to this sql make it run so much slower? - sql

I have inherited a stored procedure and am having problems with it takes a very long time to run (around 3 minutes). I have played around with it, and without the where clause it actually only takes 12 seconds to run. None of the tables it references have a lot of data in them, can anybody see any reason why adding the main where clause below makes it take so much longer?
ALTER Procedure [dbo].[MissingReadingsReport] #SiteID INT,
#FormID INT,
#StartDate Varchar(8),
#EndDate Varchar(8)
As
If #EndDate > GetDate()
Set #EndDate = Convert(Varchar(8), GetDate(), 112)
Select Dt.FormID,
DT.FormDAte,
DT.Frequency,
Dt.DayOfWeek,
DT.NumberOfRecords,
Dt.FormName,
dt.OrgDesc,
Dt.CDesc
FROM (Select MeterForms.FormID,
MeterForms.FormName,
MeterForms.SiteID,
MeterForms.Frequency,
DateTable.FormDate,
tblOrganisation.OrgDesc,
CDesc = ( COMPANY.OrgDesc ),
DayOfWeek = CASE Frequency
WHEN 'Day' THEN DatePart(dw, DateTable.FormDate)
WHEN 'WEEK' THEN
DatePart(dw, MeterForms.FormDate)
END,
NumberOfRecords = CASE Frequency
WHEN 'Day' THEN (Select TOP 1 RecordID
FROM MeterReadings
Where
MeterReadings.FormDate =
DateTable.FormDate
And MeterReadings.FormID =
MeterForms.FormID
Order By RecordID DESC)
WHEN 'WEEK' THEN (Select TOP 1 ( FormDate )
FROM MeterReadings
Where
MeterReadings.FormDate >=
DateAdd(d
, -4,
DateTable.FormDate)
And MeterReadings.FormDate
<=
DateAdd(d, 3,
DateTable.FormDate)
AND MeterReadings.FormID =
MeterForms.FormID)
END
FROM MeterForms
INNER JOIN DateTable
ON MeterForms.FormDate <= DateTable.FormDate
INNER JOIN tblOrganisation
ON MeterForms.SiteID = tblOrganisation.pkOrgId
INNER JOIN tblOrganisation COMPANY
ON tblOrganisation.fkOrgID = COMPANY.pkOrgID
/*this is what makes the query run slowly*/
Where DateTable.FormDAte >= #StartDAte
AND DateTable.FormDate <= #EndDate
AND MeterForms.SiteID = ISNULL(#SiteID, MeterForms.SiteID)
AND MeterForms.FormID = IsNull(#FormID, MeterForms.FormID)
AND MeterForms.FormID > 0)DT
Where ( Frequency = 'Day'
And dt.NumberofRecords IS NULL )
OR ( ( Frequency = 'Week'
AND DayOfWeek = DATEPART (dw, Dt.FormDate) )
AND ( FormDate <> NumberOfRecords
OR dt.NumberofRecords IS NULL ) )
Order By FormID

Based on what you've already mentioned, it looks like the tables are properly indexed for columns in the join conditions but not for the columns in the where clause.
If you're not willing to change the query, it may be worth it to look into indexes defined on the where clause columns, specially that have the NULL check

Try replacing your select with this:
FROM
(select siteid, formid, formdate from meterforms
where siteid = isnull(#siteid, siteid) and
meterforms.formid = isnull(#formid, formid) and formid >0
) MeterForms
INNER JOIN
(select formdate from datetable where formdate >= #startdate and formdate <= #enddate) DateTable
ON MeterForms.FormDate <= DateTable.FormDate
INNER JOIN tblOrganisation
ON MeterForms.SiteID = tblOrganisation.pkOrgId
INNER JOIN tblOrganisation COMPANY
ON tblOrganisation.fkOrgID = COMPANY.pkOrgID
/*this is what makes the query run slowly*/
)DT

I would be willing to bet that if you moved the Meterforms where clauses up to the from statement:
FROM (select [columns] from MeterForms WHERE SiteID= ISNULL [etc] ) MF
INNER JOIN [etc]
It would be faster, as the filtering would occur before the join. Also, having your INNER JOIN on your DateTable doing a <= down in your where clause may be returning more than you'd like ... try moving that between up to a subselect as well.
Have you run an execution plan on this yet to see where the bottleneck is?

Random suggestion, coming from an Oracle background:
What happens if you rewrite the following:
AND MeterForms.SiteID = ISNULL(#SiteID, MeterForms.SiteID)
AND MeterForms.FormID = IsNull(#FormID, MeterForms.FormID)
...to
AND (#SiteID is null or MeterForms.SiteID = #SiteID)
AND (#FormID is null or MeterForms.FormID = #FormID)

Related

SQL with while loop to DAX conversion

Trying to convert the SQL with while loop code into DAX. Trying to build this query without using temp tables as access is an issue on the database and only have views to work with. I believe best option for me is to code it in DAX. Could someone help with it.
DECLARE #sd DATETIME
DECLARE #ed DATETIME
SELECT #sd = CONVERT(DATETIME, '2021-01-31')
SELECT #ed = GETDATE()
DECLARE #date DATETIME = EOMONTH(#sd)
WHILE ( (#date) <= #ed )
BEGIN
SELECT MONTH(#date) as Month, YEAR(#date) as Year, DAY(#date) as Day, A.*
FROM [people] A
WHERE A.effective_date = (SELECT MAX(B.effective_date)
FROM [people] B
WHERE B.employee_id = A.employee_id
AND B.record_id = A.record_id
AND B.effective_date <= #date)
AND A.effective_sequence = (SELECT MAX(C.effective_sequence)
FROM [people] C
WHERE C.employee_id = A.employee_id
AND C.record_id = A.record_id
AND C.effective_date = A.effective_date)
ORDER BY A.employee_id;
SET #date = EOMONTH(DATEADD(MONTH,1,#date))
END
While you could do this as a view, you would either have to hard-code the start and end dates, or filter them afterwards (which is likely to be inefficient). Instead you can do this as an inline Table Valued Function.
We can use a virtual tally-table (generated with a couple cross-joins) to generate a row for each month
We can use row-numbering instead of the two subqueries
CREATE FUNCTION dbo.GetData (#sd DATETIME, #ed DATETIME)
RETURNS TABLE AS RETURN
WITH L0 AS (
SELECT *
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) v(n)
),
L1 AS (
SELECT 1 n FROM L0 a CROSS JOIN L0 b
)
SELECT
MONTH(m.Month) as Month,
YEAR(m.Month) as Year,
DAY(m.Month) as Day,
p.* -- specify columns
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY m.Month, p.employee_id, p.record_id ORDER BY p.effective_date, p.effective_sequence) AS rn
FROM [people] p
CROSS JOIN (
SELECT TOP (DATEDIFF(month, #sd, #ed) + 1)
DATEADD(month, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1, EOMONTH(#sd)) AS Month
FROM L1
) m
WHERE p.effective_date <= m.Month
) p
WHERE p.rn = 1
;
Then in PowerBI you can just do for example
SELECT *
FROM dbo.GetData ('2021-01-31', GETDATE()) d
ORDER BY
d.employee_id
Note that you cannot put the ORDER BY within the function, it doesn't work.

Trying Exclude 2 days within the year

I added my dates with the query so you can so what dates im running. From these dates im trying to exclude 2 days 10/6/2016 and 10/7/2016
DECLARE #Startdate AS DATETIME
DECLARE #EndDate AS DATETIME
SET #Startdate = '10/1/2015'
SET #EndDate = '9/30/2016
SELECT A.agent_name, COUNT(*) AS CH, (CAST(SUM(reporting_call_matrix.talk_time) / COUNT(reporting_call_matrix.answer_time) AS float)
+ CAST(SUM(reporting_call_matrix.hold_time) AS float) / COUNT(reporting_call_matrix.answer_time)) +
CAST(SUM(reporting_call_matrix.work_time) AS float) / COUNT(reporting_call_matrix.answer_time) AS AHT,
answer_agent_id
FROM reporting_call_matrix INNER JOIN
reporting_agents AS A ON reporting_call_matrix.answer_agent_id = A.agent_id INNER JOIN
reporting_split_info ON reporting_call_matrix.split = reporting_split_info.split
WHERE (reporting_call_matrix.answer_agent_id IS NOT NULL) AND (reporting_call_matrix.split IN (9,23)) AND
(reporting_call_matrix.queued_time >= #StartDate) AND (reporting_call_matrix.queued_time < DATEADD(d, 1, #EndDate)) AND
(reporting_call_matrix.answer_time IS NOT NULL) AND
GROUP BY A.agent_name,answer_agent_id
No idea what you are actually trying to do here but I cleaned up this query so it is more legible. I also greatly simplified that nightmarish calculation. You don't have to cast everything to a float, just simply multiple it by 1.0. Also, float is probably not a great choice if you want accuracy because it is an approximate datatype where 1.0 will be numeric which is an exact datatype.
With some aliases and formatting this is so much easier to read.
SELECT A.agent_name
, COUNT(*) AS CH
, (SUM(rcm.talk_time) + SUM(rcm.hold_time) / COUNT(rcm.answer_time) * 1.0) + (SUM(rcm.work_time) / COUNT(rcm.answer_time) * 1.0) AS AHT
, answer_agent_id
FROM reporting_call_matrix rcm
INNER JOIN reporting_agents AS A ON rcm.answer_agent_id = A.agent_id
INNER JOIN reporting_split_info rsi ON rcm.split = rsi.split
WHERE rcm.answer_agent_id IS NOT NULL
AND rcm.split IN (9,23)
AND rcm.queued_time >= #StartDate
AND rcm.queued_time < DATEADD(day, 1, #EndDate)
AND rcm.answer_time IS NOT NULL
GROUP BY A.agent_name
, rcm.answer_agent_id
Now, what is the actual question here?
With your recent update I will hazard a guess that queued_time is a date datatype?
Why not simply add one more predicate to your where clause?
AND rcm.queued_time not in ('2016-10-06', '2016-10-07')

Running Total as SUM () OVER () in SQL when values are missing

I'm trying to calculate Running Total and it works correct but only when values I'm conditioning on are available. When some are unavailable, calculation is going wrong, some NULLs happen and at the end Running Total is incorrect, here's the example of such situation:
and I would like to be set like on the screen shot below (with missing months added), which should give correct Running Total (named backlog here) at the end:
Is there any way to define full_year and month_number columns to be visible with '0' value set when there was no value?
My current query is as below:
IF OBJECT_ID('tempdb..#Temp4') IS NOT NULL BEGIN
drop table #Temp4
end
SELECT * into #Temp4
from (
SELECT
datepart(yy, t3.[datestamp]) AS full_year
,datepart(mm, t3.[datestamp]) AS month_number
,count(*) as number_of_activities
,t2.affected_item
FROM [sm70prod].[dbo].[ACTSVCMGTM1] AS t3
JOIN [sm70prod].[dbo].[INCIDENTSM1] AS t2 ON t3.number = t2.incident_id
WHERE
t2.affected_item like 'service'
AND (t3.[type] LIKE 'Open')
GROUP BY t2.affected_item, datepart(yy, t3.[datestamp]), datepart(mm, t3.[datestamp])
)
as databases (full_year, month_number, number_of_activities, affected_item)
;
IF OBJECT_ID('tempdb..#Temp5') IS NOT NULL BEGIN
drop table #Temp5
end
SELECT * into #Temp5
from (
SELECT
datepart(yy, t3.[datestamp]) AS full_year
,datepart(mm, t3.[datestamp]) AS month_number
,count(*) as number_of_activities
,t2.affected_item
FROM [sm70prod].[dbo].[ACTSVCMGTM1] AS t3
JOIN [sm70prod].[dbo].[INCIDENTSM1] AS t2 ON t3.number = t2.incident_id
WHERE
t2.affected_item like 'service'
AND (t3.[type] LIKE 'Closed')
GROUP BY t2.affected_item, datepart(yy, t3.[datestamp]), datepart(mm, t3.[datestamp])
)
as databases (full_year, month_number, number_of_activities, affected_item)
select * from (select o.full_year
,o.month_number
,o.number_of_activities as [open]
,c.number_of_activities as [close]
,sum(o.number_of_activities - c.number_of_activities) over (ORDER BY c.full_year, c.month_number) as [backlog]
from #Temp4 o full join #Temp5 c on o.full_year = c.full_year and o.month_number = c.month_number) as sub
order by full_year, month_number
https://msdn.microsoft.com/en-gb/library/ms190349%28v=sql.110%29.aspx
Try using the coalesce function:
COALESCE(someattribute, 0);
if the attribute is NULL the value zero will be used instead.
Also note:
When comparing varchars without a regex you should use the = operator and not the LIKE operator.
I found the answer for that. For missing months I can use the code as below:
If(OBJECT_ID('tempdb..#Temp6') Is Not Null)
Begin
Drop Table #Temp6
End
create table #Temp6
(
full_year int
,month_number int
)
; WITH cteStartDate AS
(SELECT StartDate = '2007-01-01'),
cteSequence(SeqNo)
AS (SELECT 0
UNION ALL
SELECT SeqNo + 1
FROM cteSequence
WHERE SeqNo < DATEDIFF(MM,(SELECT StartDate FROM cteStartDate),getdate()))
INSERT INTO #Temp6
SELECT datepart(yy, DATEADD(MM,SeqNo,(SELECT StartDate FROM cteStartDate))) AS full_year
,datepart(mm, DATEADD(MM,SeqNo,(SELECT StartDate FROM cteStartDate))) AS month_number
FROM cteSequence
OPTION (MAXRECURSION 0)
Then I can use full join with #Temp4, add ISNULL(o.number_of_activities, 0), the same for #Temp5 and it will work.

TSQL how to use if else in Where clause

I want to create a report, the report will have parameter for the user to select
-IsApprovedDate
-IsCatcheDate
I would like to know how to used the if else in the where clause.
Example if the user selects IsApprovedDate the report will lookup based on approved Date else will lookup based on catch date. In my query I will get top10 fish size base on award order weight here is my query.
;WITH CTE AS
(
select Rank() OVER (PARTITION BY c.trophyCatchCertificateTypeId order by c.catchWeight desc ) as rnk
,c.id,c.customerId, Cust.firstName + ' '+Cust.lastName as CustomerName
,CAST(CONVERT(varchar(10),catchWeightPoundsComponent)+'.'+CONVERT(varchar(10),catchWeightOuncesComponent) as numeric(6,2) ) WLBS
,c.catchGirth,c.catchLength,ct.description county
,t.description award--
,c.trophyCatchCertificateTypeId
,s.specificSpecies--
,c.speciesId
from Catches c
INNER JOIN TrophyCatchCertificateTypes t on c.trophyCatchCertificateTypeId = t.id
INNER JOIN Species s on c.speciesId = s.id
INNER JOIN Counties ct on c.countyId = ct.id
INNER JOIN Customers Cust on c.customerId = cust.id
Where c.bigCatchCertificateTypeId is not null
and c.catchStatusId =1
and c.speciesId =1 and c.isTrophyCatch =1
and c.catchDate >= #startDay and c.catchDate<=#endDay
)
Select * from CTE c1
Where rnk <=10
Just use conditional logic for this:
where . . . and
((#IsApprovedDate = 1 and c.ApprovedDate >= #startDay and c.ApprovedDate <= #endDay) or
(#IsCatchDate = 1 and c.catchDate >= #startDay and c.catchDate <= #endDay)
)
EDIT:
I would actually write this as:
where . . . and
((#IsApprovedDate = 1 and c.ApprovedDate >= #startDay and c.ApprovedDate < dateadd(day, 1 #endDay) or
(#IsCatchDate = 1 and c.catchDate >= #startDay and c.catchDate < dateadd(day, 1, #endDay))
)
This is a safer construct because it work when the date values have times and when they do not.
Performance will be much better if you build the WHERE clause dynamically in your code and then execute it.

Query is not returning proper values

I have the query below which is a mammoth:
DECLARE #Start Date, #End Date, #DaySpan int, #UserId int, #ProjectId int
SET #Start = '7/08/2014 12:00 AM -05:00';
SET #End = '7/27/2014 12:00 AM -05:00';
SET #DaySpan = 1;
SET #UserId = 102;
SET #ProjectId = 2065;
WITH T(StartDate, EndDate)
AS (
SELECT #Start StartDate, DATEADD(DAY, #DaySpan - 1, #Start) EndDate
UNION ALL
SELECT DATEADD(DAY, 1, EndDate) StartDate, DATEADD(DAY, #DaySpan, EndDate) FROM T WHERE DATEADD(DAY, #DaySpan, EndDate) <= #End
)
SELECT convert(datetimeoffset, T.StartDate) StartDate, T.EndDate, ISNULL(Completes, 0) Completes, SUM(h.Hours) Hours, SUM(h.Hours) / NULLIF(Completes, 0) HoursPerRecruit,
ISNULL(Completes, 0) / NULLIF(SUM(h.Hours), 0) RecruitsPerHour
FROM T LEFT JOIN (
SELECT StartDate, EndDate, COUNT(r.Id) Completes
FROM Respondents r JOIN T st ON r.RecruitedOn >= st.StartDate AND r.RecruitedOn < DATEADD(day, 1, st.EndDate)
WHERE r.RecruitingStatus = 7
AND RecruitedBy = #UserId
AND r.ProjectId = #ProjectId -- **REMOVE Line If you just want by User**
GROUP BY st.StartDate, st.EndDate
) c ON T.StartDate = c.StartDate AND T.EndDate = c.EndDate
LEFT JOIN (
SELECT st.StartDate, st.EndDate, SUM(Hours) Hours
FROM T st JOIN TimeEntries te ON te.Date >= CONVERT(DATE, st.StartDate) AND te.Date < DATEADD(day, DATEDIFF(day,0,CONVERT(DATE, st.EndDate)),1)
JOIN Users u ON te.HarvestUserId = u.HarvestId
--JOIN Projects PR ON te.HarvestProjectId = PR.Id
WHERE u.Id = #UserId
GROUP BY st.StartDate, st.EndDate
) h ON T.StartDate = h.StartDate AND T.EndDate = h.EndDate
GROUP BY T.StartDate, T.EndDate, Completes
ORDER BY T.StartDate
OPTION(MAXRECURSION 32767)
It returns results like below:
StartDate EndDate Completes Hours HoursPerRecruit RecruitsPerHour
2014-07-10 00:00:00.0000000 +00:00 2014-07-10 6 3.00 0.500000 2.00000000000000000000000000
It works great.. But now I want to limit the hours returned by project. So in the query you will see a line that is commented out that JOIN Projects PR ON te.HarvestProjectId = PR.Id. When I add that bit of code it completely messes up the calculations and returns nothing. Like so:
StartDate EndDate Completes Hours HoursPerRecruit RecruitsPerHour
2014-07-10 00:00:00.0000000 +00:00 2014-07-10 6 NULL NULL NULL
What am I missing that is is making the HoursPerRecruit and RecruitsPerHour be null? I can't seem to figure it out.
I know this isn't a complete answer, but I can't format code in a comment.
Look at just this bit of code, and execute it with some valid value for the parameter:
SELECT st.StartDate, st.EndDate, SUM(Hours) Hours
FROM T st JOIN TimeEntries te ON te.Date >= CONVERT(DATE, st.StartDate) AND te.Date < DATEADD(day, DATEDIFF(day,0,CONVERT(DATE, st.EndDate)),1)
JOIN Users u ON te.HarvestUserId = u.HarvestId
--JOIN Projects PR ON te.HarvestProjectId = PR.Id
WHERE u.Id = #UserId
GROUP BY st.StartDate, st.EndDate
Then un-comment the commented line. Does it return no rows? I'm guessing that will be the case based on what you describe.
If so, then look at the results of this:
SELECT * FROM Projects
and see if you can figure out why no rows from the Projects table are joining to the TimeEntries table. Maybe you're joining on the wrong columns, or there's a mis-match in the data format.
If you are trying to limit the results the condition should be in the Where clause below your commented out join. If your Project.ID is directly linked to TimeEntries.HarvestProjectId with no other conditions you should be able to change the line
WHERE u.Id = #UserId
to:
WHERE u.Id = #UserId and te.HarvestProjectId=#ProjectID