SQL query taking too long - help for better performance

SQL query taking too long - help for better performance - sql

I am running a stored procedure in which i have the following query
SELECT TOP 1 #TopValue = t2.myDecimal
from table1 t1, table2 t2
where t1.ID = t2.table1ID
and CONVERT( VARCHART(8), t1.Created, 112 ) = #stringYYYYMMDD
and t1.Term = #Label"
TopRate is Decimal(8,6)
stringYYYYMMDD is a string representing a date in the format YYYYMMDD
Label is a simple varchar(8)
This query is called for every row of my data set, that can be from 10 to 5000. If I comment this query, the procedure execution time is under 2 seconds. With the query included, it just never ends.
I am using Microsoft SQL Server management studio 2008 R2
Thank you for your help

First, you should use explicit join syntax. Second, it is suspicious whenever you have a top without an order by. So, your query as I see it is:
select TOP 1 #TopValue = t2.myDecimal
from table1 t1 join
table2 t2
on t1.ID = t2.table1ID
where CONVERT( VARCHART(8), t1.Created, 112 ) = #stringYYYYMMDD and t1.Term = #Label"
You can speed this up with some indexes. But, before doing that, you want to change the date comparison:
where t1.Created >= convert(datetime, #stringYYYYMMDD, 112) and
t1.Created < convert(datetime, #stringYYYYMMDD, 112) + 1 and
t1.Term = #Label
Moving the function from the column to the constant makes the comparison "sargable", meaning that indexes can be used for it.
Next, create the indexes table1(Term, Created, Id) and table2(table1Id). These indexes should boost performance.

Related

SQL select statement is finding something that should not exist in a view

I have this sql statement for making a view
create view RecordYearsTwo
as
select
Record.RecordID, RecordValue.Value
from
Record
join
RecordValue on Record.RecordID = RecordValue.RecordID
where
len(RecordValue.Value) = 4
and RecordValue.Value like '[16-20][0-9][0-9][0-9]%'
and RecordValue.Value like '%[16-20][0-9][0-9][0-9]'
and RecordValue.Value != '26 Mar 1850';
When I then run
select *
from Record
join RecordYearsTwo on Record.RecordID = RecordYearsTwo.RecordID
where cast(RecordYearsTwo.Value as int) >= 1800
I get this error
Conversion failed when converting the nvarchar value '26 Mar 1850' to data type int.
My understanding is that '26 Mar 1850' shouldn't even exist in my view because the length of everything in my view should be 4 and I specifically said should not equal '26 Mar 1850'
Any ideas?

The criteria for that datestamp isn't needed.
Because even the first criteria wouldn't accept it (to long).
And those LIKE criteria don't need the % if only 4 characters are expected.
create view RecordYearsTwo as
select rec.RecordID, val.Value
from Record rec
join RecordValue val on val.RecordID = rec.RecordID
where len(val.Value) = 4
and (val.Value like '1[6-9][0-9][0-9]' or val.Value like '20[0-9][0-9]')
And to avoid the error you could use TRY_CAST instead.
select *
from Record
join RecordYearsTwo on Record.RecordID = RecordYearsTwo.RecordID
where try_cast(RecordYearsTwo.Value as int) >= 1800

With multiple expressions in the where clause, you can't guarantee what order they get executed in. Try:
create view RecordYearsTwo as
select r.RecordID, v.Value
from Record r
join (
select *
from RecordValue
where len(Value) = 4
) v on r.RecordID = v.RecordID
where v.Value like '[16-20][0-9][0-9][0-9]%'
and v.Value like '%[16-20][0-9][0-9][0-9]'

First, your view conditions do not make sense. I think you want:
create view RecordYearsTwo as
select r.RecordID, rv.Value
from Record r join
RecordValue rv
on r.RecordID = rv.RecordID
where len(rv.Value) = 4
try_convert(int, rv.Value) >= 1600 and
try_convert(int, rv.Value) < 2100;
Your logic doesn't make sense. I find this amusing: 'v.Value like '[16-20][0-9][0-9][0-9]%'. That like pattern says to get any '1', any character between '6' and '2' (which is none), and any '0'. I understand what you mean, but SQL Server does not.
Then, the view does not get executed first. You have no idea what the order of execution is, so for your query, you want try_convert() again:
select *
from Record join r
RecordYearsTwo ry2
on r.RecordID = ry2.RecordID
where try_convert(int, ry2.Value) >= 1800

Cross apply with outer reference forces scan

How do I coerce SQL into seeking against my index in this scenario? I have a cross apply which, if fed static values seeks correctly. If fed input from the outer rows, it fails to generate a plan. What's the difference... should it be able to take the rows from the topmost operator and feed them into the cross apply?
select *
from AccessorGrantPermissableAssociations a
cross apply
(
select z.AccessorId, z.PermissableId, max(z.CreatedDate) CreatedDate
-- notice forceseek (cannot generate query plan when using reference to alias 'a'
-- works fine when provided static values
from AccessorGrant z (forceseek)
where
-- works
z.AccessorId = 1 and z.PermissableId = 1
-- doesn't work
--z.AccessorId = a.AccessorId and z.PermissableId = a.PermissableId
and z.CreatedDate <= cast(switchoffset(#asOfMoment, '-00:00') as datetime2)
group by z.AccessorId, z.PermissableId
) b
I can prove that the index works becuase I can execute the following with a fast seek:
select z.AccessorId, z.PermissableId, max(z.CreatedDate) CreatedDate
from AccessorGrant z (forceseek)
where z.AccessorId = 1 and z.PermissableId = 1
and z.CreatedDate <= cast(switchoffset(#asOfMoment, '-00:00') as datetime2)
group by z.AccessorId, z.PermissableId
For your info, there is an index on AccessorGrant:
(AccessorId, PermissableId, CreatedDate)
Reiterate Question:
Why doesn't the same query work in a cross apply that does if its provided static values? How can I get the most recent date for every pair of AccessibleId and PermissableId with an efficient plan?
update plans (pasteplan didn't work for me)
Here is a plan using z.AccessorId = 1 and z.PermissableId = 1:
Here is a plan using z.AccessorId = a.AccessorId and z.PermissableId = a.PermissableId:

This looks like a slight variation of a classic top-n-per-group problem.
It can be done with CROSS APPLY or with ROW_NUMBER. The best method depends on your data distribution.
If we keep the CROSS APPLY approach, I would rewrite your query like this:
select *
from
AccessorGrantPermissableAssociations AS a
cross apply
(
select TOP(1)
z.AccessorId, z.PermissableId, z.CreatedDate
from
AccessorGrant AS z
where
z.AccessorId = a.AccessorId
and z.PermissableId = a.PermissableId
and z.CreatedDate <= cast(switchoffset(#asOfMoment, '-00:00') as datetime2)
ORDER BY z.CreatedDate DESC
) AS b
;
It produces the same result, but with explicit instruction to the server to get only one row from AccessorGrant for each row from AccessorGrantPermissableAssociations. It looks like optimizer is not smart enough to convert MAX into TOP(1) when it is buried behind sub-query in this case. It can do this transformation in the simple query, but can't in this case.
If it still doesn't do seek, change the index to match the query exactly: (AccessorId, PermissableId, CreatedDate DESC).
Most likely if you write the query in this form you would not need a FORCESEEK hint.

Go back to z.AccessorId = a.AccessorId and z.PermissableId = a.PermissableId in your b subquery, but add WHERE a.AccessorId = 1 and a.PermissibleId = 1 in the main query, and get rid of the FORCESEEK table hint.
select *
from AccessorGrantPermissableAssociations a
cross apply
(
select z.AccessorId, z.PermissableId, max(z.CreatedDate) CreatedDate
from AccessorGrant z
where
z.AccessorId = a.AccessorId and z.PermissableId = a.PermissableId
and z.CreatedDate <= cast(switchoffset(#asOfMoment, '-00:00') as datetime2)
group by z.AccessorId, z.PermissableId
) b
WHERE a.AccessorId = 1 and a.PermissableId = 1
Your CROSS APPLY is evaluated for every row in your outer query, so you want to limit the outer query as much as possible.
But really, are you not after just the max created date? It should more properly be this:
select *
from AccessorGrantPermissableAssociations a
cross apply
(
select max(z.CreatedDate) CreatedDate
from AccessorGrant z
where
z.AccessorId = a.AccessorId and z.PermissableId = a.PermissableId
and z.CreatedDate <= cast(switchoffset(#asOfMoment, '-00:00') as datetime2)
) b
WHERE a.AccessorId = 1 and a.PermissableId = 1

ISNULL - ON clause

I'm trying to use the ISNULL, COALESCE or CASE on this ON clause but it gives me an error.
'Incorrect syntax near the keyword 'between'.
I need to check one date from one table between two dates from other one but if the date doesn't exist on the second table then use the previous date.
talbe1 t1 JOIN table2 t2
ON t1.code = t2.code
AND ISNULL(cast(dateadd(d,-1,t1.UtcFinishTime) as date) BETWEEN t2.TransactionFirstDate and t2.TransactionLastDate,cast(dateadd(d,-2,t1.UtcFinishTime) as date) BETWEEN t2.TransactionFirstDate and t2.TransactionLastDate)
Many thanks.

Your syntax is incorrect because ISNULL needs to be part of the statement as it returns the value; and not the entire statement, so you're likely after something like:
AND ISNULL(
cast(dateadd(d,-1,t1.UtcFinishTime) as date),
cast(dateadd(d,-2,t1.UtcFinishTime) as date)
)
BETWEEN t2.TransactionFirstDate and t2.TransactionLastDate
Or using a CASE functionality.
But I'd advice you for performance sake to make a check before hand and then have a specific query for either situation.

Could not check it but off the top of my head, below query may help-
SELECT *
from table1 t1 JOIN table2 t2
ON t1.code = t2.code AND
CASE
WHEN dateadd(day,-1,t1.UtcFinishTime) is null
THEN cast(dateadd(day,-2,t1.UtcFinishTime) as date)
END
BETWEEN t2.TransactionFirstDate and t2.TransactionLastDate

Main T-SQL WHERE function seems to be wrongly applied to a subquery

This query returns a set of dates from tblValue whose FieldValue is type nvarchar(4000)
SELECT t1.FieldValue FROM (SELECT FieldValue
FROM tblValue
WHERE FieldID = 4) t1
WHERE DateAdd(day, -90, t1.FieldValue) <= GETDATE()
This works, but instead of hard-coding the FieldID of 4, I'd like to get all FieldValues for those which have the type "Expiration".
This query returns 4.
SELECT FieldID FROM tblField WHERE FieldType = 'Expiration'
So, I expect this query's innermost subquery to return 4, and then to have the DateAdd applied only to those Expiration values which are yielded from t1 in the outermost subquery, which is what happens in the working first example.
SELECT t1.FieldValue FROM (SELECT FieldValue
FROM tblValue
WHERE FieldID = (SELECT FieldID FROM tblField WHERE FieldType = 'Expiration')) t1
WHERE DateAdd(day, -90, t1.FieldValue) <= GETDATE()
But I get the error
"Conversion failed when converting date and/or time from character string."
which to me suggests that the DateAdd is being applied to all values of tblValue, not only to those which are yielded by the subquery which returns t1. There is probably a technical reason for it, but it doesn't seem right to me. For some reason
WHERE FieldID = 4) t1
is not equivalent to
WHERE FieldID = (SELECT FieldID FROM tblField WHERE FieldType = 'Expiration')) t1
It just so happens that if I leave off the final WHERE clause of the erroring query I get the same set of dates as in the working query. So t1 should not be presenting any values which the DateAdd should have a problem with. But there it is. I'm puzzled as to why.

This happens because of the particular execution plan that the optimizer produces. Depending on how it chooses to combine the comparison and filtering operations of the various clauses, it can do either one or the other first.
In this case, it's trying to perform the date conversion and comparison before applying the FieldType filter.
It's a well-known issue but inherent to the behavior of the SQL optimizer -- this is a similar issue with a different datatype: https://connect.microsoft.com/SQLServer/feedback/details/333312/error-8114-converting-data-type-varchar-to-numeric
There are ways around this, but they are not always straightforward and usually require you to force specific order of execution.
The below works for me, although I understand that the CASE technique is not always 100% effective. From this fiddle:
SELECT t1.FieldValue FROM (SELECT FieldValue
FROM tblValue
WHERE FieldID = (SELECT FieldID FROM tblField WHERE FieldType = 'Expiration')) t1
WHERE CASE WHEN ISDATE(t1.FieldValue) = 1 THEN DateAdd(day, -90, t1.FieldValue) ELSE '1/1/2900' END <= GETDATE()

I guess you want this?
SELECT * FROM tblValue v
JOIN tblField f ON v.FieldID = f.FieldID
WHERE f.FieldType = 'Expiration' AND DateAdd(day, -90, v.FieldValue) <= GETDATE()

To categorize this as wrongly applied is not fair
You don't get to control which rows TSQL will evaluate
With a hard 4 the optimizer did that first
Without a hard 4 the query optimizer had to be ready for anything and moved it to later
The query optimizer even considers a derived table fair game to optimize
If you just look at the query plan you can see the order
Try
SELECT *
FROM tblValue v
JOIN tblField f
ON v.FieldID = f.FieldID
AND f.FieldType = 'Expiration'
AND DateAdd(day, -90, v.FieldValue) <= GETDATE()

Excessive runtime sql. How to improve?

I have the following SQL query to execute in Sql Server MSSM, and it takes more than 5 seconds to run. The tables that are joined by the inner join, just a few tens of thousands of records. Why does it takes so long?.
The higher costs of the query are: - Clustered Index Scan [MyDB].[dbo].[LinPresup].[PK_LinPresup_Linea_IdPresupuesto_IdPedido] 78%. - Clustered Index Seek [MyDB].[dbo].[Pedidos].[PK_Pedidos_IdPedido] 19%
Thank you.
Declare #FILTROPAG bigint
set #FILTROPAG = 1
Declare #FECHATRABAJO DATETIME
set #FECHATRABAJO = getDate()
Select * from(
SELECT distinct Linpresup.IdCliente, Linpresup.IdPedido, Linpresup.FSE, Linpresup.IdArticulo,
Linpresup.Des, ((Linpresup.can*linpresup.mca)-(linpresup.srv*linpresup.mca)) as Pendiente,
Linpresup.IdAlmacen, linpresup.IdPista, articulos.Tip, linpresup.Linea,
ROW_NUMBER() OVER(ORDER BY CONVERT(Char(19), Linpresup.FSE, 120) +
Linpresup.IdPedido + CONVERT(char(2), linpresup.Linea) DESC) as NUM_REG
FROM Linpresup INNER JOIN Pedidos on LinPresup.IdPedido = Pedidos.IdPedido
INNER JOIN Articulos ON Linpresup.IdArticulo = Articulos.IdArticulo
where pedidos.Cerrado = 'false' and linpresup.IdPedido <> '' and linpresup.can <> linpresup.srv
and Linpresup.FecAnulacion is null and Linpresup.Fse <= #FECHATRABAJO
and LinPresup.IdCliente not in (Select IdCliente from Clientes where Ctd = '4')
and Substring(LinPresup.IdPedido, 5, 2) LIKE '11' or Substring(LinPresup.IdPedido, 5, 2) LIKE '10'
) as TablaTemp
WHERE NUM_REG BETWEEN #FILTROPAG AND 1500
order by NUM_REG ASC
----------
This is the new query with the changes applied:
CHECKPOINT;
go
dbcc freeproccache
go
dbcc dropcleanbuffers
go
Declare #FILTROPAG bigint
set #FILTROPAG = 1
Declare #FECHATRABAJO DATETIME
set #FECHATRABAJO = getDate()
SELECT Linpresup.IdCliente, Linpresup.IdPedido, Linpresup.FSE, Linpresup.IdArticulo,
Linpresup.Des, Linpresup.can, linpresup.mca, linpresup.srv,
Linpresup.IdAlmacen, linpresup.IdPista, linpresup.Linea
into #TEMPREP
FROM Linpresup
where Linpresup.FecAnulacion is null and linpresup.IdPedido <> ''
and (linpresup.can <> linpresup.srv) and Linpresup.Fse <= #FECHATRABAJO
Select *, ((can*mca)-(srv*mca)) as Pendiente
From(
Select tablaTemp.*, ROW_NUMBER() OVER(ORDER BY FSECONVERT + IDPedido + LINCONVERT DESC) as NUM_REG, Articulos.Tip
From(
Select #TEMPREP.*,
Substring(#TEMPREP.IdPedido, 5, 2) as NewCol,
CONVERT(Char(19), #TEMPREP.FSE, 120) as FSECONVERT, CONVERT(char(2), #TEMPREP.Linea) as LINCONVERT
from #TEMPREP INNER JOIN Pedidos on #TEMPREP.IdPedido = Pedidos.IdPedido
where Pedidos.Cerrado = 'false'
and #TEMPREP.IdCliente not in (Select IdCliente from Clientes where Ctd = '4')) as tablaTemp
inner join Articulos on tablaTemp.IDArticulo = Articulos.IdArticulo
where (NewCol = '10' or NewCol = '11')) as TablaTemp2
where NUM_REG BETWEEN #FILTROPAG AND 1500
order by NUM_REG ASC
DROP TABLE #TEMPREP
The total execution time has decreased from 5336 to 3978, and the waiting time for a server response has come to take from 5309 to 2730. It's something.

This part of your query is not SARGable and an index scan will be performed instead of a seek
and Substring(LinPresup.IdPedido, 5, 2) LIKE '11'
or Substring(LinPresup.IdPedido, 5, 2) LIKE '10'
functions around column names in general will lead to an index scan

Without seeing your execution plan it's hard to say. That said the following jumps out at me as a potential danger point:
and Substring(LinPresup.IdPedido, 5, 2) LIKE '11'
or Substring(LinPresup.IdPedido, 5, 2) LIKE '10'
I suspect that using the substring function here will cause any potentially useful indexes to not be used. Also, why are you using LIKE here? I'm guessing it probably gets optimized out, but it seems like a standard = would work...

I can't imagine why you would think such a query would run quickly. You are:
ordering the recordset twice (and once with where you are using
concatentation and functions),
your where clause has functions (which are not sargable) and ORs
which are almost always slow,
you use not in where not exists would probably be faster.
you have math calculations
And you haven't mentioned your indexing (which may or may not be helpful) or what the execution plan shows as the spots that are affecting performance the most.
I would probably start with pulling the distinct data to a CTE or temp table (you can index temp tables) without the calcualtions (to ensure when you do the calcs later it is against the smallest data set). Then I would convert the substrings to LinPresup.IdPedido LIKE '1[0-1]%'. I woudl convert the not in to not exists. I would put the math in the outer query so that is is only done on the smalest data set.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query taking too long - help for better performance - sql

Related

SQL select statement is finding something that should not exist in a view

Cross apply with outer reference forces scan

ISNULL - ON clause

Main T-SQL WHERE function seems to be wrongly applied to a subquery

Excessive runtime sql. How to improve?

Categories

Resources