Excessive runtime sql. How to improve? - sql

I have the following SQL query to execute in Sql Server MSSM, and it takes more than 5 seconds to run. The tables that are joined by the inner join, just a few tens of thousands of records. Why does it takes so long?.
The higher costs of the query are: - Clustered Index Scan [MyDB].[dbo].[LinPresup].[PK_LinPresup_Linea_IdPresupuesto_IdPedido] 78%. - Clustered Index Seek [MyDB].[dbo].[Pedidos].[PK_Pedidos_IdPedido] 19%
Thank you.
Declare #FILTROPAG bigint
set #FILTROPAG = 1
Declare #FECHATRABAJO DATETIME
set #FECHATRABAJO = getDate()
Select * from(
SELECT distinct Linpresup.IdCliente, Linpresup.IdPedido, Linpresup.FSE, Linpresup.IdArticulo,
Linpresup.Des, ((Linpresup.can*linpresup.mca)-(linpresup.srv*linpresup.mca)) as Pendiente,
Linpresup.IdAlmacen, linpresup.IdPista, articulos.Tip, linpresup.Linea,
ROW_NUMBER() OVER(ORDER BY CONVERT(Char(19), Linpresup.FSE, 120) +
Linpresup.IdPedido + CONVERT(char(2), linpresup.Linea) DESC) as NUM_REG
FROM Linpresup INNER JOIN Pedidos on LinPresup.IdPedido = Pedidos.IdPedido
INNER JOIN Articulos ON Linpresup.IdArticulo = Articulos.IdArticulo
where pedidos.Cerrado = 'false' and linpresup.IdPedido <> '' and linpresup.can <> linpresup.srv
and Linpresup.FecAnulacion is null and Linpresup.Fse <= #FECHATRABAJO
and LinPresup.IdCliente not in (Select IdCliente from Clientes where Ctd = '4')
and Substring(LinPresup.IdPedido, 5, 2) LIKE '11' or Substring(LinPresup.IdPedido, 5, 2) LIKE '10'
) as TablaTemp
WHERE NUM_REG BETWEEN #FILTROPAG AND 1500
order by NUM_REG ASC
----------
This is the new query with the changes applied:
CHECKPOINT;
go
dbcc freeproccache
go
dbcc dropcleanbuffers
go
Declare #FILTROPAG bigint
set #FILTROPAG = 1
Declare #FECHATRABAJO DATETIME
set #FECHATRABAJO = getDate()
SELECT Linpresup.IdCliente, Linpresup.IdPedido, Linpresup.FSE, Linpresup.IdArticulo,
Linpresup.Des, Linpresup.can, linpresup.mca, linpresup.srv,
Linpresup.IdAlmacen, linpresup.IdPista, linpresup.Linea
into #TEMPREP
FROM Linpresup
where Linpresup.FecAnulacion is null and linpresup.IdPedido <> ''
and (linpresup.can <> linpresup.srv) and Linpresup.Fse <= #FECHATRABAJO
Select *, ((can*mca)-(srv*mca)) as Pendiente
From(
Select tablaTemp.*, ROW_NUMBER() OVER(ORDER BY FSECONVERT + IDPedido + LINCONVERT DESC) as NUM_REG, Articulos.Tip
From(
Select #TEMPREP.*,
Substring(#TEMPREP.IdPedido, 5, 2) as NewCol,
CONVERT(Char(19), #TEMPREP.FSE, 120) as FSECONVERT, CONVERT(char(2), #TEMPREP.Linea) as LINCONVERT
from #TEMPREP INNER JOIN Pedidos on #TEMPREP.IdPedido = Pedidos.IdPedido
where Pedidos.Cerrado = 'false'
and #TEMPREP.IdCliente not in (Select IdCliente from Clientes where Ctd = '4')) as tablaTemp
inner join Articulos on tablaTemp.IDArticulo = Articulos.IdArticulo
where (NewCol = '10' or NewCol = '11')) as TablaTemp2
where NUM_REG BETWEEN #FILTROPAG AND 1500
order by NUM_REG ASC
DROP TABLE #TEMPREP
The total execution time has decreased from 5336 to 3978, and the waiting time for a server response has come to take from 5309 to 2730. It's something.

This part of your query is not SARGable and an index scan will be performed instead of a seek
and Substring(LinPresup.IdPedido, 5, 2) LIKE '11'
or Substring(LinPresup.IdPedido, 5, 2) LIKE '10'
functions around column names in general will lead to an index scan

Without seeing your execution plan it's hard to say. That said the following jumps out at me as a potential danger point:
and Substring(LinPresup.IdPedido, 5, 2) LIKE '11'
or Substring(LinPresup.IdPedido, 5, 2) LIKE '10'
I suspect that using the substring function here will cause any potentially useful indexes to not be used. Also, why are you using LIKE here? I'm guessing it probably gets optimized out, but it seems like a standard = would work...

I can't imagine why you would think such a query would run quickly. You are:
ordering the recordset twice (and once with where you are using
concatentation and functions),
your where clause has functions (which are not sargable) and ORs
which are almost always slow,
you use not in where not exists would probably be faster.
you have math calculations
And you haven't mentioned your indexing (which may or may not be helpful) or what the execution plan shows as the spots that are affecting performance the most.
I would probably start with pulling the distinct data to a CTE or temp table (you can index temp tables) without the calcualtions (to ensure when you do the calcs later it is against the smallest data set). Then I would convert the substrings to LinPresup.IdPedido LIKE '1[0-1]%'. I woudl convert the not in to not exists. I would put the math in the outer query so that is is only done on the smalest data set.

Related

Main T-SQL WHERE function seems to be wrongly applied to a subquery

This query returns a set of dates from tblValue whose FieldValue is type nvarchar(4000)
SELECT t1.FieldValue FROM (SELECT FieldValue
FROM tblValue
WHERE FieldID = 4) t1
WHERE DateAdd(day, -90, t1.FieldValue) <= GETDATE()
This works, but instead of hard-coding the FieldID of 4, I'd like to get all FieldValues for those which have the type "Expiration".
This query returns 4.
SELECT FieldID FROM tblField WHERE FieldType = 'Expiration'
So, I expect this query's innermost subquery to return 4, and then to have the DateAdd applied only to those Expiration values which are yielded from t1 in the outermost subquery, which is what happens in the working first example.
SELECT t1.FieldValue FROM (SELECT FieldValue
FROM tblValue
WHERE FieldID = (SELECT FieldID FROM tblField WHERE FieldType = 'Expiration')) t1
WHERE DateAdd(day, -90, t1.FieldValue) <= GETDATE()
But I get the error
"Conversion failed when converting date and/or time from character string."
which to me suggests that the DateAdd is being applied to all values of tblValue, not only to those which are yielded by the subquery which returns t1. There is probably a technical reason for it, but it doesn't seem right to me. For some reason
WHERE FieldID = 4) t1
is not equivalent to
WHERE FieldID = (SELECT FieldID FROM tblField WHERE FieldType = 'Expiration')) t1
It just so happens that if I leave off the final WHERE clause of the erroring query I get the same set of dates as in the working query. So t1 should not be presenting any values which the DateAdd should have a problem with. But there it is. I'm puzzled as to why.
This happens because of the particular execution plan that the optimizer produces. Depending on how it chooses to combine the comparison and filtering operations of the various clauses, it can do either one or the other first.
In this case, it's trying to perform the date conversion and comparison before applying the FieldType filter.
It's a well-known issue but inherent to the behavior of the SQL optimizer -- this is a similar issue with a different datatype: https://connect.microsoft.com/SQLServer/feedback/details/333312/error-8114-converting-data-type-varchar-to-numeric
There are ways around this, but they are not always straightforward and usually require you to force specific order of execution.
The below works for me, although I understand that the CASE technique is not always 100% effective. From this fiddle:
SELECT t1.FieldValue FROM (SELECT FieldValue
FROM tblValue
WHERE FieldID = (SELECT FieldID FROM tblField WHERE FieldType = 'Expiration')) t1
WHERE CASE WHEN ISDATE(t1.FieldValue) = 1 THEN DateAdd(day, -90, t1.FieldValue) ELSE '1/1/2900' END <= GETDATE()
I guess you want this?
SELECT * FROM tblValue v
JOIN tblField f ON v.FieldID = f.FieldID
WHERE f.FieldType = 'Expiration' AND DateAdd(day, -90, v.FieldValue) <= GETDATE()
To categorize this as wrongly applied is not fair
You don't get to control which rows TSQL will evaluate
With a hard 4 the optimizer did that first
Without a hard 4 the query optimizer had to be ready for anything and moved it to later
The query optimizer even considers a derived table fair game to optimize
If you just look at the query plan you can see the order
Try
SELECT *
FROM tblValue v
JOIN tblField f
ON v.FieldID = f.FieldID
AND f.FieldType = 'Expiration'
AND DateAdd(day, -90, v.FieldValue) <= GETDATE()

SQL query taking too long - help for better performance

I am running a stored procedure in which i have the following query
SELECT TOP 1 #TopValue = t2.myDecimal
from table1 t1, table2 t2
where t1.ID = t2.table1ID
and CONVERT( VARCHART(8), t1.Created, 112 ) = #stringYYYYMMDD
and t1.Term = #Label"
TopRate is Decimal(8,6)
stringYYYYMMDD is a string representing a date in the format YYYYMMDD
Label is a simple varchar(8)
This query is called for every row of my data set, that can be from 10 to 5000. If I comment this query, the procedure execution time is under 2 seconds. With the query included, it just never ends.
I am using Microsoft SQL Server management studio 2008 R2
Thank you for your help
First, you should use explicit join syntax. Second, it is suspicious whenever you have a top without an order by. So, your query as I see it is:
select TOP 1 #TopValue = t2.myDecimal
from table1 t1 join
table2 t2
on t1.ID = t2.table1ID
where CONVERT( VARCHART(8), t1.Created, 112 ) = #stringYYYYMMDD and t1.Term = #Label"
You can speed this up with some indexes. But, before doing that, you want to change the date comparison:
where t1.Created >= convert(datetime, #stringYYYYMMDD, 112) and
t1.Created < convert(datetime, #stringYYYYMMDD, 112) + 1 and
t1.Term = #Label
Moving the function from the column to the constant makes the comparison "sargable", meaning that indexes can be used for it.
Next, create the indexes table1(Term, Created, Id) and table2(table1Id). These indexes should boost performance.

Too Long Query Duration in SQL Server 2008 R2 DataCenter

We have a query and this is an actual execution plan
As you can see - Clustered Index Seek takes 99%.
Also it seeks on primary keys (type int).
Table Source has 275 000 rows.
Table AuthorSource has 2 275 000 rows.
No partition and compression used.
The problem is that first time execution takes 25-40 seconds. But the second run successively takes 1-2 seconds.
Also we have replication, queue reader, log reader agents running on this server.
Amount of RAM: 4GB
Sql Server uses: 3.7GB
We think, that sql caches query after first execution for some period of time, and this is the reason, that second run takes only 1-2 seconds.
But irrespective of cache and other reasons, it is very strange, that primary key index seek query takes 20-40 seconds.
This issue is repeated. Any different parameters we provide to the query - we get same results: very long first time query and fast the second and the following.
May be some additional settings or Resource Governor ability we have to use ?
exec sp_executesql N'
SELECT [Project1].[C1] AS [C1]
FROM ( SELECT CAST(1 AS bit) AS X
) AS [SingleRowTable1]
LEFT OUTER JOIN
(SELECT [GroupBy1].[A1] AS [C1]
FROM ( SELECT COUNT(CAST(1 AS bit)) AS [A1]
FROM (SELECT [Extent1].[Mention_ID] AS [Mention_ID] ,
[Extent1].[Theme_ID] AS [Theme_ID] ,
[Extent1].[Mention_Weight] AS [Mention_Weight] ,
[Extent1].[AuthorSource_ID] AS [AuthorSource_ID1] ,
[Extent1].[Mention_CreationDate] AS [Mention_CreationDate] ,
[Extent1].[Mention_DeletedMark] AS [Mention_DeletedMark] ,
[Extent1].[Mention_AuthorTags] AS [Mention_AuthorTags] ,
[Extent1].[Mention_Tonality] AS [Mention_Tonality] ,
[Extent1].[Mention_Comment] AS [Mention_Comment] ,
[Extent1].[Mention_AdditionDate] AS [Mention_AdditionDate] ,
[Extent1].[UserToAnswer_ID] AS [UserToAnswer_ID] ,
[Extent1].[GeoName_ID] AS [GeoName_ID] ,
[Extent1].[Geo_ID] AS [Geo_ID] ,
[Extent1].[Mention_PermaLinkHash] AS [Mention_PermaLinkHash] ,
[Extent1].[Mention_IsFiltredByAuthor] AS [Mention_IsFiltredByAuthor] ,
[Extent1].[Mention_IsFiltredByGeo] AS [Mention_IsFiltredByGeo] ,
[Extent1].[Mention_IsFiltredBySource] AS [Mention_IsFiltredBySource] ,
[Extent1].[Mention_IsFiltredBySourceType] AS [Mention_IsFiltredBySourceType] ,
[Extent1].[GengineLog_InstanceId] AS [GengineLog_InstanceId] ,
[Extent1].[Mention_PermaLinkBinaryHash] AS [Mention_PermaLinkBinaryHash] ,
[Extent1].[Mention_APIType] AS [Mention_APIType] ,
[Extent1].[Mention_IsFilteredByAuthorSource] AS [Mention_IsFilteredByAuthorSource],
[Extent1].[Mention_IsFavorite] AS [Mention_IsFavorite] ,
[Extent1].[Mention_SpamType] AS [Mention_SpamType] ,
[Extent1].[MentionContent_ID] AS [MentionContent_ID] ,
[Extent2].[AuthorSource_ID] AS [AuthorSource_ID2] ,
[Extent2].[Author_ID] AS [Author_ID] ,
[Extent2].[Source_ID] AS [Source_ID] ,
[Extent2].[Author_Nick] AS [Author_Nick] ,
[Extent2].[Author_UrlBinaryHash] AS [Author_UrlBinaryHash] ,
[Extent2].[AuthorSource_Type] AS [AuthorSource_Type] ,
[Extent2].[Author_Url] AS [Author_Url] ,
[Extent2].[AuthorSource_Description] AS [AuthorSource_Description] ,
[Extent2].[AuthorSource_Gender] AS [AuthorSource_Gender]
FROM [dbo].[Mention] AS [Extent1]
LEFT OUTER JOIN [dbo].[AuthorSource] AS [Extent2]
ON [Extent1].[AuthorSource_ID] = [Extent2].[AuthorSource_ID]
WHERE (
[Extent1].[Mention_DeletedMark] <> CAST(1 AS bit)
)
AND
(
[Extent1].[Mention_IsFiltredByAuthor] <> CAST(1 AS bit)
)
AND
(
[Extent1].[Mention_IsFilteredByAuthorSource] <> CAST(1 AS bit)
)
AND
(
[Extent1].[Mention_IsFiltredByGeo] <> CAST(1 AS bit)
)
AND
(
[Extent1].[Mention_IsFiltredBySource] <> CAST(1 AS bit)
)
AND
(
[Extent1].[Mention_IsFiltredBySourceType] <> CAST(1 AS bit)
)
) AS [Filter1]
LEFT OUTER JOIN [dbo].[Source] AS [Extent3]
ON [Filter1].[Source_ID] = [Extent3].[Source_ID]
WHERE (
[Filter1].[Theme_ID] = #p__linq__49557
)
AND
(
[Extent3].[Source_Type] <> #p__linq__49558
)
) AS [GroupaBy1]
) AS [Project1]
ON 1 = 1
',N'#p__linq__49557 int,#p__linq__49558 int',#p__linq__49557=7966,#p__linq__49558=8
IndexSeeking Performance Information
Also we wrote query manually in sql with this simple code:
Select COUNT(1) from Mention m inner join AuthorSource auth on m.AuthorSource_ID = auth.AuthorSource_ID inner join
Source s on auth.Source_ID = s.Source_ID where
m.Mention_DeletedMark = 0 AND m.Mention_IsFilteredByAuthorSource = 0 AND m.Mention_IsFiltredByAuthor = 0
AND m.Mention_IsFiltredByGeo = 0 AND m.Mention_IsFiltredBySource = 0 AND m.Mention_IsFiltredBySourceType = 0
AND m.Theme_ID = 7966
and s.Source_Type <> 8
and execution plan is the same that we posted.
The query is quite hairy, but It looks like you are missing an index on Mention.Theme_ID?
Sql server is having problem because a lot of <> are used, meaning that it cannot use an index and must fetch everything and then sort it out.
After Martin's advices in comments to question, the answer is in understanding how SQL Server build the execution plans and counting disk reads operation needed to first query run.
In our particular situation, forced inner hash join instead of inner join give to us result as we expected and different execution plan that SQL choose by default.

Problem with sql select query

I'm having a little problem with [PortfelID] column. I need it's ID to be able to use it in function which will return me name about Type of Strategy per client. However by doing this i need to put [PortfelID] in GroupBy which complicates the results a lot.
I'm looking for a way to find Type of Strategy and Sum of Money this strategy has. However if i use Group By [PortfelID] I'm getting multiple entries per each strategy. Actually over 700 rows (because there are 700 [PortfelID] values). And all I want is just 1 strategy and Sum of [WycenaWartosc] for this strategy. So in total i would get 15 rows or so
Is there a way to use that function without having to add [PortfelID] in Group By?
DECLARE #data DateTime
SET #data = '20100930'
SELECT [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data)
,SUM([WycenaWartosc]) AS 'Wycena'
FROM[dbo].[Wycena]
LEFT JOIN [KlienciPortfeleKonta]
ON [Wycena].[KlienciPortfeleKontaID] = [KlienciPortfeleKonta].[KlienciPortfeleKontaID]
WHERE [WycenaData] = #data
GROUP BY [PortfelID]
Where [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] is defined like this:
ALTER FUNCTION [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty]
(
#portfelID INT,
#data DATETIME
)
RETURNS NVARCHAR(MAX)
AS BEGIN
RETURN ( SELECT TOP 1
[TypyStrategiiNazwa]
FROM [dbo].[KlienciPortfeleUmowy]
INNER JOIN [dbo].[TypyStrategii]
ON dbo.KlienciPortfeleUmowy.TypyStrategiiID = dbo.TypyStrategii.TypyStrategiiID
WHERE [PortfelID] = #portfelID
AND ( [KlienciUmowyDataPoczatkowa] <= #data
AND ([KlienciUmowyDataKoncowa] >= #data
OR KlienciUmowyDataKoncowa IS NULL)
)
ORDER BY [KlienciUmowyID] ASC
)
end
EDIT:
As per suggestion (Roopesh Majeti) I've made something like this:
SELECT SUM(CASE WHEN [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data) = 'portfel energetyka' THEN [WycenaWartosc] ELSE 0 END) AS 'Strategy 1'
,SUM(CASE WHEN [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data) = 'banków niepublicznych' THEN [WycenaWartosc] ELSE 0 END) AS 'Strategy 2'
FROM [dbo].[Wycena]
LEFT JOIN [KlienciPortfeleKonta]
ON [Wycena].[KlienciPortfeleKontaID] = [KlienciPortfeleKonta].[KlienciPortfeleKontaID]
WHERE [WycenaData] = #data
But this seems like a bit overkill and a bit too much of hand job is required. AlexS solution seems to do exactly what I need :-)
Here's an idea of how you can do this.
DECLARE #data DateTime
SET #data = '20100930'
SELECT
TypID,
SUM([WycenaWartosc]) AS 'Wycena'
FROM
(
SELECT [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data) as TypID
,[WycenaWartosc]
FROM[dbo].[Wycena]
LEFT JOIN [KlienciPortfeleKonta]
ON [Wycena].[KlienciPortfeleKontaID] = [KlienciPortfeleKonta].[KlienciPortfeleKontaID]
WHERE [WycenaData] = #data
) as Q
GROUP BY [TypID]
So basically there's no need to group by PortfelID (as soon as you need to group by output of [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty]).
This query is not optimal, though. Join can be pushed to the outer query in case PortfelID and WycenaData are not in [KlienciPortfeleKonta] table.
UPDATE: fixed select list and aggregation function application
How about using the "Case" statement in sql ?
Check the below link for example :
http://www.1keydata.com/sql/sql-case.html
Hope this helps.

SQL Server 2005 Setting a variable to the result of a select query

How do I set a variable to the result of select query without using a stored procedure?
I want to do something like:
OOdate DATETIME
SET OOdate = Select OO.Date
FROM OLAP.OutageHours as OO
WHERE OO.OutageID = 1
Then I want to use OOdate in this query:
SELECT COUNT(FF.HALID) from Outages.FaultsInOutages as OFIO
INNER join Faults.Faults as FF ON FF.HALID = OFIO.HALID
WHERE CONVERT(VARCHAR(10),OO.Date,126) = CONVERT(VARCHAR(10),FF.FaultDate,126))
AND
OFIO.OutageID = 1
You can use something like
SET #cnt = (SELECT COUNT(*) FROM User)
or
SELECT #cnt = (COUNT(*) FROM User)
For this to work the SELECT must return a single column and a single result and the SELECT statement must be in parenthesis.
Edit: Have you tried something like this?
DECLARE #OOdate DATETIME
SET #OOdate = Select OO.Date from OLAP.OutageHours as OO where OO.OutageID = 1
Select COUNT(FF.HALID)
from Outages.FaultsInOutages as OFIO
inner join Faults.Faults as FF
ON FF.HALID = OFIO.HALID
WHERE #OODate = FF.FaultDate
AND OFIO.OutageID = 1
-- Sql Server 2005 Management studio
use Master
go
DECLARE #MyVar bigint
SET #myvar = (SELECT count(*) FROM spt_values);
SELECT #myvar
Result: 2346 (in my db)
-- Note: #myvar = #Myvar
You could use:
declare #foo as nvarchar(25)
select #foo = 'bar'
select #foo
You could also just put the first SELECT in a subquery. Since most optimizers will fold it into a constant anyway, there should not be a performance hit on this.
Incidentally, since you are using a predicate like this:
CONVERT(...) = CONVERT(...)
that predicate expression cannot be optimized properly or use indexes on the columns reference by the CONVERT() function.
Here is one way to make the original query somewhat better:
DECLARE #ooDate datetime
SELECT #ooDate = OO.Date FROM OLAP.OutageHours AS OO where OO.OutageID = 1
SELECT
COUNT(FF.HALID)
FROM
Outages.FaultsInOutages AS OFIO
INNER JOIN Faults.Faults as FF ON
FF.HALID = OFIO.HALID
WHERE
FF.FaultDate >= #ooDate AND
FF.FaultDate < DATEADD(day, 1, #ooDate) AND
OFIO.OutageID = 1
This version could leverage in index that involved FaultDate, and achieves the same goal.
Here it is, rewritten to use a subquery to avoid the variable declaration and subsequent SELECT.
SELECT
COUNT(FF.HALID)
FROM
Outages.FaultsInOutages AS OFIO
INNER JOIN Faults.Faults as FF ON
FF.HALID = OFIO.HALID
WHERE
CONVERT(varchar(10), FF.FaultDate, 126) = (SELECT CONVERT(varchar(10), OO.Date, 126) FROM OLAP.OutageHours AS OO where OO.OutageID = 1) AND
OFIO.OutageID = 1
Note that this approach has the same index usage issue as the original, because of the use of CONVERT() on FF.FaultDate. This could be remedied by adding the subquery twice, but you would be better served with the variable approach in this case. This last version is only for demonstration.
Regards.
This will work for original question asked:
DECLARE #Result INT;
SELECT #Result = COUNT(*)
FROM TableName
WHERE Condition
What do you mean exactly? Do you want to reuse the result of your query for an other query?
In that case, why don't you combine both queries, by making the second query search inside the results of the first one (SELECT xxx in (SELECT yyy...)