The following query will be ran in about 22 seconds:
DECLARE #i INT, #x INT
SET #i = 156567
SELECT
TOP 1
#x = AncestorId
FROM
dbo.tvw_AllProjectStructureParents_ChildView a
WHERE
ProjectStructureId = #i AND
a.NodeTypeCode = 42 AND
a.AncestorTypeDiffLevel = 1
OPTION (RECOMPILE)
The problem is with variable assignment (indeed this line: #x = AncestorId). when removing the assignment, it speeds up to about 15 miliseconds!
I solved it with inserting the result to a temp table but I think it is a bad way.
Can anyone help me what the source of problem is?!
P.S.
bad Execution plan (22s) : https://www.brentozar.com/pastetheplan/?id=Sy6a4c9bW
good execution plan (20ms) :https://www.brentozar.com/pastetheplan/?id=Byg8Hc5ZZ
When you use OPTION (RECOMPILE) SQL Server can generally perform parameter embedding optimisation.
The plan it is compiling is single use so it can sniff the values of all variables and parameters and treat them as constants.
A trivial example showing the parameter embedding optimisation in action and the effect of assigning to a variable is below (actual execution plans not estimated).
DECLARE #A INT = 1,
#B INT = 2,
#C INT;
SELECT TOP (1) number FROM master..spt_values WHERE #A > #B;
SELECT TOP (1) number FROM master..spt_values WHERE #A > #B OPTION (RECOMPILE);
SELECT TOP (1) #C = number FROM master..spt_values WHERE #A > #B OPTION (RECOMPILE);
The plans for this are below
Note the middle one does not even touch the table at all as SQL Server can deduce at compile time that #A > #B is not true. But plan 3 is back to including the table in the plan as the variable assignment evidently prevents the effect of OPTION (RECOMPILE) shown in plan 2.
(As an aside the third plan is not really 4-5 times as expensive as the first. Assigning to a variable also seems to suppress the usual row goal logic where the costs of the index scan would be scaled down to reflect the TOP 1)
In your good plan the #i value of 156567 is pushed right into the seek in the anchor leg of the recursive CTE, it returned 0 rows and so the recursive part had to do no work.
In your bad plan the recursive CTE gets fully materialised with 627,393 executions of the recursive sub tree and finally the predicate is applied on the resulting 627,393 rows (discarding all of them) at the end
I'm not sure why SQL Server can't push the predicate with a variable down. You haven't supplied the definitions of your tables - or the view with the recursive CTE. There is a similar issue with predicate pushing, views, and window functions though.
One solution would be to change the view to an inline table valued function that accepts a parameter for mainid and then add that in to the WHERE clause in the anchor part of the definition. Rather than relying on SQL Server to push the predicate down for you.
The difference comes probably from SELECT TOP 1.
When you have only field, SQL Server will take only first row. When you have variable assignment SQL Server is fetching all results but use only the top one.
I checked on different queries and it is not always a case, but probably here SQL Server optimization fails because of complexity of views/tables.
You can try following workaround:
DECLARE #i INT, #x INT
SET #i = 156567
SET #x = (SELECT
TOP 1
AncestorId
FROM
dbo.tvw_AllProjectStructureParents_ChildView a
WHERE
ProjectStructureId = #i AND
a.NodeTypeCode = 42 AND
a.AncestorTypeDiffLevel = 1)
Related
Take the (simplified) stored procedure defined here:
create procedure get_some_stuffs
#max_records int = null
as
begin
set NOCOUNT on
select top (#max_records) *
from my_table
order by mothers_maiden_name
end
I want to restrict the number of records selected only if #max_records is provided.
Problems:
The real query is nasty and large; I want to avoid having it duplicated similar to this:
if(#max_records is null)
begin
select *
from {massive query}
end
else
begin
select top (#max_records)
from {massive query}
end
An arbitrary sentinel value doesn't feel right:
select top (ISNULL(#max_records, 2147483647)) *
from {massive query}
For example, if #max_records is null and {massive query} returns less than 2147483647 rows, would this be identical to:
select *
from {massive query}
or is there some kind of penalty for selecting top (2147483647) * from a table with only 50 rows?
Are there any other existing patterns that allow for an optionally count-restricted result set without duplicating queries or using sentinel values?
I was thinking about this, and although I like the explicitness of the IF statement in your Problem 1 statement, I understand the issue of duplication. As such, you could put the main query in a single CTE, and use some trickery to query from it (the bolded parts being the highlight of this solution):
CREATE PROC get_some_stuffs
(
#max_records int = NULL
)
AS
BEGIN
SET NOCOUNT ON;
WITH staged AS (
-- Only write the main query one time
SELECT * FROM {massive query}
)
-- This part below the main query never changes:
SELECT *
FROM (
-- A little switcheroo based on the value of #max_records
SELECT * FROM staged WHERE #max_records IS NULL
UNION ALL
SELECT TOP(ISNULL(#max_records, 0)) * FROM staged WHERE #max_records IS NOT NULL
) final
-- Can't use ORDER BY in combination with a UNION, so move it out here
ORDER BY mothers_maiden_name
END
I looked at the actual query plans for each and the optimizer is smart enough to completely avoid the part of the UNION ALL that doesn't need to run.
The ISNULL(#max_records, 0) is in there because TOP NULL isn't valid, and it will not compile.
You could use SET ROWCOUNT:
create procedure get_some_stuffs
#max_records int = null
as
begin
set NOCOUNT on
IF #max_records IS NOT NULL
BEGIN
SET ROWCOUNT #max_records
END
select top (#max_records) *
from my_table
order by mothers_maiden_name
SET ROWCOUNT 0
end
There are a few methods, but as you probably notice these all look ugly or are unnecessarily complicated. Furthermore, do you really need that ORDER BY?
You could use TOP (100) PERCENT and a View, but the PERCENT only works if you do not really need that expensive ORDER BY, since SQL Server will ignore your ORDER BY if you try it.
I suggest taking advantage of stored procedures, but first lets explain the difference in the type of procs:
Hard Coded Parameter Sniffing
--Note the lack of a real parametrized column. See notes below.
IF OBJECT_ID('[dbo].[USP_TopQuery]', 'U') IS NULL
EXECUTE('CREATE PROC dbo.USP_TopQuery AS ')
GO
ALTER PROC [dbo].[USP_TopQuery] #MaxRows NVARCHAR(50)
AS
BEGIN
DECLARE #SQL NVARCHAR(4000) = N'SELECT * FROM dbo.ThisFile'
, #Option NVARCHAR(50) = 'TOP (' + #MaxRows + ') *'
IF ISNUMERIC(#MaxRows) = 0
EXEC sp_executesql #SQL
ELSE
BEGIN
SET #SQL = REPLACE(#SQL, '*', #Option)
EXEC sp_executesql #SQL
END
END
Local Variable Parameter Sniffing
IF OBJECT_ID('[dbo].[USP_TopQuery2]', 'U') IS NULL
EXECUTE('CREATE PROC dbo.USP_TopQuery2 AS ')
GO
ALTER PROC [dbo].[USP_TopQuery2] #MaxRows INT NULL
AS
BEGIN
DECLARE #Rows INT;
SET #Rows = #MaxRows;
IF #MaxRows IS NULL
SELECT *
FROM dbo.THisFile
ELSE
SELECT TOP (#Rows) *
FROM dbo.THisFile
END
No Parameter Sniffing, old method
IF OBJECT_ID('[dbo].[USP_TopQuery3]', 'U') IS NULL
EXECUTE('CREATE PROC dbo.USP_TopQuery3 AS ')
GO
ALTER PROC [dbo].[USP_TopQuery3] #MaxRows INT NULL
AS
BEGIN
IF #MaxRows IS NULL
SELECT *
FROM dbo.THisFile
ELSE
SELECT TOP (#MaxRows) *
FROM dbo.THisFile
END
PLEASE NOTE ABOUT PARAMETER SNIFFING:
SQL Server initializes variables in Stored Procs at the time of compile, not when it parses.
This means that SQL Server will be unable to guess the query and will
choose the last valid execution plan for the query, regardless of
whether it is even good.
There are two methods, hard coding an local variables that allow the Optimizer to guess.
Hard Coding for Parameter Sniffing
Use sp_executesql to not only reuse the query, but prevent SQL Injection.
However, in this type of query, will not always perform substantially better since a TOP Operator is not a column or table (so the statement effectively has no variables in this version I used)
Statistics at the time of the creation of your compiled plan will dictate how affective the method is if you are not using a variable on a predicate (ON, WHERE, HAVING)
Can use options or hint to RECOMPILE to overcome this issue.
Variable Parameter Sniffing
Variable Paramter sniffing, on the other hand, is flexible enough to work witht the statistics here, and in my own testing it seemed the variable parameter had the advantage of the query using statistics (particularly after I updated the statistics).
Ultimately, the issue of performance is about which method will use the least amount of steps to traverse through the leaflets. Statistics, the rows in your table, and the rules for when SQL Server will decide to use a Scan vs Seek impact the performance.
Running different values will show performances change significantly, though typically better than USP_TopQuery3. So DO NOT ASSUME one method is necessarily better than the other.
Also note you can use a table-valued function to do the same, but as Dave Pinal would say:
If you are going to answer that ‘To avoid repeating code, you use
Function’ ‑ please think harder! Stored procedure can do the same...
if you are going to answer
with ‘Function can be used in SELECT, whereas Stored Procedure cannot
be used’ ‑ again think harder!
SQL SERVER – Question to You – When to use Function and When to use Stored Procedure
You could do it like this (using your example):
create procedure get_some_stuffs
#max_records int = null
as
begin
set NOCOUNT on
select top (ISNULL(#max_records,1000)) *
from my_table
order by mothers_maiden_name
end
I know you don't like this (according to your point 2), but that's pretty much how it's done (in my experience).
How about something like this (you're have to really look at execution plans and I didn't have time to set anything up)?
create procedure get_some_stuffs
#max_records int = null
as
begin
set NOCOUNT on
select *, ROW_NUMBER(OVER order by mothers_maiden_name) AS row_num
from {massive query}
WHERE #max_records IS NULL OR row_num < #max_records
end
Another thing you can do with {massive query} is make a view or inline table-valued function (it it's parametrized), which is generally a pretty good practice for anything big and repetitively used.
I've got a query in SQL (2008) that I can't understand why it's taking so much longer to evaluate if I include a clause in a WHERE statement that shouldn't affect the result. Here is an example of the query:
declare #includeAll bit = 0;
SELECT
Id
,Name
,Total
FROM
MyTable
WHERE
#includeAll = 1 OR Id = 3926
Obviously, in this case, the #includeAll = 1 will evaluate false; however, including that increases the time of the query as if it were always true. The result I get is correct with or without that clause: I only get the 1 entry with Id = 3926, but (in my real-world query) including that line increases the query time from < 0 seconds to about 7 minutes...so it seems it's running the query as if the statement were true, even though it's not, but still returning the correct results.
Any light that can be shed on why would be helpful. Also, if you have a suggestion on working around it I'd take it. I want to have a clause such as this one so that I can include a parameter in a stored procedure that will make it disregard the Id that it has and return all results if set to true, but I can't allow that to affect the performance when simply trying to get one record.
You'd need to look at the query plan to be sure, but using OR will often make it scan like this in some DBMS.
Also, read #Bogdan Sahlean's response for some great details as why this happening.
This may not work, but you can try something like if you need to stick with straight SQL:
SELECT
Id
,Name
,Total
FROM
MyTable
WHERE Id = 3926
UNION ALL
SELECT
Id
,Name
,Total
FROM
MyTable
WHERE Id <> 3926
AND #includeAll = 1
If you are using a stored procedure, you could conditionally run the SQL either way instead which is probably more efficient.
Something like:
if #includeAll = 0 then
SELECT
Id
,Name
,Total
FROM
MyTable
WHERE Id = 3926
else
SELECT
Id
,Name
,Total
FROM
MyTable
Obviously, in this case, the #includeAll = 1 will evaluate false;
however, including that increases the time of the query as if it were
always true.
This happens because those two predicates force SQL Server to choose an Index|Table Scan operator. Why ?
The execution plan is generated for all possible values of #includeAll variable / parameter. So, the same execution plan is used when #includeAll = 0 and when #includeAll = 1. If #includeAll = 0 is true and if there is an index on Id column then SQL Server could use an Index Seek or Index Seek + Key|RID Lookup to find the rows. But if #includeAll = 1 is true the optimal data access operator is Index|Table Scan. So if the execution plan must be usable for all values of #includeAll variable what is the data access operator used by SQL Server: Seek or Scan ? The answer is bellow where you can find a similar query:
DECLARE #includeAll BIT = 0;
-- Initial solution
SELECT p.ProductID, p.Name, p.Color
FROM Production.Product p
WHERE #includeAll = 1 OR p.ProductID = 345
-- My solution
DECLARE #SqlStatement NVARCHAR(MAX);
SET #SqlStatement = N'
SELECT p.ProductID, p.Name, p.Color
FROM Production.Product p
' + CASE WHEN #includeAll = 1 THEN '' ELSE 'WHERE p.ProductID = #ProductID' END;
EXEC sp_executesql #SqlStatement, N'#ProductID INT', #ProductID = 345;
These queries have the following execution plans:
As you can see, first execution plan includes an Clustered Index Scan with two not optimized predicates.
My solution is based on dynamic queries and it generates two different queries depending on the value of #includeAll variable thus:
[ 1 ] When #includeAll = 0 the generated query (#SqlStatement) is
SELECT p.ProductID, p.Name, p.Color
FROM Production.Product p
WHERE p.ProductID = #ProductID
and the execution plan includes an Index Seek (as you can see in image above) and
[ 2 ] When #includeAll = 1 the generated query (#SqlStatement) is
SELECT p.ProductID, p.Name, p.Color
FROM Production.Product p
and the execution plan includes an Clustered Index Scan. These two generated queries have different optimal execution plan.
Note: I've used Adventure Works 2012 sample database
My guess would be parameter sniffing - the procedure compiled when #includeAll was 1, and this is the query plan that has been cached. Meaning that when it is false you are still doing a full table scan when potentially and index seek and key lookup would be faster.
I think the best way of doing this is:
declare #includeAll bit = 0;
if #includeAll = 1
BEGIN
SELECT Id, Name,Total
FROM MyTable;
END
ELSE
BEGIN
SELECT Id, Name,Total
FROM MyTable
WHERE Id = 3926;
END
Or you can force recomplilation each time it is run:
SELECT Id, Name,Total
FROM MyTable
WHERE Id = 3926
OR #IncludeAll = 1
OPTION (RECOMPILE);
To demonstrate this further, I set up a very simple table and filled it with nonsense data:
CREATE TABLE dbo.T (ID INT, Filler CHAR(1000));
INSERT dbo.T (ID)
SELECT TOP 100000 a.Number
FROM master..spt_values a, master..spt_values b
WHERE a.type = 'P'
AND b.Type = 'P'
AND b.Number BETWEEN 1 AND 100;
CREATE NONCLUSTERED INDEX IX_T_ID ON dbo.T (ID);
I then ran the same query 4 times.
With #IncludeAll set to 1, query plan uses a table scan and plan is cached
Same query with #IncludeAll set to false, the plan with the table scan is still cached so that is used.
Clear the cache of plans, and run the query again with #IncludeAll false, so that the plan is now compiled and stored with an Index seek and bookmark lookup.
Run with #IncludeAll set to true. The Index seek and lookup are again used.
DECLARE #SQL NVARCHAR(MAX) = 'SELECT COUNT(Filler) FROM dbo.T WHERE #IncludeAll = 1 OR ID = 2;',
#ParamDefinition NVARCHAR(MAX) = '#IncludeAll BIT',
#PlanHandle VARBINARY(64);
EXECUTE sp_executesql #SQL, #ParamDefinition, 1;
EXECUTE sp_executesql #SQL, #ParamDefinition, 0;
SELECT #PlanHandle = cp.Plan_Handle
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS st
WHERE st.text LIKE '%' + #SQL;
DBCC FREEPROCCACHE (#PlanHandle); -- CLEAR THE CACHE
EXECUTE sp_executesql #SQL, #ParamDefinition, 0;
EXECUTE sp_executesql #SQL, #ParamDefinition, 1;
DBCC FREEPROCCACHE (#PlanHandle); -- CLEAR THE CACHE
Inspecting the execution plans show that once the query has been compiled it will reuse the same plan regardless of parameter value, and that it will cache the plan that is appropriate for the value passed when it is run first, not on a most flexible basis.
Putting an OR statement in your SQL like this will cause a scan. It is probably scanning your entire table or index which will be extremely inefficient in a very large table.
If you look at the query plan for a query without the #includeAll portion at all, you would probably see an Index seek operation. As soon as you add that OR you are more than likely changing the query plan to a table/index scan. You should take a look at the plan for your query to see what exactly it is doing.
Where Sql Server generates a queryplan, it must create a plan that would work for any possible values of any embedded variables. In your case an index seek would give you the best results when #IncludeAll = 0 however an index seek can not be used in the event #IncludeAll = 1. So the query optimizer has no choice but use the query plan that would work for either values of #IncludeAll. That results in a table scan.
It is because sql server does not short circuit when you have OR in where clause.
Normally in other programming languages if we have a condition like
IF (#Param1 == 1 || #Param2 == 2)
If Param1 is = 1 it would not even bother evaluating the other expression.
In Sql Server if you have similar OR in your where clause like you have in your query
WHERE #includeAll = 1 OR Id = 3926
Even if #includeAll = 1 evaluates to true it might go ahead and check for the second condition anyway.
And changing the Order in your where clause that which expression is evaluated 1st doesnt make any difference because this is something Tunning optimiser will decied at run time. You cannot control this behaviour of sql server. Short circuiting expression evaluation or Order in which expressions are evaluated.
I am using SQL Server 2008 and I have the following query:
SELECT [Id] FROM [dbo].[Products] WHERE [dbo].GetNumOnOrder([Id]) = 0
With the following "GetNumOnOrder" Scalar-valued function:
CREATE FUNCTION [dbo].[GetNumOnOrder]
(
#ProductId INT
)
RETURNS INT
AS
BEGIN
DECLARE #NumOnOrder INT
SELECT #NumOnOrder = SUM([NumOrdered] - [NumReceived])
FROM [dbo].[PurchaseOrderDetails]
INNER JOIN [dbo].[PurchaseOrders]
ON [PurchaseOrderDetails].[PurchaseOrderId] = [PurchaseOrders].[Id]
WHERE [PurchaseOrders].[StatusId] <> 5
AND [PurchaseOrderDetails].[ProductId] = #ProductId
RETURN CASE WHEN #NumOnOrder IS NOT NULL THEN #NumOnOrder ELSE 0 END
END
However it takes around 6 seconds to execute. Unfortunately I have no control over the initial SQL generated but I can change the function. Is there any way the function can be modified to speed this up? I'd appreciate the help. Thanks
If you have the rights to add indexes to the tables (and dependant on the version of SQL Server you are using), I would investigate what performance gain adding the following would have:-
create index newindex1 on PurchaseOrders (id)
include (StatusId);
create index newindex2 on PurchaseOrderDetails (PurchaseOrderId)
include (ProductId,NumOrdered,NumReceived);
You probably already have indexes on these columns - but the indexes above will support just the query in your function in the most efficient way possible (reducing the number of page reads to a minimum). If the performance of this function is important enough, you could also consider adding a calculated column into your table - for NumOrdered-NumReceived (and then only include the result column in the index above - and your query). You could also consider doing this in an indexed view rather than the table - but schema binding a view can by tiresome and inconvenient. Obviously, the wider the tables in question are - the greater the improvement in performance will be.
If you still want to use a function and can not live without it, use a in-line table value version. It is a-lot faster. Check out these articles from some experts.
http://aboutsqlserver.com/2011/10/23/sunday-t-sql-tip-inline-vs-multi-statement-table-valued-functions/
http://dataeducation.com/scalar-functions-inlining-and-performance-an-entertaining-title-for-a-boring-post/
I have had a couple MVP friends say that this the only function they ever write since scalar functions are treated as a bunch of Stored Procedure calls.
Re-write using in-line table value function. Check the syntax since I did not. Use the Coalesce function to convert NULL to Zero.
--
-- Table value function
--
CREATE FUNCTION [dbo].[GetNumOnOrder] ( #ProductId INT )
RETURNS TABLE
AS
RETURN
(
SELECT
COALESCE(SUM([NumOrdered] - [NumReceived]), 0) AS Num
FROM
[dbo].[PurchaseOrderDetails]
INNER JOIN [dbo].[PurchaseOrders]
ON [PurchaseOrderDetails].[PurchaseOrderId] = [PurchaseOrders].[Id]
WHERE [PurchaseOrders].[StatusId] <> 5
AND [PurchaseOrderDetails].[ProductId] = #ProductId
);
--
-- Sample call with cross apply
--
SELECT [Id]
FROM [dbo].[Products] P
CROSS APPLY [dbo].[GetNumOnOrder] (C.Id) AS CI
WHERE CI.Num = 0;
If the data is unevenly distributed data in table PurchaseOrderDetails then cached query plans might impact your query performance. This is a case where "Parameter Sniffing" might create bad query plans. Actually SQL Server supports an optimization called "parameter sniffing", where it will choose different plan based on the particular values in #ProductId variable.
So to improve performance of your query you can re-write your function as:
CREATE FUNCTION [dbo].[GetNumOnOrder]
(
#ProductId INT
)
RETURNS INT
AS
BEGIN
DECLARE #NumOnOrder INT,#v_ProductId INT
SET #v_ProductId = #ProductId;
SELECT #NumOnOrder = SUM([NumOrdered] - [NumReceived])
FROM [dbo].[PurchaseOrderDetails]
INNER JOIN [dbo].[PurchaseOrders]
ON [PurchaseOrderDetails].[PurchaseOrderId] = [PurchaseOrders].[Id]
WHERE [PurchaseOrders].[StatusId] <> 5
AND [PurchaseOrderDetails].[ProductId] = #v_ProductId
RETURN CASE WHEN #NumOnOrder IS NOT NULL THEN #NumOnOrder ELSE 0 END
END
or you can include a Recomplie hint.
In my SELECT statement i use optional parameters in a way like this:
DECLARE #p1 INT = 1
DECLARE #p2 INT = 1
SELECT name FROM some_table WHERE (id = #p1 OR #p1 IS NULL) AND (name = #p2 OR #p2 IS NULL)
In this case the optimizer generates "index scan" (not seek) operations for the entity which is not most effective when parameters are supplied with not null values.
If i add the RECOMPILE hint to the query the optimizer builds more effective plan which uses "seek". It works on my MSSQL 2008 R2 SP1 server and it also means that the optimizer CAN build a plan which consider only one logic branch of my query.
How can i make it to use that plan everywhere i want with no recompiling? The USE PLAN hint seemes not to work in this case.
Below is test code:
-- see plans
CREATE TABLE test_table(
id INT IDENTITY(1,1) NOT NULL,
name varchar(10),
CONSTRAINT [pk_test_table] PRIMARY KEY CLUSTERED (id ASC))
GO
INSERT INTO test_table(name) VALUES ('a'),('b'),('c')
GO
DECLARE #p INT = 1
SELECT name FROM test_table WHERE id = #p OR #p IS NULL
SELECT name FROM test_table WHERE id = #p OR #p IS NULL OPTION(RECOMPILE)
GO
DROP TABLE test_table
GO
Note that not all versions of SQL server will change the plan the way i shown.
The reason you get a scan is because the predicate will not short-circuit and both statements will always be evaluated. As you have already stated it will not work well with the optimizer and force a scan. Even though with recompile appears to help sometimes, it's not consistent.
If you have a large table where seeks are a must then you have two options:
Dynamic sql.
If statements separating your queries and thus creating separate execution plans (when #p is null you will of course always get a scan).
Response to Comment on Andreas' Answer
The problem is that you need two different plans.
If #p1 = 1 then you can use a SEEK on the index.
If #p1 IS NULL, however, it is not a seek, by definition it's a SCAN.
This means that when the optimiser is generating a plan Prior to knowledge of the parameters, it needs to create a plan that can fullfil all possibilities. Only a Scan can cover the needs of Both #p1 = 1 And #p1 IS NULL.
It also means that if the plan is recompiled at the time when the parameters are known, and #p1 = 1, a SEEK plan can be created.
This is the reason that, as you mention in your comment, IF statements resolve your problem; Each IF block represents a different portion of the problem space, and each can be given a different execution plan.
See Dynamic Search Conditions in T-SQL.
This explains comprehensively the versions where the RECOMPILE option works and alternatives where it doesn't.
Look at this article http://www.bigresource.com/Tracker/Track-ms_sql-fTP7dh01/
It seems that you can try to use proposal solution:
`SELECT * FROM <table> WHERE IsNull(column, -1) = IsNull(#value, -1)`
or
`SELECT * FROM <table> WHERE COALESCE(column, -1) = COALESCE(#value, -1)`
I have an SP that takes 10 seconds to run about 10 times (about a second every time it is ran). The platform is asp .net, and the server is SQL Server 2005. I have indexed the table (not on the PK also), and that is not the issue. Some caveats:
usp_SaveKeyword is not the issue. I commented out that entire SP and it made not difference.
I set #SearchID to 1 and the time was significantly reduced, only taking about 15ms on average for the transaction.
I commented out the entire stored procedure except the insert into tblSearches and strangely it took more time to execute.
Any ideas of what could be going on?
set ANSI_NULLS ON
go
ALTER PROCEDURE [dbo].[usp_NewSearch]
#Keyword VARCHAR(50),
#SessionID UNIQUEIDENTIFIER,
#time SMALLDATETIME = NULL,
#CityID INT = NULL
AS
BEGIN
SET NOCOUNT ON;
IF #time IS NULL SET #time = GETDATE();
DECLARE #KeywordID INT;
EXEC #KeywordID = usp_SaveKeyword #Keyword;
PRINT 'KeywordID : '
PRINT #KeywordID
DECLARE #SearchID BIGINT;
SELECT TOP 1 #SearchID = SearchID
FROM tblSearches
WHERE SessionID = #SessionID
AND KeywordID = #KeywordID;
IF #SearchID IS NULL BEGIN
INSERT INTO tblSearches
(KeywordID, [time], SessionID, CityID)
VALUES
(#KeywordID, #time, #SessionID, #CityID)
SELECT Scope_Identity();
END
ELSE BEGIN
SELECT #SearchID
END
END
Why are you using top 1 #SearchID instead of max (SearchID) or where exists in this query? top requires you to run the query and retrieve the first row from the result set. If the result set is large this could consume quite a lot of resources before you get out the final result set.
SELECT TOP 1 #SearchID = SearchID
FROM tblSearches
WHERE SessionID = #SessionID
AND KeywordID = #KeywordID;
I don't see any obvious reason for this - either of aforementioned constructs should get you something semantically equivalent to this with a very cheap index lookup. Unless I'm missing something you should be able to do something like
select #SearchID = isnull (max (SearchID), -1)
from tblSearches
where SessionID = #SessionID
and KeywordID = #KeywordID
This ought to be fairly efficient and (unless I'm missing something) semantically equivalent.
Enable "Display Estimated Execution Plan" in SQL Management Studio - where does the execution plan show you spending the time? It'll guide you on the heuristics being used to optimize the query (or not in this case). Generally the "fatter" lines are the ones to focus on - they're ones generating large amounts of I/O.
Unfortunately even if you tell us the table schema, only you will be able to see actually how SQL chose to optimize the query. One last thing - have you got a clustered index on tblSearches?
Triggers!
They are insidious indeed.
What is the clustered index on tblSearches? If the clustered index is not on primary key, the database may be spending a lot of time reordering.
How many other indexes do you have?
Do you have any triggers?
Where does the execution plan indicate the time is being spent?