Different execution plan when executing statement directly and from stored procedure - sql

While developing a new query at work I wrote it and profiled it in SQL Query Analyzer. The query was performing really good without any table scans but when I encapsulated it within a stored procedure the performance was horrible. When I looked at the execution plan I could see that SQL Server picked a different plan that used a table scan instead of an index seek on TableB (I've been forced to obfuscate the table and column names a bit but none of the query logic has changed).
Here's the query
SELECT
DATEADD(dd, 0, DATEDIFF(dd, 0, TableA.Created)) AS Day,
DATEPART(hh, TableA.Created) AS [Hour],
SUM(TableB.Quantity) AS Quantity,
SUM(TableB.Amount) AS Amount
FROM
TableA
INNER JOIN TableB ON TableA.BID = TableB.ID
WHERE
(TableA.ShopId = #ShopId)
GROUP BY
DATEADD(dd, 0, DATEDIFF(dd, 0, TableA.Created)),
DATEPART(hh, TableA.Created)
ORDER BY
DATEPART(hh, TableA.Created)
When I run the query "raw" I get the following trace stats
Event Class Duration CPU Reads Writes
SQL:StmtCompleted 75 41 7 0
And when I run the query as a stored proc using the following command
DECLARE #ShopId int
SELECT #ShopId = 1
EXEC spStats_GetSalesStatsByHour #ShopId
I get the following trace stats
Event Class Duration CPU Reads Writes
SQL:StmtCompleted 222 10 48 0
I also get the same result if I store the query in an nvarchar and execute it using sp_executesql like this (it performs like the sproc)
DECLARE #SQL nvarchar(2000)
SET #SQL = 'SELECT DATEADD(dd, ...'
exec sp_executesql #SQL
The stored procedure does not contain anything except for the select statement above. What would cause sql server to pick an inferior execution plan just because the statement is executed as a stored procedure?
We're currently running on SQL Server 2000

This generally has something to do with parameter sniffing. It can be very frustrating to deal with. Sometimes it can be solved by recompiling the stored procedure, and sometimes you can even use a duplicate variable inside the stored procedure like this:
alter procedure p_myproc (#p1 int) as
declare #p1_copy int;
set #p1_copy = #p1;
And then use #p1_copy in the query. Seems ridiculous but it works.
Check my recent question on the same topic:
Why does the SqlServer optimizer get so confused with parameters?

Yes -- I had seen this on Oracle DB 11g as well -- same query ran fast on 2 nodes of db server at SQL prompt BUT when called from package it literally hung up!
had to clear the shared pool to get identical behaviour: reason some job/script was running that had older copy locked in library cache/memory on one node with inferior execution plan.

Related

SELECT hangs when using a variable

SQL Server 2014 (v13.0.4001.0) - this sample script hangs:
DECLARE #from int = 0
DECLARE #to int = 1000
select
*
from
TaskNote dtn
join
Participants tp on dtn.Task_ID = tp.TaskId
where
dtn.TaskNote_ID between #from and #to
But if I change variables to constants - it is all OK.
Like this:
where
dtn.DocTaskNote_ID between 0 and 1000
Also, if I remove the join, all is ok.
Can't figure out where the problem is
A possible cause for the problem you mention, in case your query lies within a stored procedure, is parameter sniffing. SQL Server compiles the query for the first time using the initial values of the parameters. In subsequent calls to the procedure the engine uses the cached execution plan which is probably not optimal for the current variable values.
One workaround this problem is to use OPTION (RECOMPILE):
select *
from TaskNote dtn
join Participants tp on dtn.Task_ID = tp.TaskId
where dtn.TaskNote_ID between #from and #to
option (recompile)
This way the query is being compiled every time the procedure is executed using the current parameters values.
Further reading:
Parameter Sniffing Problem and Possible Workarounds

Performance issue when inserting large XML parameters into temp table [duplicate]

I'm trying to insert some data from a XML document into a variable table. What blows my mind is that the same select-into (bulk) runs in no time while insert-select takes ages and holds SQL server process accountable for 100% CPU usage while the query executes.
I took a look at the execution plan and INDEED there's a difference. The insert-select adds an extra "Table spool" node even though it doesn't assign cost. The "Table Valued Function [XML Reader]" then gets 92%. With select-into, the two "Table Valued Function [XML Reader]" get 49% each.
Please explain "WHY is this happening" and "HOW to resolve this (elegantly)" as I can indeed bulk insert into a temporary table and then in turn insert into variable table, but that's just creepy.
I tried this on SQL 10.50.1600, 10.00.2531 with the same results
Here's a test case:
declare #xColumns xml
declare #columns table(name nvarchar(300))
if OBJECT_ID('tempdb.dbo.#columns') is not null drop table #columns
insert #columns select name from sys.all_columns
set #xColumns = (select name from #columns for xml path('columns'))
delete #columns
print 'XML data size: ' + cast(datalength(#xColumns) as varchar(30))
--raiserror('selecting', 10, 1) with nowait
--select ColumnNames.value('.', 'nvarchar(300)') name
--from #xColumns.nodes('/columns/name') T1(ColumnNames)
raiserror('selecting into #columns', 10, 1) with nowait
select ColumnNames.value('.', 'nvarchar(300)') name
into #columns
from #xColumns.nodes('/columns/name') T1(ColumnNames)
raiserror('inserting #columns', 10, 1) with nowait
insert #columns
select ColumnNames.value('.', 'nvarchar(300)') name
from #xColumns.nodes('/columns/name') T1(ColumnNames)
Thanks a bunch!!
This is a bug in SQL Server 2008.
Use
insert #columns
select ColumnNames.value('.', 'nvarchar(300)') name
from #xColumns.nodes('/columns/name') T1(ColumnNames)
OPTION (OPTIMIZE FOR ( #xColumns = NULL ))
This workaround is from an item on the Microsoft Connect Site which also mentions a hotfix for this Eager Spool / XML Reader issue is available (under traceflag 4130).
The reason for the performance regression is explained in a different connect item
The spool was introduced due to a general halloween protection logic
(that is not needed for the XQuery expressions).
Looks to be an issue specific to SQL Server 2008. When I run the code in SQL Server 2005, both inserts run quickly and produce identical execution plans that start with the fragment shown below as Plan 1. In 2008, the first insert uses Plan 1 but the second insert produces Plan 2. The remainder of both plans beyond the fragment shown are identical.
Plan 1
Plan 2

Does assigning stored procedure input parameters to local variables help optimize the query?

I have a stored procedure that takes 5 input parameters. The procedure is a bit complicated and takes around 2 minutes to execute. I am in process of optimizing query.
So, my question is, does it always help to assign input parameters to local variables and then use local variables in the procedure?
If so, how does it help?
I will not try and explain the full details of parameter sniffing, but in short, no it does not always help (and it can hinder).
Imagine a table (T) with a primary key and an indexed Date column (A), in the table there are 1,000 rows, 400 have the same value of A (lets say today 20130122), the remaining 600 rows are the next 600 days, so only 1 record per date.
This query:
SELECT *
FROM T
WHERE A = '20130122';
Will yield a different execution plan to:
SELECT *
FROM T
WHERE A = '20130123';
Since the statistics will indicate that for the first 400 out of 1,000 rows will be returned the optimiser should recognise that a table scan will be more efficient than a bookmark lookup, whereas the second will only yield 1 rows, so a bookmark lookup will be much more efficient.
Now, back to your question, if we made this a procedure:
CREATE PROCEDURE dbo.GetFromT #Param DATE
AS
SELECT *
FROM T
WHERE A = #Param
Then run
EXECUTE dbo.GetFromT '20130122'; --400 rows
The query plan with the table scan will be used, if the first time you run it you use '20130123' as a parameter it will store the bookmark lookup plan. Until such times as the procedure is recompiled the plan will remain the same. Doing something like this:
CREATE PROCEDURE dbo.GetFromT #Param VARCHAR(5)
AS
DECLARE #Param2 VARCHAR(5) = #Param;
SELECT *
FROM T
WHERE A = #Param2
Then this is run:
EXECUTE dbo.GetFromT '20130122';
While the procedure is compiled in one go, it does not flow properly, so the query plan created at the first compilation has no idea that #Param2 will become the same as #param, so the optimiser (with no knowledge of how many rows to expect) will assume 300 will be returned (30%), as such will deem a table scan more efficient that a bookmark lookup. If you ran the same procedure with '20130123' as a parameter it would yield the same plan (regardless of what parameter it was first invoked with) because the statistics cannot be used for an unkonwn value. So running this procedure for '20130122' would be more efficient, but for all other values would be less efficient than without local parameters (assuming the procedure without local parameters was first invoked with anything but '20130122')
Some queries to demonstate so you can view execution plans for yourself
Create schema and sample data
CREATE TABLE T (ID INT IDENTITY(1, 1) PRIMARY KEY, A DATE NOT NULL, B INT,C INT, D INT, E INT);
CREATE NONCLUSTERED INDEX IX_T ON T (A);
INSERT T (A, B, C, D, E)
SELECT TOP 400 CAST('20130122' AS DATE), number, 2, 3, 4
FROM Master..spt_values
WHERE type = 'P'
UNION ALL
SELECT TOP 600 DATEADD(DAY, number, CAST('20130122' AS DATE)), number, 2, 3, 4
FROM Master..spt_values
WHERE Type = 'P';
GO
CREATE PROCEDURE dbo.GetFromT #Param DATE
AS
SELECT *
FROM T
WHERE A = #Param
GO
CREATE PROCEDURE dbo.GetFromT2 #Param DATE
AS
DECLARE #Param2 DATE = #Param;
SELECT *
FROM T
WHERE A = #Param2
GO
Run procedures (showing actual execution plan):
EXECUTE GetFromT '20130122';
EXECUTE GetFromT '20130123';
EXECUTE GetFromT2 '20130122';
EXECUTE GetFromT2 '20130123';
GO
EXECUTE SP_RECOMPILE GetFromT;
EXECUTE SP_RECOMPILE GetFromT2;
GO
EXECUTE GetFromT '20130123';
EXECUTE GetFromT '20130122';
EXECUTE GetFromT2 '20130123';
EXECUTE GetFromT2 '20130122';
You will see that the first time GetFromT is compiled it uses a table scan, and retains this when run with the parameter '20130122', GetFromT2 also uses a table scan and retains the plan for '20130122'.
After the procedures have been set for recompilation and run again (note in a different order), GetFromT uses a bookmark loopup, and retains the plan for '20130122', despite having previously deemed that an table scan is a more approprate plan. GetFromT2 is unaffected by the order and has the same plan as before the recompliateion.
So, in summary, it depends on the distribution of your data, and your indexes, your frequency of recompilation, and a bit of luck as to whether a procedure will benefit from using local variables. It certainly does not always help.
Hopefully I have shed some light on the effect of using local parameters, execution plans and stored procedure complilation. If I have failed completely, or missed a key point a much more in depth explanation can be found here:
http://www.sommarskog.se/query-plan-mysteries.html
I don't believe so. Modern computer architectures have plenty of cache close to the processor for putting in stored procedure values. Essentially, you can consider these as being on a "stack" which gets loaded into local cache memory.
If you have output parameters, then possibly copying input values to a local variable would eliminate one step of indirection. However, the first time that indirection is executed, the destination memory will be put in the local cache and it will probably remain there.
So, no, I don't think this is an important optimization.
But, you could always time different variants of a stored procedure to see if this would help.
It does help.
Below links contain more details about parameter sniffing.
http://blogs.msdn.com/b/turgays/archive/2013/09/10/parameter-sniffing-problem-and-workarounds.aspx
http://sqlperformance.com/2013/08/t-sql-queries/parameter-sniffing-embedding-and-the-recompile-options
When you execute a SP with parameters for the first time, query optimizer creates the query plan based on the value of the parameter.
Query optimizer uses statistics data for that particular value to decide the best query plan. But cardinality issues can affect this. Which means if you execute the same SP with different parameter value that previously generated query plan may not be the best plan.
By assigning parameters to local variables we hide the parameter values from query optimizer. So it creates the query plan for general case.
this is same as using "OPTIMIZE FOR UNKNOWN" hint in the SP.

Writing LOGS in stored procedure

I am using SQL Server. I'm writing a stored procedure that executes a series of queries. I want to LOG the execution time of every query. Is it possible? Pls help.
Example for using a logging table:
create procedure procedure_name as begin
declare #start_date datetime = getdate(),
#execution_time_in_seconds
int /*your procedure code
here*/
#execution_time_in_seconds =
datediff(SECOND,#start_date,getdate())
insert into your
logging_table(execution_time_column) values(#execution_time_in_seconds) end
The engine is already keeping stats of execution in sys.dm_exec_query_stats. Before you add heavy logging like insert into a log tabvle inside your procedure, consider what can you extract from these stats. The contain values for:
execution count
execution time (elapsed)
work time (non-blocked actual CPU time across all CPUs in parallel queries)
logical reads/writes
physical reads
number of rows returned
This kind of information is significantly richer and more useful for performance investigation that what you would log in a naive approach. Most metrics contain the min, max and total value (and with execution count you also have the average). You can immediatly get a clue what are expensive queries (the one with large elapsed average), which are queries that block often (elapsed time much higher than work time), which cause much writes or much reads, which return large results etc etc etc.
You can keep track of the time stamp via CURRENT_TIMESTAMP and log it before and after the statements you want to execute and then log them and later compare them to see and of course with meaningful messages indicating what started and finished when. Or if you want to see it also you could use this: SET STATISTICS TIME ON and SET STATISTICS TIME OFF this one I use in query analyser.
Depending on what you exactly want you need to figure out where to store these messages for logging. Like a table or some thing else.
use PRINT & GETDATE to log the execution time info
example:
DECLARE #Time1 DATETIME
DECLARE #Time2 DATETIME
DECLARE #Time3 DATETIME
DECLARE #STR_TIME1 VARCHAR(255)
DECLARE #STR_TIME2 VARCHAR(255)
DECLARE #STR_TIME3 VARCHAR(255)
--{ execute query 1 }
SET #Time3 = GETDATE()
--{ execute query 2 }
SET #Time2 = GETDATE()
--{ execute query 3 }
SET #Time3 = GETDATE()
SET #STR_TIME1 = CONVERT(varchar(255),#Time1,109)
SET #STR_TIME2 = CONVERT(varchar(255),#Time2,109)
SET #STR_TIME3 = CONVERT(varchar(255),#Time3,109)
PRINT 'T1 is' + #STR_TIME1
PRINT 'T2 is ' + #STR_TIME2
PRINT 'T3 is ' + #STR_TIME3

Query times out in .Net SqlCommand.ExecuteNonQuery, works in SQL Server Management Studio

Update: Problem solved, and staying solved. If you want to see the site in action, visit Tweet08
I've got several queries that act differently in SSMS versus when run inside my .Net application. The SSMS executes fine in under a second. The .Net call times out after 120 seconds (connection default timeout).
I did a SQL Trace (and collected everything) I've seen that the connection options are the same (and match the SQL Server's defaults). The SHOWPLAN All, however, show a huge difference in the row estimates and thus the working version does an aggressive Table Spool, where-as the failing call does not.
In the SSMS, the datatypes of the temp variables are based on the generated SQL Parameters in the .Net, so they are the same.
The failure executes under Cassini in a VS2008 debug session. The success is under SSMS 2008 . Both are running against the same destination server form the same network on the same machine.
Query in SSMS:
DECLARE #ContentTableID0 TINYINT
DECLARE #EntryTag1 INT
DECLARE #ContentTableID2 TINYINT
DECLARE #FieldCheckId3 INT
DECLARE #FieldCheckValue3 VARCHAR(128)
DECLARE #FieldCheckId5 INT
DECLARE #FieldCheckValue5 VARCHAR(128)
DECLARE #FieldCheckId7 INT
DECLARE #FieldCheckValue7 VARCHAR(128)
SET #ContentTableID0= 3
SET #EntryTag1= 8
SET #ContentTableID2= 2
SET #FieldCheckId3= 14
SET #FieldCheckValue3= 'igor'
SET #FieldCheckId5= 33
SET #FieldCheckValue5= 'a'
SET #FieldCheckId7= 34
SET #FieldCheckValue7= 'a'
SELECT COUNT_BIG(*)
FROM dbo.ContentEntry AS mainCE
WHERE GetUTCDate() BETWEEN mainCE.CreatedOn AND mainCE.ExpiredOn
AND (mainCE.ContentTableID=#ContentTableID0)
AND ( EXISTS (SELECT *
FROM dbo.ContentEntryLabel
WHERE ContentEntryID = mainCE.ID
AND GetUTCDate() BETWEEN CreatedOn AND ExpiredOn
AND LabelFacetID = #EntryTag1))
AND (mainCE.OwnerGUID IN (SELECT TOP 1 Name
FROM dbo.ContentEntry AS innerCE1
WHERE GetUTCDate() BETWEEN innerCE1.CreatedOn AND innerCE1.ExpiredOn
AND (innerCE1.ContentTableID=#ContentTableID2
AND EXISTS (SELECT *
FROM dbo.ContentEntryField
WHERE ContentEntryID = innerCE1.ID
AND (ContentTableFieldID = #FieldCheckId3
AND DictionaryValueID IN (SELECT dv.ID
FROM dbo.DictionaryValue AS dv
WHERE dv.Word LIKE '%' + #FieldCheckValue3 + '%'))
)
)
)
OR EXISTS (SELECT *
FROM dbo.ContentEntryField
WHERE ContentEntryID = mainCE.ID
AND ( (ContentTableFieldID = #FieldCheckId5
AND DictionaryValueID IN (SELECT dv.ID
FROM dbo.DictionaryValue AS dv
WHERE dv.Word LIKE '%' + #FieldCheckValue5 + '%')
)
OR (ContentTableFieldID = #FieldCheckId7
AND DictionaryValueID IN (SELECT dv.ID
FROM dbo.DictionaryValue AS dv
WHERE dv.Word LIKE '%' + #FieldCheckValue7 + '%')
)
)
)
)
Trace's version of .Net call (some formatting added):
exec sp_executesql N'SELECT COUNT_BIG(*) ...'
,N'#ContentTableID0 tinyint
,#EntryTag1 int
,#ContentTableID2 tinyint
,#FieldCheckId3 int
,#FieldCheckValue3 varchar(128)
,#FieldCheckId5 int
,#FieldCheckValue5 varchar(128)
,#FieldCheckId7 int
,#FieldCheckValue7 varchar(128)'
,#ContentTableID0=3
,#EntryTag1=8
,#ContentTableID2=2
,#FieldCheckId3=14
,#FieldCheckValue3='igor'
,#FieldCheckId5=33
,#FieldCheckValue5='a'
,#FieldCheckId7=34
,#FieldCheckValue7='a'
It is not your indexes.
This is parameter-sniffing, as it usually happens to parametrized stored procedures. It is not widely known, even among those who know about parameter-sniffing, but it can also happen when you use parameters through sp_executesql.
You will note that the version that you are testing in SSMS and the version the the profiler is showing are not identical because the profiler version shows that your .Net application is executing it through sp_executesql. If you extract and execute the full sql text that is actually being run for your application, then I believe that you will see the same performance problem with the same query plan.
FYI: the query plans being different is the key indicator of parameter-sniffing.
FIX: The easiest way to fix this one assuming it is executing on SQL Server 2005 or 2008 is to add the clause "OPTION (RECOMPILE)" as the last line of you SELECT statement. Be forewarned, you may have to execute it twice before it works and it does not always work on SQL Server 2005. If that happens, then there are other steps that you can take, but they are a little bit more involved.
One thing that you could try is to check and see if "Forced Parameterization" has been turned on for your database (it should be in the SSMS Database properties, under the Options page). To tunr Forced Parameterization off execute this command:
ALTER DATABASE [yourDB] SET PARAMETERIZATION SIMPLE
I ran into this situation today and the fix that solved my problem is to use WITH (NOLOCK) while doing a select on tables:
Eg: If your stored proc has T-SQL that looks like below:
SELECT * FROM [dbo].[Employee]
Change it to
SELECT * FROM [dbo].[Employee] WITH (NOLOCK)
Hope this helps.
I've had off-hours jobs fubar my indexes before and I've gotten the same result as you describe. sp_recompile can recompile a sproc... or, if that doesn't work, the sp_recompile can be run on the table and all sprocs that act on that table will be recompiled -- works for me every time.
I ran into this problem before as well. Sounds like your indexes are out of whack. To get the same behavior in SSMS, add this before the script
SET ARITHABORT OFF
Does it timeout as well? If so, it's your indexing and statistics
It's most likely index-related. Had a similar issue with .Net app vs SSMS (specifically on a proc using a temp table w/ < 100 rows). We added a clustered index on the table and it flew from .Net thereafter.
Checked and this server, a development server, was not running SQL Server 2005 SP3. Tried to install that (with necessary reboot), but it didn't install. Oddly now both code and SSMS return in subsecond time.
Woot this is a HEISENBUG.
I've seen this behavior before and it can be a big problem with o/r mappers that use sp_executesql. If you examine the execution plans you'll likely find that the sp_executesql query is not making good use of indexes. I spent a fair amount of time trying to find a fix or explanation for this behavior but never got anywhere.
Most likely your .Net programs pass the variables as NVARCHAR, not as VARCHAR. Your indexes are on VARCHAR columns I assume (judging from your script), and a condition like ascii_column = #unicodeVariable is actually not SARG-able. The plan has to generate a scan in this case, where in SSMS would generate a seek because the variable is the right type.
Make sure you pass all your string as VARCHAR parameters, or modify your query to explicitly cast the variables, like this:
SELECT dv.ID
FROM dbo.DictionaryValue AS dv
WHERE dv.Word LIKE '%' + CAST(#FieldCheckValue5 AS VARCHAR(128)) + '%'