Why is this Sproc Scanning Table [closed] - sql

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
This system was built years ago in SQL7 and currently runs in SQL2k5
We have a table tProducts with a proudctid Guid as the PK/Clustered Index. I realize for optimum performance we should modify the table but that's not a possibility at this point. I'm joining it to the tProductSpecial table which also has a PK/Clustered Index of productid, which is also the FK in this relationship. We have about 50k records in the tProducts table and about 35k records in the tProductSpecial table (some products have special information and some do not). There is one more piece. I am using a Temporary table in the sproc to grab the logged in users security roles and loading them, this also joins to the tProducts table, roleid is non-clustered index in tProducts. I've included some of the WHERE conditions that access these table.
SELECT *
FROM tProducts
JOIN tProductSpecial ON tProducts.productid=tProductSpecial.productid
JOIN #tRoles ON tProducts.roleid=#tRoles.roleid
WHERE
(tProducts.productSKU = #sku AND tProducts.productStatus=1) --DIRECT MATCH
OR
( -- KEYWORD SEARCH
CONTAINS(tProducts.*,'FORMSOF(INFLECTIONAL,''' + #lookuptext + ''')')
AND
(
#productStatus IS NULL
OR
(
#productStatus IS NOT NULL
AND
tProducts.productStatus = #productStatus
)
)
AND
( --- item on sale
#bOnSale IS NULL
OR #bOnSale=0
OR
(
#bOnSale = 1
AND tProducts.productOnSale=1
)
)
AND
( -- from price
#from=0
OR #from IS NULL
OR
(
#from<>0
AND
tProducts.customerCost>=#from
)
)
AND
( --to price
#to=0
OR #to IS NULL
OR
(
#to<>0
AND
tProducts.customerCost<=#from
)
)
AND
( --how old is product
#age IS NULL
OR #age = 0
OR
(
#age IS NOT NULL
AND #age > 0
AND DATEDIFF(day,tProducts.productCreated,GETDATE())
<=CONVERT(varchar(10),#age)
)
)
ORDER BY tProducts.productSKU

The problem is probably due to parameter sniffing - you're using the same plan over and over again even if the parameters are different and only sometimes a scan would be the best approach. In SQL Server 2008, you could simply add OPTION (RECOMPILE) or OPTIMIZE FOR UNKNOWN, but in SQL Server 2005, try altering the stored procedure and adding the WITH RECOMPILE option. This will force SQL Server to consider a new plan every time, based on the incoming parameters.
Another option would be to build up the query dynamically based on whether #bOnSale, #from etc. are populated. In 2005 this will lead to plan cache bloat but you may be better off overall. This could completely avoid the Full-Text access, for example, when #sku is populated. Again, in SQL Server 2008 this is better, as you can stave off some of the plan cache bloat by using Optimize for ad hoc workloads.

A stored procedure of long standing can suddenly develop a case of the poor plan for all sorts of reasons. Just off the top of my head here are a few that stand out:
Index Changes
Dropping/Modifying indices on tables can cause queries that have had a perfectly fine execution plan for years to go south. Solution: don't do that. If you are going to do that, examine the dependencies on that table and make sure that everything is still getting a good execution plan.
Table Statistics
Index statistics can get fouled up for a number of reasons. For instance, large bulk loads of data into table(s) referenced by the stored procedure can bollux up index statistics on that table and thus mislead the query optimizer into constructing a poor execution plan. Solutions include 1 or more of the following:
update statistics on the table(s) involved
run DBCC FREEPROCCACHE to flush the stored procedure cache.
Execute sp_recompile on the offending stored procedure.
Direct References To Stored Procedure Parameters
Stored procedures whose queries directly use the parameters passed into the stored procedure get an execution plan based on the value of the parameters on the stored procedures first execution, when the plan is created and cached. If those parameter values are a little outré, the cached execution plan may perform well for the oddball cause but extremely poorly for the general case. The short-term Solution is torun sp_recompile or modify the stored procedure so that has the with recompile option (Note however, that this has a significant performance impact since the stored procedure gets recompiled for every execution. That means that compile locks are taken during the compilation process and that can (and does) cause blocking.
The correct fix for this problem is to declare a local variable within the stored procedure and set it to the value of the parameter. That turns the parameter value to an expression and breaks the plan's dependency on the parameter value. Floored me the first time I got hit with this problem.
Good luck!

Related

SQL Server table-valued function executed code

Based on Row level security I have created a table-valued function:
CREATE FUNCTION Security.userAccessPredicate(#ValueId int)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT 1 AS accessResult
WHERE #ValueId =
(
SELECT Value
FROM dbo.Values
WHERE UserId = CAST(SESSION_CONTEXT(N'UserId') AS NVARCHAR(50))
) OR NULLIF(CAST(SESSION_CONTEXT(N'UserId') AS nvarchar(50)),'') IS NULL
);
CREATE SECURITY POLICY Security.userSecurityPolicy
ADD FILTER PREDICATE Security.userAccessPredicate(ValueUd) ON dbo.MainTable
Let's say MainTable contains milions of rows. Is userAccessPredicate calculating SELECT Value FROM dbo.Values for every row independently? If so it is ineffective I guess. How to check what exact code is generating when executing table-valued function? SQL Server Profiler isn't way because I am using Azure DB.
I am using SQL Server 2016 Management Studio.
Best way is to look at an execution plan with the policy turned off then turned on. You'll see the extra work its doing as a consequence. You're adding another table to query so its similar to doing a join but probably more efficient.
To answer your question, if you see the addition of a nested loop in the plan when the policy is on, then yes its going row-by-row Nested Loops
Also do the same with DBCC SHOW_STATISTICS to get a look at the resource hits too. With smaller tables i never saw any noticeable performance hits, < 100,000 rows in a similar implementation.
I found this link useful when getting into this before.
https://www.mssqltips.com/sqlservertip/4005/sql-server-2016-row-level-security-limitations-performance-and-troubleshooting/

How can a stored proc have multiple execution plans?

I am working with MS SQL Server 2008 R2. I have a stored procedure named rpt_getWeeklyScheduleData. This is the query I used to look up its execution plan in a specific database:
select
*
from
sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
where
OBJECT_NAME(st.objectid, st.dbid) = 'rpt_getWeeklyScheduleData' and
st.dbid = DB_ID()
The above query returns me 9 rows. I was expecting 1 row.
This stored procedure has been modified multiple times so I believe SQL Server has been building a new execution plan for it whenever it was modified and run. Is it correct explanation? If not then how can you explain this?
Also is it possible to see when each plan was created? If yes then how?
UPDATE:
This is the stored proc's signature:
CREATE procedure [dbo].[rpt_getWeeklyScheduleData]
(
#a_paaipk int,
#a_location_code int,
#a_department_code int,
#a_week_start_date varchar(12),
#a_week_end_date varchar(12),
#a_language_code int,
#a_flag int
)
as
begin
...
end
The stored proc is long; has only 2 if conditions both for #a_flag parameter.
if #a_flag = 0
begin
...
end
if #a_flag = 1
begin
...
end
Depending on the nature of the stored procedure (which wasn't provided) this is very possible for any number of reasons (most likely not limited to below):
Does the proc use a lot of if this then this select, else this select/update
Does the proc contain dynamic sql?
Are you executing the SP from both web and SSMS? Then you're likely executing the SP with different connection settings.
Does the stored proc have parameters? Sometimes a difference in parameters can cause one execution plan to be terrible for a specific set, so a different plan is used.
Going to try an analogy which might help... maybe...
Say you have a stored procedure for your weekend shopping.
You typically need to get groceries, sometimes an air filter, and even less often a big pack of something that needs replacing 4 times a year.
The grocery store can handle groceries, and is the closest to your house (5 minutes).
Target can handle the air filter and groceries, but add 25 minutes travel time.
"Big place of everything" has everything you'd possibly need, but is an hours drive away.
So here, depending on your parameters #needsAirFilter and #needsBigPackOfSomething could vastly change your "execution plan" of your stored procedure of "shopping".
If #needsAirFilter and #needsBigPackOfSomething is false, there's no reason to make the 30 minute or hour drive, as everything you need is at the grocery store.
One a month, #needsAirFilter is true, in that case we need to go to Target, as the grocery store's execution plan is insufficient.
4 times a year #needsBigPackOfSomething is true, and we need to make the hour drive to get the big pack of something, while grabbing groceries, and airfilter since we're there.
Sure... we could make the hour drive every time to get groceries, and the other things when needed (imagine single execution plan). But this is in no way the most efficient way to do it. In instances like this, we have different execution plans for what information/goods are actually needed.
No idea if that helps... but I had fun :D
Typically SQL Server will generate a new query plan depending on the values of the parameters being passed in (this can determine what indexes, if any, it will use) and if indexes are added, changed or updated (on the tables/views being used in the proc) so SQL Server may decide that it is more effective to use one or more indexes that it previously ignored. The more involved the SQL in the proc will also kick off more work on SQL Server side as it attempts to optimize the query. If the data changes (suddenly you have many more customers in NJ and there is a query and index for states) it may decide that its going to use that index and the query plan is changed. If any of the tables or views involved in the query change (schema change) will also invalidate an existing plan and result in a new plan being generated.

Need help with SQL query on SQL Server 2005

We're seeing strange behavior when running two versions of a query on SQL Server 2005:
version A:
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = 1234
ORDER BY name ASC
version B:
DECLARE #Id AS INT;
SET #Id = 1234;
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId
WHERE listcontacts.listid = #Id
ORDER BY name ASC
Both queries return 1000 rows; version A takes on average 15s; version B on average takes 4s.
Could anyone help us understand the difference in execution times of these two versions of SQL?
If we invoke this query via named parameters using NHibernate, we see the following query via SQL Server profiler:
EXEC sp_executesql N'SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = #id ORDER BY name ASC',
N'#id INT',
#id=1234;
...and this tends to perform as badly as version A.
Try take a look at the execution plan for your query. This should give you some more explanation on how your query is executed.
I've not seen the execution plans, but I strongly suspect that they are different in these two cases. The issue that you are having is that in case A (the faster query) the optimiser knows the value that you are using for the list id (1234) and using a combination of the distribution statistics and the indexes chooses an optimal plan.
In the second case, the optimiser is not able to sniff the value of the ID and so produces a plan that would be acceptable for any passed in list id. And where I say acceptable I do not mean optimal.
So what can you do to improve the scenario? There are a couple of alternatives here:
1) Create a stored procedure to perform the query as below:
CREATE PROCEDURE Foo
#Id INT
AS
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = #Id
ORDER BY name ASC
GO
This will allow the optimiser to sniff the value of the input parameter when passed in and produce an appropriate execution plan for the first execution. Unfortunately it will cache that plan for reuse later so unless the you generally call the sproc with similarly selective values this may not help you too much
2) Create a stored procedure as above, but specify it to be WITH RECOMPILE. This will ensure that the stored procedure is recompiled each time it is executed and hence produce a new plan optimised for this input value
3) Add OPTION (RECOMPILE) to the end of the SQL Statement. Forces recompilation of this statement, and is able to optimise for the input value
4) Add OPTION (OPTIMIZE FOR (#Id = 1234)) to the end of the SQL statement. This will cause the plan that gets cached to be optimised for this specific input value. Great if this is a highly common value, or most common values are similarly selective, but not so great if the distribution of selectivity is more widely spread.
It's possible that instead of casting 1234 to be the same type as listcontacts.listid and then doing the comparison with each row, it might be casting the value in each row to be the same as 1234. The first requires just one cast, the second needs a cast per row (and that's probably on far more than 1000 rows, it may be for every row in the table). I'm not sure what type that constant will be interpreted as but it may be 'numeric' rather than 'int'.
If this is the cause, the second version is faster because it's forcing 1234 to be interpreted as an int and thus removing the need to cast the value in every row.
However, as the previous poster suggests, the query plan shown in SQL Server Management Studio may indicate an alternative explanation.
The best way to see what is happening is to compare the execution plans, everything else is speculation based on the limited details presented in the question.
To see the execution plan, go into SQL Server Management Studio and run SET SHOWPLAN_XML ON then run query version A, the query will not run but the execution plan will be displayed in XML. Then run query version B and see its execution plan. If you still can't tell the difference or solve the problem, post both execution plans and someone here will explain it.

strange SQL server report performance problem related with update statistics

I got a complex report using reporting service, the report connect to a SQl 2005 database, and calling a number of store procedure and functions. it works ok initially, but after a few months(data grows), it run into timeout error.
I created a few indexes to improve the performance, but the strange thing it that it works after the index was created, but throws out the same error the next day. Then I try to update the statistics on the database, it works again (the running time of the query improve 10 times). But again, it stop working the next day.
Now, the temp solution is that I run the update statistic every hour. But I can't find a reasonable explanation for this behaviour. the database is not very busy, there won't be lots of data being updated for one day. how can the update statistics make so much difference?
I suspect you have parameter sniffing. Updating statistics merely forces all query plans to be discarded, so it appears to work for a time
CREATE PROC dbo.MyReport
#SignatureParam varchar(10),
...
AS
...
DECLARE #MaskedParam varchar(10), ...
SELECT #MaskedParam = #SignatureParam, ...
SELECT...WHERE column = #MaskedParam AND ...
...
GO
I've seen this problem when the indexes on the underlying tables need to be adjusted or the SQL needs work.
The rebuild index and the update statistics read the table into the cache, which improves performance. The next day the table has been flushed out of the cache and the performance problems return.
SQL Profiler is very useful in these situations to identify what changes from run to run.

add SQL Server index but how to recompile only affected stored procedures?

I need to add an index to a table, and I want to recompile only/all the stored procedures that make reference to this table. Is there any quick and easy way?
EDIT:
from SQL Server 2005 Books Online, Recompiling Stored Procedures:
As a database is changed by such actions as adding indexes or changing data in indexed columns, the original query plans used to access its tables should be optimized again by recompiling them. This optimization happens automatically the first time a stored procedure is run after Microsoft SQL Server 2005 is restarted. It also occurs if an underlying table used by the stored procedure changes. But if a new index is added from which the stored procedure might benefit, optimization does not happen until the next time the stored procedure is run after Microsoft SQL Server is restarted. In this situation, it can be useful to force the stored procedure to recompile the next time it executes
Another reason to force a stored procedure to recompile is to counteract, when necessary, the "parameter sniffing" behavior of stored procedure compilation. When SQL Server executes stored procedures, any parameter values used by the procedure when it compiles are included as part of generating the query plan. If these values represent the typical ones with which the procedure is called subsequently, then the stored procedure benefits from the query plan each time it compiles and executes. If not, performance may suffer
You can exceute sp_recompile and supply the table name you've just indexed. all procs that depend on that table will be flushed from the stored proc cache, and be "compiled" the next time they are executed
See this from the msdn docs:
sp_recompile (Transact-SQL)
They are generally recompiled automatically. I guess I don't know if this is guaranteed, but it has been what I have observed - if you change (e.g. add an index) the objects referenced by the sproc then it recompiles.
create table mytable (i int identity)
insert mytable default values
go 100
create proc sp1 as select * from mytable where i = 17
go
exec sp1
If you look at the plan for this execution, it shows a table scan as expected.
create index mytablei on mytable(i)
exec sp1
The plan has changed to an index seek.
EDIT: ok I came up with a query that appears to work - this gives you all sproc names that have a reference to a given table in the plan cache. You can concatenate the sproc name with the sp_recompile syntax to generate a bunch of sp_recompile statements you can then execute.
;WITH XMLNAMESPACES (default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
,TableRefs (SProcName, ReferencedTableName) as
(
select
object_name(qp.objectid) as SProcName,
objNodes.objNode.value('#Database', 'sysname') + '.' + objNodes.objNode.value('#Schema', 'sysname') + '.' + objNodes.objNode.value('#Table', 'sysname') as ReferencedTableName
from sys.dm_exec_cached_plans cp
outer apply sys.dm_exec_sql_text(cp.plan_handle) st
outer apply sys.dm_exec_query_plan(cp.plan_handle) as qp
outer apply qp.query_plan.nodes('//Object[#Table]') as objNodes(objNode)
where cp.cacheobjtype = 'Compiled Plan'
and cp.objtype = 'Proc'
)
select
*
from TableRefs
where SProcName is not null
and isnull(ReferencedTableName,'') = '[db].[schema].[table]'
I believe that the stored procedures that would potentially benefit from the presence of the index in question will automatically have a new query plan generated, provided the auto generate statistics option has been enabled.
See the section entitled Recompiling Execution Plans for details of what eventualities cause an automatic recompilation.
http://technet.microsoft.com/en-us/library/ms181055(SQL.90).aspx