Increase Execute Duration of Procedure When Using Variables in WhereClause - sql

I Have a procedure executed in SQL Server 2008 R2, the script is:
DECLARE #LocalVar SMALLINT = GetLocalVarFunction();
SELECT
[TT].[ID],
[TT].[Title]
FROM [TargetTable] AS [TT]
LEFT JOIN [AcceccTable] AS [AT] ON [AT].[AccessID] = [TT].[ID]
WHERE
(
(#LocalVar = 1 AND ([AT].[Access] = 0 OR [AT].[Access] Is Null) AND
([TT].[Level] > 7)
);
GO
This Procedure executed in 16 seconds.
But When I change the Where Clause to:
WHERE
(
((1=1) AND [AT].[Access] = 0 OR [AT].[Access] Is Null) AND
([TT].[Level] > 7)
);
The Procedure Executed in less than 1 second.
As You see I just remove the local variable.
So where is the problem? Is there any thing I missing to use local variable in where clause? any suggestion to improve execute time when I using local variable in where clause?
Update:
I also think to add an if statement before script and split the procedure to 2 procedures, but I have 4 or 5 variables like above and use if statement is so complex.
Update2:
I change the set of #LocalVar:
DECLARE #LocalVar SMALLINT = 1;
There is no change in execute time.

When you use use local variables in WHERE filter then it causes FULL TABLE SCAN. The value of the local variable is not known to the SQL Server at compile time. hence SQL Server creates an execution plan for the largest scale that is avaliable for that column.
As you have seen that when you pass 1==1 then SQL server knows the value and hence the performance is not degraded. But the moment you pass a local variable the value is unknown.
One solution may be to use OPTION ( RECOMPILE ) at the end of your SQL query
You can check out the OPTIMIZE FOR UNKNOWN

When you use a local variable in WHERE optimizer doesn't know what to do with it.
You may check this link
What you could do in your case is run your query with displaying the actual plan in both cases and see how SQL is treating them.

It seems that you are using the #LocalVar as a branch condition, as follows:
If #LocalVar is 1 then apply a filter to the query
If #LocalVaris 0 then return an empty result set.
IMO you would be better off writing this condition explicitly, as then SQL will be in a position to optimize separate plans for the 2 branches, i.e.
DECLARE #LocalVar SMALLINT = GetLocalVarFunction();
IF (#LocalVar = 1)
SELECT
[TT].[ID],
[TT].[Title]
FROM [TargetTable] AS [TT]
LEFT JOIN [AcceccTable] AS [AT] ON [AT].[AccessID] = [TT].[ID]
WHERE
(
([AT].[Access] = 0 OR [AT].[Access] Is Null) AND
([TT].[Level] > 7)
)
ELSE
SELECT
[TT].[ID],
[TT].[Title]
FROM [TargetTable] AS [TT]
WHERE 1=2 -- Or any invalid filter, to retain the empty result
And then, because there are now 2 branches through your stored procedure, you should add WITH RECOMPILE to the stored proc, because the 2 branches have radically different query plans.
Edit
Just to clarify the comments:
Note that placing OPTION(RECOMPILE) after a query means that the query plan is never cached - this might not be a good idea if your query is called frequently.
The WITH RECOMPILE at a PROC level prevents caching of branches through the proc. It is not the same as OPTION(RECOMPILE) at query level.
If there are a large number of permutations of filter in your query, then the 'branching' technique above doesn't scale very well - your code quickly becomes unmaintainable.
You might unfortunately then need to consider using parameterized dynamic SQL. SQL will then at least cache a separate plan for each permutation.

Related

SQL: short-circuiting not working. "Null or empty full-text predicate" after upgrading to SQL Server 2012

I have the following query in SQL Server 2005 which works fine:
DECLARE #venuename NVARCHAR(100)
DECLARE #town NVARCHAR(100)
SET #venuename = NULL -- normally these are parameters in the stored proc.
SET #town = 'London'
SELECT COUNT(*) FROM dbo.Venue
WHERE
(#VenueName IS NULL OR CONTAINS((Venue.VenueName), #VenueName))
AND
(#Town IS NULL OR Town LIKE #Town + '%')
It uses short-circuiting when null values are passed for the parameters (there are many more in the real SP than shown in my example).
However after upgrading to SQL 2012, running this query with NULL passed for #VenueName fails with the error "Null or empty full-text predicate" as SQL Server seems to be running (or evaluating) the CONTAINS statement for #VenueName even when #VenueName is set to NULL.
Is there a way to use short-circuiting in 2012 or is this no longer possible? I'd hate to have to rewrite all of my SPs as we've used this technique in dozens of stored procedures across multiple projects over the years.
I do not know much about sql 2012 but can you please try following
DECLARE #venuename NVARCHAR(100)
DECLARE #town NVARCHAR(100)
SET #venuename = '""' -- -- **Yes '""' instead of null**.
SET #town = 'London'
SELECT COUNT(*) FROM dbo.Venue
WHERE
(#VenueName ='""' OR CONTAINS((Venue.VenueName), #VenueName))
AND
(#Town IS NULL OR Town LIKE #Town + '%')
Check out this thread: OR Operator Short-circuit in SQL Server Within SQL server, there is no guarantee that an OR clause breaks early. It's always been that way, so I guess you've just been lucky that it worked with SQL Server 2005.
To workaround your problem, consider using the ISNULL function every time you supply a parameter value that might be NULL, to the CONTAINS function.
This is Perfect Answer.
Let's examine these two statements:
IF (CONDITION 1) OR (CONDITION 2)
..
IF (CONDITION 3) AND (CONDITION 4)
...
If CONDITION 1 is TRUE, will CONDITION 2 be checked?
If CONDITION 3 is FALSE, will CONDITION 4 be checked?
What about conditions on WHERE: does the SQL Server engine optimize all conditions in a WHERE clause? Should programmers place conditions in the right order to be sure that the SQL Server optimizer resolves it in the right manner?
ADDED:
Thank to Jack for link, surprise from t-sql code:
IF 1/0 = 1 OR 1 = 1
SELECT 'True' AS result
ELSE
SELECT 'False' AS result
IF 1/0 = 1 AND 1 = 0
SELECT 'True' AS result
ELSE
SELECT 'False' AS result
There is not raise a Divide by zero exception in this case.
CONCLUSION:
If C++/C#/VB has short-circuiting why can't SQL Server have it?
To truly answer this let's take a look at how both work with conditions. C++/C#/VB all have short circuiting defined in the language specifications to speed up code execution. Why bother evaluating N OR conditions when the first one is already true or M AND conditions when the first one is already false.
We as developers have to be aware that SQL Server works differently. It is a cost based system. To get the optimal execution plan for our query the query processor has to evaluate every where condition and assign it a cost. These costs are then evaluated as a whole to form a threshold that must be lower than the defined threshold SQL Server has for a good plan. If the cost is lower than the defined threshold the plan is used, if not the whole process is repeated again with a different mix of condition costs. Cost here is either a scan or a seek or a merge join or a hash join etc... Because of this the short-circuiting as is available in C++/C#/VB simply isn't possible. You might think that forcing use of index on a column counts as short circuiting but it doesn't. It only forces the use of that index and with that shortens the list of possible execution plans. The system is still cost based.
As a developer you must be aware that SQL Server does not do short-circuiting like it is done in other programming languages and there's nothing you can do to force it to.

SQL Server 2008, different WHERE clauses with one query

I have a stored procedure which takes the same columns but with different WHERE clause.
Something like this.
SELECT
alarms.startt, alarms.endt, clients.code, clients.Plant,
alarms.controller, alarmtype.atype, alarmstatus.[text]
FROM alarms
INNER JOIN clients ON alarms.clientid = clients.C_id
INNER JOIN alarmstatus ON alarms.statusid = alarmstatus.AS_id
INNER JOIN alarmtype ON alarms.typeid = alarmtype.AT_id
and I put the same query in 3 if's (conditions) where the WHERE clause changes according the parameter passed in a variable.
Do I have to write the whole string over and over for each condition in every if?
Or can I optimize it to one time and the only thing what will change will be the WHERE clause?
You don't have to, you can get around it by doing something like
SELECT *
FROM [Query]
WHERE (#Parameter = 1 AND Column1 = 8)
OR (#Parameter = 2 AND Column2 = 8)
OR (#Parameter = 3 AND Column3 = 8)
However, just because you can do something, does not mean you should. Less verbose SQL does not mean better performance, so using something like:
IF #Parameter = 1
BEGIN
SELECT *
FROM [Query]
WHERE Column1 = 8
END
ELSE IF #Parameter = 2
BEGIN
SELECT *
FROM [Query]
WHERE Column2 = 8
END
ELSE IF #Parameter = 3
BEGIN
SELECT *
FROM [Query]
WHERE Column3 = 8
END
while equavalent to the first query should result in better perfomance as it will be optimised better.
You can avoid repeating the code if you do something like:
WHERE (col1 = #var1 AND #var1 IS NOT NULL)
OR ...
OPTION RECOMPILE;
You can also have some effect on this behavior with the parameterization setting of the database (simple vs. forced).
Something that avoids repeating the code and avoids sub-optimal plans due to parameter sniffing is to use dynamic SQL:
DECLARE #sql NVARCHAR(MAX) = N'SELECT ...';
IF #var1 IS NOT NULL
SET #sql = #sql + ' WHERE ...';
This may work better if you have the server setting "optimize for ad hoc queries" enabled.
I would probably stick with repeating the whole SQL Statement, but have resorted to this in the past...
WHERE (#whichwhere=1 AND mytable.field1=#id)
OR (#whichwhere=2 AND mytable.field2=#id)
OR (#whichwhere=3 AND mytable.field3=#id)
Not particularly readable, and you will have to check the execution plan if it is slow, but it keeps you from repeating the code.
Since no one has suggested this. You can put the original query in a view and then access the view with different WHERE clauses.
To improve performance, you can even add indexes to the view if you know what columns will be commonly used in the WHERE clause (check out http://msdn.microsoft.com/en-us/library/dd171921(v=sql.100).aspx).
Well like most things in SQL: it depends. There are a few consideration here.
Would the the different WHEREs lead to substantially different query
plans for execution e.g. one of the columns indexed but not the other
two
Is the query likely to change over time: i.e. customer
requirements needing other columns
Is the WHERE likely to become 4,
then 8, then 16 etc options.
One approach is to exec different procs into a temp table. Each proc would then have its own query plan.
Another approach would be to use dynamic SQL, once again each "query" would be assigned is own plan.
A third appoach would be to write an app that generated the SQL for each option, this could be either a stored proc or a sql string.
Then have a data set and do test driven development against it (this is true for each approach).
In the end the best learning solution is probably to
a) read about SQL Kalen Delaney Inside SQL is an acknowledged expert.
b) test your own solutions against your own data
I would go this way:
WHERE 8 = CASE #parameter
WHEN 1 THEN Column1
WHEN 2 THEN Column2
.
.
.

How to structure a query with a large, complex where clause?

I have an SQL query that takes these parameters:
#SearchFor nvarchar(200) = null
,#SearchInLat Decimal(18,15) = null
,#SearchInLng Decimal(18,15) = null
,#SearchActivity int = null
,#SearchOffers bit = null
,#StartRow int
,#EndRow int
The variables #SearchFor, #SearchActivity, #SearchOffers can be either null or not null. #SearchInLat and #SearchInLng must both null, or both have values.
I'm not going to post the whole query as its boring and hard to read, but the WHERE clause is shaped like this:
( -- filter by activity --
(#SearchActivity IS NULL)
OR (#SearchActivity = Activities.ActivityID)
)
AND ( -- filter by Location --
(#SearchInLat is NULL AND #SearchInLng is NULL)
OR ( ... )
)
AND ( -- filter by activity --
#SearchActivity is NULL
OR ( ... )
)
AND ( -- filter by has offers --
#SearchOffers is NULL
OR ( ... )
)
AND (
... -- more stuff
)
I have read that this is a bad way to structure a query - that SqlServer has trouble working out an efficient execution plan with lots of clauses like this, so I'm looking for other ways to do it.
I see two ways of doing this:
Construct the query as a string in my client application, so that the WHERE clause only contains filters for the relevant parameters. The problem with this is it means not accessing the database through stored procedures, as everything else is at the moment.
Change the stored procedure so that it examines which arguments are null, and executes child procedures depending on which arguments it is passed. The problem here is that it would mean repeating myself a lot in the definition of the procs, and thus be harder to maintain.
What should I do? Or should I just keep on as I am currently doing? I have OPTION (RECOMPILE) set for the procedures, but I've heard that this doesn't work right in Server 2005. Also, I plan to add more parameters to this proc, so I want to make sure whatever solution I have is fairly scaleable.
The answer is to use DynamicSQL (be it in the client, or in an SP using sp_executesql), but the reason why is long, so here's a link...
Dynamic Search Conditions in T-SQL
A very short version is that one-size does not fit all. And as the optimiser creates one plan for one query, it's slow. So the solution is to continue using parameterised queries (for execution plan caching), but to have many queries, for the different types of search that can happen.
Perhaps an alternative might be to perform several separate select statements?
e.g.
( -- filter by activity --
if #SearchActivity is not null
insert into tmpTable (<columns>)
select *
from myTable
where (#SearchActivity = Activities.ActivityID)
)
( -- filter by Location --
if #SearchInLat is not null and #SearchInLng is not null
insert into tmpTable (<columns>)
select *
from myTable
where (latCol = #SearchInLat AND lngCol = #SearchInLng)
etc...
then select the temp table to return the final result set.
I'm not sure how this would work with respect to the optimiser and the query plans, but each individual select would be very straightforward and could utilise the indexes that you would have created on each column which should make them very quick.
Depending on your requirements it also may make sense to create a primary key on the temp table to allow you to join to it on each select (to avoid duplicates).
Look at the performance first, like others have said.
If possible, you can use IF clauses to simplify the queries based on what parameters are provided.
You could also use functions or views to encapsulate some of the code if you find you are repeating it often.

Performance implications of sql 'OR' conditions when one alternative is trivial?

I'm creating a stored procedure for searching some data in my database according to some criteria input by the user.
My sql code looks like this:
Create Procedure mySearchProc
(
#IDCriteria bigint=null,
...
#MaxDateCriteria datetime=null
)
as
select Col1,...,Coln from MyTable
where (#IDCriteria is null or ID=#IDCriteria)
...
and (#MaxDateCriteria is null or Date<#MaxDateCriteria)
Edit : I've around 20 possible parameters, and each combination of n non-null parameters can happen.
Is it ok performance-wise to write this kind of code? (I'm using MS SQL Server 2008)
Would generating SQL code containing only the needed where clauses be notably faster?
OR clauses are notorious for causing performance issues mainly because they require table scans. If you can write the query without ORs you'll be better off.
where (#IDCriteria is null or ID=#IDCriteria)
and (#MaxDateCriteria is null or Date<#MaxDateCriteria)
If you write this criteria, then SQL server will not know whether it is better to use the index for IDs or the index for Dates.
For proper optimization, it is far better to write separate queries for each case and use IF to guide you to the correct one.
IF #IDCriteria is not null and #MaxDateCriteria is not null
--query
WHERE ID = #IDCriteria and Date < #MaxDateCriteria
ELSE IF #IDCriteria is not null
--query
WHERE ID = #IDCriteria
ELSE IF #MaxDateCriteria is not null
--query
WHERE Date < #MaxDateCriteria
ELSE
--query
WHERE 1 = 1
If you expect to need different plans out of the optimizer, you need to write different queries to get them!!
Would generating SQL code containing only the needed where clauses be notably faster?
Yes - if you expect the optimizer to choose between different plans.
Edit:
DECLARE #CustomerNumber int, #CustomerName varchar(30)
SET #CustomerNumber = 123
SET #CustomerName = '123'
SELECT * FROM Customers
WHERE (CustomerNumber = #CustomerNumber OR #CustomerNumber is null)
AND (CustomerName = #CustomerName OR #CustomerName is null)
CustomerName and CustomerNumber are indexed. Optimizer says : "Clustered
Index Scan with parallelization". You can't write a worse single table query.
Edit : I've around 20 possible parameters, and each combination of n non-null parameters can happen.
We had a similar "search" functionality in our database. When we looked at the actual queries issued, 99.9% of them used an AccountIdentifier. In your case, I suspect either one column is -always supplied- or one of two columns are always supplied. This would lead to 2 or 3 cases respectively.
It's not important to remove OR's from the whole structure. It is important to remove OR's from the column/s that you expect the optimizer to use to access the indexes.
So, to boil down the above comments:
Create a separate sub-procedure for each of the most popular variations of specific combinations of parameters, and within a dispatcher procedure call the appropriate one from an IF ELSE structure, the penultimate ELSE clause of which builds a query dynamically to cover the remaining cases.
Perhaps only one or two cases may be specifically coded at first, but as time goes by and particular combinations of parameters are identified as being statistically significant, implementation procedures may be written and the master IF ELSE construct extended to identify those cases and call the appropriate sub-procedure.
Regarding "Would generating SQL code containing only the needed where clauses be notably faster?"
I don't think so, because this way you effectively remove the positive effects of query plan caching.
You could perform selective queries, in order of the most common / efficient (indexed etc), parameters, and add PK(s) to a temporary table
That would create a (hopefully small!) subset of data
Then join that Temporary Table with the main table, using a full WHERE clause with
SELECT ...
FROM #TempTable AS T
JOIN dbo.MyTable AS M
ON M.ID = T.ID
WHERE (#IDCriteria IS NULL OR M.ID=#IDCriteria)
...
AND (#MaxDateCriteria IS NULL OR M.Date<#MaxDateCriteria)
style to refine the (small) subset.
What if constructs like these were replaced:
WHERE (#IDCriteria IS NULL OR #IDCriteria=ID)
AND (#MaxDateCriteria IS NULL OR Date<#MaxDateCriteria)
AND ...
with ones like these:
WHERE ID = ISNULL(#IDCriteria, ID)
AND Date < ISNULL(#MaxDateCriteria, DATEADD(millisecond, 1, Date))
AND ...
or is this just coating the same unoptimizable query in syntactic sugar?
Choosing the right index is hard for the optimizer. IMO, this is one of few cases where dynamic SQL is the best option.
this is one of the cases i use code building or a sproc for each searchoption.
since your search is so complex i'd go with code building.
you can do this either in code or with dynamic sql.
just be careful of SQL Injection.
I suggest one step further than some of the other suggestions - think about degeneralizing at a much higher abstraction level, preferably the UI structure. Usually this seems to happen when the problem is being pondered in data mode rather than user domain mode.
In practice, I've found that almost every such query has one or more non-null, fairly selective columns that would be reasonably optimizable, if one (or more) were specified. Furthermore, these are usually reasonable assumptions that users can understand.
Example: Find Orders by Customer; or Find Orders by Date Range; or Find Orders By Salesperson.
If this pattern applies, then you can decompose your hypergeneralized query into more purposeful subqueries that also make sense to users, and you can reasonably prompt for required values (or ranges), and not worry too much about crafting efficient expressions for subsidiary columns.
You may still end up with an "All Others" category. But at least then if you provide what is essentially an open-ended Query By Example form, then users will have some idea what they're getting into. Doing what you describe really puts you in the role of trying to out-think the query optimizer, which is folly IMHO.
I'm currently working with SQL 2005, so I don't know if the 2008 optimizer acts differently. That being said, I've found that you need to do a couple of things...
Make sure that you are using WITH (RECOMPILE) for your query
Use CASE statements to cause short-circuiting of the logic. At least in 2005 this is NOT done with OR statements. For example:
.
SELECT
...
FROM
...
WHERE
(1 =
CASE
WHEN #my_column IS NULL THEN 1
WHEN my_column = #my_column THEN 1
ELSE 0
END
)
The CASE statement will cause the SQL Server optimizer to recognize that it doesn't need to continue past the first WHEN. In this example it's not a big deal, but in my search procs a non-null parameter often meant searching in another table through a subquery for existence of a matching row, which got costly. Once I made this change the search procs started running much faster.
My suggestion is to build the sql string. You will gain maximum performance from index and reuse execution plan.
DECLARE #sql nvarchar(4000);
SET #sql = N''
IF #param1 IS NOT NULL
SET #sql = CASE WHEN #sql = N'' THEN N'' ELSE N' AND ' END + N'param1 = #param1';
IF #param2 IS NOT NULL
SET #sql = CASE WHEN #sql = N'' THEN N'' ELSE N' AND ' END + N'param2 = #param2';
...
IF #paramN IS NOT NULL
SET #sql = CASE WHEN #sql = N'' THEN N'' ELSE N' AND ' END + N'paramN = #paramN';
IF #sql <> N''
SET #sql = N' WHERE ' + #sql;
SET #sql = N'SELECT ... FROM myTable' + #sql;
EXEC sp_executesql #sql, N'#param1 type, #param2 type, ..., #paramN type', #param1, #param2, ..., #paramN;
Each time the procedure is called, passing different parameters, there is a different optimal execution plan for getting the data. The problem being, that SQL has cached an execution plan for your procedure and will use a sub-optimal (read terrible) execution plan.
I would recommend:
Create specific SPs for frequently run execution paths (i.e. passed parameter sets) optimised for each scenario.
Keep you main generic SP for edge cases (presuming they are rarely run) but use the WITH RECOMPILE clause to cause a new execution plan to be created each time the procedure is run.
We use OR clauses checking against NULLs for optional parameters to great affect. It works very well without the RECOMPILE option so long as the execution path is not drastically altered by passing different parameters.

Stored procedure bit parameter activating additional where clause to check for null

I have a stored procedure that looks like:
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND CASE WHEN #AdditionalFilter = 1 THEN
T.Column2 IS NOT NULL
Needless to say, this doesn't work. How can I activate the additional where clause that checks for the #AdditionalFilter parameter? Thanks for any help.
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND (#AdditionalFilter = 0 OR
T.Column2 IS NOT NULL)
If #AdditionalFilter is 0, the column won't be evaluated since it can't affect the outcome of the part between brackets. If it's anything other than 0, the column condition will be evaluated.
This practice tends to confuse the query optimizer. I've seen SQL Server 2000 build the execution plan exactly the opposite way round and use an index on Column1 when the flag was set and vice-versa. SQL Server 2005 seemed to at least get the execution plan right on first compilation, but you then have a new problem. The system caches compiled execution plans and tries to reuse them. If you first use the query one way, it will still execute the query that way even if the extra parameter changes, and different indexes would be more appropriate.
You can force a stored procedure to be recompiled on this execution by using WITH RECOMPILE in the EXEC statement, or every time by specifying WITH RECOMPILE on the CREATE PROCEDURE statement. There will be a penalty as SQL Server re-parses and optimizes the query each time.
In general, if the form of your query is going to change, use dynamic SQL generation with parameters. SQL Server will also cache execution plans for parameterized queries and auto-parameterized queries (where it tries to deduce which arguments are parameters), and even regular queries, but it gives most weight to stored procedure execution plans, then parameterized, auto-parameterized and regular queries in that order. The higher the weight, the longer it can stay in RAM before the plan is discarded, if the server needs the memory for something else.
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND (NOT #AdditionalFilter OR T.Column2 IS NOT NULL)
select *
from SomeTable t
where t.Column1 is null
and (#AdditionalFilter = 0 or t.Column2 is not null)