T-SQL Conditional Query - sql

Is there difference in performance when querying someting like this?
Query 1:
SELECT * FROM Customer WHERE Name=Name
Query 2:
SELECT * FROM Customer
I will use it in conditional select all
SELECT * FROM Customer
WHERE Name = CASE WHEN #QueryAll='true' THEN Name ELSE #SearchValue END
If there's no performance issue in Query 1 and 2, I think it is a short code for this one:
IF #QueryAll='true'
SELECT * FROM Customer
ELSE
SELECT * FROM Customer WHERE Name=#SearchValue

You should read Dynamic Search Conditions in T‑SQL by Erland Sommarskog.
If you use SQL Server 2008 or later, then use OPTION(RECOMPILE) and write the query like this:
SELECT *
FROM Customer
WHERE
(Name = #SearchValue OR #QueryAll='true')
OPTION (RECOMPILE);
I usually pass NULL for #SearchValue to indicate that this parameter should be ignored, rather than using separate parameter #QueryAll. In this convention the query becomes this:
SELECT *
FROM Customer
WHERE
(Name = #SearchValue OR #SearchValue IS NULL)
OPTION (RECOMPILE);
Edit
For details see the link above. In short, OPTION(RECOMPILE) instructs SQL Server to recompile execution plan of the query every time it is run and SQL Server will not cache the generated plan. Recompilation also means that values of any variables are effectively inlined into the query and optimizer knows them.
So, if #SearchValue is NULL, optimizer is smart enough to generate the plan as if the query was this:
SELECT *
FROM Customer
If #SearchValue has a non-NULL value 'abc', optimizer is smart enough to generate the plan as if the query was this:
SELECT *
FROM Customer
WHERE (Name = 'abc')
The obvious drawback of OPTION(RECOMPILE) is added overhead for recompilation (usually around few hundred milliseconds), which can be significant if you run the query very often.

The query 2 would be faster with an index on name column, and you should specify just the fields you'll need, not all of them.
For some guidance in optional parameter queries, take a look here: Sometimes the Simplest Solution Isn't the Best Solution (The Optional Parameter Problem)

Related

SQL Server 2008, different WHERE clauses with one query

I have a stored procedure which takes the same columns but with different WHERE clause.
Something like this.
SELECT
alarms.startt, alarms.endt, clients.code, clients.Plant,
alarms.controller, alarmtype.atype, alarmstatus.[text]
FROM alarms
INNER JOIN clients ON alarms.clientid = clients.C_id
INNER JOIN alarmstatus ON alarms.statusid = alarmstatus.AS_id
INNER JOIN alarmtype ON alarms.typeid = alarmtype.AT_id
and I put the same query in 3 if's (conditions) where the WHERE clause changes according the parameter passed in a variable.
Do I have to write the whole string over and over for each condition in every if?
Or can I optimize it to one time and the only thing what will change will be the WHERE clause?
You don't have to, you can get around it by doing something like
SELECT *
FROM [Query]
WHERE (#Parameter = 1 AND Column1 = 8)
OR (#Parameter = 2 AND Column2 = 8)
OR (#Parameter = 3 AND Column3 = 8)
However, just because you can do something, does not mean you should. Less verbose SQL does not mean better performance, so using something like:
IF #Parameter = 1
BEGIN
SELECT *
FROM [Query]
WHERE Column1 = 8
END
ELSE IF #Parameter = 2
BEGIN
SELECT *
FROM [Query]
WHERE Column2 = 8
END
ELSE IF #Parameter = 3
BEGIN
SELECT *
FROM [Query]
WHERE Column3 = 8
END
while equavalent to the first query should result in better perfomance as it will be optimised better.
You can avoid repeating the code if you do something like:
WHERE (col1 = #var1 AND #var1 IS NOT NULL)
OR ...
OPTION RECOMPILE;
You can also have some effect on this behavior with the parameterization setting of the database (simple vs. forced).
Something that avoids repeating the code and avoids sub-optimal plans due to parameter sniffing is to use dynamic SQL:
DECLARE #sql NVARCHAR(MAX) = N'SELECT ...';
IF #var1 IS NOT NULL
SET #sql = #sql + ' WHERE ...';
This may work better if you have the server setting "optimize for ad hoc queries" enabled.
I would probably stick with repeating the whole SQL Statement, but have resorted to this in the past...
WHERE (#whichwhere=1 AND mytable.field1=#id)
OR (#whichwhere=2 AND mytable.field2=#id)
OR (#whichwhere=3 AND mytable.field3=#id)
Not particularly readable, and you will have to check the execution plan if it is slow, but it keeps you from repeating the code.
Since no one has suggested this. You can put the original query in a view and then access the view with different WHERE clauses.
To improve performance, you can even add indexes to the view if you know what columns will be commonly used in the WHERE clause (check out http://msdn.microsoft.com/en-us/library/dd171921(v=sql.100).aspx).
Well like most things in SQL: it depends. There are a few consideration here.
Would the the different WHEREs lead to substantially different query
plans for execution e.g. one of the columns indexed but not the other
two
Is the query likely to change over time: i.e. customer
requirements needing other columns
Is the WHERE likely to become 4,
then 8, then 16 etc options.
One approach is to exec different procs into a temp table. Each proc would then have its own query plan.
Another approach would be to use dynamic SQL, once again each "query" would be assigned is own plan.
A third appoach would be to write an app that generated the SQL for each option, this could be either a stored proc or a sql string.
Then have a data set and do test driven development against it (this is true for each approach).
In the end the best learning solution is probably to
a) read about SQL Kalen Delaney Inside SQL is an acknowledged expert.
b) test your own solutions against your own data
I would go this way:
WHERE 8 = CASE #parameter
WHEN 1 THEN Column1
WHEN 2 THEN Column2
.
.
.

SQL Parameter Slows Down Query

I have a query which I'm using with SQL Server 2008R2 via ADO.NET. When I use a LIKE clause inline, it works in less than a second, with 5 rows returned from 2 million. If I declare the paramater as I do in .NET at the start of the query in SSMS, it takes forever.
It's the same query, but parameterized.
The first (which works fine) is (which works fine):
;WITH Results_CTE AS (
SELECT ld.* , ROW_NUMBER() OVER (ORDER BY PK_ID) AS RowNum
FROM list..List_Data ld
WHERE Name IS NOT NULL AND
Postcode LIKE 'SW14 1xx%'
) SELECT * FROM Results_CTE
The second which takes forever is:
declare #postcode varchar(10) = 'SW14 1xx'
;WITH Results_CTE AS (
SELECT ld.* , ROW_NUMBER() OVER (ORDER BY PK_ID) AS RowNum
FROM list..List_Data ld
WHERE Name IS NOT NULL AND
Postcode LIKE #postcode +'%'
) SELECT * FROM Results_CTE
I believe this has something to do with the inner workings of SQL Server but I really have no idea.
I was googling for potential problems with SqlCommand.Parameters.Add() in C#, and I found this page. I know this is an SQL Server post, but others might find it through google, and it may help them with C#.
For me, none of the above answers worked, so I tried another method.
Instead of:
cmd.Parameters.Add(new SqlParameter("#postcode", postcode));
I used this instead:
// Replace SqlDbType enumeration with whatever SQL Data Type you're using.
cmd.Parameters.Add("#postcode", SqlDbType.VarChar).Value = postcode;
And don't forget the namespace:
using System.Data;
Hope this helps someone!
Use
SELECT *
FROM Results_CTE
OPTION (RECOMPILE)
SQL Server does not sniff the value of the variable so it has no idea how selective it will be and will probably be assuming that the query will return significantly more rows than is actually the case and giving you a plan optimised for that.
In your case I'm pretty sure that in the good plan you will find it is using a non covering non clustered index to evaluate the PostCode predicate and some lookups to retrieve the missing columns whereas in the bad plan (as it guesses the query will return a greater number of rows) it avoids this in favour of a full table scan.
You can use optimize for to have the parameterized query use the same execution plan as the one with a specific parameter:
SELECT *
FROM Results_CTE
OPTION (OPTIMIZE FOR (#postcode = 'SW14 1xx'))
This looks like a problem caused by parameter sniffing - during plan compilation SQL Server "sniffs" the current parameters values and uses it to optimise the query. The most common problem that this might cause is if the query is run with an "odd" parameter value the first time its run / compiled in which case the query plan will be optimised for that parameter value, parameter sniffing can cause all other problems however
In your case if the query is run with an empty / null value for #postcode then the query is using a LIKE '%' clause, which is very likely to cause a table scan as a LIKE wildcard is being used at the start of the filter. It looks like either the plan was initially run / compiled with an empty #postcode parameter, or SQL Server is somehow getting confused by this parameter.
There are a couple of things you can try:
Mark the query for recompilation and then run the query again with a non-null value for #postcode.
"Mask" the parameter to try and prevent parameter sniffing,
for example:
declare #postcode varchar(10) = 'SW14 1xx'
declare #postcode_filter varchar(10) = #postcode + '%'
-- Run the query using #postcode_filter instead of #postcode
Although this query looks like it should behave in exactly the same way I've found that SQL Server deals with parameters in strange ways - the rules on when exactly parameter sniffing is used can be a tad strange at time so you may want to play around with variations on the above.

Does SQL Server optimize LIKE ('%%') query?

I have a Stored Proc which performs search on records.
The problem is that some of the search criteria,which are coming from UI, may be empty string.
So, when the criteria not specified, the LIKE statement becomes redundant.
How can I effectively perform that search or Sql Server? Or, Does it optimize LIKE('%%') query since it means there is nothing to compare?
The Stored proc is like this:
ALTER PROC [FRA].[MCC_SEARCH]
#MCC_Code varchar(4),
#MCC_Desc nvarchar(50),
#Detail nvarchar(50)
AS
BEGIN
SELECT
MCC_Code,
MCC_Desc,
CreateDate,
CreatingUser
FROM
FRA.MCC (NOLOCK)
WHERE
MCC_Code LIKE ('%' + #MCC_Code + '%')
AND MCC_Desc LIKE ('%' + #MCC_Desc + '%')
AND Detail LIKE ('%' + #Detail + '%')
ORDER BY MCC_Code
END
With regard to an optimal, index-using execution plan - no. The prefixing wildcard prevents an index from being used, resulting in a scan instead.
If you do not have a wildcard on the end of the search term as well, then that scenario can be optimised - something I blogged out a while back: Optimising wildcard prefixed LIKE conditions
Update
To clarify my point:
LIKE 'Something%' - is able to use an index
LIKE '%Something' - is not able to use an index out-of-the-box. But you can optimise this to allow it to use an index by following the "REVERSE technique" I linked to.
LIKE '%Something%' - is not able to use an index. Nothing you can do to optimise for LIKE.
The short answer is - no
The long answer is - absolutely not
Does it optimize LIKE('%%') query since it means there is nothing to compare?
The statement is untrue, because there is something to compare. The following are equivalent
WHERE column LIKE '%%'
WHERE column IS NOT NULL
IS NOT NULL requires a table scan, unless there are very few non-null values in the column and it is well indexed.
EDIT
Resource on Dynamic Search procedures in SQL Server:
You simply must read this article by Erland Sommarskog, SQL Server MVP http://www.sommarskog.se/dyn-search.html (pick your version, or read both)
Otherwise if you need good performance on CONTAINS style searches, consider using SQL Server Fulltext engine.
If you use a LIKE clausule, and specify a wildcard-character (%) as a prefix of the searchstring, SQL Server (and all other DBMS'es I guess) will not be able to use indexes that might exists on that column.
I don't know if it optimizes the query if you use an empty search-argument ... Perhaps your question may be answered if you look at the execution plan ?
Edit: I've just checked this out, and the execution plan of this statement:
select * from mytable
is exactly the same as this the exec plan of this statement:
select * from mytable where description like '%'
Both SQL statements simply use a clustered index scan.

Performance implications of sql 'OR' conditions when one alternative is trivial?

I'm creating a stored procedure for searching some data in my database according to some criteria input by the user.
My sql code looks like this:
Create Procedure mySearchProc
(
#IDCriteria bigint=null,
...
#MaxDateCriteria datetime=null
)
as
select Col1,...,Coln from MyTable
where (#IDCriteria is null or ID=#IDCriteria)
...
and (#MaxDateCriteria is null or Date<#MaxDateCriteria)
Edit : I've around 20 possible parameters, and each combination of n non-null parameters can happen.
Is it ok performance-wise to write this kind of code? (I'm using MS SQL Server 2008)
Would generating SQL code containing only the needed where clauses be notably faster?
OR clauses are notorious for causing performance issues mainly because they require table scans. If you can write the query without ORs you'll be better off.
where (#IDCriteria is null or ID=#IDCriteria)
and (#MaxDateCriteria is null or Date<#MaxDateCriteria)
If you write this criteria, then SQL server will not know whether it is better to use the index for IDs or the index for Dates.
For proper optimization, it is far better to write separate queries for each case and use IF to guide you to the correct one.
IF #IDCriteria is not null and #MaxDateCriteria is not null
--query
WHERE ID = #IDCriteria and Date < #MaxDateCriteria
ELSE IF #IDCriteria is not null
--query
WHERE ID = #IDCriteria
ELSE IF #MaxDateCriteria is not null
--query
WHERE Date < #MaxDateCriteria
ELSE
--query
WHERE 1 = 1
If you expect to need different plans out of the optimizer, you need to write different queries to get them!!
Would generating SQL code containing only the needed where clauses be notably faster?
Yes - if you expect the optimizer to choose between different plans.
Edit:
DECLARE #CustomerNumber int, #CustomerName varchar(30)
SET #CustomerNumber = 123
SET #CustomerName = '123'
SELECT * FROM Customers
WHERE (CustomerNumber = #CustomerNumber OR #CustomerNumber is null)
AND (CustomerName = #CustomerName OR #CustomerName is null)
CustomerName and CustomerNumber are indexed. Optimizer says : "Clustered
Index Scan with parallelization". You can't write a worse single table query.
Edit : I've around 20 possible parameters, and each combination of n non-null parameters can happen.
We had a similar "search" functionality in our database. When we looked at the actual queries issued, 99.9% of them used an AccountIdentifier. In your case, I suspect either one column is -always supplied- or one of two columns are always supplied. This would lead to 2 or 3 cases respectively.
It's not important to remove OR's from the whole structure. It is important to remove OR's from the column/s that you expect the optimizer to use to access the indexes.
So, to boil down the above comments:
Create a separate sub-procedure for each of the most popular variations of specific combinations of parameters, and within a dispatcher procedure call the appropriate one from an IF ELSE structure, the penultimate ELSE clause of which builds a query dynamically to cover the remaining cases.
Perhaps only one or two cases may be specifically coded at first, but as time goes by and particular combinations of parameters are identified as being statistically significant, implementation procedures may be written and the master IF ELSE construct extended to identify those cases and call the appropriate sub-procedure.
Regarding "Would generating SQL code containing only the needed where clauses be notably faster?"
I don't think so, because this way you effectively remove the positive effects of query plan caching.
You could perform selective queries, in order of the most common / efficient (indexed etc), parameters, and add PK(s) to a temporary table
That would create a (hopefully small!) subset of data
Then join that Temporary Table with the main table, using a full WHERE clause with
SELECT ...
FROM #TempTable AS T
JOIN dbo.MyTable AS M
ON M.ID = T.ID
WHERE (#IDCriteria IS NULL OR M.ID=#IDCriteria)
...
AND (#MaxDateCriteria IS NULL OR M.Date<#MaxDateCriteria)
style to refine the (small) subset.
What if constructs like these were replaced:
WHERE (#IDCriteria IS NULL OR #IDCriteria=ID)
AND (#MaxDateCriteria IS NULL OR Date<#MaxDateCriteria)
AND ...
with ones like these:
WHERE ID = ISNULL(#IDCriteria, ID)
AND Date < ISNULL(#MaxDateCriteria, DATEADD(millisecond, 1, Date))
AND ...
or is this just coating the same unoptimizable query in syntactic sugar?
Choosing the right index is hard for the optimizer. IMO, this is one of few cases where dynamic SQL is the best option.
this is one of the cases i use code building or a sproc for each searchoption.
since your search is so complex i'd go with code building.
you can do this either in code or with dynamic sql.
just be careful of SQL Injection.
I suggest one step further than some of the other suggestions - think about degeneralizing at a much higher abstraction level, preferably the UI structure. Usually this seems to happen when the problem is being pondered in data mode rather than user domain mode.
In practice, I've found that almost every such query has one or more non-null, fairly selective columns that would be reasonably optimizable, if one (or more) were specified. Furthermore, these are usually reasonable assumptions that users can understand.
Example: Find Orders by Customer; or Find Orders by Date Range; or Find Orders By Salesperson.
If this pattern applies, then you can decompose your hypergeneralized query into more purposeful subqueries that also make sense to users, and you can reasonably prompt for required values (or ranges), and not worry too much about crafting efficient expressions for subsidiary columns.
You may still end up with an "All Others" category. But at least then if you provide what is essentially an open-ended Query By Example form, then users will have some idea what they're getting into. Doing what you describe really puts you in the role of trying to out-think the query optimizer, which is folly IMHO.
I'm currently working with SQL 2005, so I don't know if the 2008 optimizer acts differently. That being said, I've found that you need to do a couple of things...
Make sure that you are using WITH (RECOMPILE) for your query
Use CASE statements to cause short-circuiting of the logic. At least in 2005 this is NOT done with OR statements. For example:
.
SELECT
...
FROM
...
WHERE
(1 =
CASE
WHEN #my_column IS NULL THEN 1
WHEN my_column = #my_column THEN 1
ELSE 0
END
)
The CASE statement will cause the SQL Server optimizer to recognize that it doesn't need to continue past the first WHEN. In this example it's not a big deal, but in my search procs a non-null parameter often meant searching in another table through a subquery for existence of a matching row, which got costly. Once I made this change the search procs started running much faster.
My suggestion is to build the sql string. You will gain maximum performance from index and reuse execution plan.
DECLARE #sql nvarchar(4000);
SET #sql = N''
IF #param1 IS NOT NULL
SET #sql = CASE WHEN #sql = N'' THEN N'' ELSE N' AND ' END + N'param1 = #param1';
IF #param2 IS NOT NULL
SET #sql = CASE WHEN #sql = N'' THEN N'' ELSE N' AND ' END + N'param2 = #param2';
...
IF #paramN IS NOT NULL
SET #sql = CASE WHEN #sql = N'' THEN N'' ELSE N' AND ' END + N'paramN = #paramN';
IF #sql <> N''
SET #sql = N' WHERE ' + #sql;
SET #sql = N'SELECT ... FROM myTable' + #sql;
EXEC sp_executesql #sql, N'#param1 type, #param2 type, ..., #paramN type', #param1, #param2, ..., #paramN;
Each time the procedure is called, passing different parameters, there is a different optimal execution plan for getting the data. The problem being, that SQL has cached an execution plan for your procedure and will use a sub-optimal (read terrible) execution plan.
I would recommend:
Create specific SPs for frequently run execution paths (i.e. passed parameter sets) optimised for each scenario.
Keep you main generic SP for edge cases (presuming they are rarely run) but use the WITH RECOMPILE clause to cause a new execution plan to be created each time the procedure is run.
We use OR clauses checking against NULLs for optional parameters to great affect. It works very well without the RECOMPILE option so long as the execution path is not drastically altered by passing different parameters.

Stored procedure bit parameter activating additional where clause to check for null

I have a stored procedure that looks like:
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND CASE WHEN #AdditionalFilter = 1 THEN
T.Column2 IS NOT NULL
Needless to say, this doesn't work. How can I activate the additional where clause that checks for the #AdditionalFilter parameter? Thanks for any help.
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND (#AdditionalFilter = 0 OR
T.Column2 IS NOT NULL)
If #AdditionalFilter is 0, the column won't be evaluated since it can't affect the outcome of the part between brackets. If it's anything other than 0, the column condition will be evaluated.
This practice tends to confuse the query optimizer. I've seen SQL Server 2000 build the execution plan exactly the opposite way round and use an index on Column1 when the flag was set and vice-versa. SQL Server 2005 seemed to at least get the execution plan right on first compilation, but you then have a new problem. The system caches compiled execution plans and tries to reuse them. If you first use the query one way, it will still execute the query that way even if the extra parameter changes, and different indexes would be more appropriate.
You can force a stored procedure to be recompiled on this execution by using WITH RECOMPILE in the EXEC statement, or every time by specifying WITH RECOMPILE on the CREATE PROCEDURE statement. There will be a penalty as SQL Server re-parses and optimizes the query each time.
In general, if the form of your query is going to change, use dynamic SQL generation with parameters. SQL Server will also cache execution plans for parameterized queries and auto-parameterized queries (where it tries to deduce which arguments are parameters), and even regular queries, but it gives most weight to stored procedure execution plans, then parameterized, auto-parameterized and regular queries in that order. The higher the weight, the longer it can stay in RAM before the plan is discarded, if the server needs the memory for something else.
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND (NOT #AdditionalFilter OR T.Column2 IS NOT NULL)
select *
from SomeTable t
where t.Column1 is null
and (#AdditionalFilter = 0 or t.Column2 is not null)