SQL Query Where clause slows down query - sql

I use to write query like this:
SELECT *
FROM myTable
WHERE COl1 = #Param1 OR #Param1 Is NUll
The above use to execute find and a decent speed. But now it takes 17 secs.
If I remove the
OR #Param1 Is NULL
It executes less than 1 sec.
SELECT *
FROM myTable
WHERE COl1 = #Param1
Executes less that 1 sec
Any idea why the OR #Param1 Is Null would add 16 sec to the execution?
I've been using this style of query for many years and haven't noticed any performance hit.
The query basically is saying give my ALL records if #param1 is null otherwise give me only the records that matach #param1

Since you are passing in #Param1 you could take a procedure based approach as outlined in the link in my original comment. This would look like the following:
CREATE PROC getData
#Param1 varchar(255) = NULL
AS
BEGIN
IF(#Param1 IS NULL)
BEGIN
SELECT *
FROM myTable
END
ELSE
BEGIN
SELECT *
FROM myTable
WHERE COL1 = #Param1
END;
END;

Try the below and let us know if this helps
SELECT *
FROM myTable
WHERE COl1 = nvl(#Param1,col1) ;
By using nvl, the value in it is not parsed first and gets the execution plan which might help performance.

Related

Why does swapping out a = comparison for an unused IN slow down this T-SQL query so much?

Apologies but I can't offer a reproducible example for this. It won't reproduce with a simple example and thus must be related to our data structures and volume: I'm hoping someone can see a pattern here and offer some advice.
We previously had a stored procedure, which felt oddly written to me but worked fine, which ran code equivalent to the following pseudocode:
DECLARE #HasResults BIT = 0;
IF
(SELECT COUNT(*) FROM myTable t
WHERE
t.field1 = #param1
OR t.field2 = #param2
OR t.field3 = #param3
OR t.field4 = #param4) > 0
SET #HasResults = 1
SELECT #HasResults AS HasResults
All of the fields and params were originally integers and in normal usage all but one of them will be NULL. I had to change one of the params to an nvarchar(max) so that it would take a list, which I split with a fairly standard splitting function and then use an IN statement:
DECLARE #HasResults BIT = 0;
IF
(SELECT COUNT(*) FROM myTable t
WHERE
t.field1 = #param1
OR t.field2 = #param2
OR t.field3 IN (select ID from fnSplit(#param3))
OR t.field4 = #param4) > 0
SET #HasResults = 1
SELECT #HasResults AS HasResults
This resulted in the query, in some circumstances, going from a sub-second to over a minute. Now you might expect that from an IN comparison but what baffles me is that if there's data in #param3 it works fine - it's if #param3 is NULL that the query is slow. If you comment out the IN clause, it goes back to sub-second speed.
The splitting function isn't the problem here - it's very fast and I've experimented with it but nothing improves.
To further confuse me I discovered that you can significantly improve the situation by removing that unnecessary IF statement. This takes about 10 seconds to run, which is much slower than the original query but much faster than using the IF:
SELECT COUNT(*) FROM myTable t
WHERE
t.field1 = #param1
OR t.field2 = #param2
OR t.field3 = #param3
OR t.field4 = #param4
Why is this query running so much slower when I try and split a NULL value and use the results in an IN statement, and why does the IF have such an impact?
EDIT: split function as requested:
CREATE FUNCTION [dbo].[fnSplit](
#sInputList VARCHAR(MAX)
, #sDelimiter VARCHAR(MAX) = ','
) RETURNS #List TABLE (item VARCHAR(MAX))
BEGIN
DECLARE #sItem VARCHAR(MAX)
WHILE CHARINDEX(#sDelimiter,#sInputList,0) <> 0
BEGIN
SELECT
#sItem=RTRIM(LTRIM(SUBSTRING(#sInputList,1,CHARINDEX(#sDelimiter,#sInputList,0)-1))),
#sInputList=RTRIM(LTRIM(SUBSTRING(#sInputList,CHARINDEX(#sDelimiter,#sInputList,0)+LEN(#sDelimiter),LEN(#sInputList))))
IF LEN(#sItem) > 0
INSERT INTO #List SELECT #sItem
END
IF LEN(#sInputList) > 0
INSERT INTO #List SELECT #sInputList
RETURN
END
GO
Understanding why a scalaire UDF or a multi-instruction table can kill performances is very simple.
UDF are Transact SQL that is a interpreted language. Not a compiled one. So the function will be call on every row. This had 3 consequences :
executing the function for every rows (RBAR effet)
forbid any parallel processing because of potential side effects
forbid the use of indexes, because indexes cannot be used when data is transformed
So if you want performances, avoid using UDF when there is another solution.
You can use instead STRING_SPLIT which I think is the fastest or XML operations.
In fact, in queries, that operates as "sets" and not on every values, any use of anything that have an iterative process will kill all performances. Recursive queries includes....

SQL 2014 Query runs much faster in sp_executesql than as plain query

Very strange issue, I have looked but best I can see this is not a known issue.
I am debugging a query that someone else wrote for a SSRS report.
When run from SSRS returns 13,730 rows. I captured this query execution in SQL Profiler then ran it in SSMS which returned 11,940 rows. The report does nothing with the result set, it just shows each row.
I then took the query out of sp_executesql and converted the sp_executesql params into normal params. When this is run in SSMS the query just endlessly spins with a CXPACKET wait type.
I don't understand how essentially the same query can have such vastly different results and execution plans by way of running it.
I can't copy the whole query here, but the gist of it is:
/* Example 1
This returns different results when run from the report (SQL Server Reporting Services 2014) as opposed to when run in SQL Server Management Studio. The actual execution from SSRS was captured using profiler
Either execution completed in about 15 seconds.
*/
exec sp_executesql N'
CREATE TABLE #tmpTable (
field1 varchar(250),
field2 int,
field3 varchar(max)
)
Insert into #tmpTablea
SELECT
somestuff as field1,
someotherstuff as field2,
morestuff as field3
FROM
RealTable
WHERE
somestuff = ''stuff that matters for the first query''
AND
otherstuff = #param1
Insert into #tmpTable
SELECT
somestuff as field1,
someotherstuff as field2,
morestuff as field3
FROM
RealTable2
WHERE
somestuff = ''stuff that matters for the second query''
AND
otherstuff = #param2
SELECT * FROM #tmpTable
',N'#param1 nvarchar(4000),#param2 nvarchar(4000)',#param1=N'value1',#param2=N'value2'
/* Example 2
This one just runs forever, I cancelled it after 15 minutes.
*/
Declare #param1 nvarchar(4000),
#param2 nvarchar(4000)
SELECT #param1 = 'value1',
#param2 = 'value2'
CREATE TABLE #tmpTable (
field1 varchar(250),
field2 int,
field3 varchar(max)
)
Insert into #tmpTable
SELECT
somestuff as field1,
someotherstuff as field2,
morestuff as field3
FROM
RealTable
WHERE
somestuff = ''stuff that matters for the first query''
AND
otherstuff = #param1
Insert into #tmpTable
SELECT
somestuff as field1,
someotherstuff as field2,
morestuff as field3
FROM
RealTable2
WHERE
somestuff = ''stuff that matters for the second query''
AND
otherstuff = #param2
SELECT * FROM #tmpTable
Interesting. You may have uncovered something not published about SQL-Server's internal behavior.
You mentioned that when you run the query without sp_executesql that it spins with a CXPACKET wait type. This means that the query optimizer parallelized your query. CXPACKET wait type is what happens when the processes running the query are exchanging information.
When you use sp_executesql, it just runs right away, which suggests that sp_execautesql (no CXPACKET wait type) is not parallelizing your query, or not parallelizing it as much.
You can test this theory easily. At the end of each statement in the version that is NOT using sp_executesql, put:
OPTION (MAXDOP 0);
This is effectively telling Sql-Server to use only one core, and should guarantee no CXPACKET waits.
If this helps, then test it with MAXDOP 1, MAXDOP 2, etc ...
NOTE: In general, the query optimizer does a good job of figuring out when to parallelize and the degree to do so (i.e., it figures out what the optimal MAXDOP value), but sometimes, it doesn't, and after some experimenting hints like this can speed things up.

SQL IF EXISTS with OR Condition

I am facing a performance issue with a SQL statement. I have noticed a performance degradation in one SQL statement in one of a procedure.
SQL Statement:
IF EXISTS(SELECT TOP 1 FROM TABLE1 WHERE COLUMN1 = 'XYZ') OR #ISALLOWED = 1
BEGIN
-- SQL Statements
END
I couldn't able to understand why OR statement with IF Exists statement is causing performance issue in above query. Because if I rewrite the above statement like this:
DECLARE #ISVALUEEXISTS BIT
SET #ISVALUEEXISTS = 0
IF EXISTS(SELECT TOP 1 FROM TABLE1 WHERE COLUMN1 = 'XYZ')
SET #ISVALUEEXISTS = 1
IF (#ISVALUEEXISTS = 1 OR #ISALLOWED = 1 )
BEGIN
--SQL Statements
END
Then performance issue is gone. So, I'm couldn't able to understand how and why OR condition with IF Exists statement is causing problem.
Anyone has any idea about this?
If you have this query inside stored procedure this could happen because of parameter sniffing.
Try something like this to check it:
declare #ISALLOWED_internal
select #ISALLOWED_internal = #ISALLOWED
IF EXISTS(SELECT TOP 1 FROM TABLE1 WHERE COLUMN1 = 'XYZ') OR #ISALLOWED_internal = 1
BEGIN
-- SQL Statements
END

SQL Search using case or if

Everyone has been a super help so far. My next question is what is the best way for me to approach this... If I have 7 fields that a user can search what is the best way to conduct this search, They can have any combination of the 7 fields so that is 7! or 5040 Combinations which is impossible to code that many. So how do I account for when the User selects field 1 and field 3 or they select field 1, field 2, and field 7? Is there any easy to do this with SQL? I dont know if I should approach this using an IF statement or go towards a CASE in the select statement. Or should I go a complete different direction? Well if anyone has any helpful pointers I would greatly appreciate it.
Thank You
You'll probably want to look into using dynamic SQL for this. See: Dynamic Search Conditions in T-SQL and Catch-all queries for good articles on this topic.
Select f1,f2 from table where f1 like '%val%' or f2 like '%val%'
You could write a stored procedure that accepts each parameter as null and then write your WHERE clause like:
WHERE (field1 = #param1 or #param1 is null)
AND (field2 = #param2 or #param2 is null) etc...
But I wouldn't recommend it. It can definitely affect performance doing it this way depending on the number of parameters you have. I second Joe Stefanelli's answer with looking into dynamic SQL in this case.
Depends on:
how your data looks like,
how big they are,
how exact result is expected (all matching records or top 100 is enough),
how much resources has you database.
you can try something like:
CREATE PROC dbo.Search(
#param1 INT = NULL,
#param2 VARCHAR(3) = NULL
)
AS
BEGIN
SET NOCOUNT ON
-- create temporary table to keep keys (primary) of matching records from searched table
CREATE TABLE #results (k INT)
INSERT INTO
#results(k)
SELECT -- you can use TOP here to norrow results
key
FROM
table
-- you can use WHERE if there are some default conditions
PRINT ##ROWCOUNT
-- if #param1 is set filter #result
IF #param1 IS NOT NULL BEGIN
PRINT '#param1'
;WITH d AS (
SELECT
key
FROM
table
WHERE
param1 <> #param1
)
DELETE FROM
#results
WHERE
k = key
PRINT ##ROWCOUNT
END
-- if #param2 is set filter #result
IF #param2 IS NOT NULL BEGIN
PRINT '#param2'
;WITH d AS (
SELECT
key
FROM
table
WHERE
param2 <> #param2
)
DELETE FROM
#results
WHERE
k = key
PRINT ##ROWCOUNT
END
-- returns what left in #results table
SELECT
table.* -- or better only columns you need
FROM
#results r
JOIN
table
ON
table.key = r.k
END
I use this technique on large database (millions of records, but running on large server) to filter data from some predefined data. And it works pretty well.
However I don't need all matching records -- depends on query 10-3000 matching records is enough.
If you are using a stored procedure you can use this method:
CREATE PROCEDURE dbo.foo
#param1 VARCHAR(32) = NULL,
#param2 INT = NULL
AS
BEGIN
SET NOCOUNT ON
SELECT * FROM MyTable as t
WHERE (#param1 IS NULL OR t.Column1 = #param1)
AND (#param2 IS NULL OR t.COlumn2 = #param2)
END
GO
These are usually called optional parameters. The idea is that if you don't pass one in it gets the default value (null) and that section of your where clause always returns true.

Proper way to handle 'optional' where clause filters in SQL?

Let's say you have a stored procedure, and it takes an optional parameter. You want to use this optional parameter in the SQL query. Typically this is how I've seen it done:
SELECT * FROM dbo.MyTableName t1
WHERE t1.ThisField = 'test'
AND (#MyOptionalParam IS NULL OR t1.MyField = #MyOptionalParam)
This seems to work well, however it causes a high amount of logical reads if you run the query with STATISTICS IO ON. I've also tried the following variant:
SELECT * FROM dbo.MyTableName t1
WHERE t1.ThisField = 'test'
AND t1.MyField = CASE WHEN #MyOptionalParam IS NULL THEN t1.MyField ELSE #MyOptionalParam END
And it yields the same number of high reads. If we convert the SQL to a string, then call sp_ExecuteSQL on it, the reads are almost nil:
DECLARE #sql nvarchar(max)
SELECT #sql = 'SELECT * FROM dbo.MyTableName t1
WHERE t1.ThisField = ''test'''
IF #MyOptionalParam IS NOT NULL
BEGIN
SELECT #sql = #sql + ' AND t1.MyField = #MyOptionalParam '
END
EXECUTE sp_ExecuteSQL #sql, N'#MyOptionalParam', #MyOptionalParam
Am I crazy? Why are optional where clauses so hard to get right?
Update: I'm basically asking if there's a way to keep the standard syntax inside of a stored procedure and get low logical reads, like the sp_ExecuteSql method does. It seems completely crazy to me to build up a string... not to mention it makes it harder to maintain, debug, visualize..
If we convert the SQL to a string, then call sp_ExecuteSQL on it, the reads are almost nil...
Because your query is no longer evaluating an OR, which as you can see kills sargability
The query plan is cached when using sp_executesql; SQL Server doesn't have to do a hard parse...
Excellent resource: The Curse & Blessing of Dynamic SQL
As long as you are using parameterized queries, you should safe from SQL Injection attacks.
This is another variation on the optional parameter technique:
SELECT * FROM dbo.MyTableName t1
WHERE t1.ThisField = 'test'
AND t1.MyField = COALESCE(#MyOptionalParam, t1.MyField)
I'm pretty sure it will have the same performance problem though. If performance is #1 then you'll probably be stuck with forking logic and near duplicate queries or building strings which is equally painful in TSQL.
You're using "OR" clause (implicitly and explicitly) on the first two SQL statements. Last one is an "AND" criteria. "OR" is always more expensive than "AND" criteria. No you're not crazy, should be expected.
EDIT: Adding link to similar question/answer with context as to why the union / if...else approach works better than OR logic (FYI, Remus, the answerer in this link, used to work on the SQL Server team developing service broker and other technologies)
Change from using the "or" syntax to a union approach, you'll see 2 seeks that should keep your logical read count as low as possible:
SELECT * FROM dbo.MyTableName t1
WHERE t1.ThisField = 'test'
AND #MyOptionalParam IS NULL
union all
SELECT * FROM dbo.MyTableName t1
WHERE t1.ThisField = 'test'
AND t1.MyField = #MyOptionalParam
If you want to de-duplicate the results, use a "union" instead of "union all".
EDIT: Demo showing that the optimizer is smart enough to rule out scan with a null variable value in UNION:
if object_id('tempdb..#data') > 0
drop table #data
go
-- Put in some data
select top 1000000
cast(a.name as varchar(100)) as thisField, cast(newid() as varchar(50)) as myField
into #data
from sys.columns a
cross join sys.columns b
cross join sys.columns c;
go
-- Shwo count
select count(*) from #data;
go
-- Index on thisField
create clustered index ixc__blah__temp on #data (thisField);
go
set statistics io on;
go
-- Query with a null parameter value
declare #MyOptionalParam varchar(50);
select *
from #data d
where d.thisField = 'test'
and #MyOptionalParam is null;
go
-- Union query
declare #MyOptionalParam varchar(50);
select *
from #data d
where d.thisField = 'test'
and #MyOptionalParam is null
union all
select *
from #data d
where d.thisField = 'test'
and d.myField = '5D25E9F8-EA23-47EE-A954-9D290908EE3E';
go
-- Union query with value
declare #MyOptionalParam varchar(50);
select #MyOptionalParam = '5D25E9F8-EA23-47EE-A954-9D290908EE3E'
select *
from #data d
where d.thisField = 'test'
and #MyOptionalParam is null
union all
select *
from #data d
where d.thisField = 'test'
and d.myField = '5D25E9F8-EA23-47EE-A954-9D290908EE3E';
go
if object_id('tempdb..#data') > 0
drop table #data
go
Change from using the "or" syntax to a two query approach, you'll see 2 different plans that should keep your logical read count as low as possible:
IF #MyOptionalParam is null
BEGIN
SELECT *
FROM dbo.MyTableName t1
END
ELSE
BEGIN
SELECT *
FROM dbo.MyTableName t1
WHERE t1.MyField = #MyOptionalParam
END
You need to fight your programmer's urge to reduce duplication here. Realize you are asking for two fundamentally different execution plans and require two queries to produce two plans.