Prevent parameter sniffing in stored procedures SQL Server 2008?

Prevent parameter sniffing in stored procedures SQL Server 2008? - sql

I have started creating a stored procedure that will search through my database table based on the passed parameters. So far I already heard about potential problems with kitchen sink parameter sniffing. There are a few articles that helped understand the problem but I'm still not 100% that I have a good solution. I have a few screens in the system that will search different tables in my database. All of them have three different criteria that the user will select and search on. First criteria are Status that can be Active,Inactive or All. Next will be Filter By, this can offer different options to the user depends on the table and the number of columns. Usually, users can select to filter by Name,Code,Number,DOB,Email,UserName or Show All. Each search screen will have at least 3 filters and one of them will be Show All. I have created a stored procedure where the user can search Status and Filter By Name,Code or Show All. One problem that I have is Status filter. Seems that SQL will check all options in where clause so If I pass parameter 1 SP returns all active records if I pass 0 then only inactive records. The problem is if I pass 2 SP should return all records (active and inactive) but I see only active records. Here is an example:
USE [TestDB]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROC [dbo].[Search_Master]
#Status BIT = NULL,
#FilterBy INT = NULL,
#Name VARCHAR(50) = NULL,
#Code CHAR(2) = NULL
WITH RECOMPILE
AS
DECLARE #MasterStatus INT;
DECLARE #MasterFilter INT;
DECLARE #MasterName VARCHAR(50);
DECLARE #MasterCode CHAR(2);
SET #MasterStatus = #Status;
SET #MasterFilter = #FilterBy;
SET #MasterName = #Name;
SET #MasterCode = #Code;
SELECT RecID, Status, Code, Name
FROM Master
WHERE
(
(#MasterFilter = 1 AND Name LIKE '%'+#MasterName+'%')
OR
(#MasterFilter = 2 AND Code = #MasterCode)
OR
(#MasterFilter = 3 AND #MasterName IS NULL AND #MasterCode IS NULL)
)
AND
(
(#MasterStatus != 2 AND MasterStatus = #Status)
OR
(#MasterStatus = 2 AND 1=1)
);
Other than problem with Status filter I'm wondering if there is any other issues that I might have with parameter sniffing? I found a blog that talks about preventing sniffing and one way to do that is by declaring local variables. If anyone have suggestions or solution for Status filter please let me know.

On your Status issue, I believe the problem is that your BIT parameter isn't behaving as you're expecting it to. Here's a quick test to demonstrate:
DECLARE #bit BIT;
SET #bit = 2
SELECT #bit AS [2=What?];
--Results
+---------+
| 2=What? |
+---------+
| 1 |
+---------+
From our friends at Microsoft:
Converting to bit promotes any nonzero value to 1.
When you pass in your parameter as 2, the engine does an implicit conversion from INTEGER to BIT, and your non-zero value becomes a 1.
You'll likely want to change the data type on that parameter, then use some conditional logic inside your procedure to deal with the various possible values as you want them to be handled.
On the issue of parameter sniffing, 1) read the article Sean suggests in the comments, but 2) if you keep that WITH RECOMPILE on your procedure, parameter sniffing can't happen.
The issue (but still read the article) is that SQL Server uses the first set of parameters you send through the proc to store an execution plan, but subsequent parameters require substantially different plans. Adding WITH RECOMPILE is forcing a new execution plan on every iteration, which has some overhead, but is may well be exactly what you want to do in your situation.
As a closing thought, SQL Server 2008 ended mainstream support in 2014 and extended support ends on 7/9/2019. An upgrade might be a good idea.

Related

Same query, different result after removing USE DatabaseName GO

My function GetProductDesc (when called) returns a different result after commenting out USE DatabaseName GO. I don't even know where to start debugging this. The pictures tell the story. I had to blur out a lot but you can see that the results are clearly different. Keep in mind that the pictures are not the function code, they are calling the function GetProductDesc
So strange. Any suggestions? I have an expert helping me later today but I had to share.
EDIT:
The function uses another lookup table in the same database. There is no Top or Order By clause. It calculates the product description based on the input components (numbers). It will return a different result if the input numbers are different, but here the input numbers are the same!
The function has been in place and working for over 5 years. I believe the problem started at about the time the version of SQL Server was updated recently.
EDIT 2 with partial answer:
The problem is caused by ##RowCount. It appears to be a breaking change caused by our recent migration to SQL Server 2019 although I haven't found the problem documented. The function returns a different product description based on ##RowCount following a Select statement. Internally the function does something like this:
SELECT Fields FROM Table WHERE Field = #Variable
IF ##Rowcount = 1
Return ProdDesc1
ELSE
Return ProdDesc2
After the SQL Server migration ##RowCount here was different depending on whether
USE DatabaseName
GO
was present.
The solution was to replace ##Rowcount with a variable #RowCount. This new code works:
DECLARE #RowCount INT = 0
SELECT Fields, #RowCount = #RowCount + 1
FROM Table WHERE Field = #Variable
IF #RowCount = 1
Return ProdDesc1
ELSE
Return ProdDesc2
If you have SQL Server 2019 installed try this to recreate the problem:
USE Master
GO
Select ##ROWCOUNT
The result here is ##ROWCOUNT = 0
Now comment out the two top lines:
--USE Master
--GO
Select ##ROWCOUNT
The result is now ##ROWCOUNT = 1
Anybody know why?

There is a SQL Server 2019 cumulative update from Microsoft that fixes this problem.

Correct usage of WHERE and AND in procedure with optional parameters

Pre-question info:
I'm writing a stored-procedure that would take some parameters and depending on those parameters(if they are filled - because they don't have to be) I'm adding few where clauses. The thing is I don't know if I'm gonna even use the where clause from start because I don't know if any of my params is going to be non-empty/not-null.
The inside of procedure looks cca like:
BEGIN
DECLARE #strMySelect varchar(max)
SET #strMySelect ='SELECT myparams FROM mytable'
// add some WHERE statement(*)
IF(ISNULL(#myParamDate1,'')<>'')BEGIN
SET #strMySelect =#strMySelect +'
AND param1 >='''+CAST(#myParamDate1 as varchar(30))+''''
END
IF(ISNULL(#myParamDate2,'')<>'')BEGIN
SET #strMySelect =#strMySelect +'
AND param1 <='''+CAST(#myParamDate2 as varchar(30))+''''
END
//... bit more of these "AND"s
EXECUTE(#strExec)
QUESTION:
Is it ok(correct way of doing this) to put in my query some WHERE statement that I know that will be always true so I can use in my parameter cases AND always? OR do I have to check for each param if it's first one that is filled or is there an easy way of checking in SQL that at least one of my parameters isn't NULL/empty?

I handle optional parameters like this:
where (
(#optionalParameter is not null and someField = #optionalParameter )
or
#optionalParameter is null
)
etc
I find it simpler.

Your extra where clause is not a problem from a performance point-of-view, since the query optimizer will (likely) remove the 1 = 1 condition anyway.
However, I would recommend a solution along the lines of what Dan Bracuk suggested for two reasons:
It is easier to read, write and debug.
You avoid the possibility of SQL injection attacks.
There are cases where you have to custom-build your query-string (e.g. when given a table name as parameter), but I would avoid it whenever possible.

You don't need to use EXEC function to check for parameters. A good practice is using case to check for parameter value for example
CREATE PROCEDURE MyProc
#Param1 int = 0
AS
BEGIN
SELECT * FROM MyTable WHERE CASE #param1 WHEN 0 THEN #param1 ELSE MyField END = #Param1
END
GO
In case that #param1 has no value (default 0) then you have #param1=#param1 which gives always true, in case you have #param with value then condition is MyField=#param1.

Update a variable number of parameters in a Stored Procedure

Assume the following please:
I have a table that has ~50 columns. I receive an import file that has information that needs to be updated in a variable number of these columns. Is there a method via a stored procedure where I can only update the columns that need to be updated and the rest retain their current value unchanged. Keep in mind I am not saying that the unaffected columns return to some default value but actually maintain the current value stored in them.
Thank you very much for any response on this.
Is COALESCE something that could be applied here as I have read in this thread: Updating a Table through a stored procedure with variable parameters
Or am I looking at a method more similar to what is being explained here:
SQL Server stored procedure with optional parameters updates wrong columns
For the record my sql skills are quite weak, I am just beginning to dive into this end of the pool
JD

Yes, you can use COALESCE to do this. Basically if the parameter is passed in as NULL then it will use the original value. Below is a general pattern that you can adapt.
DECLARE #LastName NVARCHAR(50) = 'John'
DECLARE #FirstName NVARCHAR(50) = NULL;
DECLARE #ID INT = 1;
UPDATE dbo.UpdateExample
SET LastName = COALESCE(#LastName, LastName), FirstName = COALESCE(#FirstName, FirstName),
WHERE ID = #ID
Also, have a read of this article, titled: The Impact of Non-Updating Updates
http://web.archive.org/web/20180406220621/http://sqlblog.com:80/blogs/paul_white/archive/2010/08/11/the_2D00_impact_2D00_of_2D00_update_2D00_statements_2D00_that_2D00_don_2D00_t_2D00_change_2D00_data.aspx
Basically,
"SQL Server contains a number of optimisations to avoid unnecessary logging or page flushing when processing an UPDATE operation that will not result in any change to the persistent database."

No query plan for procedure in SQL Server 2005

We have a SQL Server DB with 150-200 stored procs, all of which produce a viewable query plan in sys.dm_exec_query_plan except for one. According to http://msdn.microsoft.com/en-us/library/ms189747.aspx:
Under the following conditions, no Showplan output is returned in the query_plan column of the returned table for sys.dm_exec_query_plan:
If the query plan that is specified by using plan_handle has been evicted from the plan cache, the query_plan column of the returned table is null. For example, this condition may occur if there is a time delay between when the plan handle was captured and when it was used with sys.dm_exec_query_plan.
Some Transact-SQL statements are not cached, such as bulk operation statements or statements containing string literals larger than 8 KB in size. XML Showplans for such statements cannot be retrieved by using sys.dm_exec_query_plan unless the batch is currently executing because they do not exist in the cache.
If a Transact-SQL batch or stored procedure contains a call to a user-defined function or a call to dynamic SQL, for example using EXEC (string), the compiled XML Showplan for the user-defined function is not included in the table returned by sys.dm_exec_query_plan for the batch or stored procedure. Instead, you must make a separate call to sys.dm_exec_query_plan for the plan handle that corresponds to the user-defined function.
And later..
Due to a limitation in the number of nested levels allowed in the xml data type, sys.dm_exec_query_plan cannot return query plans that meet or exceed 128 levels of nested elements.
I'm confident that none of these apply to this procedure. The result never has a query plan, no matter what the timing, so 1 doesn't apply. There are no long string literals or bulk operations, so 2 doesn't apply. There are no user defined functions or dynamic SQL, so 3 doesn't apply. And there's little nesting, so the last doesn't apply. In fact, it's a very simple proc, which I'm including in full (with some table names changed to protect the innocent). Note that the parameter-sniffing shenanigans postdate the problem. It still happens even if I use the parameters directly in the query. Any ideas on why I don't have a viewable query plan for this proc?
ALTER PROCEDURE [dbo].[spGetThreadComments]
#threadId int,
#stateCutoff int = 80,
#origin varchar(255) = null,
#includeComments bit = 1,
#count int = 100000
AS
if (#count is null)
begin
select #count = 100000
end
-- copy parameters to local variables to avoid parameter sniffing
declare #threadIdL int, #stateCutoffL int, #originL varchar(255), #includeCommentsL bit, #countL int
select #threadIdL = #threadId, #stateCutoffL = #stateCutoff, #originL = #origin, #includeCommentsL = #includeComments, #countL = #count
set rowcount #countL
if (#originL = 'Foo')
begin
select * from FooComments (nolock) where threadId = #threadId and statusCode <= #stateCutoff
order by isnull(parentCommentId, commentId), dateCreated
end
else
begin
if (#includeCommentsL = 1)
begin
select * from Comments (nolock)
where threadId = #threadIdL and statusCode <= #stateCutoffL
order by isnull(parentCommentId, commentId), dateCreated
end
else
begin
select userId, commentId from Comments (nolock)
where threadId = #threadIdL and statusCode <= #stateCutoffL
order by isnull(parentCommentId, commentId), dateCreated
end
end

Hmm, perhaps the tables aren't really tables. They could be views or something else.

try putting dbo. or whatever the schema is in front of all of the table names, and then check again.
see this article:
http://www.sommarskog.se/dyn-search-2005.html
quote from the article:
As you can see, I refer to all tables
in two-part notation. That is, I also
specify the schema (which in SQL
7/2000 parlance normally is referred
to as owner.) If I would leave out the
schema, each user would get his own
his own private version of the query
plan

Can we write a sub function or procedure inside another stored procedure

I want to check if SQL Server(2000/05/08) has the ability to write a nested stored procedure, what I meant is - WRITING a Sub Function/procedure inside another stored procedure. NOT calling another SP.
Why I was thinking about it is- One of my SP is having a repeated lines of code and that is specific to only this SP.So if we have this nested SP feature then I can declare another sub/local procedure inside my main SP and put all the repeating lines in that. and I can call that local sp in my main SP. I remember such feature is available in Oracle SPs.
If SQL server is also having this feature, can someone please explain some more details about it or provide a link where I can find documentation.
Thanks in advance
Sai

I don't recommend doing this as each time it is created a new execution plan must be calculated, but YES, it definitely can be done (Everything is possible, but not always recommended).
Here is an example:
CREATE PROC [dbo].[sp_helloworld]
AS
BEGIN
SELECT 'Hello World'
DECLARE #sSQL VARCHAR(1000)
SET #sSQL = 'CREATE PROC [dbo].[sp_helloworld2]
AS
BEGIN
SELECT ''Hello World 2''
END'
EXEC (#sSQL)
EXEC [sp_helloworld2];
DROP PROC [sp_helloworld2];
END
You will get the warning
The module 'sp_helloworld' depends on the missing object 'sp_helloworld2'.
The module will still be created; however, it cannot run successfully until
the object exists.
You can bypass this warning by using EXEC('sp_helloworld2') above.
But if you call EXEC [sp_helloworld] you will get the results
Hello World
Hello World 2

It does not have that feature. It is hard to see what real benefit such a feature would provide, apart from stopping the code in the nested SPROC from being called from elsewhere.

Oracle's PL/SQL is something of a special case, being a language heavily based on Ada, rather than simple DML with some procedural constructs bolted on. Whether or not you think this is a good idea probably depends on your appetite for procedural code in your DBMS and your liking for learning complex new languages.
The idea of a subroutine, to reduce duplication or otherwise, is largely foreign to other database platforms in my experience (Oracle, MS SQL, Sybase, MySQL, SQLite in the main).
While the SQL-building proc would work, I think John's right in suggesting that you don't use his otherwise-correct answer!
You don't say what form your repeated lines take, so I'll offer three potential alternatives, starting with the simplest:
Do nothing. Accept that procedural
SQL is a primitive language lacking
so many "essential" constructs that
you wouldn't use it at all if it
wasn't the only option.
Move your procedural operations outside of the DBMS and execute them in code written in a more sophisticated language. Consider ways in which your architecture could be adjusted to extract business logic from your data storage platform (hey, why not redesign the whole thing!)
If the repetition is happening in DML, SELECTs in particular, consider introducing views to slim down the queries.
Write code to generate, as part of your build process, the stored procedures. That way if the repeated lines ever need to change, you can change them in one place and automatically generate the repetition.
That's four. I thought of another one as I was typing; consider it a bonus.

CREATE TABLE #t1 (digit INT, name NVARCHAR(10));
GO
CREATE PROCEDURE #insert_to_t1
(
#digit INT
, #name NVARCHAR(10)
)
AS
BEGIN
merge #t1 AS tgt
using (SELECT #digit, #name) AS src (digit,name)
ON (tgt.digit = src.digit)
WHEN matched THEN
UPDATE SET name = src.name
WHEN NOT matched THEN
INSERT (digit,name) VALUES (src.digit,src.name);
END;
GO
EXEC #insert_to_t1 1,'One';
EXEC #insert_to_t1 2,'Two';
EXEC #insert_to_t1 3,'Three';
EXEC #insert_to_t1 4,'Not Four';
EXEC #insert_to_t1 4,'Four'; --update previous record!
SELECT * FROM #t1;
What we're doing here is creating a procedure that lives for the life of the connection and which is then later used to insert some data into a table.

John's sp_helloworld does work, but here's the reason why you don't see this done more often.
There is a very large performance impact when a stored procedure is compiled. There's a Microsoft article on troubleshooting performance problems caused by a large number of recompiles, because this really slows your system down quite a bit:
http://support.microsoft.com/kb/243586
Instead of creating the stored procedure, you're better off just creating a string variable with the T-SQL you want to call, and then repeatedly executing that string variable.
Don't get me wrong - that's a pretty bad performance idea too, but it's better than creating stored procedures on the fly. If you can persist this code in a permanent stored procedure or function and eliminate the recompile delays, SQL Server can build a single execution plan for the code once and then reuse that plan very quickly.

I just had a similar situation in a SQL Trigger (similar to SQL procedure) where I basically had same insert statement to be executed maximum 13 times for 13 possible key values that resulted of 1 event. I used a counter, looped it 13 times using DO WHILE and used CASE for each of the key values processing, while kept a flag to figure out when I need to insert and when to skip.

it would be very nice if MS develops GOSUB besides GOTO, an easy thing to do!
Creating procedures or functions for "internal routines" polute objects structure.
I "implement" it like this
BODY1:
goto HEADER HEADER_RET1:
insert into body ...
goto BODY1_RET
BODY2:
goto HEADER HEADER_RET2:
INSERT INTO body....
goto BODY2_RET
HEADER:
insert into header
if #fork=1 goto HEADER_RET1
if #fork=2 goto HEADER_RET2
select 1/0 --flow check!

I too had need of this. I had two functions that brought back case counts to a stored procedure, which was pulling a list of all users, and their case counts.
Along the lines of
select name, userID, fnCaseCount(userID), fnRefCount(UserID)
from table1 t1
left join table2 t2
on t1.userID = t2.UserID
For a relatively tiny set (400 users), it was calling each of the two functions one time. In total, that's 800 calls out from the stored procedure. Not pretty, but one wouldn't expect a sql server to have a problem with that few calls.
This was taking over 4 minutes to complete.
Individually, the function call was nearly instantaneous. Even 800 near instantaneous calls should be nearly instantaneous.
All indexes were in place, and SSMS suggested no new indexes when the execution plan was analyzed for both the stored procedure and the functions.
I copied the code from the function, and put it into the SQL query in the stored procedure. But it appears the transition between sp and function is what ate up the time.
Execution time is still too high at 18 seconds, but allows the query to complete within our 30 second time out window.
If I could have had a sub procedure it would have made the code prettier, but still may have added overhead.
I may next try to move the same functionality into a view I can use in a join.
select t1.UserID, t2.name, v1.CaseCount, v2.RefCount
from table1 t1
left join table2 t2
on t1.userID = t2.UserID
left join vwCaseCount v1
on v1.UserID = t1.UserID
left join vwRefCount v2
on v2.UserID = t1.UserID
Okay, I just created views from the functions, so my execution time went from over 4 minutes, to 18 seconds, to 8 seconds. I'll keep playing with it.

I agree with andynormancx that there doesn't seem to be much point in doing this.
If you really want the shared code to be contained inside the SP then you could probably cobble something together with GOTO or dynamic SQL, but doing it properly with a separate SP or UDF would be better in almost every way.

For whatever it is worth, here is a working example of a GOTO-based internal subroutine. I went that way in order to have a re-useable script without side effects, external dependencies, and duplicated code:
DECLARE #n INT
-- Subroutine input parameters:
DECLARE #m_mi INT -- return selector
-- Subroutine output parameters:
DECLARE #m_use INT -- instance memory usage
DECLARE #m_low BIT -- low memory flag
DECLARE #r_msg NVARCHAR(max) -- low memory description
-- Subroutine internal variables:
DECLARE #v_low BIT, -- low virtual memory
#p_low BIT -- low physical memory
DECLARE #m_gra INT
/* ---------------------- Main script ----------------------- */
-- 1. First subroutine invocation:
SET #m_mi = 1 GOTO MemInfo
MI1: -- return here after invocation
IF #m_low = 1 PRINT '1:Low memory'
ELSE PRINT '1:Memory OK'
SET #n = 2
WHILE #n > 0
BEGIN
-- 2. Second subroutine invocation:
SET #m_mi = 2 GOTO MemInfo
MI2: -- return here after invocation
IF #m_low = 1 PRINT '2:Low memory'
ELSE PRINT '2:Memory OK'
SET #n = #n - 1
END
GOTO EndOfScript
MemInfo:
/* ------------------- Subroutine MemInfo ------------------- */
-- IN : #m_mi: return point: 1 for label MI1 and 2 for label MI2
-- OUT: #m_use: RAM used by isntance,
-- #m_low: low memory condition
-- #r_msg: low memory message
SET #m_low = 1
SELECT #m_use = physical_memory_in_use_kb/1024,
#p_low = process_physical_memory_low ,
#v_low = process_virtual_memory_low
FROM sys.dm_os_process_memory
IF #p_low = 1 OR #v_low = 1 BEGIN
SET #r_msg = 'Low memory.' GOTO LOWMEM END
SELECT #m_gra = cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name = N'Memory Grants Pending'
IF #m_gra > 0 BEGIN
SET #r_msg = 'Memory grants pending' GOTO LOWMEM END
SET #m_low = 0
LOWMEM:
IF #m_mi = 1 GOTO MI1
IF #m_mi = 2 GOTO MI2
EndOfScript:

Thank you all for your replies!
I'm better off then creating one more SP with the repeating code and call that, which is the best way interms of performance and look wise.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas