Procedure Advice - sql-server-2012

I'm looking to boost performance on one of our processes within a production database. We have 2 sets of SPs which are by configuration settings stored within a configuration table.
An example syntax would be:
Declare #SWITCH BIT
IF #SWITCH = 1
INSERT INTO DEST_TABLE_A
SELECT VALUES
FROM SOURCE_TABLE
IF #SWITCH = 2
INSERT INTO DEST_TABLE_B
SELECT VALUES
FROM SOURCE_TABLE
Would it be better practice in this instance to move the IF logic into the WHERE clause to create a standardized instead of the logic having a conditional within it?
E.g.
INSERT INTO DEST_TABLE_A
SELECT VALUES
FROM SOURCE_TABLE
WHERE #SWITCH = 1
INSERT INTO DEST_TABLE_B
SELECT VALUES
FROM SOURCE_TABLE
WHERE #SWITCH = 2
I appreciate this might be an opinion piece but I was curious to see if anyone else has had experience with this scenario.

The second example might lead you to parameter sniffing problem . (longer explanation here)
This issue is created by the query optimizer that generates an execution plan optimized for one of the values of the switch statement (and this is the value that you send first time when you call the stored procedure).
In your case, if you call the stored procedure with #switch = 1 first time, an execution plan for this parameter is generated. Subsequent calls with #switch = 2 might take significantlty longer times to process.

Related

Is there a short hand method of retrieving a value from a function in an insert?

I have some tSQLt tests which are using magic numbers for some static data IDs and I'm trying to make them more self documenting by using a function.
Currently I'm using this but it's a bit more wordy than I would like and was hoping there was a short form method I could use without the extra brackets around the function. I know I could do this more efficiently by declaring the Ids as variables but as this is for tests my priority is more on the readability/self-documenting side.
INSERT INTO dbo.tAccessProfileAreaRight (id, AccessProfileId, AccessAreaRightId)
VALUES (1, 1, (SELECT dbo.GetAccessAreaRightId('Purchase Orders', 'Authorise'))),
(2, 1, (SELECT dbo.GetAccessAreaRightId('Purchase Invoices', 'Authorise'))),
You want your test to be more readable with is a great goal to aim at. However, your chosen way might not be optimal.
In general, a test should insert the data needed for the test in all tables accessed by the code under test. So in this case I suggest you insert a row into dbo.tAccessAreaRight and a row into dbo.tAccessProfile before inserting into dbo.tAccessProfileAreaRight.
To not be hindered by existing or potential future constraints on the tables, use tSQLt.FakeTable.
That would make your test look something like this:
CREATE PROCEDURE MyTestClass.[test ... is doing ... when ...]
AS
BEGIN
-- Assemble
exec tSQLt.FakeTable 'dbo.tAccessProfileAreaRight'
exec tSQLt.FakeTable 'dbo.tAccessAreaRight'
exec tSQLt.FakeTable 'dbo.tAccessProfile'
INSERT INTO dbo.tAccessProfile(id) VALUES (1042);
INSERT INTO dbo.tAccessAreaRight(id) VALUES (5013),(5017);
INSERT INTO dbo.tAccessProfileAreaRight(id, AccessProfileId, AccessAreaRightId)
VALUES(1,1042,5013),
(1,1042,5017);
--Act
INSERT INTO #SomeTableThatYouNeedToCreateFirst
EXEC dbo.ProcedureUnderTest #AccessProfileId=1042
--Assert
--do what you need to do to make sure the code behaves correctly,
--for example using EXEC tSQLt.AssertEqualsTable
END
GO
Because we are using tSQLt.FakeTable you do not need to worry about the columns you do not need, so in the three inserts above, just include the columns that are actually accessed by the code under test. For that same reason, you do want to explicitly list the columns, even if you end up using all columns in a table. That way, if an unrelated piece of functionality requires an additional column in the table later on, this test will be unaffected by that change.
I find this pattern leads to not only more immediately understandable tests, as for each "magic number," it is clear where it came from; you also make your test independent of an unrelated piece of code that at some point might change or just stop being maintained which could lead to random test failures, something we should strive to avoid.
You could do an Insert Into with a Select.
INSERT INTO dbo.tAccessProfileAreaRight (id, AccessProfileId, AccessAreaRightId)
select 1,1,dbo.GetAccessAreaRightId('Purchase Orders', 'Authorise')
UNION select 2,1,dbo.GetAccessAreaRightId('Purchase Invoices', 'Authorise')
UNION select ...
Also you don't have to specify the columns for the INSERT INTO if the select columns match the table columns exactly.
INSERT INTO dbo.tAccessProfileAreaRight
select 1,1,dbo.GetAccessAreaRightId('Purchase Orders', 'Authorise')
UNION select 2,1,dbo.GetAccessAreaRightId('Purchase Invoices', 'Authorise')

T-Sql - Select query in another select query takes long time

I have a procedure with arguments but its calling takes a very long time. I decided to check what is wrong with my query and came to the conclusion that the problem is Column In (SELECT [...]).
Both queries return 1500 rows.
First query: time 45 second
Second query: time 0 second
1.
declare #FILTER_OPTION int
declare #ID_DISTRIBUTOR type_int_value
declare #ID_DATA_TYPE type_bigint_value
declare #ID_AGGREGATION_TYPE type_int_value
set #FILTER_OPTION = 8
insert into #ID_DISTRIBUTOR values (19)
insert into #ID_DATA_TYPE values (30025)
insert into #ID_AGGREGATION_TYPE values (10)
SELECT * FROM dbo.[DATA] WHERE
[ID_DISTRIBUTOR] IN (select [VALUE] from #ID_DISTRIBUTOR)
AND [ID_DATA_TYPE] IN (select [VALUE] from #ID_DATA_TYPE)
AND [ID_AGGREGATION_TYPE] IN (select [VALUE] from #ID_AGGREGATION_TYPE)
2.
select * FROM dbo.[DATA] WHERE
[ID_DISTRIBUTOR] IN (19)
AND [ID_DATA_TYPE] IN (30025)
AND [ID_AGGREGATION_TYPE] IN (10)
Why this is happening?
How should I create a stored procedure that takes an array of arguments to use it quickly?
Edit:
Maybe it's a problem with indexes? indexes are created on these three columns.
For such a large performance difference, I would guess that you have one or more indexes. In particular, if you have an index on (ID_DISTRIBUTOR, ID_DATA_TYPE, ID_AGGREGATION_TYPE), then the second query can make use of the index. SQL Server can recognize that the IN is really = and the query is a simple lookup.
In the first case, SQL Server doesn't "know" that the subqueries really have only one row in them. That requires a different set of optimizations. In particular, the above index cannot be used, because the IN generally optimizes differently from =.
As for what to do. First, look at the execution plans so you can see the different between the two versions. Then, test the second version with more than one value in the IN lists.
If you can live with just one value for each comparison, then use = rather than IN.

Calling stored procedure to insert multiple values

In our application we have a multiline grids which have many records. For inserting or updating we are calling a stored procedure.
As per the current implementation the stored procedure is calling for each line in the grid. For each line it checks the existence in the table. If data is already there, it will update the table else insert new data into the table.
Instead of calling the procedure for each line, we thought create a table value parameter and pass all the grid values at the same time.
My questions are:
Is it a good approach?
How to handle the existence check (for insert or update) if I pass the values as table-valued parameter? Do I need to loop through the table and check it?
Is it better to have separate stored procedures for insert and update?
Please provide your suggestions. Thanks in advance.
1) TVP is a good approach. And a single stored proc call is more efficient with fewer calls to the Database.
2) You haven't made it clear if each row in the grid has some kind of ID column that determines if the data exists in the Table, however assuming there is, make sure that it is indexed then use INSERT INTO and UPDATE statements like this:
To add new rows:
INSERT INTO [grid_table]
SELECT * FROM [table_valued_parameter]
WHERE [id_column] NOT IN (SELECT [id_column] FROM [grid_table])
To update existing rows:
UPDATE gt
SET gt.col_A = tvp.col_A,
gt.col_B = tvp.col_B,
gt.col_C = tvp.col_C,
...
gt.col_Z = tvp.col_Z
FROM [grid_table] gt
INNER JOIN [table_valued_parameter] tvp ON gt.id_column = tvp.id_column
NB:
No need to do an IF EXISTS() or anything as the WHERE and JOIN
clauses will run the same checks,so no need to do a 'pre-check'
before running each statement.
This assumes the TVP data isthe same structure as the Table in the
database.
YOU MUST make sure the id_column is indexed.
I've use 'INNER JOIN' instead of just 'JOIN' to make the point it is an inner join
3) Using the approach above you just new one stored proc, simple and effective
It's a good approach
Any way try to put the logic through object level for iterating and checking and finally insert/update in T-SQL. This reduces overhead for RDMS as object level functionality is faster than operations in RDBMS.
Dont put too may stored procedures for each type of operation have a minimised procedures with multiple operations based on parameters you send to it.
Hope it helps!
Yes, it is a good approach. Calling procedure for each row is bad for performance. TVPs make life easier.
Yes, you can do that check in stored procedure, which should be a simple SELECT on uniqueId in most of the cases.
With this approach, yes, it is better to have both in same stored procedure.
1) Using TVP is good approach, but send only new or updated rows as TVP, no need to send entire datagrid.
2) For INSERT/UPDATE use MERGE example:
MERGE [dbo].[Contact] AS [Target]
USING #Contact AS [Source] ON [Target].[Email] = [Source].[Email]
WHEN MATCHED THEN
UPDATE SET [FirstName] = [Source].[FirstName],
[LastName] = [Source].[LastName]
WHEN NOT MATCHED THEN
INSERT ( [Email], [FirstName], [LastName] )
VALUES ( [Source].[Email], [Source].[FirstName], [Source].[LastName] );
3) For your case one stored procedure is enough.

Increase Execute Duration of Procedure When Using Variables in WhereClause

I Have a procedure executed in SQL Server 2008 R2, the script is:
DECLARE #LocalVar SMALLINT = GetLocalVarFunction();
SELECT
[TT].[ID],
[TT].[Title]
FROM [TargetTable] AS [TT]
LEFT JOIN [AcceccTable] AS [AT] ON [AT].[AccessID] = [TT].[ID]
WHERE
(
(#LocalVar = 1 AND ([AT].[Access] = 0 OR [AT].[Access] Is Null) AND
([TT].[Level] > 7)
);
GO
This Procedure executed in 16 seconds.
But When I change the Where Clause to:
WHERE
(
((1=1) AND [AT].[Access] = 0 OR [AT].[Access] Is Null) AND
([TT].[Level] > 7)
);
The Procedure Executed in less than 1 second.
As You see I just remove the local variable.
So where is the problem? Is there any thing I missing to use local variable in where clause? any suggestion to improve execute time when I using local variable in where clause?
Update:
I also think to add an if statement before script and split the procedure to 2 procedures, but I have 4 or 5 variables like above and use if statement is so complex.
Update2:
I change the set of #LocalVar:
DECLARE #LocalVar SMALLINT = 1;
There is no change in execute time.
When you use use local variables in WHERE filter then it causes FULL TABLE SCAN. The value of the local variable is not known to the SQL Server at compile time. hence SQL Server creates an execution plan for the largest scale that is avaliable for that column.
As you have seen that when you pass 1==1 then SQL server knows the value and hence the performance is not degraded. But the moment you pass a local variable the value is unknown.
One solution may be to use OPTION ( RECOMPILE ) at the end of your SQL query
You can check out the OPTIMIZE FOR UNKNOWN
When you use a local variable in WHERE optimizer doesn't know what to do with it.
You may check this link
What you could do in your case is run your query with displaying the actual plan in both cases and see how SQL is treating them.
It seems that you are using the #LocalVar as a branch condition, as follows:
If #LocalVar is 1 then apply a filter to the query
If #LocalVaris 0 then return an empty result set.
IMO you would be better off writing this condition explicitly, as then SQL will be in a position to optimize separate plans for the 2 branches, i.e.
DECLARE #LocalVar SMALLINT = GetLocalVarFunction();
IF (#LocalVar = 1)
SELECT
[TT].[ID],
[TT].[Title]
FROM [TargetTable] AS [TT]
LEFT JOIN [AcceccTable] AS [AT] ON [AT].[AccessID] = [TT].[ID]
WHERE
(
([AT].[Access] = 0 OR [AT].[Access] Is Null) AND
([TT].[Level] > 7)
)
ELSE
SELECT
[TT].[ID],
[TT].[Title]
FROM [TargetTable] AS [TT]
WHERE 1=2 -- Or any invalid filter, to retain the empty result
And then, because there are now 2 branches through your stored procedure, you should add WITH RECOMPILE to the stored proc, because the 2 branches have radically different query plans.
Edit
Just to clarify the comments:
Note that placing OPTION(RECOMPILE) after a query means that the query plan is never cached - this might not be a good idea if your query is called frequently.
The WITH RECOMPILE at a PROC level prevents caching of branches through the proc. It is not the same as OPTION(RECOMPILE) at query level.
If there are a large number of permutations of filter in your query, then the 'branching' technique above doesn't scale very well - your code quickly becomes unmaintainable.
You might unfortunately then need to consider using parameterized dynamic SQL. SQL will then at least cache a separate plan for each permutation.

Stored procedure bit parameter activating additional where clause to check for null

I have a stored procedure that looks like:
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND CASE WHEN #AdditionalFilter = 1 THEN
T.Column2 IS NOT NULL
Needless to say, this doesn't work. How can I activate the additional where clause that checks for the #AdditionalFilter parameter? Thanks for any help.
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND (#AdditionalFilter = 0 OR
T.Column2 IS NOT NULL)
If #AdditionalFilter is 0, the column won't be evaluated since it can't affect the outcome of the part between brackets. If it's anything other than 0, the column condition will be evaluated.
This practice tends to confuse the query optimizer. I've seen SQL Server 2000 build the execution plan exactly the opposite way round and use an index on Column1 when the flag was set and vice-versa. SQL Server 2005 seemed to at least get the execution plan right on first compilation, but you then have a new problem. The system caches compiled execution plans and tries to reuse them. If you first use the query one way, it will still execute the query that way even if the extra parameter changes, and different indexes would be more appropriate.
You can force a stored procedure to be recompiled on this execution by using WITH RECOMPILE in the EXEC statement, or every time by specifying WITH RECOMPILE on the CREATE PROCEDURE statement. There will be a penalty as SQL Server re-parses and optimizes the query each time.
In general, if the form of your query is going to change, use dynamic SQL generation with parameters. SQL Server will also cache execution plans for parameterized queries and auto-parameterized queries (where it tries to deduce which arguments are parameters), and even regular queries, but it gives most weight to stored procedure execution plans, then parameterized, auto-parameterized and regular queries in that order. The higher the weight, the longer it can stay in RAM before the plan is discarded, if the server needs the memory for something else.
CREATE PROCEDURE dbo.usp_TestFilter
#AdditionalFilter BIT = 1
AS
SELECT *
FROM dbo.SomeTable T
WHERE
T.Column1 IS NOT NULL
AND (NOT #AdditionalFilter OR T.Column2 IS NOT NULL)
select *
from SomeTable t
where t.Column1 is null
and (#AdditionalFilter = 0 or t.Column2 is not null)