Having TRANSACTION In All Queries - sql

Do you think always having a transaction around the SQL statements in a stored procedure is a good practice? I'm just about to optimize this legacy application in my company, and one thing I found is that every stored procedure has BEGIN TRANSACTION. Even a procedure with a single select or update statement has one. I thought it would be nice to have BEGIN TRANSACTION if performing multiple actions, but not just one action. I may be wrong, which is why I need someone else to advise me. Thanks for your time, guys.

It is entirely unnecessary as each SQL statement executes atomically, ie. as if it were already running in its own transaction. In fact, opening unnecessary transactions can lead to increased locking, even deadlocks. Forgetting to match COMMITs with BEGINs can leave a transaction open for as long as the connection to the database is open and interfere with other transactions in the same connection.
Such coding almost certainly means that whoever wrote the code was not very experienced in database programming and is a sure smell that there may be other problems as well.

The only possible reason I could see for this is if you have the possibility of needing to roll-back the transaction for a reason other than a SQL failure.
However, if the code is literally
begin transaction
statement
commit
Then I see absolutely no reason to use an explicit transaction, and it's probably being done because it's always been done that way.

I don't know of any benefit of not just using auto commit transactions for these statements.
Possible disadvantages of using explicit transactions everywhere might be that it just adds clutter to the code and so makes it less easy to see when an explicit transaction is being used to ensure correctness over multiple statements.
Additionally it increases the risk that a transaction is left open holding locks unless care is taken (e.g. with SET XACT_ABORT ON).
Also there is a minor performance implication as shown in #8kb's answer. This illustrates it another way using the visual studio profiler.
Setup
(Testing against an empty table)
CREATE TABLE T (X INT)
Explicit
SET NOCOUNT ON
DECLARE #X INT
WHILE ( 1 = 1 )
BEGIN
BEGIN TRAN
SELECT #X = X
FROM T
COMMIT
END
Auto Commit
SET NOCOUNT ON
DECLARE #X INT
WHILE ( 1 = 1 )
BEGIN
SELECT #X = X
FROM T
END
Both of them end up spending time in CMsqlXactImp::Begin and CMsqlXactImp::Commit but for the explicit transactions case it spends a significantly greater proportion of the execution time in these methods and hence less time doing useful work.
+--------------------------------+----------+----------+
| | Auto | Explicit |
+--------------------------------+----------+----------+
| CXStmtQuery::ErsqExecuteQuery | 35.16% | 25.06% |
| CXStmtQuery::XretSchemaChanged | 20.71% | 14.89% |
| CMsqlXactImp::Begin | 5.06% | 13% |
| CMsqlXactImp::Commit | 12.41% | 24.03% |
+--------------------------------+----------+----------+

When performing multiple insert/update/delete, it is better to have a transaction to insure atomicity on operation and it insure that all the tasks of operation are executed or none.
For single insert/update/delete statement, it depends upon what kind of operation (from business layer perspective) you are performing and how important it is. If you are performing some calculation before single insert/update/delete, then better use transaction, may be some data changed after you retrieve data for insert/update/delete.

One plus point is you can add another INSERT (for example) and it's already safe.
Then again, you also have the problem of nested transactions if a stored procedure calls another one. An inner rollback will cause error 266.
If every call is simple CRUD with no nesting then it's pointless: but if you nest or have multiple writes pre TXN then it's good to have a consistent template.

You mentioned that you'll be optimizing this legacy app.
One of the first, and easiest, things you can do to improve performance is remove all the BEGIN TRAN and COMMIT TRAN for the stored procedures that only do SELECTs.
Here is a simple test to demonstrate:
/* Compare basic SELECT times with and without a transaction */
DECLARE #date DATETIME2
DECLARE #noTran INT
DECLARE #withTran INT
SET #noTran = 0
SET #withTran = 0
DECLARE #t TABLE (ColA INT)
INSERT #t VALUES (1)
DECLARE
#count INT,
#value INT
SET #count = 1
WHILE #count < 1000000
BEGIN
SET #date = GETDATE()
SELECT #value = ColA FROM #t WHERE ColA = 1
SET #noTran = #noTran + DATEDIFF(MICROSECOND, #date, GETDATE())
SET #date = GETDATE()
BEGIN TRAN
SELECT #value = ColA FROM #t WHERE ColA = 1
COMMIT TRAN
SET #withTran = #withTran + DATEDIFF(MICROSECOND, #date, GETDATE())
SET #count = #count + 1
END
SELECT
#noTran / 1000000. AS Seconds_NoTransaction,
#withTran / 1000000. AS Seconds_WithTransaction
/** Results **/
Seconds_NoTransaction Seconds_WithTransaction
--------------------------------------- ---------------------------------------
14.23600000 18.08300000
You can see there is a definite overhead associated with the transactions.
Note: this is assuming your these stored procedures are not using any special isolation levels or locking hints (for something like handling pessimistic concurrency). In that case, obvously you would want to keep them.
So to answer the question, I would only leave in the transactions where you are actually attempting to preserve the integrity of the data modifications in case of an error in the code, SQL Server, or the hardware.

I can only say that placing a transaction block like this to every stored procedure might be a novice's work.
A transaction should be placed only in a block that has more than one insert/update statements, other than that, there is no need to place a transaction block in the stored procedure.

BEGIN TRANSACTION / COMMIT syntax shouldn't be used in every stored procedure by default unless you are trying to cover the following scenarios:
You include the WITH MARK option because you want to support restoring the database from a backup to a specific point in time.
You intend to port the code from SQL Server to another database platform like Oracle. Oracle does not commit transactions by default.

Related

SQL number generation in concurrent environment (Transation isolation level)

I am working with an application that generates invoice numbers (sequentially based on few parameters) and so far it has been using a trigger with serialized transaction. Because the trigger is rather "heavy" it manages to timeout execution of the insert query.
I'm now working on a solution to that problem and so far I came to the point where I have a stored procedure that do the insert and after the insert I have a transaction with isolation level serializable (which by the way applies to that transaction only or should i set it back after the transaction has been commited?) that:
gets the number
if not found do the insert into that table and if found updates the number (increment)
commits the transaction
I'm wondering whether there's a better way to ensure the number is used once and gets incrementer with the table locked (only the number tables gets locked, right?).
I read about sp_getapplock, would that be somewhat a better way to achieve my goal?
I would optimize the routine for update (and handle "insert if not there" separately), at which point it would be:
declare #number int;
update tbl
set #number = number, number += 1
where year = #year and month = #month and office = #office and type = #type;
You don't need any specific locking hints or isolation levels, SQL Server will ensure no two transactions read the same value before incrementing.
If you'd like to avoid handling the insert separately, you can:
merge into tbl
using (values (#year, #month, #office, #type)) as v(y,m,o,t)
on tbl.year = v.year and tbl.month = v.month and tbl.office = v.office and tbl.type = v.type
when not matched by target then
insert (year, month, office, type, number) values(#year, #month, #office, #type, 1)
when matched then
update set #number = tbl.number, tbl.number += 1
;
Logically this should provide the same guard against race condition as update, but for some reason I don't remember where is the proof.
If you first insert and then update you have a time window where an invalid number is set and can be observed. Further, if the 2nd transaction fails which can always happen you have inconsistent data.
Try this:
Take a fresh number in tran 1.
Insert in tran 2 with the number that was taken already
That way you might burn a number but there will never be inconsistent data.

Conditionally inserting records into a table in multithreaded environment based on a count

I am writing a T-SQL stored procedure that conditionally adds a record to a table only if the number of similar records is below a certain threshold, 10 in the example below. The problem is this will be run from a web application, so it will run on multiple threads, and I need to ensure that the table never has more than 10 similar records.
The basic gist of the procedure is:
BEGIN
DECLARE #c INT
SELECT #c = count(*)
FROM foo
WHERE bar = #a_param
IF #c < 10 THEN
INSERT INTO foo
(bar)
VALUES (#a_param)
END IF
END
I think I could solve any potential concurrency problems by replacing the select statement with:
SELECT #c = count(*) WITH (TABLOCKX, HOLDLOCK)
But I am curious if there any methods other than lock hints for managing concurrency problems in T-SQL
One option would be to use the sp_getapplock system stored procedure. You can place your critical section logic in a transaction and use the built in locking of sql server to ensure synchronized access.
Example:
CREATE PROC MyCriticalWork(#MyParam INT)
AS
DECLARE #LockRequestResult INT
SET #LockRequestResult=0
DECLARE #MyTimeoutMiliseconds INT
SET #MyTimeoutMiliseconds=5000--Wait only five seconds max then timeouit
BEGIN TRAN
EXEC #LockRequestResult=SP_GETAPPLOCK 'MyCriticalWork','Exclusive','Transaction',#MyTimeoutMiliseconds
IF(#LockRequestResult>=0)BEGIN
/*
DO YOUR CRITICAL READS AND WRITES HERE
*/
--Release the lock
COMMIT TRAN
END ELSE
ROLLBACK TRAN
Use SERIALIZABLE. By definition it provides you the illusion that your transaction is the only transaction running. Be aware that this might result in blocking and deadlocking. In fact this SQL code is a classic candidate for deadlocking: Two transactions might first read a set of rows, then both will try to modify that set of rows. Locking hints are the classic way of solving that problem. Retry also works.
As stated in the comment. Why are you trying to insert on multiple threads? You cannot write to a table faster on multiple threads.
But you don't need a declare
insert into [Table_1] (ID, fname, lname)
select 3, 'fname', 'lname'
from [Table_1]
where ID = 3
having COUNT(*) <= 10
If you need to take a lock then do so
The data is not 3NF
Should start any design with a proper data model
Why rule out table lock?
That could very well be the best approach
Really, what are the chances?
Even without a lock you would have to have two at a count of 9 submit at exactly the same time. Even then it would stop at 11. Is the 10 an absolute hard number?

Getting deadlocks on MS SQL stored procedure performing a read/update (put code to handle deadlocks)

I have to admit I'm just learning about properly handling deadlocks but based on suggestions I read, I thought this was the proper way to handle it. Basically I have many processes trying to 'reserve' a row in the database for an update. So I first read for an available row, then write to it. Is this not the right way? If so, how do I need to fix this SP?
CREATE PROCEDURE [dbo].[reserveAccount]
-- Add the parameters for the stored procedure here
#machineId varchar(MAX)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION;
declare #id BIGINT;
set #id = (select min(id) from Account_Data where passfail is null and reservedby is null);
update Account_data set reservedby = #machineId where ID = #id;
COMMIT TRANSACTION;
END
You can write this as a single statement. That will may fix the update problem:
update Account_data
set reservedby = #machineId
where ID = (select min(id) from Account_Data where passfail is null and reservedby is null);
Well, yur problem is 2that you have 2 statements - a select and an update. if those run concurrent, then the select will..... make a read lock and the update will demand a write lock. At the same time 2 machins deadlock.
Simple solution is to make the initial select demand an uddate lock (WITH (ROWLOCK, UPDLOCK) as hint). That may or may not work (depends on what else goes on) but it is a good start.
Second step - if that fails - is to use an application elvel lock (sp_getapplock) that makes sure a critical system always has only one owner and htus only exeutes transactions serially.

sql server: Is this nesting in a transcation sufficient for getting a unique number from the database?

i want to generate a unique number from a table.
It has to be thread safe of course, so when i check for the last number and get '3', and then store '4' in the database, i don't want anybody else just in between those two actions (get the number and store it one higher in the database) also to get '3' back, and then also storing '4'
So i thought, put it in a transaction like this:
begin transaction
declare #maxNum int
select #maxNum = MAX(SequenceNumber) from invoice
where YEAR = #year
if #maxNum is null
begin
set #maxNum = 0
end
set #maxNum = #maxNum + 1
INSERT INTO [Invoice]
([Year]
,[SequenceNumber]
,[DateCreated])
VALUES
(#year
,#maxNum
,GETUTCDATE()
)
commit transaction
return #maxNum
But i wondered, is that enough, to put it in a transaction?
my first thought was: it locks this sp for usage by other people, but is that correct? how can sql server know what to lock at the first step?
Will this construction guarantee me that nobody else will do the select #maxnum part just when i am updating the #maxnum value, and at that moment receiving the same #maxnum as i did so i'm in trouble.
I hope you understand what i want to accomplish, and also if you know if i did choose the right solution.
EDIT:
also described as 'How to Single-Thread a stored procedure'
If you want to have the year and a sequence number stored in the database, and create an invoice number from that, I'd use:
a InvoiceYear column (which could totally be computed as YEAR(InvoiceDate))
an InvoiceID INT IDENTITY column which you could reset every year to 1
create a computed column InvoiceNumber as:
ALTER TABLE dbo.InvoiceTable
ADD InvoiceNumber AS CAST(InvoiceYear AS VARCHAR(4)) + '-' +
RIGHT('000000' + CAST(InvoiceID AS VARCHAR(6)), 6) PERSISTED
This way, you automagically get invoice numbers:
2010-000001
......
2010-001234
......
2010-123456
Of course, if you need more than 6 digits (= 1 million invoices) - just adjust the RIGHT() and CAST() statements for the InvoiceID column.
Also, since this is a persisted computed column, you can index it for fast retrieval.
This way: you don't have to worry about concurrency, stored procs, transactions and stuff like that - SQL Server will do that for you - for free!
No, it's not enough. The shared lock set by the select will not prevent anyone from reading that same value at the same time.
Change this:
select #maxNum = MAX(SequenceNumber) from invoice where YEAR = #year
To this:
select #maxNum = MAX(SequenceNumber) from invoice with (updlock, holdlock) where YEAR = #year
This way you replace the shared lock with an update lock, and two update locks are not compatible with each over.
The holdlock means that the lock is to be held until the end of the transaction. So you do still need the transaction bit.
Note that this will not help if there's some other procedure that also wants to do the update. If that other procedure reads the value without providing the updlock hint, it will still be able to read the previous value of the counter. This may be a good thing, as it improves concurrency in scenarios where the other readers do not intend to make an update later, but it also may be not what you want, in which case either update all procedures to use updlock, or use xlock instead to place an exclusive lock, not compatible with shared locks.
As it turned out, i didn't want to lock the table, i just wanted to execute the stored procedure one at a time.
In C# code i would place a lock on another object, and that's what was discussed here
http://www.sqlservercentral.com/Forums/Topic357663-8-1.aspx
So that's what i used
declare #Result int
EXEC #Result =
sp_getapplock #Resource = 'holdit1', #LockMode = 'Exclusive', #LockTimeout = 10000 --Time to wait for the lock
IF #Result < 0
BEGIN
ROLLBACK TRAN
RAISERROR('Procedure Already Running for holdit1 - Concurrent execution is not supported.',16,9)
RETURN(-1)
END
where 'holdit1' is just a name for the lock.
#result returns 0 or 1 if it succeeds in getting the lock (one of them is when it immediately succeeds, and the other is when you get the lock while waiting)

Can we write a sub function or procedure inside another stored procedure

I want to check if SQL Server(2000/05/08) has the ability to write a nested stored procedure, what I meant is - WRITING a Sub Function/procedure inside another stored procedure. NOT calling another SP.
Why I was thinking about it is- One of my SP is having a repeated lines of code and that is specific to only this SP.So if we have this nested SP feature then I can declare another sub/local procedure inside my main SP and put all the repeating lines in that. and I can call that local sp in my main SP. I remember such feature is available in Oracle SPs.
If SQL server is also having this feature, can someone please explain some more details about it or provide a link where I can find documentation.
Thanks in advance
Sai
I don't recommend doing this as each time it is created a new execution plan must be calculated, but YES, it definitely can be done (Everything is possible, but not always recommended).
Here is an example:
CREATE PROC [dbo].[sp_helloworld]
AS
BEGIN
SELECT 'Hello World'
DECLARE #sSQL VARCHAR(1000)
SET #sSQL = 'CREATE PROC [dbo].[sp_helloworld2]
AS
BEGIN
SELECT ''Hello World 2''
END'
EXEC (#sSQL)
EXEC [sp_helloworld2];
DROP PROC [sp_helloworld2];
END
You will get the warning
The module 'sp_helloworld' depends on the missing object 'sp_helloworld2'.
The module will still be created; however, it cannot run successfully until
the object exists.
You can bypass this warning by using EXEC('sp_helloworld2') above.
But if you call EXEC [sp_helloworld] you will get the results
Hello World
Hello World 2
It does not have that feature. It is hard to see what real benefit such a feature would provide, apart from stopping the code in the nested SPROC from being called from elsewhere.
Oracle's PL/SQL is something of a special case, being a language heavily based on Ada, rather than simple DML with some procedural constructs bolted on. Whether or not you think this is a good idea probably depends on your appetite for procedural code in your DBMS and your liking for learning complex new languages.
The idea of a subroutine, to reduce duplication or otherwise, is largely foreign to other database platforms in my experience (Oracle, MS SQL, Sybase, MySQL, SQLite in the main).
While the SQL-building proc would work, I think John's right in suggesting that you don't use his otherwise-correct answer!
You don't say what form your repeated lines take, so I'll offer three potential alternatives, starting with the simplest:
Do nothing. Accept that procedural
SQL is a primitive language lacking
so many "essential" constructs that
you wouldn't use it at all if it
wasn't the only option.
Move your procedural operations outside of the DBMS and execute them in code written in a more sophisticated language. Consider ways in which your architecture could be adjusted to extract business logic from your data storage platform (hey, why not redesign the whole thing!)
If the repetition is happening in DML, SELECTs in particular, consider introducing views to slim down the queries.
Write code to generate, as part of your build process, the stored procedures. That way if the repeated lines ever need to change, you can change them in one place and automatically generate the repetition.
That's four. I thought of another one as I was typing; consider it a bonus.
CREATE TABLE #t1 (digit INT, name NVARCHAR(10));
GO
CREATE PROCEDURE #insert_to_t1
(
#digit INT
, #name NVARCHAR(10)
)
AS
BEGIN
merge #t1 AS tgt
using (SELECT #digit, #name) AS src (digit,name)
ON (tgt.digit = src.digit)
WHEN matched THEN
UPDATE SET name = src.name
WHEN NOT matched THEN
INSERT (digit,name) VALUES (src.digit,src.name);
END;
GO
EXEC #insert_to_t1 1,'One';
EXEC #insert_to_t1 2,'Two';
EXEC #insert_to_t1 3,'Three';
EXEC #insert_to_t1 4,'Not Four';
EXEC #insert_to_t1 4,'Four'; --update previous record!
SELECT * FROM #t1;
What we're doing here is creating a procedure that lives for the life of the connection and which is then later used to insert some data into a table.
John's sp_helloworld does work, but here's the reason why you don't see this done more often.
There is a very large performance impact when a stored procedure is compiled. There's a Microsoft article on troubleshooting performance problems caused by a large number of recompiles, because this really slows your system down quite a bit:
http://support.microsoft.com/kb/243586
Instead of creating the stored procedure, you're better off just creating a string variable with the T-SQL you want to call, and then repeatedly executing that string variable.
Don't get me wrong - that's a pretty bad performance idea too, but it's better than creating stored procedures on the fly. If you can persist this code in a permanent stored procedure or function and eliminate the recompile delays, SQL Server can build a single execution plan for the code once and then reuse that plan very quickly.
I just had a similar situation in a SQL Trigger (similar to SQL procedure) where I basically had same insert statement to be executed maximum 13 times for 13 possible key values that resulted of 1 event. I used a counter, looped it 13 times using DO WHILE and used CASE for each of the key values processing, while kept a flag to figure out when I need to insert and when to skip.
it would be very nice if MS develops GOSUB besides GOTO, an easy thing to do!
Creating procedures or functions for "internal routines" polute objects structure.
I "implement" it like this
BODY1:
goto HEADER HEADER_RET1:
insert into body ...
goto BODY1_RET
BODY2:
goto HEADER HEADER_RET2:
INSERT INTO body....
goto BODY2_RET
HEADER:
insert into header
if #fork=1 goto HEADER_RET1
if #fork=2 goto HEADER_RET2
select 1/0 --flow check!
I too had need of this. I had two functions that brought back case counts to a stored procedure, which was pulling a list of all users, and their case counts.
Along the lines of
select name, userID, fnCaseCount(userID), fnRefCount(UserID)
from table1 t1
left join table2 t2
on t1.userID = t2.UserID
For a relatively tiny set (400 users), it was calling each of the two functions one time. In total, that's 800 calls out from the stored procedure. Not pretty, but one wouldn't expect a sql server to have a problem with that few calls.
This was taking over 4 minutes to complete.
Individually, the function call was nearly instantaneous. Even 800 near instantaneous calls should be nearly instantaneous.
All indexes were in place, and SSMS suggested no new indexes when the execution plan was analyzed for both the stored procedure and the functions.
I copied the code from the function, and put it into the SQL query in the stored procedure. But it appears the transition between sp and function is what ate up the time.
Execution time is still too high at 18 seconds, but allows the query to complete within our 30 second time out window.
If I could have had a sub procedure it would have made the code prettier, but still may have added overhead.
I may next try to move the same functionality into a view I can use in a join.
select t1.UserID, t2.name, v1.CaseCount, v2.RefCount
from table1 t1
left join table2 t2
on t1.userID = t2.UserID
left join vwCaseCount v1
on v1.UserID = t1.UserID
left join vwRefCount v2
on v2.UserID = t1.UserID
Okay, I just created views from the functions, so my execution time went from over 4 minutes, to 18 seconds, to 8 seconds. I'll keep playing with it.
I agree with andynormancx that there doesn't seem to be much point in doing this.
If you really want the shared code to be contained inside the SP then you could probably cobble something together with GOTO or dynamic SQL, but doing it properly with a separate SP or UDF would be better in almost every way.
For whatever it is worth, here is a working example of a GOTO-based internal subroutine. I went that way in order to have a re-useable script without side effects, external dependencies, and duplicated code:
DECLARE #n INT
-- Subroutine input parameters:
DECLARE #m_mi INT -- return selector
-- Subroutine output parameters:
DECLARE #m_use INT -- instance memory usage
DECLARE #m_low BIT -- low memory flag
DECLARE #r_msg NVARCHAR(max) -- low memory description
-- Subroutine internal variables:
DECLARE #v_low BIT, -- low virtual memory
#p_low BIT -- low physical memory
DECLARE #m_gra INT
/* ---------------------- Main script ----------------------- */
-- 1. First subroutine invocation:
SET #m_mi = 1 GOTO MemInfo
MI1: -- return here after invocation
IF #m_low = 1 PRINT '1:Low memory'
ELSE PRINT '1:Memory OK'
SET #n = 2
WHILE #n > 0
BEGIN
-- 2. Second subroutine invocation:
SET #m_mi = 2 GOTO MemInfo
MI2: -- return here after invocation
IF #m_low = 1 PRINT '2:Low memory'
ELSE PRINT '2:Memory OK'
SET #n = #n - 1
END
GOTO EndOfScript
MemInfo:
/* ------------------- Subroutine MemInfo ------------------- */
-- IN : #m_mi: return point: 1 for label MI1 and 2 for label MI2
-- OUT: #m_use: RAM used by isntance,
-- #m_low: low memory condition
-- #r_msg: low memory message
SET #m_low = 1
SELECT #m_use = physical_memory_in_use_kb/1024,
#p_low = process_physical_memory_low ,
#v_low = process_virtual_memory_low
FROM sys.dm_os_process_memory
IF #p_low = 1 OR #v_low = 1 BEGIN
SET #r_msg = 'Low memory.' GOTO LOWMEM END
SELECT #m_gra = cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name = N'Memory Grants Pending'
IF #m_gra > 0 BEGIN
SET #r_msg = 'Memory grants pending' GOTO LOWMEM END
SET #m_low = 0
LOWMEM:
IF #m_mi = 1 GOTO MI1
IF #m_mi = 2 GOTO MI2
EndOfScript:
Thank you all for your replies!
I'm better off then creating one more SP with the repeating code and call that, which is the best way interms of performance and look wise.