I just discovered the idea of testing a stored proc by calling it from within a BEGIN TRAN t1 ROLLBACK TRAN t1 pair.
I am a bit afraid of this. Is that a common practice ? Is it reliable ?
My goal here is to quicly test a stored proc that reads and updates 2 databases (same server). The SP does not do any truncate but uses a table variable combined with an INSERT.. OUTPUT statement.
The volume will be low (less than 1000 lines affected).
Thanks
There are a few things that can go wrong:
The proc could do its own transaction management
It could execute non-transactable statements like CREATE DATABASE
It could have an error, causing the transaction to automatically rollback. If the proc then continues to run in some way, it might write stuff outside of a transaction
XACT_ABORT might be used inconsistently, causing the previously mentioned effect
In general, this is a good technique, though.
Truncate is transacted, btw.
Related
Hear me out! I know this use case sounds suspect, but...
I have a stored procedure which checks a table (effectively a cache) for data for a given requested ID. If it doesn't find any data for that ID, or deems it out of date, it executes a second stored procedure which will pull data from a separate database (using dynamic SQL, source DB name is based on the requested ID) and insert it into the local table. It then selects from this table.
If the data is in the table, everything returns quickly (ms), but if it needs to be brought in from the other database, it takes about 10 seconds. We're seeing race conditions where two concurrent instances check the local cache, see something is missing, and queue up sequential ingestions of the remote data into the cache. To avoid double-insertion, the cache-populating procedure will clear whatever is already there for this id, but this just means the first instance of the procedure can selecting no rows because the second instance deleted the just-inserted records before re-inserting them itself.
I think I want to put a lock around the entire procedure (checking the cache, potentially populating the cache, selecting from the cache) - although I'm open to other solutions. I think the overall caching approach has to remain on-demand though, the remote databases come and go by the hundreds, and we only want to cache the ones actually requested by reporting as-needed.
BEGIN TRANSACTION;
BEGIN TRY
-- Take out a lock intended to prevent anyone else modifying the cache while we're reading and potentially modifying it
EXEC sp_getapplock #Resource = '[private].[cache_entries]', #LockOwner='Transaction', #LockMode = 'Exclusive', #LockTimeout = 120000;
-- Invoke a stored procedure that ingests any required data that is not already cached
EXEC [private].populate_cache #required_dbs
-- CALCULATIONS
-- ... SELECT FROM [private].cache_entries
COMMIT TRANSACTION; -- Free the lock
END TRY
BEGIN CATCH --Ensure we release our lock on failure
ROLLBACK TRANSACTION;
THROW
END CATCH;
The alternative to sp_getapplock is to use locking hints with your transaction. Both are reasonable approaches. Locking hints can be complex, but they protect the target object itself rather than a single code path. So sometimes necessary. sp_getapplock is simple (with Transaction as owner), and reliable.
You can do this without sp_getapplock, which tends to inhibit concurrency a lot.
The way to do this is to continue do your checks within a transaction, but to apply a HOLDLOCK hint, as well as a UPDLOCK hint.
HOLDLOCK aka the SERIALIZABLE isolation level, will place a lock not only on the ID you specify, but even on the absence of such data, in other words it will prevent anyone else inserting into that ID.
You must use both these hints, as well as have an index that matches that SELECT, otherwise you could run into major blocking and deadlocking problems due to full table scans.
Also, you don't need a CATCH and ROLLBACK. Just use SET XACT_ABORT ON; which ensures a rollback in any event of an error.
SET XACT_ABORT ON; -- always have this set
BEGIN TRANSACTION;
DECLARE #SomeData nvarchar(100) = (
SELECT ce.SomeColumn
FROM [private].cache_entries ce WITH (HOLDLOCK, UPDLOCK)
WHERE ce.SomeCondition = 1
);
IF #SomeData IS NULL
BEGIN
-- Invoke a stored procedure that ingests any required data that is not already cached
EXEC [private].populate_cache #required_dbs
END
-- CALCULATIONS
-- ... SELECT FROM [private].cache_entries
COMMIT TRANSACTION; -- Free the lock
edited terminology for accuracy:
We have large, daily flows of data within our data-mart. Some of the largest, done with Stored procedures managed by SSIS, take several hours. These long-running stored procedures are preventing the transaction-log from clearing (which compounds the issue because we have numerous SP's running at once, which are then all writing to the T-log with no truncate). Eventually this breaks our database and we're forced to recover from the morning snapshot.
We have explored doing "sub"-commits within the SP, but as I understand it you can't fully release the transaction log within an active stored procedure, because it is itself a transaction.
Without refactoring our large SP's to run in batches, or something to that effect, is it possible to commit to the transaction log periodically within an active SP, so that we release the lock on the transaction log?
edit / extension:
Perhaps I was wrong above:
Will committing intermittently within the SP allow the transaction-log to truncate?
Will committing intermittently within the SP allow the transaction-log to truncate?
If the client starts a transaction, it's not recommended to COMMIT that transaction inside a stored procedure. It's not allowed to exit the stored procedure with a different ##trancount than it was entered with.
The following pattern is technically allowed, although I have never seen it used in the real world:
use tempdb
if ##trancount > 0 rollback
go
drop table if exists T
create table T(id int identity)
go
create or alter procedure tranTest
as
begin
insert into T default values
commit transaction
begin transaction
end
go
begin transaction
exec tranTest
select * from T
rollback
go 5
It would be deeply confusing for client code to rollback a transaction and not have the stored procedure's work rolled back.
If the client doesn't start a transaction, you can have multiple transactions inside a stored procedure, but the smallest granularity for a transaction is a single DML statement. So each INSERT, UPDATE, DELETE, or MERGE would be run in a single transaction.
The practical solutions to this are, in descending order of goodness:
1) Increase the storage available to the log file to accommodate the transactions.
2) Refactor the ETL to use shorter transactions, possibly readying data in stating tables and loading or switching it in in a single, final transaction
3) Refactor the ETL to run in smaller batches.
BEGIN TRAN
SELECT * FROM AnySchema.AnyTable
WHERE AnyColumn = SomeCondition
COMMIT
I know the transaction is not required here because it is just a select but just want to know how bad a programming it is and whether it is going to be an overhead on the DB engine.
You may use transaction on SELECT statements to ensure nobody else could update / delete records of the table of while the bunch of your select queries are executing.
Using WITH(NOLOCK):
Anyways, you may also use WITH(NOLOCK) for t_sql
SELECT * FROM AnySchema.AnyTable WITH(NOLOCK)
WHERE AnyColumn = SomeCondition
WITH (NOLOCK) is the equivalent of using READ UNCOMMITED as a transaction isolation level. Here stand the risk of reading an uncommitted row that is subsequently rolled back, i.e. data that never made it into the database.
So, while it can prevent reads being deadlocked by other operations, it comes with a risk.
TRANSACTION Block :
Using TRANSACTION block will not cause much of extra DB overload but if you keep the same type practice on, and suppose , at any SQL block you forget (you / your developers may forget, right ?) to close the transaction, then other processes can't work on the same table.
Anyways, it depends on what type of application you are using. If very frequent update and select things are there , it is advised not to use such transaction blocks. If medium level of updates and select are there, occurs next to each other, you may use transaction blocks for select (but ensure to close the transaction).
That's actually a good question. To understand what's going on, you need to know about SET IMPLICIT_TRANSACTIONS.
When ON, the system is in implicit transaction mode. This means that if ##TRANCOUNT = 0, any of the following Transact-SQL statements begins a new transaction. It is equivalent to an unseen BEGIN TRANSACTION being executed first:
When OFF, each of the preceding T-SQL statements is bounded by an unseen BEGIN TRANSACTION and an unseen COMMIT TRANSACTION statement. When OFF, we say the transaction mode is autocommit. [this is the default]
If your T-SQL code visibly issues a BEGIN TRANSACTION, we say the transaction mode is explicit.
https://msdn.microsoft.com/en-us/library/ms187807.aspx
Since the SQL Server would have created a transaction for you, manually doing doesn't actually change anything. The exact same thing would have happened either way.
Summary: What you are doing isn't 'wrong' because it has no effect, but unnecessary and confusing to the reader.
I think you would be taking a holdlock and most likely a tablock
table hints
That is not always a good thing as you would block any update or deletes (maybe even inserts)
It would be better to let SQL decide what level of of locks to take. Most likely pagelocks. I would stay away from nolock as bad stuff can happen.
On a select on a single table just let the optimizer do it's thing.
I run a lot of queries that perform INSERT's, insert SELECT's, UPDATE's and ALTER's on tables, and when developing these queries, the intermediate steps that are run to test that various parts of the query work, potentially change the table or the data within the table.
Is it possible to do a dry run of a query and have SQL Management Studio give you what the results would be, without actually modifying the data or the table structure?
At the moment I have to back up the database, and run the query. If it works, good, if it doesn't, I have to restore the database (which can take around a hour) and I'm trying to avoid wasting all this time having to restore databases.
Use an SQL transaction to make your changes then back them out.
Before you execute your script:
BEGIN TRANSACTION;
After you execute your script and have done your checking:
ROLLBACK TRANSACTION;
Every change in your script will then be undone.
Note: Make sure you don't have a COMMIT in your script!
Begin the transaction, perform the table operations, and rollback as shown below:
BEGIN TRAN
UPDATE C
SET column1 = 'XXX'
FROM table1 C
SELECT *
FROM table1
WHERE column1 = 'XXX'
ROLLBACK TRAN
This will rollback all the operations performed since the last commit since the beginning of this transaction.
I am working on pymssql, a python MSSQL driver. I have encountered an interesting situation that I can't seem to find documentation for. It seems that when a CREATE TABLE statement fails, the transaction it was run in is implicitly rolled back:
-- shows 0
select ##TRANCOUNT
BEGIN TRAN
-- will cause an error
INSERT INTO foobar values ('baz')
-- shows 1 as expected
select ##TRANCOUNT
-- will cause an error
CREATE TABLE badschema.t1 (
test1 CHAR(5) NOT NULL
)
-- shows 0, this is not expected
select ##TRANCOUNT
I would like to understand why this is happening and know if there are docs that describe the situation. I am going to code around this behavior in the driver, but I want to make sure that I do so for any other error types that implicitly rollback a transaction.
NOTE
I am not concerned here with typical transactional behavior. I specifically want to know why an implicit rollback is given in the case of the failed CREATE statement but not with the INSERT statement.
Here is the definitive guide to error handling in Sql Server:
http://www.sommarskog.se/error-handling-I.html
It's long, but in a good way, and it was written for Sql Server 2000 but most of it is still accurate. The part you're looking for is here:
http://www.sommarskog.se/error-handling-I.html#whathappens
In your case, the article says that Sql Server is performing a Batch Abortion, and that it will take this measure in the following situations:
Most conversion errors, for instance conversion of non-numeric string to a numeric value.
Superfluous parameter to a parameterless stored procedure.
Exceeding the maximum nesting-level of stored procedures, triggers and functions.
Being selected as a deadlock victim.
Mismatch in number of columns in INSERT-EXEC.
Running out of space for data file or transaction log.
There's a bit more to it than this, so make sure to read the entire section.
It is often, but not always, the point of a transaction to rollback the entire thing if any part of it fails:
http://www.firstsql.com/tutor5.htm
One of the most common reasons to use transactions is when you need the action to be atomic:
An atomic operation in computer
science refers to a set of operations
that can be combined so that they
appear to the rest of the system to be
a single operation with only two
possible outcomes: success or failure.
en.wikipedia.org/wiki/Atomic_(computer_science)
It's probably not documented, because, if I understand your example correctly, it is assumed you intended that functionality by beginning a transaction with BEGIN TRAN
If you run as one batch (which I did first time), the transaction stays open because the INSERT aborts the batch and CREATE TABLE is not run. Only if you run line-by-line does the transaction get rolled back
You can also generate an implicit rollback for the INSERT by setting SET XACT_ABORT ON.
My guess (just had a light bulb moment as I typed the sentence above) is that CREATE TABLE uses SET XACT_ABORT ON internalls = implicit rollback in practice
Some more stuff from me on SO about SET XACT_ABORT (we use it in all our code because it releases locks and rolls back TXNs on client CommandTimeout)