How to lock and unlock a table exclusively outside of a transaction

How to lock and unlock a table exclusively outside of a transaction - sql

In SQL Server, how can a table be locked and unlocked (exclusively) outside of a transaction?
Reason: multiple instances of my app need to interact with one table without stepping on each other's toes, while at the same time executing a statement that cannot run within a transaction

One option might be to look into sp_getapplock and sp_releaseapplock.
This would not lock the table, but an arbitrary named resource. You could then configure your application to only work with the table in question after the lock has been acquired, e.g., through stored procedures.
An example of this would be something like:
EXEC sp_getapplock #Resource = 'ResourceName', #LockMode = 'Exclusive', #LockOwner = 'Session'
-- UPDATE table, etc.
EXEC sp_releaseapplock #Resource = 'ResourceName', #LockOwner = 'Session'
Specifying #LockOwner = 'Session' means you can use this locking mechanism outside of a transaction.
There's also the option of extracting the locking and releasing statements into their own stored procedures, so the logic is only stated once; these stored procedures could return a value to the calling procedure with a result specifying whether or not the lock has been acquired/released.
At that point it's just a question of ensuring that this mechanism is put into place for every procedure/table/etc. where there may be contention.

Related

Should I use sp_getapplock to prevent multiple instances of a stored procedure that conditionally inserts?

Hear me out! I know this use case sounds suspect, but...
I have a stored procedure which checks a table (effectively a cache) for data for a given requested ID. If it doesn't find any data for that ID, or deems it out of date, it executes a second stored procedure which will pull data from a separate database (using dynamic SQL, source DB name is based on the requested ID) and insert it into the local table. It then selects from this table.
If the data is in the table, everything returns quickly (ms), but if it needs to be brought in from the other database, it takes about 10 seconds. We're seeing race conditions where two concurrent instances check the local cache, see something is missing, and queue up sequential ingestions of the remote data into the cache. To avoid double-insertion, the cache-populating procedure will clear whatever is already there for this id, but this just means the first instance of the procedure can selecting no rows because the second instance deleted the just-inserted records before re-inserting them itself.
I think I want to put a lock around the entire procedure (checking the cache, potentially populating the cache, selecting from the cache) - although I'm open to other solutions. I think the overall caching approach has to remain on-demand though, the remote databases come and go by the hundreds, and we only want to cache the ones actually requested by reporting as-needed.
BEGIN TRANSACTION;
BEGIN TRY
-- Take out a lock intended to prevent anyone else modifying the cache while we're reading and potentially modifying it
EXEC sp_getapplock #Resource = '[private].[cache_entries]', #LockOwner='Transaction', #LockMode = 'Exclusive', #LockTimeout = 120000;
-- Invoke a stored procedure that ingests any required data that is not already cached
EXEC [private].populate_cache #required_dbs
-- CALCULATIONS
-- ... SELECT FROM [private].cache_entries
COMMIT TRANSACTION; -- Free the lock
END TRY
BEGIN CATCH --Ensure we release our lock on failure
ROLLBACK TRANSACTION;
THROW
END CATCH;

The alternative to sp_getapplock is to use locking hints with your transaction. Both are reasonable approaches. Locking hints can be complex, but they protect the target object itself rather than a single code path. So sometimes necessary. sp_getapplock is simple (with Transaction as owner), and reliable.

You can do this without sp_getapplock, which tends to inhibit concurrency a lot.
The way to do this is to continue do your checks within a transaction, but to apply a HOLDLOCK hint, as well as a UPDLOCK hint.
HOLDLOCK aka the SERIALIZABLE isolation level, will place a lock not only on the ID you specify, but even on the absence of such data, in other words it will prevent anyone else inserting into that ID.
You must use both these hints, as well as have an index that matches that SELECT, otherwise you could run into major blocking and deadlocking problems due to full table scans.
Also, you don't need a CATCH and ROLLBACK. Just use SET XACT_ABORT ON; which ensures a rollback in any event of an error.
SET XACT_ABORT ON; -- always have this set
BEGIN TRANSACTION;
DECLARE #SomeData nvarchar(100) = (
SELECT ce.SomeColumn
FROM [private].cache_entries ce WITH (HOLDLOCK, UPDLOCK)
WHERE ce.SomeCondition = 1
);
IF #SomeData IS NULL
BEGIN
-- Invoke a stored procedure that ingests any required data that is not already cached
EXEC [private].populate_cache #required_dbs
END
-- CALCULATIONS
-- ... SELECT FROM [private].cache_entries
COMMIT TRANSACTION; -- Free the lock

How can we avoid Stored Procedures being executed in parallel?

We have the following situation:
A Stored Procedure is invoked by a middleware and is given a XML file as parameter. The Procedure then parses the XML file and inserts values into temporary tables inside a loop. After looping, the values inside the temporary tables are inserted into physical tables.
Problem is, the Stored Procedure has a relatively long run-time (about 5 Minutes). In this period, it is likely that it is being invoked a second time, which would cause both processes to be suspended.
Now my question:
How can we avoid a second execution of a Stored Procedure if it is already running?
Best regards

I would recommend designing your application layer to prevent multiple instances of this process being run at once. For example, you could move the logic into a queue that is processed 1 message at a time. Another option would be locking at the application level to prevent the database call from being executed.
SQL Server does have a locking mechanism to ensure a block of code is not run multiple times: an "app lock". This is similar in concept to the lock statement in C# or other semaphores you might see in other languages.
To acquire an application lock, call sp_getapplock. For example:
begin tran
exec sp_getapplock #Resource = 'MyExpensiveProcess', #LockMode = 'Exclusive', #LockOwner = 'Transaction'
This call will block if another process has acquired the lock. If a second RPC call tries to run this process, and you would rather have the process return a helpful error message, you can pass in a #LockTimeout of 0 and check the return code.
For example, the code below raises an error if it could not acquire the lock. Your code could return something else that the application interprets as "process is already running, try again later":
begin tran
declare #result int
exec #result = sp_getapplock #Resource = 'MyExpensiveProcess', #LockMode = 'Exclusive', #LockOwner = 'Transaction', #LockTimeout = 0
if #result < 0
begin
rollback
raiserror (N'Could not acquire application lock', 16, 1)
end
To release the lock, call sp_releaseapplock.
exec sp_releaseapplock #Resource = 'MyExpensiveProcess'

Stored procedures are meant to be run multiple times and in parallel as well. The idea is to reuse the code.
If you want to avoid multiple run for same input, you need to take care of it manually. By implementing condition check for the input or using some locking mechanism.
If you don't want your procedure to run in parallel at all (regardless of input) best strategy is to acquire lock using some entry in DB table or using global variables depending on DBMS you are using.

You can check if the stored procedure is already running using exec sp_who2. This may be an approach to consider. In your SP, check this first and simply exit if it is. It will run again the next time the job executes.
You would need to filter out the current thread, make sure the count of that SP is 1 (1 will be for the current process, 2 means already running), or have a helper SP that is called first.
Here are other ideas: Check if stored procedure is running

Preventing deadlocks in SQL Server

I have an application connected to a SQL Server 2014 database that combines several rows into one. There are no other connections to this database while the application is running.
First, select a chunk of rows within a specific time span. This query uses a non-clustered seek (TIME column) merged with a clustered lookup.
select ...
from FOO
where TIME >= #from and TIME < #to and ...
Then, we process these rows in c# and write changes as a single update and multiple deletes, this happens many times per chunk. These also use non-clustered index seeks.
begin tran
update FOO set ...
where NON_CLUSTERED_ID = #id
delete FOO where NON_CLUSTERED_ID in (#id1, #id2, #id3, ...)
commit
I am getting deadlocks when running this with multiple parallel chunks. I tried using ROWLOCK for the update and delete but that caused even more deadlocks than before for some reason, even though there are no overlaps between chunks.
Then I tried TABLOCKX, HOLDLOCK on the update, but that means I can't perform my select in parallel so I'm losing the advantages of parallelism.
Any idea how I can avoid deadlocks but still process multiple parallel chunks?
Would it be safe to use NOLOCK on my select in this case, given there is no row overlap between chunks? Then TABLOCKX, HOLDLOCK would only block the update and delete, correct?
Or should I just accept that deadlocks will happen and retry the query in my application?
UPDATE (additional information): All deadlocks so far have happened in the update and delete phase, none in the select. I'll try to get some deadlock logs up if I can't get this solved today (the correct trace flags weren't enabled before).
UPDATE: These are the two arrangements of deadlocks that occur with ROWLOCK, they both refer only to the delete statement and the non-clustered index it uses. I'm not sure if these are the same as the deadlocks that occur without any table hints as I wasn't able to reproduce any of those.
Ask if there's anything else needed from the .xdl, I'm a bit weary of attaching the whole thing.

The general advice regarding deadlocks: make sure you do everything in the same order, i.e. acquire locks in the same order, for different processes.
You can find the same advice in this technical article on microsoft.com regarding Minimizing Deadlocks. There's a good reason it is listed first.
Access objects in the same order.
Avoid user interaction in transactions.
Keep transactions short and in one batch.
Use a lower isolation level.
Use a row versioning-based isolation level.
Set READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning.
Use snapshot isolation.
Use bound connections.
Update after question from Cato:
How would acquiring locks in the same order apply here? Have you got any advice on how he would change his SQL to do that?
Deadlocks are always the same, no matter what environment: two processes (say A & B) acquire multiple locks (say X & Y) in a different order so that A is waiting for Y and B is waiting for X while A is holding X and B is holding Y.
It applies here because DELETE and UPDATE statements implicitely acquire locks on the rows or index range or table (depending on what the engine deems appropriate).
You should analyze your process and see if there are scenarios where locks could be acquired in a different order. If that doesn't reveal anything, you can analyze deadlocks using the SQL Server Profiler:
To trace deadlock events, add the Deadlock graph event class to a trace. This event class populates the TextData data column in the trace with XML data about the process and objects that are involved in the deadlock. SQL Server Profiler can extract the XML document to a deadlock XML (.xdl) file which you can view later in SQL Server Management Studio. You can configure SQL Server Profiler to extract Deadlock graph events to a single file that contains all Deadlock graph events, or to separate files.

I'd use sp_getapplock in the updating transaction to prevent multiple instances of this code running in parallel. This will not block the selecting statement as table locking hints do.
You still should program the retrying logic, because it may take a while to acquire the lock, longer than the timeout parameter.
This is how the updating transaction can be wrapped into sp_getapplock.
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'some_unique_name_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
update FOO set ...
where NON_CLUSTERED_ID = #id
delete FOO where NON_CLUSTERED_ID in (#id1, #id2, #id3, ...)
END ELSE BEGIN
-- return some error code, so that the caller could retry
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
-- handle the error
END CATCH;
The selecting statement doesn't need any changes.
I would recommend against NOLOCK, even though you say that IDs in chunks do not overlap. With this hint the SELECT query can skip some pages that are being changed, it can read some pages twice. It is unlikely that such behavior can be tolerated.

Kindly use get applock in such format in code.  The stored procedure sp_getapplock puts the lock on the application resource .
EXEC Sp_getapplock
#Resource = 'storeprocedurename',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 25000
It is very helpful. Kindly increase LockTimeout to reduce deadlock

How to check if a proc is already running when called?

As a follow up to my previous question where I ask about storedproc_Task1 calling storedproc_Task2, I want to know if SQL (SQL Server 2012) has a way to check if a proc is currently running, before calling it.
For example, if storedproc_Task2 can be called by both storedproc_Task1 and storedproc_Task3, I don't want storedproc_Task1 to call storedproc_Task2 only 20 seconds after storedproc_Task3. I want the code to look something like the following:
declare #MyRetCode_Recd_In_Task1 int
if storedproc_Task2 is running then
--wait for storedproc_Task2 to finish
else
execute #MyRetCode_Recd_In_Task1 = storedproc_Task2 (with calling parameters if any).
end
The question is how do I handle the if storedproc_Task2 is running boolean check?
UPDATE: I initially posed the question using general names for my stored procedures, (i.e. sp_Task1) but have updated the question to use names like storedproc_Task1 instead. Per srutzky's reminder, the prefix sp_ is reserved for system procs in the [master] database.

Given that the desire is to have any process calling sp_Task2 wait until sp_Task2 completes if it is already running, that is essentially making sp_Task2 single-threaded.
This can be accomplished through the use of Application Locks (see sp_getapplock and sp_releaseapplock). Application Locks let you create locks around arbitrary concepts. Meaning, you can define the #Resource as "Task2" which will force each caller to wait their turn. It would follow this structure:
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'Task2', #LockMode = 'Exclusive';
...single-threaded code...
EXEC sp_releaseapplock #Resource = 'Task2';
COMMIT TRANSACTION;
You need to manage errors / ROLLBACK yourself (as stated in the linked MSDN documentation) so put in the usual TRY / CATCH. But, this does allow you to manage the situation.
This code can be placed either in sp_Task2 at the beginning and end, as follows:
CREATE PROCEDURE dbo.Task2
AS
SET NOCOUNT ON;
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'Task2', #LockMode = 'Exclusive';
{current logic for Task2 proc}
EXEC sp_releaseapplock #Resource = 'Task2';
COMMIT TRANSACTION;
Or it can be placed in all of the locations that calls sp_Task2, as follows:
CREATE PROCEDURE dbo.Task1
AS
SET NOCOUNT ON;
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'Task2', #LockMode = 'Exclusive';
EXEC dbo.Task2 (with calling parameters if any);
EXEC sp_releaseapplock #Resource = 'Task2';
COMMIT TRANSACTION;
I would think that the first choice -- placing the logic in sp_Task2 -- would be the cleanest since a) it is in a single location and b) cannot be avoided by someone else calling sp_Task2 outside of the currently defined paths (ad hoc query or a new proc that doesn't take this precaution).
Please see my answer to your initial question regarding not using the sp_ prefix for stored procedure names and not needing the return value.
Please note: sp_getapplock / sp_releaseapplock should be used sparingly; Application Locks can definitely be very handy (such as in cases like this one) but they should only be used when absolutely necessary.

If you are using a global table as stated in the answer to your previous question then just drop the global table at the end of the procedure and then to check if the procedure is still running just check for the existence of the table:
If Object_ID('tempdb...##temptable') is null then -- Procedure is not running
--do something
else
--do something else
end

sp_getApplock lockOwner values transaction vs session

I have stored procedure that needs to be running one at a time (should not be running concurrently).
For this I am using below stored proc
exec #result = sys.sp_getapplock
#Resource='Employee', #LockMode = 'Exclusive', #LockOwner= 'session', #LockTimeout = 0
Inside that call there is LockOwner set to 'session' and there is also another possible value called 'transaction'.
I do know that if I will choose transaction I need to write stored proc body within transaction.
I searched about differences between both but did not get any luck.
Can anybody please help me out what is difference between them apart from calling sp_getapplock inside transction?
I will also appreciate any best practices for solving my problem (prevent running stored procedure from concurrent execution.)

I think the SEQUENCE object might work better for you than the applock solution. It acts a little like an identity column without a table and will allow you to generate the sequential part of your random number in a thread-safe way.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas