Can an UPDATE of a Transaction T1 run at the same time with T2 that does a SELECT get to those rows before the Select does?

Can an UPDATE of a Transaction T1 run at the same time with T2 that does a SELECT get to those rows before the Select does? - sql

Let's say we have a table with many rows and a primary key (index).
T1 will do a SELECT that would search for some rows using the WHERE clause, locking them with Shared locks. At the same time, T2 will do an update on a row that falls into the range of T1's requested rows.
The question is, can the Update get to those rows before he Select does?
How is the engine locking rows when Selecting, one-by-one ,like : read this row,lock it, now move to the next, etc. In this case the Update might get to the rows before the Select gets them? and what if no index was used but a table scan instead?
The Update statement has a Select component too. How does the Update actually lock a row?
One by one, first reads it, then locks it with X, next one etc. In this scenario the Select could get to the rows before the Update does?
And is the Select part of the Update affected by the isolation level?
The question is targeted on traditional ANSI isolation systems and not Oracle/MVCC

There's quite a few questions here but I'll try to address some of them.
Both SELECT and UPDATE will lock as they go through the index or records in the table. Records already locked by SELECT will not be available for UPDATE and the other way round. This may even cause a deadlock, depending on the order of those operations (which is beyond your control).
If you need to update before select, you need to control it from your code level. If you start both at once, SQL Server will just start executing them and locking.
SELECT is affected by the isolation level, e.g. when your isolation level will be read uncommitted, select will read the data and not put any locks.

Related

Is it possible to lock on a value of a column in SQL Server?

I have a table that looks like that:
Id GroupId
1 G1
2 G1
3 G2
4 G2
5 G2
It should at any time be possible to read all of the rows (committed only). When there will be an update I want to have a transaction that will lock on group id, i.e. there should at any given time be only one transaction that attempts to update per GroupId.
It should ideally be still possible to read all committed rows (i.e. other transaction/ordinary reads that will not try to acquire the "update per group lock" should be still able to read).
The reason I want to do this is that an update can not rely on "outdated" data. I.e. I do make some calculations in a transaction and another transaction cannot edit row with id 1 or add a new row with the same GroupId after these rows were read by the first transaction (even though the first transaction would never modify the row itself it will be dependent on it's value).
Another "nice to have" requirement is that sometimes I would need the same requirement "cross group", i.e. the update transaction would have to lock 2 groups at the same time. (This is not a dynamic number of groups, but rather just 2)

Here are some ideas. I don't think any of them are perfect - I think you will need to give yourself a set of use-cases and try them. Some of the situations I tried after applying locks
SELECTs with the WHERE filter as another group
SELECTs with the WHERE filter as the locked group
UPDATES on the table with the WHERE clause as another group
UPDATEs on the table where ID (not GrpID!) was not locked
UPDATEs on the table where the row was locked (e.g., IDs 1 and 2)
INSERTs into the table with that GrpId
I have the funny feeling that none of these will be 100%, but the most likely answer is the second one (setting the transaction isolation level). It will probably lock more than desired, but will give you the isolation you need.
Also one thing to remember: if you lock many rows (e.g., there are thousands of rows with the GrpId you want) then SQL Server can escalate the lock to be a full-table lock. (I believe the tipping point is 5000 locks, but not sure).
Old-school hackjob
At the start of your transaction, update all the relevant rows somehow e.g.,
BEGIN TRAN
UPDATE YourTable
SET GrpId = GrpId
WHERE GrpId = N'G1';
-- Do other stuff
COMMIT TRAN;
Nothing else can use them because (bravo!) they are a write within a transaction.
Convenient - set isolation level
See https://learn.microsoft.com/en-us/sql/relational-databases/sql-server-transaction-locking-and-row-versioning-guide?view=sql-server-ver15#isolation-levels-in-the-
Before your transaction, set the isolation level high e.g., SERIALIZABLE.
You may want to read all the relevant rows at the start of your transaction (e.g., SELECT Grp FROM YourTable WHERE Grp = N'Grp1') to lock them from being updated.
Flexible but requires a lot of coding
Use resource locking with sp_getapplock and sp_releaseapplock.
These are used to lock resources, not tables or rows.
What is a resource? Well, anything you want it to be. In this case, I'd suggest 'Grp1', 'Grp2' etc. It doesn't actually lock rows. Instead, you ask (via sp_getapplock, or APPLOCK_TEST) whether you can get the resource lock. If so, continue. If not, then stop.
Anything code referring to these tables needs to be reviewed and potentially modified to ask if it's allowed to run or not. If something doesn't ask for permission and just does it, there's no actual real locks stopping it (except via any transactions you've explicity specified).
You also need to ensure that errors are handled appropriately (e.g., still releasing the app_lock) and that processes that are blocked are re-tried.

Avoiding deadlock when updating table

I've got a 3-tier app and have data cached on a client side, so I need to know when data changed on the server to keep this cache in sync.
So I added a "lastmodification" field in the tables, and update this field when a data change. But some 'parent' lastmodification rows must be updated in case child rows (using FK) are modified.
Fetching the MAX(lastmodification) from the main table, and MAX from a related table, and then MAX of these several values was working but was a bit slow.
I mean:
MAX(MAX(MAIN_TABLE), MAX(CHILD1_TABLE), MAX(CHILD2_TABLE))
So I switched and added a trigger to this table so that it update a field in a TBL_METADATA table:
CREATE TABLE [TABLE_METADATA](
[TABLE_NAME] [nvarchar](250) NOT NULL,
[TABLE_LAST_MODIFICATION] [datetime] NOT NULL
Now related table can update the 'main' table last modification time by just also updating the last modification in the metadata table.
Fetching the lastmodification is now fast
But ... now I've random deadlock related to updating this table.
This is due to 2 transactions modifying the TABLE_METADATA at a different step, and then locking each other.
My question: Do you see a way to keep this lastmodification update without locking the row?
In my case I really don't care if:
The lastmodification stay updated even if the transaction is rollback
The 'dirty' lastmodification (updated but not yet committed) is
overwritten by a new value
In fact, I really don't need these update to be in the transaction, but as they are executed by the trigger it's automatically in the current transaction.
Thank you for any help

As far as I know, you cannot prevent a U-lock. However, you could try reducing the number of locks to a minimum by using with (rowlock).
This will tell the query optimiser to lock rows one by one as they are updated, rather than to use a page or table lock.
You can also use with (nolock) on tables which are joined to the table which is being updated. An alternative to this would be to use set transaction isolation level read uncommitted.
Be careful using this method though, as you can potentially create corrupted data.
For example:
update mt with (rowlock)
set SomeColumn = Something
from MyTable mt
inner join AnotherTable at with (nolock)
on mt.mtId = at.atId
You can also add with (rowlock) and with (nolock)/set transaction isolation level read uncommitted to other database objects which often read and write the same table, to further reduce the likelihood of a deadlock occurring.
If deadlocks are still occurring, you can reduce read locking on the target table by self joining like this:
update mt with (rowlock)
set SomeColumn = Something
from MyTable mt
where mt.Id in (select Id from MyTable mt2 where Column = Condition)
More documentation about table hints can be found here.

Atomic update in SQL with two queries

Assume that I want to do atomic update for a column with two queries:
SELECT num FROM table WHERE id = ?
Put num inside variable x
UPDATE table SET num = x + 1 WHERE id = ?
And I make these two queries inside a transaction with isolation level: repeatable read.
Is it possible to have a deadlock situation?
Let's say there are two transactions running.
T1 reads the value num and acquires the share lock on this row
T2 reads the value num and acquires the share lock on this row
T1 now wants to get the exclusive lock to write but cannot because T2 holds a share lock
T2 now wants to get the exclusive lock to write but cannot because T1 holds a share lock
We have deadlock here.
But I tried in the code with Spring data JPA, there is no deadlock. Instead I have race condition and the final incremented value is less than what is expected. If I move to isolation level serializable, then I will have deadlock, and there is no race condition. The final increment value is good.
I thought serializable only deals with the range lock. Why does the deadlock happen in serializable isolation level and not repeatable read? with the two queries above.
PS: Don't tell me to user just one single UPDATE statement, I am trying to learn how to manual do atomic update with two statement and learn more about transaction here.

MS SQL table hints and locking, parallelism

Here's the situation:
MS SQL 2008 database with table that is updated approximately once a minute.
The table structure is similar to following:
[docID], [warehouseID], [docDate], [docNum], [partID], [partQty]
Typical working cycle:
User starts data exchange from in-house developed system:
BEGIN TRANSACTION
SELECT * FROM t1
WHERE [docDate] BETWEEN &DateStart AND &DateEnd
AND [warehouseID] IN ('w1','w2','w3')
...then system performs rather long processing of the data selected, generates the list of [docID]s to delete from t1, then goes
DELETE FROM t1 WHERE [docID] IN ('d1','d2','d3',...,'dN')
COMMIT TRANSACTION
Here, the problem is that while 1st transaction processes selected the data, another reads it too and then they together populate the same data in in-house system.
At first, I inserted (TABLOCKX) table hint into SELECT query. And it worked pretty well until users started to complain about system's performance.
Then I changed hints to (ROWLOCK, XLOCK, HOLDLOCK), assuming that it would:
exclusively lock...
selected rows (instead of whole table)...
until the end of transaction
But this seems making a whole table lock anyway. I have no access to database itself, so I can't just analyze these locks (actually, I have no idea yet how to do it, even if I had access)
What I would like to have as a result:
users are able to process data related with different warehouses and dates in parallel
as a result of 1., avoid duplication of downloaded data
Except locks, other solutions I have are (although they both seem clumsy):
Implement a flag in t1, showing that the data is under processing (and then do 'SELECT ... WHERE NOT [flag]')
Divide t1 into two parts: header and details, and apply locks separately.
I beleive that I might misunderstood some concepts with regards to transaction isolation levels and/or table hints and there is another (better) way.
Please, advise!

You may change a concept of workflow.
Instead of deleting records update them with setting extra field Deprecated from 0 to 1.
And read data not from the table but from the view where Deprecated = 0.
BEGIN TRANSACTION
SELECT * FROM vT1
WHERE [docDate] BETWEEN &DateStart AND &DateEnd
AND [warehouseID] IN ('w1','w2','w3')
where vT1 view looks like this:
select *
from t1
where Deprecated = 0
And the deletion will look like this:
UPDATE t1 SET Deprecated = 1 WHERE [docID] IN ('d1','d2','d3',...,'dN')
COMMIT TRANSACTION
Using such a concept you will achieve two goals:
decrease probability of locks
get history of movings on warehouses

SQL Table locking for concurrency [duplicate]

This question already has answers here:
Only inserting a row if it's not already there
(7 answers)
Closed 9 years ago.
I'm trying to make sure that one and only one row gets inserted into a table, but I'm running into issues where multiple processes are bumping into each other and I get more than one row. Here's the details (probably more detail than is needed, sorry):
There's a table called Areas that holds a hierarchy of "areas". Each "area" may have pending "orders" in the Orders table. Since it's a hierarchy, multiple "areas" can be grouped under a parent "area".
I have a stored procedure called FindNextOrder that, given an area, finds the next pending order (which could be in a child area) and "activates" it. "Activating" it means inserting the OrderID into the QueueActive table. The business rule is that an area can only have one active order at a time.
So my stored procedure has a statement like this:
IF EXISTS (SELECT 1 FROM QueueActive WHERE <Order exists for the given area>) RETURN
...
INSERT INTO QueueActive <Order data for the given area>
My problem is that there every once in a while, two different processes will call this stored procedure at almost the same time. When each one does the check for an existing row, each comes back with a zero. Because of that both processes do the insert statement and I end up with TWO active orders instead of just one.
How do I prevent this? Oh, and I happen to be using SQL Server 2012 Express but I need a solution that works in SQL Server 2000, 2005, and 2008 as well.
I already did a search for exclusively locking a table and found this answer but my attempt to implement this failed.

I would use some query hints on your select statement. The trouble is coming along because your procedure is only taking out shared locks and thus the other procedures can join in.
Tag on a WITH (ROWLOCK, XLOCK, READPAST) to your SELECT
ROWLOCK ensures that you are only locking the row.
XLOCK takes out an exclusive lock on the row, that way no one else can read it.
READPAST allows the query to skip over any locked rows and keep working instead of waiting.
The last one is optional and depends upon your concurrency requirements.
Further reading:
SQL Server ROWLOCK over a SELECT if not exists INSERT transaction
http://technet.microsoft.com/en-us/library/ms187373.aspx

Have you tried to create a trigger that rolls back second transaction if there is one Active order in a table?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas