Protect against parallel transaction updating row - sql

I'm building a simple set of queries for an SQL database. I've come across a situation I want to protect against, but I don't know the database theory terminology to explain what I'm asking.
In this example I have two simultaneous transactions occurring on a database. Transaction #1 begins and Transaction #2 begins after T1 but T2 ends before T1 does its commit.
The table USERS has columns id, name, passwordHash
--Transaction #1
BEGIN TRANSACTION;
SELECT id from USERS where name = someName;
--do some work, transaction #2 starts and completes quickly while this work is being performed
UPDATE USERS SET name = newName where id = $id;
COMMIT;
--Transaction #2
BEGIN TRANSACTION;
SELECT id from USERS where name = someName;
UPDATE USERS SET passwordHash = newPasswordHash where id = $id;
COMMIT;
I would like to have some kind of safety check performed where by if I am updating a row, I am only updating the same version of that row that existed at the time the transaction started.
In this case, I would like the Transaction 1 COMMIT to fail because Transaction 2 has already updated the row belonging to user with name someName.

You can use SELECT FOR UPDATE with NOWAIT to lock the rows against concurrent modifications by other transactions. That will guarantee that your later updates will run against the same version of those rows; other transaction cannot change these rows until your transaction commits.
Example (using Postgresql):
Transaction 1:
begin transaction;
select * from users where username = 'Fabian' for update nowait;
update users set passwordHash = '123' where username = 'Fabian';
commit;
Transaction 2, somewhere after transaction 1 has selected for update, but not committed:
> select * from users where username = 'Fabian' for update nowait;
ERROR: could not obtain lock on row in relation "users"
Edit
This is usually called pessimistic locking. The transaction that first selects the row will "win", any later select for update will fail. If you want that the transaction wins that first writes a change, you might want to go for an optimistic locking approach, as proposed by #Laurence.

The standard way to do this is to add a rowversion column to the table. You read this column along with the rest of the data. When you submit an update, you include it in the where clause. You can then check the number of rows affected to see if another transaction got in first.
Some databases have native support for such columns. E.g. SQL Server has the timestamp/rowversion datatype. Oracle has rowdependencies. DB2 has rowversion.

Related

Locking a specific row in postgres

I'm new enough to Postgres, and I'm trying to figure out how to lock a specific row of a table.
As an example, I have a table of users:
Name: John, Money: 1
Name: Jack, Money: 2
In my backend, I want to select John and make sure that no other calls can update (or even select possibly) John's row until my transaction is complete.
I think I need an exclusive lock from what I've read up online? I can't seem to find a good example of locking just 1 row from a table online, any idea?
Edit - Should I be doing it at method level like #SqlUpdate (or some form of that - using org.skife.jdbi.v2) or in the query itself?
If you want to lock the table in a specific selected row you need to LOCK FIRST them use the FOR UPDATE / FOR SHARE statement.
For example, in your case if you need to lock the first row you do this:
BEGIN;
LOCK TABLE person IN ROW EXCLUSIVE MODE;
-- BLOCK 1
SELECT * FROM person WHERE name = 'John' and money = 1 FOR UPDATE;
-- BLOCK 2
UPDATE person set name = 'John 2' WHERE name = 'John' and money = 1;
END;
In the BLOCK1 before the SELECT statement you are doing nothing only telling the database "Hey, I will do something in this table, so when I do, lock this table in this mode". You can select / update / delete any row.
But in BLOCK2 when you use the FOR UPDATE you lock that row to other transactions to specific modes(read the doc for more details). Will be locked until that transaction ends.
If you need a example do a test and try to do another SELECT ... FOR UPDATE in BLOCK2 before end the first transaction. It will be waiting the first transaction to end and will select right after it.
Only an ACCESS EXCLUSIVE lock blocks a SELECT (without FOR
UPDATE/SHARE) statement.
I am using it in a function to control subsequences and it is great. Hope you enjoy.
As soon as you update (and not commit) the row, no other transaction will be able to update that row.
If you want to lock the row before doing the update (which seems useless), you can do so using select ... for update.
You can not prevent other sessions from reading that row, and frankly that doesn't make sense either.
Even if your transaction hasn't finished (=committed) other sessions will not see any intermediate (inconsistent) values - they will see the state of the database as it was before your transaction started. That's the whole point of having a relational database that supports transactions.
You can use
LOCK TABLE table IN ACCESS EXCLUSIVE MODE;
when you are ready to read from your table. "SELECT" and all other operations will be queued until the end of the transaction (commit changes or rollback).
Note that this will lock the entire table and referring to PostgreSQL there is no table level lock that can lock exclusively a specific row.
So you can use
FOR UPDATE
row level lock in all your SELECT that will update your row and this will prevent all those SELECT that will update a row from reading your row !
PostgreSQL Documentation :
FOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents them from being locked, modified or deleted by other transactions until the current transaction ends. That is, other transactions that attempt UPDATE, DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE or SELECT FOR KEY SHARE of these rows will be blocked until the current transaction ends; conversely, SELECT FOR UPDATE will wait for a concurrent transaction that has run any of those commands on the same row, and will then lock and return the updated row (or no row, if the row was deleted). Within a REPEATABLE READ or SERIALIZABLE transaction, however, an error will be thrown if a row to be locked has changed since the transaction started. For further discussion see Section 13.4.
The FOR UPDATE lock mode is also acquired by any DELETE on a row, and also by an UPDATE that modifies the values on certain columns. Currently, the set of columns considered for the UPDATE case are those that have a unique index on them that can be used in a foreign key (so partial indexes and expressional indexes are not considered), but this may change in the future.*
I'm using my own table, my table name is paid_properties and it has two columns user_id and counter.
As you want one transaction at a time, so you can use one of the following locks:
FOR UPDATE mode assumes a total change (or delete) of a row.
FOR NO KEY UPDATE mode assumes a change only to the fields that are not involved in unique indexes (in other words, this change does not affect foreign keys).
The UPDATE command itself selects the minimum appropriate locking mode; rows are usually locked in the FOR NO KEY UPDATE mode.
To test it run following query in one tab (I'm using pgadmin4):
BEGIN;
SELECT * FROM paid_properties WHERE user_id = 37 LIMIT 1 FOR NO KEY UPDATE;
SELECT pg_sleep(60);
UPDATE paid_properties set counter = 4 where user_id = 37;
-- ROLLBACK; -- If you want to discard the operations you did above
END;
And the following query in another tab:
UPDATE paid_properties set counter = counter + 90 where user_id = 37;
You'll see that you're the second query will not be executed until the first one finishes and you'll have an answer of 94 which is correct in my case.
For more information:
https://postgrespro.com/blog/pgsql/5968005
https://www.postgresql.org/docs/current/explicit-locking.html
Hope this is helpful

Avoiding concurrency problems with MAX+1 integer in SQL Server 2008... making own IDENTITY value

I need to increment an integer in a SQL Server 2008 column.
Sounds like I should use an IDENTITY column, but I need to increment separate counters for each of my customers. Think of an e-commerce site where each customer gets their own incrementing order number, starting with 1. The values must be unique (per customer).
For example,
Customer1 (Order #s 1,2,3,4,5...)
Customer2 (Order #s 1,2,3,4,5...)
Essentially, I will need to manually do the work of SQL's identity function since the number of customers is unlimited and I need order # counters for each of them.
I am quite comfortable doing:
BEGIN TRANSACTION
SELECT #NewOrderNumber = MAX(OrderNumber)+1 From Orders where CustomerID=#ID
INSERT INTO ORDERS VALUES (#NewOrderNumber, other order columns here)
COMMIT TRANSACTION
My problem is locking and concurrency concerns and assuring a unique value. It seems we need to lock with TABLOCKX. But this is a high volume database and I can't just lock the whole Orders table every time I need to do a SELECT MAX+1 process and insert a new order record.
But, if I don't lock the whole table, then I might not get a unique value for that customer. Because some of our order entry is done after-the-fact in batches by a multi-threaded Windows process, it is possible that 2 operations will be simultaneously wanting to insert a new order for the same customer.
So what locking methodology or technique will avoid deadlocks and still let me maintain unique incrementing order numbers PER customer?
In SQL Server 2005 and later, this is best done atomically, without using any transactions or locking:
update ORDERS
set OrderNumber=OrderNumber+1
output inserted.OrderNumber where CustomerID=#ID
I would introduce a table to keep last number per customer
to query and update it in the same transaction with order generation.
TABLE CustomerNextOrderNumber
{
CustomerID id PRIMARY KEY,
NextOrderNumber int
}
Update lock on select will help to avoid race condition when two orders are placed concurrently by the same customer.
BEGIN TRANSACTION
DECLARE #NextOrderNumber INT
SELECT #NextOrderNumber = NextOrderNumber
FROM CustomerNextOrderNumber (UPDLOCK)
WHERE CustomerID = #CustomerID
UPDATE CustomerNextOrderNumber
SET NextOrderNumber = NextOrderNumber + 1
WHERE CustomerID = #CustomerID
... use number here
COMMIT
Similar, but more straightforward approach (inspired by Joachim Isaksson)
update lock here is imposed by the first update.
BEGIN TRANSACTION
DECLARE #NextOrderNumber INT
UPDATE CustomerNextOrderNumber
SET NextOrderNumber = NextOrderNumber + 1
WHERE CustomerID = #CustomerID
SELECT #NextOrderNumber = NextOrderNumber
FROM CustomerNextOrderNUmber
where CustomerID = #CustomerID
...
COMMIT
The default transaction level, read committed, does not protect you against phantom reads. A phantom read is when another process inserts a row in between your select and insert:
BEGIN TRANSACTION
SELECT #NewOrderNumber = MAX(OrderNumber)+1 From Orders where CustomerID=#ID
INSERT INTO ORDERS VALUES (#NewOrderNumber, other order columns here)
COMMIT TRANSACTION
Even one level higher, repeatable read, doesn't protect you. Only the highest isolation level, serializable, protects against phantom reads.
So one solution is the highest isolation level:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
...
Another solution is to use the tablockx, holdlock and updlock table hints to make sure only your transaction can modify the table. The first locks the table, the second keeps the lock until the end of the transaction, and the third grabs an update lock for the select, so it doesn't have to upgraded later.
SELECT #NewOrderNumber = MAX(OrderNumber)+1
From Orders with (tablockx, holdlock, updlock)
where CustomerID=#ID
These queries will be quick if you have an index on CustomerID, so I wouldn't worry too much about concurrency, certainly not if you have less than 10 orders per minute.
You could do this:
BEGIN TRANSACTION
SELECT ID
FROM Customer WITH(ROWLOCK)
WHERE Customer.ID = #ID
SELECT #NewOrderNumber = MAX(OrderNumber)+1 From Orders where CustomerID=#ID
INSERT INTO ORDERS VALUES (#NewOrderNumber, other order columns here)
COMMIT TRANSACTION
We are now only locking one Customer from the customers table and not all customers, whenever 2 people try to add an order for the same customer at the same time, whoever gets the lock on the customer first wins and the other person will have to wait.
If people are inserting orders for different customers, they won't get in each others way!
Here is how this would work:
User1 start to insert an order for Customer with ID 1000.
User2 tries to insert an order for Customer with ID 1000.
User2 have to wait until User1 finish inserting the order.
User1 insert the order and the transaction is committed.
User2 can now insert the order and is guaranteed to get the true max orderId for customer 1000.
would it be possible to create a table with an IDENTITY field in for each customer, then you could insert a new record in to the customer's table and pull the value from that.
You are trying to relate two completely different requirements.
Even if you got this working. What happens if Customer A has an ealier order deleted, are you going to renumber the all their existing records to keep them consecutive and starting from 1. Now that would be a locking a problem....
Give the record an identity (or possibly a guid) When you want a count, query for it, if you want row number (never seen the point of that myself), use rowno.
You do not need a an auto increementing order per customer, you don't want one, and without a massive amount of locking can't have one.
Lateral thinking time.
If you present
Order Description Date Due
1 Staples 26/1/2012
2 Stapler 1/3/2012
3 Paper Clips 19/1/2012
it doesn't mean (and in fact shouldn't mean) that the order keys are 1, 2 and 3, they can be anything as long as they fulfill a uniqueness requirement.
create table TestIds
(customerId int,
nextId int)
insert into TestIds
values(1,1)
insert into TestIds
values(2,1)
insert into TestIds
values(3,1)
go
create proc getNextId(#CustomerId int)
as
declare #NextId int
while (##ROWCOUNT = 0)
begin
select #NextId = nextId
from TestIds
where customerId = #CustomerId
update TestIds
set nextId = nextId + 1
where customerId = #CustomerId
and nextId = #NextId
end
select #NextId
go

How can I fix "Snapshot isolation transaction aborted due to update conflict"?

I see an error message related to transaction isolation levels. There are two tables involved, first one is updated frequently with transaction isolation level set to SERIALIZABLE, the second one has a foreign key on first one.
Problem occurs when doing insert or update of the second table. Once in few hours I get the following error message:
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.first' directly or indirectly in database 'DB' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
I don't set transaction isolation level when inserting or updating second table, also I ran command DBCC USEROPTIONS and it returns read_committed.
First:
It seems, you're not using SERIALIZABLE, but snapshot isolation which was introduced with MSSQL 2005. Here is an article to understand the difference:
http://blogs.msdn.com/b/craigfr/archive/2007/05/16/serializable-vs-snapshot-isolation-level.aspx
=> This was based on the error, message, but as you have explained again in the comments the error comes when editing the second table.
Second:
For modifications MSSQL Server always tries to acquire locks, and since there are locks (by using a transaction) on the first table which escalate to locks on the second table because of the (foreign key) the operation fails. So every modification causes in fact a mini transaction.
The default transaction level on MSSQL is READ COMMITTED, but if you turn on the option READ_COMMITTED_SNAPSHOT it will convert READ COMMITTED to a SNAPSHOT like transaction every time you use READ COMMITTED. Which then leads to the error message you get.
To be precise as VladV pointed out, it's not really using the SNAPSHOT isolation level, but READ COMMITTED with row versioning rather than locking, but only on a statement basis, where SNAPSHOT is using row versioning on a transaction basis.
To understand the difference check out this:
http://msdn.microsoft.com/en-us/library/ms345124(SQL.90).aspx
To find out more about the READ_COMMITTED_SNAPSHOT, its explained in detail here:
http://msdn.microsoft.com/en-us/library/tcbchxcb(VS.80).aspx
and here:
Default SQL Server IsolationLevel Changes
Another reason for you to see SNAPSHOT isolation if you have not specified it, is by using implicit transaction. After turing this option on and you don't actually specify the isolation level on a modifying statement (which you don't), MS SQL server will choose whatever he believes is the right isolation level. Here are the details:
http://msdn.microsoft.com/en-us/library/ms188317(SQL.90).aspx
For all theses scenarios the solution is the same though.
Solution:
You need to execute the operations in sequence, and you can do this by specifically using a transaction with SERIALIZABLE isolation level on both operations: when inserting/updating the first and when inserting/updating the second.
This way you block the respective other until it is completed.
We had a similar issue - and you'd be glad to know that you should be able to solve the problem without removing the FK constraint.
Specifically, in our scenario, we had frequent updates to the parent table in a READ COMMITTED transaction. We also had frequent concurrent (long running) snapshot transactions occurring that needed to insert rows into a child table with a FK to parent table - so essentially it's the same scenario as yours, except we used a READ COMMITTED instead of SEREALIZABLE transaction.
To solve the problem, create a new UNIQUE NONCLUSTERED constraint on the primary table over the FK column. In addition you must also re-create the FK after you've created the unique constraint as this will ensure that the FK now references the constraint (not the clustered key).
Note: the disadvantage is that you now have a seemingly redundant constraint on the table that needs to be maintained by SQL server when updates are made to the parent table. That said, it may be a good opportunity for you to consider a different/alternate clustered key...and if you're lucky, it could even replace the need for another index on this table...
Unfortunately I can't find a good explanation on the web on why creating a unique constraint solves the problem. The easiest way I can explain why this works is because the FK now only references the unique constraint - and a modification to the parent table (i.e. to the non-FK referenced columns) does not cause an update conflict in the snapshot transaction as the FK now references an unchanged unique constraint entry. Contrast this with the clustered key where a change to any column in parent table would affect the row version in this table - and since the FK sees an updated version number, the snapshot transaction needs to abort.
Furthermore, if the parent row is deleted in the non-snapshot transaction, then both the clustered and unique constraints would be affected and, as expected, the snapshot transaction will roll back (so FK integrity is maintained).
I've been able to reproduce this problem using the above sample code that I have adapted from this blog entry
---------------------- SETUP Test database
-- Creating Customers table without unique constraint
USE master;
go
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'SnapshotTest')
BEGIN;
DROP DATABASE SnapshotTest;
END;
go
CREATE DATABASE SnapshotTest;
go
ALTER DATABASE SnapshotTest
SET ALLOW_SNAPSHOT_ISOLATION ON;
go
USE SnapshotTest;
go
CREATE TABLE Customers
(CustID int NOT NULL PRIMARY KEY,CustName varchar(40) NOT NULL);
CREATE TABLE Orders
(OrderID char(7) NOT NULL PRIMARY KEY,
OrderType char(1) CHECK (OrderType IN ('A', 'B')),
CustID int NOT NULL REFERENCES Customers (CustID)
);
INSERT INTO Customers (CustID, CustName) VALUES (1, 'First test customer');
INSERT INTO Customers (CustID, CustName) VALUES (2, 'Second test customer');
GO
---------------------- TEST 1: Run this test before test 2
USE SnapshotTest;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION;
-- Check to see that the customer has no orders
SELECT * FROM Orders WHERE CustID = 1;
-- Update the customer
UPDATE Customers SET CustName='Updated customer' WHERE CustID = 1;
-- Twiddle thumbs for 10 seconds before commiting
WAITFOR DELAY '0:00:10';
COMMIT TRANSACTION;
go
-- Check results
SELECT * FROM Customers (NOLOCK);
SELECT * FROM Orders (NOLOCK);
GO
---------------------- TEST 2: Run this test in a new session shortly after test 1
USE SnapshotTest;
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRANSACTION;
SELECT * FROM Customers WHERE CustID = 1;
INSERT INTO Orders (OrderID, OrderType, CustID) VALUES ('Order01', 'A', 1);
-- Twiddle thumbs for 10 seconds before commiting
WAITFOR DELAY '0:00:10';
COMMIT TRANSACTION;
go
-- Check results
SELECT * FROM Customers (NOLOCK);
SELECT * FROM Orders (NOLOCK);
go
And to fix the above scenario, re-setup the test database. Then run the following script before running Test 1 and 2.
ALTER TABLE Customers
ADD CONSTRAINT UX_CustID_ForSnapshotFkUpdates UNIQUE NONCLUSTERED (CustID)
-- re-create the existing FK so it now references the constraint instead of clustered index (the existing FK probably has a different name in your DB)
ALTER TABLE [dbo].[Orders] DROP CONSTRAINT [FK__Orders__CustID__1367E606]
ALTER TABLE [dbo].[Orders] WITH CHECK ADD FOREIGN KEY([CustID])
REFERENCES [dbo].[Customers] ([CustID])
GO
According to my 3 experiments with "SNAPSHOT", "SERIALIZABLE" and "READ COMMITTED SNAPSHOT" isolation levels, I got the same error below with only "SNAPSHOT" isolation level when updating the row which is already updated by other transaction while I did not get the same error below with "SERIALIZABLE" and "READ COMMITTED SNAPSHOT" isolation levels:
Snapshot isolation transaction aborted due to update conflict. You
cannot use snapshot isolation to access table 'dbo.person' directly or
indirectly in database 'test' to update, delete, or insert the row that
has been modified or deleted by another transaction. Retry the
transaction or change the isolation level for the update/delete
statement.
And, as the documentation says below, with "SNAPSHOT" isolation level, we get the same error above when updating the row which is already updated by other transaction. And, I do not think we can solve or avoid the error above so what we can do is retry the transaction again as the error above says. So if our applications get the error above, we will be able to handle the error above as an exception handling to retry the transaction again:
A snapshot transaction always uses optimistic concurrency control,
withholding any locks that would prevent other transactions from
updating rows. If a snapshot transaction attempts to commit an update
to a row that was changed after the transaction began, the transaction
is rolled back, and an error is raised.
For my experiment with "SNAPSHOT" isolation level, I created "person" table with "id" and "name" in "test" database as shown below:
id
name
1
John
2
David
Now, I did these steps with SQL queries as shown below:
Flow
Transaction 1 (T1)
Transaction 2 (T2)
Explanation
Step 1
BEGIN;
T1 starts.
Step 2
BEGIN;
T2 starts.
Step 3
UPDATE person SET name = 'Tom' WHERE id = 2;
T1 updates "David" to "Tom" so this row is locked by T1 until T1 commits.
Step 4
UPDATE person SET name = 'Lisa' WHERE id = 2;
T2 cannot update "Tom" to "Lisa" because this row is locked by T1 so T2 is waiting T1 to unlock this row by commit to update this row.
Step 5
COMMIT;
Waiting...
T1 commits.
Step 6
ROLLBACK;Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.person' directly or indirectly in database 'test' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
Now, T2 automatically rollbacks and gets the error.

trigger and transactions on temporary tables

can we create trigger and transactions on temporary tables?
when user will insert data then , if it is committed then the trigger would be fired , and that data would go from the temporary table into the actual tables.
and when the SQL service would stop, or the server would be shutdown, then the temporary tables would be deleted automatically.
or shall i use an another actual table , in which first the data would be inserted and then if it is committed then the trigger would be fired and the data would be sent to the main tables and then i would execute a truncate query to remove data from the interface table, hence removing the duplicate data.
I don't think you understand triggers - trigger firing is associated with the statement that they're related to, rather than when the transaction commits. Two scripts:
Script 1:
create table T1 (
ID int not null,
Val1 varchar(10) not null
)
go
create table T2 (
ID int not null,
Val2 varchar(10) not null
)
go
create trigger T_T1_I
on T1
after insert
as
insert into T2 (ID,Val2) select ID,Val1 from inserted
go
begin transaction
insert into T1 (ID,Val1)
select 10,'abc'
go
RAISERROR('Run script 2 now',10,1) WITH NOWAIT
WAITFOR DELAY '00:01:00'
go
commit
Script 2:
select * from T2 with (nolock)
Open two connections to the same DB, put one script in each connection. Run script 1. When it displays the message "Run script 2 now", switch to the other connection. You'll see that you're able to select uncommitted data from T2, even though that data is inserted by the trigger. (This also implies that appropriate locks are being held on T2 by script 1 until the trigger commits).
Since this implies that the equivalent of what you're asking for is to just insert into the base table and hold your transaction open, you can do that.
If you want to hide the actual shape of the table from users, create a view and write triggers on that to update the base tables. As stated above though, as soon as you've performed a DML operation against the view, the triggers will have fired, and you'll be holding locks against the base table. Depending on the transaction isolation level of other connections, they may see your changes, or be blocked until the transaction commits.
Triggers cannot be created on temp tables. But it is an unusual requirement to do so.
Temp tables can be part of a transaction, BUT table variables cannot.
As #Damien points out, triggers do NOT fire when a transaction is commited, rather they fire when an action on the table (INSERT, UPDATE, DELETE) with a corresponding trigger occurs.
Or create a view that you can insert data into. It will write back to the table and then the triggers will fire.

roll backin all the records in sql server

My interviewer asked me Question that i am inserting 10 rows in database table and in some 5th row i find the some trouble then how can i roll back all the records?
Please let me know how can i do this
Assuming all occur within the same transaction, use the ROLLBACK command.
Before you insert the rows
BEGIN TRANSACTION TransactionName
[Insert Rows]
Then either
COMMIT TRANSACTION TransactionName
OR
ROLLBACK TRANSACTION TransactionName
during any problems during the insert.