When does sql exclusively lock a row in an update statement? - sql

Can a race condition occur in sql under these conditions?
If I have this SQL update running in one thread call it statement 1:
Update Items
Set Flag = B
where Flag = A;
And this SQL update running in another call it statement 2:
Update Items
Set Flag = C
where Flag = A;
Is it possible for each thread to read the same record where Flag is equal to A and write the record with their own values? Such that statement 1 can write it first and then statement 2 writes it or visa versa?
The answer to this question depends on when the database exclusively locks the update. Does it happen before it finds the records or after it finds the records and evaluates the where clause?

First, there are three lock contexts:
Database level lock
Table level lock
Row level lock
Then you have four lock modes:
IX
IS
X
S
IX and IS locks are "intention" locks. These locks are held before acquiring other types of locks. X locks are exclusive (write) locks and S locks are shared (read) locks.
The locks (IX,IS,X or S) locks can be taken at any context level. An X lock at the database level will block all other operations in the database for example. This is the type of lock that SQLlite takes. An S lock is taken for the entire database during reads, and an X lock is taken for the entire database during writes. Writes will wait for any S locks to complete and will block new S and X locks until the write lock is released. This provides a serializable isolation transaction level.
For MySQL, the locking depends on the storage engine. MyISAM will take X and S locks on entire (sets of) tables. X locks will wait on existing S or X locks and block new locks. New X locks will be given higher priority in the queue, moved ahead of new S locks. This behavior can be changed by setting LOW_PRIORITY_UPDATES, which could result in write starvation because writes will be de-prioritized in favor of reads.
It is possible in MySQL to obtain an X lock over the entire database using 'FLUSH TABLES WITH READ LOCK'.
InnoDB locks rows as they are encountered via an index read. InnoDB locks index records and locks the records when the index records are traversed. InnoDB uses special locks called 'gap' locks to ensure REPEATABLE-READ transaction isolation level. Locks are held on index entries, so if a table is not well indexed for an UPDATE query, then many rows will be locked. Note that InnoDB does not create S locks for normal SELECT queries. It uses row versioning, not row level locking for consistent snapshots.
When acquiring X locks, the database needs to detect deadlocks. Consider the following:
>connection 1
start transaction;
update T set c = c + 1 order by id asc;
>connection 2
start transaction;
update T set c = c - 1 order by id desc;
In a row locking model, these two statements can not both complete successfully. The first would wait forever to acquire locks the second holds, and vice-versa. The database will pick one of the connections to roll back. InnoDB will pick the connection which has made the fewest number of changes. MyISAM will lock the whole table for whichever connection acquires the lock first, and then the second will run after the first completes.
The simple example given by you will be resolved by X locks at any context (database, table or row). If two connections begin at exactly the same type, both running two updates which try to update the same row, both will attempt to acquire an X lock. Only one connection can acquire the X lock. It is not possible to determine exactly which one will acquire the lock. The other connection will have to wait until the lock is released until it can acquire the X lock. Keep in mind, that if the row was locked by a DELETE or UPDATE, then the waiter might end up not acquiring a lock after waiting, because there is nothing left in the database to lock.
In your example, the first UPDATE to acquire the X lock, and the second UPDATE will then wait on the X lock and will eventually execute but not match any rows.

Exclusive lock, used for data-modification operations, such as INSERT, UPDATE, or DELETE will be used in this scenario.
An exclusive lock ensures that multiple updates cannot be made to the same resource at the same time.
You will not get a race condition in this scenario.
If you have a more complex scenario involving multiple tables then you may get race conditions, or deadlocks. There are many ways to avoid this, simplifying and separating queries, etc.
You can also apply hints to queries that tell SQL what type of lock to use.
http://msdn.microsoft.com/en-us/library/aa213026(v=sql.80).aspx

Sounds like you should read about locking. SQL server has a complex set of logic and will perform either table or row level locks based on the number of rows it estimates will require updates. Unless you specifically tell it which you want it to perform it can even vary from query to query. Usually if you are modifying a small subset of the table it will choose a row level lock.
SQL Server is designed with ACID in mind, thus it writes changes to its logs before performing any actual updates to the data. This allows any failed updates to be rolled back and allows consistency between queries (like your asking about). You can perform dirty reads to get around locking issues, however you cannot prevent SQL Server from locking inserted, updated and/or deleted records.
SQL Server Locking
EDIT: Here is an article about ACID.
ACID - Wikipedia

All SQL databases pretty much guarantee that such a collision will not occur. "When" locking occurs depends on whether locking is at the table, partition, page, or row level. Or, whether you have turned off such locking in your database.
What can happen, if you have concurrent update statements and multiple rows being updated, is that sone row are updated with the first, some with the second.
In general, I think of the where clause as being evaluated to select the row set, lock the rows one at a time, do the update and unlock. However, this depends on the type of locking. In this case, the scenario above would continue with the values flipping.
If you are concerned about this situation, use table level locking to force serialization when concurrent update requests are being processed.

Related

Update table statement

have a great day to everyone.
I have some confusion about UPDATE TABLE statement in Oracle DB 12cr2.
Let's assume we have 3 users:
U1;
U2;
U3;
U1 has a table called TEST_1, and U2 and U3 both have UPDATE privilege on that table.
My question is that: If U2 and U3 try to update same rows in that particular table at the same time what will happen? How Oracle will control such kind of processes?
Thanks in advance
Although the first answer already explains very well what is the locking mechanism, let me add a bit more information.
In your case, we are talking about Row Locks (TX). Row-level locks are primarily used to prevent two transactions from modifying the same row. When a transaction needs to modify a row, a row lock is acquired.
There is no limit to the number of row locks held by a statement or transaction, and Oracle does not escalate locks from the row level to a coarser granularity. Row locking provides the finest grain locking possible and so provides the best possible concurrency and throughput.
When two transactions ( updates in your case ) are attacking the same row, the first will acquire the lock , and it won't release it until it either commits or rollback. A system change number (SCN) which is a logical, internal time stamp used by Oracle Database will be assigned to each transactions. System Change Numbers or SCNs order events that occur within the database, which is necessary to satisfy the ACID properties of a transaction.
SCNs occur in a monotonically increasing sequence. Oracle Database can use an SCN like a clock because an observed SCN indicates a logical point in time and repeated observations return equal or greater values. If one event has a lower SCN than another event, then it occurred at an earlier time with respect to the database. Several events may share the same SCN, which means that they occurred at the same time with respect to the database.
Every transaction has an SCN. For example, if a transaction updates a row, then the database records the SCN at which this update occurred. Other modifications in this transaction have the same SCN. When a transaction commits, the database records an SCN for this commit.
Oracle Database increments SCNs in the system global area (SGA). When a transaction modifies data, the database writes a new SCN to the undo data segment assigned to the transaction. The log writer process then writes the commit record of the transaction immediately to the online redo log. The commit record has the unique SCN of the transaction. Oracle Database also uses SCNs as part of its instance recovery and media recovery mechanisms.
When two transactions occur at the very same time, for example the same second, the one which its timestamp is sooner is the one acquiring lock. Keep in mind that TIMESTAMP stores fractional_seconds_precision which specifies the number of digits in the fractional part of a SECOND. This fraction can be a number in the range 0 to 9.
In short One of the three users will acquire a lock on the table while updating it (whichever users executes the query first by micro secs differences) and rest of the two users have to wait for U1 to commit and release the lock, So now U2 or U3 will get updated data to work upon.
Snippet from another article:
COMMIT
When a COMMIT statement is issued to the database, the transaction has ended, and the following results are true:
All work done by the transaction becomes permanent.
Other users can see changes in data made by the transaction.
Any locks acquired by the transaction are released.

Postgres - Deadlock while updating a column that is also part of where clause

My colleague at work and I were wondering if, during an update, a column is being updated while the same column is used in where clause, there are chances of deadlock.
For ex:
UPDATE EMPLOYEES
SET DEPT_ID = NULL
WHERE DEPT_ID = 13;
So if the table EMPLOYEES contains about a million records, are there chances of deadlock?
There is no chance for a deadlock at all. Not only will a single query never deadlock itself in Postgres (see comments), there is also no chance for a deadlock in combination with the same query in a concurrent transactions.
The minimum "requirements" for a deadlock:
At least two competing concurrent transactions.
Each of both must lock a resource that one of the others will try to access later.
Each of both must later try to access a resource locked by the other transaction. So that at least two wait for the other to finish.
In theory two concurrent, identical calls like you display have the potential for a deadlock if there are multiple rows with the same DEPT_IT. Since there is no ORDER BY for a DELETE, it can take an exclusive row lock on rows to delete in any arbitrary order. Two identical commands might start with different rows and end up deadlocking each other.
In practice, this is not going to happen because both concurrent deletes will take locks in the same order thereby voiding any potential for deadlocks. We would need additional concurrent transactions or more commands in the same transaction trying to lock resources out of order.
But all of this is completely unrelated to the fact that a column to be updated is also in the WHERE clause. (Even if indexes on the column are involved.) Due to the MVCC model of Postgres, it writes a new row version anyway, no matter which columns are actually updated.
If you should run into deadlocks involving out-of-order row locks, you can solve it using SELECT .. FOR UPDATE with a deterministic ORDER BY in a subquery:
Avoiding PostgreSQL deadlocks when performing bulk update and delete operations
Postgres, update and lock ordering

Is UPDATE command thread safe (tracking revisions) in MS SQL

Suppose we have a blog tool where each time a user performs a modification on an Article(Id,Body,Revisions), the revision counter is incremented by 1. If we would execute the following query (in MS SQL), and, assuming that we have many people trying to update the article, would we then get the 'right' Revisions?
Since I'm using EF, I have expressed the query in the following way:
context.Database.ExecuteSqlCommand("UPDATE dbo.Articles SET Revisions = Revisions + 1 WHERE Id=#p0;", articleId);
NB: What I mean by 'right' Revisions is that if we would have 100 people updating the article simultaneously, once they are all finished, the Revisions would be set to 100.
Yes, this is thread-safe. The database engine will lock the record during the update, which means any other threads will have to wait for it to finish its update.
During that time the field will indeed increment with one, without any interference from other threads. Once done, the resource is unlocked, and the next waiting thread will lock it in turn, and do the same.
As explained in the docs, the lock is an exclusive one:
Exclusive (X) Used for data-modification operations, such as INSERT, UPDATE, or DELETE. Ensures that multiple updates cannot be made to the same resource at the same time.
and:
Exclusive Locks
Exclusive (X) locks prevent access to a resource by concurrent transactions. No other transactions can read or modify data locked with an exclusive (X) lock.

Update via a temp table

So I have a rather large table (150 million rows) that data scrub queries get run on nightly. Now these queries don't update a lot of records, but to get the records needed, that have to query that single table multiple times in sub queries, which takes some time.
So, would it be better for me to do a normal update statement, or would it be better if I put the few results I needed in a temp table, and then just did an update for those few rows, which would greatly reduce the locks during update.
I'm unsure how an update statement locks work when most of the time is spent querying. If it is going to only update 5 records, and runs for half and hour, will it release a record that it updated in the first minute, or does it wait till the end of the query?
Thanks
You need to use (and look into) into the ROWLOCK table hint. You can use it with the update statement while updating in batches of 5000 rows of less. This will attempt to place row locks in the target table (or on index keys, if a covering index is present). If for some reason that fails, the lock will be escalated to a table lock.
From MSDN (as for reasons why lock escalation might occur):
When the Database Engine checks for possible escalations at every 1250
newly acquired locks, a lock escalation will occur if and only if a
Transact-SQL statement has acquired at least 5000 locks on a single
reference of a table. Lock escalation is triggered when a Transact-SQL
statement acquires at least 5,000 locks on a single reference of a
table. For example, lock escalation is not triggered if a statement
acquires 3,000 locks in one index and 3,000 locks in another index of
the same table. Similarly, lock escalation is not triggered if a
statement has a self join on a table, and each reference to the table
only acquires 3,000 locks in the table.
Actually, there's more to read in this last article. You should have a look at mixed lock type escalation section.

Minimizing deadlocks with purposely contrived + highly concurrent transactions?

I'm currently working on benchmarking different isolation levels in SQL Server 2008 -- but right now I'm stuck on what seems to be a trivial deadlocking problem, but I can't seem to figure it out. Hopefully someone here can offer advice (I'm a novice to SQL)
I currently have two types of transactions (to demonstrate dirty reads, but that's irrelevant):
Transaction Type A: Select all rows from Table A.
Transaction Type B: Set value 'cost' = 0 in all rows in Table A, then rollback immediately.
I currently run a threadpool of 1000 threads and 10,000 transactions, where each thread randomly chooses between executing Transaction Type A and Transaction Type B. However, I'm getting a ton of deadlocks even with forced row locking.
I assume that the deadlocks are occurring because of the row ordering of locks being acquired -- that is, if both Type A and Type B 'scan' table A in the same ordering, e.g. from top to bottom, such deadlocks cannot occur. However, I'm having trouble figuring out how to get SQL Server to maintain row ordering during SELECT and UPDATE statements.
Any tips? First time poster to stackoverflow, so please be gentle :-)
EDIT: The isolation level is purposely set to READ_COMMITTED to show that it eliminates dirty reads (and it does). Deadlocks only occur on any level equal to or higher than READ_COMMITTED; obviously no deadlocks occur on READ_UNCOMMITTED.
EDIT 2: These transactions are being run on a fresh instance of AdventureWorks LT on SQL Server 2008R2.
If you are starting a transaction to update all the rows, type B, and then rollback the transaction, the lock will need to be held for that entire transaction on all rows. Even though you have row level locks the lock needs to be held for the entire transaction.
You may see less deadlocks if you have page level or table level locking because these are easier to handle for Sql Server, but you will still need to hold these locks on the whole whilst the transaction is ongoing.
When you are designing a highly concurrent system you should avoid queries that lock the whole table. I recommend the following MicroSoft guide for understanding locks and reducing their impact:
http://technet.microsoft.com/en-us/library/cc966413.aspx