So given this transaction:
select * from table_a where field_a = 'A' for update;
Assuming this gives out multiple rows/results, will the database lock all the results right off the bat? Or will it lock it one row at a time.
If the later is true, does that mean running this query concurrently, can result in a deadlock?
Thus, adding an order by to maintain consistency on the order is needed to solve this problem?
The documentation explains what happens as follows:
FOR UPDATE
FOR UPDATE causes the rows retrieved by the SELECT statement to be
locked as though for update. This prevents them from being locked,
modified or deleted by other transactions until the current
transaction ends. That is, other transactions that attempt UPDATE,
DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE
or SELECT FOR KEY SHARE of these rows will be blocked until
the current transaction ends; conversely, SELECT FOR UPDATE will
wait for a concurrent transaction that has run any of those commands
on the same row, and will then lock and return the updated row (or no
row, if the row was deleted). Within a REPEATABLE READ or
SERIALIZABLE transaction, however, an error will be thrown if a row
to be locked has changed since the transaction started. For further
discussion see Section 13.4.
The direct answer to your question is that Postgres cannot lock all the rows "right off the bat"; it has to find them first. Remember, this is row-level locking rather than table-level locking.
The documentation includes this note:
SELECT FOR UPDATE modifies selected rows to mark them locked, and so
will result in disk writes.
I interpret this as saying that Postgres executes the SELECT query and as it finds the rows, it marks them as locked. The lock (for a given row) starts when Postgres identifies the row. It continues until the end of the transaction.
Based on this, I think it is possible for a deadlock situation to arise using SELECT FOR UPDATE.
In the past I always thought that select query would not blocks other insert sql. However, recently I wrote a query that takes a long time (more than 2 min) to select data from a table. During the select, a number of insert sql statements were timing out.
If select blocks insert, what would be the solution way to prevent the timeout without causing dirty read?
I have investigate option of using isolation snapshot, but currently I have no access to change the client's database to enable the “ALLOW_SNAPSHOT_ISOLATION”.
Thanks
When does a Select query block Inserts or Updates to the same or
other table(s)?
When it holds a lock on a resource that is mutually exclusive with one that the insert or update statement needs.
Under readcommitted isolation level with no additional locking hints then the S locks taken out are typically released as soon as the data is read. For repeatable read or serializable however they will be held until the end of the transaction (statement for a single select not running in an explicit transaction).
serializable will often take out range locks which will cause additional blocking over and above that caused by the holding of locks on the rows and pages actually read.
READPAST might be what you're looking for - check out this article.
Using Sql Server 2005. I have a long running update that may take about 60 seconds in our production environment. The update is not part of any explicit transactions nor has any sql hints. While the update is running, what's to be expected from other requests that occur on those rows that will be updated? There's about 6 million total rows in the table that will be updated of which about 500,000 rows will be updated.
Some concurrency concerns/questions:
1) What if another select query (with nolock hint) is performed on this table among some of the rows that are being updated. Will the query wait until the update is finished?
2) What the other select query does not have a nolock hint? Will this query have to wait until the update is finished?
3) What if another update query is performing an update on one of these rows? Will this query have to wait until it's finished?
4) What about deletes?
5) What about inserts?
Thanks!
Dave
Every statement in sql server runs in a transaction. If you don't explicitly start one, the server starts one for every statement and commits it if the statement is successful, and rolls it back if it is not.
The exact locking you'll see with your update, unfortunately, depends. It will start off as row locks, but it is likely that it will be elevated to at least a few page locks based on the number of rows you're updating. Full elevation to a table lock is unlikely, but this depends in some amount on your server - SQL Server will elevate it if the page locks are using too much memory.
If your select is ran with nolock then you will get dirty reads if you happen to select any rows which are involved in the update. This means you will read the uncommitted data, and it may not be consistent with other rows (since those may not have been updated yet).
For all other cases if your statement encounters a row involved in the update, or a row on a locked page (assuming the lock has been elevated) then the statement will have to wait for the update to finish.
I'm using a Microsoft SQL Server 2005 database with isolation level READ_COMMITTED and READ_COMMITTED_SNAPSHOT=ON.
Now I want to use:
SELECT * FROM <tablename> FOR UPDATE
...so that other database connections block when trying to access the same row "FOR UPDATE".
I tried:
SELECT * FROM <tablename> WITH (updlock) WHERE id=1
...but this blocks all other connections even for selecting an id other than "1".
Which is the correct hint to do a SELECT FOR UPDATE as known for Oracle, DB2, MySql?
EDIT 2009-10-03:
These are the statements to create the table and the index:
CREATE TABLE example ( Id BIGINT NOT NULL, TransactionId BIGINT,
Terminal BIGINT, Status SMALLINT );
ALTER TABLE example ADD CONSTRAINT index108 PRIMARY KEY ( Id )
CREATE INDEX I108_FkTerminal ON example ( Terminal )
CREATE INDEX I108_Key ON example ( TransactionId )
A lot of parallel processes do this SELECT:
SELECT * FROM example o WITH (updlock) WHERE o.TransactionId = ?
EDIT 2009-10-05:
For a better overview I've written down all tried solutions in the following table:
mechanism | SELECT on different row blocks | SELECT on same row blocks
-----------------------+--------------------------------+--------------------------
ROWLOCK | no | no
updlock, rowlock | yes | yes
xlock,rowlock | yes | yes
repeatableread | no | no
DBCC TRACEON (1211,-1) | yes | yes
rowlock,xlock,holdlock | yes | yes
updlock,holdlock | yes | yes
UPDLOCK,READPAST | no | no
I'm looking for | no | yes
Recently I had a deadlock problem because Sql Server locks more then necessary (page). You can't really do anything against it. Now we are catching deadlock exceptions... and I wish I had Oracle instead.
Edit:
We are using snapshot isolation meanwhile, which solves many, but not all of the problems. Unfortunately, to be able to use snapshot isolation it must be allowed by the database server, which may cause unnecessary problems at customers site. Now we are not only catching deadlock exceptions (which still can occur, of course) but also snapshot concurrency problems to repeat transactions from background processes (which cannot be repeated by the user). But this still performs much better than before.
I have a similar problem, I want to lock only 1 row.
As far as I know, with UPDLOCK option, SQLSERVER locks all the rows that it needs to read in order to get the row. So, if you don't define a index to direct access to the row, all the preceded rows will be locked.
In your example:
Asume that you have a table named TBL with an id field.
You want to lock the row with id=10.
You need to define a index for the field id (or any other fields that are involved in you select):
CREATE INDEX TBLINDEX ON TBL ( id )
And then, your query to lock ONLY the rows that you read is:
SELECT * FROM TBL WITH (UPDLOCK, INDEX(TBLINDEX)) WHERE id=10.
If you don't use the INDEX(TBLINDEX) option, SQLSERVER needs to read all rows from the beginning of the table to find your row with id=10, so those rows will be locked.
You cannot have snapshot isolation and blocking reads at the same time. The purpose of snapshot isolation is to prevent blocking reads.
perhaps making mvcc permanent could solve it (as opposed to specific batch only: SET TRANSACTION ISOLATION LEVEL SNAPSHOT):
ALTER DATABASE yourDbNameHere SET READ_COMMITTED_SNAPSHOT ON;
[EDIT: October 14]
After reading this: Better concurrency in Oracle than SQL Server? and this: http://msdn.microsoft.com/en-us/library/ms175095.aspx
When the READ_COMMITTED_SNAPSHOT
database option is set ON, the
mechanisms used to support the option
are activated immediately. When
setting the READ_COMMITTED_SNAPSHOT
option, only the connection executing
the ALTER DATABASE command is allowed
in the database. There must be no
other open connection in the database
until ALTER DATABASE is complete. The
database does not have to be in
single-user mode.
i've come to conclusion that you need to set two flags in order to activate mssql's MVCC permanently on a given database:
ALTER DATABASE yourDbNameHere SET ALLOW_SNAPSHOT_ISOLATION ON;
ALTER DATABASE yourDbNameHere SET READ_COMMITTED_SNAPSHOT ON;
Try (updlock, rowlock)
The full answer could delve into the internals of the DBMS. It depends on how the query engine (which executes the query plan generated by the SQL optimizer) operates.
However, one possible explanation (applicable to at least some versions of some DBMS - not necessarily to MS SQL Server) is that there is no index on the ID column, so any process trying to work a query with 'WHERE id = ?' in it ends up doing a sequential scan of the table, and that sequential scan hits the lock which your process applied. You can also run into problems if the DBMS applies page-level locking by default; locking one row locks the entire page and all the rows on that page.
There are some ways you could debunk this as the source of trouble. Look at the query plan; study the indexes; try your SELECT with ID of 1000000 instead of 1 and see whether other processes are still blocked.
OK, a single select wil by default use "Read Committed" transaction isolation which locks and therefore stops writes to that set. You can change the transaction isolation level with
Set Transaction Isolation Level { Read Uncommitted | Read Committed | Repeatable Read | Serializable }
Begin Tran
Select ...
Commit Tran
These are explained in detail in SQL Server BOL
Your next problem is that by default SQL Server 2K5 will escalate the locks if you have more than ~2500 locks or use more than 40% of 'normal' memory in the lock transaction. The escalation goes to page, then table lock
You can switch this escalation off by setting "trace flag" 1211t, see BOL for more information
Create a fake update to enforce the rowlock.
UPDATE <tablename> (ROWLOCK) SET <somecolumn> = <somecolumn> WHERE id=1
If that's not locking your row, god knows what will.
After this "UPDATE" you can do your SELECT (ROWLOCK) and subsequent updates.
I'm assuming you don't want any other session to be able to read the row while this specific query is running...
Wrapping your SELECT in a transaction while using WITH (XLOCK,READPAST) locking hint will get the results you want. Just make sure those other concurrent reads are NOT using WITH (NOLOCK). READPAST allows other sessions to perform the same SELECT but on other rows.
BEGIN TRAN
SELECT *
FROM <tablename> WITH (XLOCK,READPAST)
WHERE RowId = #SomeId
-- Do SOMETHING
UPDATE <tablename>
SET <column>=#somevalue
WHERE RowId=#SomeId
COMMIT
Question - is this case proven to be the result of lock escalation (i.e. if you trace with profiler for lock escalation events, is that definitely what is happening to cause the blocking)? If so, there is a full explanation and a (rather extreme) workaround by enabling a trace flag at the instance level to prevent lock escalation. See http://support.microsoft.com/kb/323630 trace flag 1211
But, that will likely have unintended side effects.
If you are deliberately locking a row and keeping it locked for an extended period, then using the internal locking mechanism for transactions isn't the best method (in SQL Server at least). All the optimization in SQL Server is geared toward short transactions - get in, make an update, get out. That's the reason for lock escalation in the first place.
So if the intent is to "check out" a row for a prolonged period, instead of transactional locking it's best to use a column with values and a plain ol' update statement to flag the rows as locked or not.
Application locks are one way to roll your own locking with custom granularity while avoiding "helpful" lock escalation. See sp_getapplock.
Try using:
SELECT * FROM <tablename> WITH ROWLOCK XLOCK HOLDLOCK
This should make the lock exclusive and hold it for the duration of the transaction.
According to this article, the solution is to use the WITH(REPEATABLEREAD) hint.
Revisit all your queries, maybe you have some query that select without ROWLOCK/FOR UPDATE hint from the same table you have SELECT FOR UPDATE.
MSSQL often escalates those row locks to page-level locks (even table-level locks, if you don't have index on field you are querying), see this explanation. Since you ask for FOR UPDATE, i could assume that you need transacion-level(e.g. financial, inventory, etc) robustness. So the advice on that site is not applicable to your problem. It's just an insight why MSSQL escalates locks.
If you are already using MSSQL 2005(and up), they are MVCC-based, i think you should have no problem with row-level lock using ROWLOCK/UPDLOCK hint. But if you are already using MSSQL 2005 and up, try to check some of your queries which query the same table you want to FOR UPDATE if they escalate locks by checking the fields on their WHERE clause if they have index.
P.S.
I'm using PostgreSQL, it also uses MVCC have FOR UPDATE, i don't encounter same problem. Lock escalations is what MVCC solves, so i would be surprised if MSSQL 2005 still escalate locks on table with WHERE clauses that doesn't have index on its fields. If that(lock escalation) is still the case for MSSQL 2005, try to check the fields on WHERE clauses if they have index.
Disclaimer: my last use of MSSQL is version 2000 only.
You have to deal with the exception at commit time and repeat the transaction.
I solved the rowlock problem in a completely different way. I realized that sql server was not able to manage such a lock in a satisfying way. I choosed to solve this from a programatically point of view by the use of a mutex... waitForLock... releaseLock...
Have you tried READPAST?
I've used UPDLOCK and READPAST together when treating a table like a queue.
How about trying to do a simple update on this row first (without really changing any data)? After that you can proceed with the row like in was selected for update.
UPDATE dbo.Customer SET FieldForLock = FieldForLock WHERE CustomerID = #CustomerID
/* do whatever you want */
Edit: you should wrap it in a transaction of course
Edit 2: another solution is to use SERIALIZABLE isolation level
We are using ADO.NET to connect to a SQL 2005 server, and doing a number of inserts/updates and selects in it. We changed one of the updates to be inside a transaction however it appears to (b)lock the entire table when we do it, regardless of the IsolationLevel we set on the transaction.
The behavior that I seem to see is that:
If you have no transactions then it's an all out fight (losers getting dead locked)
If you have a few transactions then they win all the time and block all others out unless
If you have a few transactions and you set something like nolock on the rest then you get transactions and nothing blocked. This is because every statement (select/insert/delete/update) has an isolationlevel regardless of transactions.
Is this correct?
The answer to your question is: It depends.
If you are updating a table, SQL Server uses several strategies to decide how many rows to lock, row level locks, page locks or full table locks.
If you are updating more than a certain percentage of the table (configurable as I remember), then SQL Server gives you a table level lock, which may block selects.
The best reference is:
Understanding Locking in SQL Server:
http://msdn.microsoft.com/en-us/library/aa213039(SQL.80).aspx
(for SQL Server 2000)
Introduction to Locking in SQL Server: http://www.sqlteam.com/article/introduction-to-locking-in-sql-server
Isolation Levels in the Database Engine: http://msdn.microsoft.com/en-us/library/ms189122.aspx (for SQL server 2008, but 2005 version is available).
Good luck.
Your update statement (i.e one that changes data) will hold locks regardless of the isolation level and whether you have explicitly defined a transaction of not.
What you can control is the granularity of the locks by using query hints. So if the update is locking the entire table, then you can specify a query hint to only lock the affected rows (ROWLOCK hint). That is unless your query is updating the whole table of course.
So to answer your question, the first connection to request locks on a resource will hold those locks for the duration of the transaction. You can specify that a select does not hold locks by using the read uncommitted isolation level, statements that change data insert/update/delete always hold locks regardless. The next connection to request locks on the same resource will wait until the first has finished and will then hold its locks. Dead locking is a specific scenario where two connections are holding locks and each is waiting for the other connection's resource, to avoid the engine waiting forever, one connection is chosen as the deadlock victim.