I am running a query with skip locked
select * from booker.review_tasks where review_task_id in (
2140285001,
2140285031,
2140304551
) for update skip locked ;
And then updating it using another concurrent transaction
update booker.review_tasks set priority = 190 where review_task_id in (2140285001,
2140304551);
I am getting an error in the first transaction as
was aborted: ERROR: could not serialize access due to concurrent update Call getNextException to see other errors in the batch.
at com.amazon.contentreviewplatformservice.db.RWNamedJdbcTemplate.batchUpdate(RWNamedJdbcTemplate.java:126) ~[ContentReviewPlatformService-1.0.jar:?]
at com.amazon.contentreviewplatformservice.db.dao.DAOUtils.changeUserTaskAssignmentStatus(DAOUtils.java:126) ~[ContentReviewPlatformService-1.0.jar:?]
at com.amazon.contentreviewplatformservice.workflow.dao.WorkflowManagerDAO.changeReviewState(WorkflowManagerDAO.java:784) ~[ContentReviewPlatformService-1.0.jar:?]
How can a row be updating when its locked , or if locked is skip why is there error of concurrent update
The SELECT ... FOR UPDATE first selects the rows and then locks them. It sees the database in the state of its transaction snapshot, which was taken by the first statement in the REPEATABLE READ transaction, possibly a while back.
After the SELECT ... FOR UPDATE has found a row, it will try to lock it. Now if the UPDATE has run at any time between the time the snapshot was taken and the attempt to lock the row, PostgreSQL determines that the row version it would have to lock is different from the row version the transaction sees and throws a serialization error.
Related
Let's assume I have the following query in two separate SSMS query windows:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRANSACTION
UPDATE dbo.Jobs
SET [status] = 'Running'
OUTPUT Inserted.*
WHERE [status] = 'Waiting'
--I'm NOT committing yet
--Commit Transaction
I run query window 1 (but do not commit), and then I run query window 2.
I want for query window 2 to immediately update only rows that were inserted after I started query 1 (all new records come in with a status of 'Waiting'). However, SQL Server is waiting for the first query to finish, because in an update statement it's not reading dirty values (even if it's set to READ UNCOMMITTED);
Is there a way to overcome this?
In my application I will have 2 (or more) processes running it, I want that process 2 should be able to pickup the rows that process 1 have not picked up; I don't want that process 2 should need to wait until process 1 is finish
What you are asking for is simply impossible.
Even at the lowest isolation level of READ UNCOMMITTED (aka NOLOCK), an X-Lock (exclusive) must be taken in order to make modifications. In other words, writes are always locked, even if the reads that fetched those rows were not locked.
So even though session 2 is running under READ UNCOMMITTED also, if it wants to do a modification it must also take an X-Lock, which is incompatible with the first X-Lock.
The solution here is to either do this in one session, or commit immediately. In any case, do not hold locks for any length of time, as it can cause massive blocking chains and even deadlocks.
If you want to just ignore all those rows which have been inserted, you could use the WITH (READPAST) hint.
READ UNCOMMITTED as an isolation level or as a hint has huge issues.
It can cause anything from deadlocks to completely incorrect results. For example, you could read a row twice, or not at all, when by the logical definition of the schema there should have been exactly one row. You could read entire pages twice or not at all.
You can get deadlocks due to U-Locks not being taken in UPDATE and DELETE statements.
And you still take schema locks, so you can still get stuck behind a synchronous statistics update or an index rebuild.
I have a series data modify operation to do,such as
1. update table_a set value=1 where id=1
2. update table_b set value=2 where id=1
3. update table_c set value=3 where id=1
and I want to ensure this three operation must all complete,I know using transaction can guarantee all performed or none performed.But my point is must make these three all performed.when first sql performed,the app instance may crashed and the other two are missed.
Note this is a ditributed enviroment,may be another app instance can take over the unfinished SQL,but how can I do it?
Can I use a stored procedure,the app instance only fire the stored procedure,and database finsh all the sql?
If when performing transaction,the app instance suddenly crashes,will it leads to a dead lock?
Deadlocks are not crashed requests which fail before the end of the execution. If your request crashes into a transaction, it won't lead to a deadlock.
It is always better to use stored procedures but this won't help you for this specific case.
What I would suggest is inded the use of a transaction with a try catch to rollback the transaction in case of failure.
Something like that :
BEGIN TRY -- start of try
BEGIN TRANSACTION; -- start of transaction
update table_a set value=1 where id=1
update table_b set value=2 where id=1
update table_c set value=3 where id=1
COMMIT TRANSACTION; -- everything went ok we commit
BEGIN CATCH -- an error happened we rollback
PRINT N'Unexpected error';
ROLLBACK TRANSACTION;
END CATCH
You can check more complete examples here
If an app is performing a transaction on a database server, and the app crashes (abruptly disconnects from the database) before committing the transaction, the database server rolls back the transaction. The disconnection does not leave the database in an unusable (potentially deadlocked) state.
So your database contents won't reflect any of your three UPDATE operations when your app crashes during your transaction. It will just lose the transaction in progress.
How to handle this potential failure mode?
Reduce the probability of a crash during a transaction. Try to avoid doing stuff in your app that could make it crash while your transaction is in process. For example, if you get data from some other server or device, get it all before you begin your transaction. This solution is usually good enough for production apps.
Rig up some sort of way for your app, upon restarting, to find out the most recent successful transaction. One good way? Add a column like this to one of your tables: (this is a MySQL thing.)
last_update_timestamp TIMESTAMP DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
This causes every UPDATE operation on each column to -- automatically -- put NOW() into the last_update_timestamp column. Then, when your crashed app restarts you can do
SELECT MAX(last_update_timestamp) FROM table
and you'll know when the most recent successful update occurred. This automatic update also gets rolled back if a transaction is rolled back. If you know when the last successful update occurred, your app may be able to redo the one that was rolled back by the crash.
If you choose to build a redo-transaction capability, be sure to build it so you can test it! if (testingAppCrash) crashNow = 1 / 0; might do the trick in your app.
I have the following statement in my PostgreSQL 10.5 database, which I execute in a repeatable read transaction:
delete from task
where task.task_id = (
select task.task_id
from task
order by task.created_at asc
limit 1
for update skip locked
)
returning
task.task_id,
task.created_at
Unfortunately, when I run it, I sometimes get:
[67] ERROR: could not serialize access due to concurrent update
[67] STATEMENT: delete from task
where task.task_id = (
select task.task_id
from task
order by task.created_at asc
limit $1
for update skip locked
)
returning
task.task_id,
task.created_at
which means the transaction rolled back because some other transaction modified the record in the meantime. (I think?)
I don't quite understand this. How could a different transaction modify a record that was selected with for update skip locked, and deleted?
This quote from the manual discusses your case exactly:
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands
behave the same as SELECT in terms of searching for target rows: they
will only find target rows that were committed as of the transaction
start time. However, such a target row might have already been updated
(or deleted or locked) by another concurrent transaction by the time
it is found. In this case, the repeatable read transaction will wait
for the first updating transaction to commit or roll back (if it is
still in progress). If the first updater rolls back, then its effects
are negated and the repeatable read transaction can proceed with
updating the originally found row. But if the first updater commits
(and actually updated or deleted the row, not just locked it) then the
repeatable read transaction will be rolled back with the message
ERROR: could not serialize access due to concurrent update
Meaning, your transaction was unable to lock the row to begin with - due to concurrent write access that got there first. SKIP LOCKED cannot save you from this completely as there may not be a lock to skip any more and we still run into a serialization failure if the row has already been changed (and the change committed - hence the lock released) since transaction start.
The same statement should work just fine with default READ COMMITTED transaction isolation. Related:
Postgres UPDATE … LIMIT 1
So given this transaction:
select * from table_a where field_a = 'A' for update;
Assuming this gives out multiple rows/results, will the database lock all the results right off the bat? Or will it lock it one row at a time.
If the later is true, does that mean running this query concurrently, can result in a deadlock?
Thus, adding an order by to maintain consistency on the order is needed to solve this problem?
The documentation explains what happens as follows:
FOR UPDATE
FOR UPDATE causes the rows retrieved by the SELECT statement to be
locked as though for update. This prevents them from being locked,
modified or deleted by other transactions until the current
transaction ends. That is, other transactions that attempt UPDATE,
DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE
or SELECT FOR KEY SHARE of these rows will be blocked until
the current transaction ends; conversely, SELECT FOR UPDATE will
wait for a concurrent transaction that has run any of those commands
on the same row, and will then lock and return the updated row (or no
row, if the row was deleted). Within a REPEATABLE READ or
SERIALIZABLE transaction, however, an error will be thrown if a row
to be locked has changed since the transaction started. For further
discussion see Section 13.4.
The direct answer to your question is that Postgres cannot lock all the rows "right off the bat"; it has to find them first. Remember, this is row-level locking rather than table-level locking.
The documentation includes this note:
SELECT FOR UPDATE modifies selected rows to mark them locked, and so
will result in disk writes.
I interpret this as saying that Postgres executes the SELECT query and as it finds the rows, it marks them as locked. The lock (for a given row) starts when Postgres identifies the row. It continues until the end of the transaction.
Based on this, I think it is possible for a deadlock situation to arise using SELECT FOR UPDATE.
I am not able to understand how select will behave while its part of exclusive transaction. Please consider following scenarios –
Scenario 1
Step 1.1
create table Tmp(x int)
insert into Tmp values(1)
Step 1.2 – session 1
begin tran
set transaction isolation level serializable
select * from Tmp
Step 1.3 – session 2
select * from Tmp
Even first session hasn't been finished, session 2 will be able to read tmp table. I thought Tmp will have exclusive lock and shared lock should not be issued to select query in session 2. And it’s not happening. I have made sure that default isolation level is READ COMMITED.
Thanks in advance for helping me in understanding this behavior.
EDIT : Why I need select in exclusive lock?
I have a SP which actually generate sequential values. So flow is -
read max values from Table and store value in variables
Update table set value=value+1
This SP is executed in parallel by several thousand instances. If two instances execute SP at same time, then they will read same value and will update value+1. Though I would like to have sequential value for every execution. I think its possible only if select is also part of exclusive lock.
If you want a transaction to be serializable, you have to change that option before you start the outermost transaction. So your first session is incorrect and is still actually running under read committed (or whatever other level was in effect for that session).
But even if you correct the order of statements, it still will not acquire an exclusive lock for a plain SELECT statement.
If you want the plain SELECT to acquire an exclusive lock, you need to ask for it:
select * from Tmp with (XLOCK)
or you need to execute a statement that actually requires an exclusive lock:
update Tmp set x = x
Your first session doesn't need an exclusive lock because it's not changing the data. If your first (serializable) session had run to completion and either rolled back or committed, before your second session was started, that session's results would still be the same because your first session didn't change the data - and so the "serializable" nature of the transaction was correct.