Is `lock_timeout` necessary for a Postgres CREATE INDEX CONCURRENTLY - sql

When running CREATE INDEX CONCURRENTLY, can you lock up a table while the SHARE UPDATE EXCLUSIVE is being acquired? If it took 10 minutes to acquire the lock, would anyone be blocked from using the table during that time?
The point of running CONCURRENTLY is to safely add an index to an active table. But what I can't find a clear answer on is whether the initial lock being acquired can cause queries to queue up.
The documentation for a particular safe postgres migrations library mentions (https://github.com/doctolib/safe-pg-migrations#user-content-safe_add_remove_index):
If you still get lock timeout while adding / removing indexes, it
might be for one of those reasons:
Long-running queries are active on the table. To create / remove an index, PG needs to wait for the queries that are actually running to
finish before starting the index creation / removal. The blocking
activity logger might help you to pinpoint the culprit queries.
A vacuum / autovacuum is running on the table, holding a ShareUpdateExclusiveLock, you are most likely out of luck for the
current migration, but you may try to optimize your autovacuums
settings.
But even that doesn't clearly tell me - if I don't set a timeout (so the timeout is 0), and I wait a long time for the lock to get acquired - will that cause other queries to wait until the lock is acquired or will the ADD INDEX be the only thing blocked during this time?

It will block the things that ShareUpdateExclusive blocks. But not ordinary SELECT, INSERT, UPDATE, DELETE.
The most troubling thing it blocks might be ANALYZE. If the stats get too out of date and can't be repaired, your SELECTs might start choosing ridiculous plans which never finish, despite not being formally blocked.
It does have to transiently acquire stronger locks than ShareUpdateExclusive, but if it has to wait for them it does so in an indirect way which doesn't block others.

Related

Azure SQL Managed Instance - Blocked by Negative SPID

I have a nightly "archive and delete" process that archives data outside of a 150-day sliding window to Azure Blob Storage, and then deletes the data. In the past, the deletion process ran for about two hours and of course blocked all sorts of other processes (this is a big, busy table). So, we modified the process to delete in chunks, and that helped with the blocking of other processes.
However, recently the deletion process has been taking 12+ hours to run. When checking for blocking, it's constantly blocked by SPID -5 ... and I understand, this is supposedly an orphaned DT. However, none of the queries I run to get the GUID return any rows, for example:
SELECT
DISTINCT(request_owner_guid) as UoW_Guid
FROM sys.dm_tran_locks
WHERE request_session_id =-5
Any suggestions on what I need to do here? This is becoming a real problem. Thanks.

How to prevent locks in redshift. ( Shared lock stopping a write job)

I have a data warehouse which are used by multiple downstream users. They read the data from the redshift table. When they read the data, there is a shared lock enforced on the table. At that time, my daily job which is supposed to write on the table does not write as it cannot put an exclusive lock until the shared lock is clear.
Ideally my write job should take priority over any other read job. Can I enforce this is some way?
Usually this is done by your update process not requiring an exclusive lock or managing the need for locks so that the update process isn't blocked.
Can you describe your update process and which steps are requiring the exclusive locks?
Look at the locks and statements causing them when things are making forward progress. Reworking these parts should allow you to keep you updates moving while these read sessions are acting on the versions of data they started with.
It is also important to not have user transactions that hang around for days on end. This can happen when interactive sessions are just left open mid transaction. The also prevents errors due to some sessions seeing very old versions of data.

Updating on commit to avoid deadlocks

I have a table that tracks the last update time of another table's partitions so our reconciler need only check the partitions that have been updated since the last reconcile. There are multiple threads updating the partitioned table and therefore updating the same row of the latest update time table several times each. This is obviously causing deadlocks. Is there a way to prevent these deadlocks by only updating once on commit?
I was thinking of maybe using a session local temporary table, but not sure how to transfer the values to the global table on commit.
There is no way to trigger a process on commit so that approach probably won't work.
Potentially, you could have each of the writer processes write to an Oracle Advanced Queue (AQ) and then have another process that de-queues the messages and actually applies them to the current table. That would mean that there would be some lag between the writer session committing and the AQ processor picking up and processing the message but that lag shouldn't be too long. You could do the same thing by having each writer thread insert into a queue-like table and having a separate thread process that table if you don't want to use AQ.
I'm confused, though, by how the process you are describing could cause a deadlock. Are you really talking about a deadlock (i.e. an ORA-00060 error is thrown and a deadlock trace file is generated)? What you are describing should lead to blocking locks, not deadlocks, unless there is more going on than you have told us.

Handle Lock Manually in SQL Server?

I am new to SQL Server, but am having a fair knowledge of simple things like select/update/delete and other transaction. I am facing a dead lock scenario in my application. I have understood the scenario as many threads are parallel trying to run a set of update operations. Its is not a single update but a set of update operations.
I have understood that this cannot be avoided in my application as many people want to do a update simultaneously. So I want to have a manual lock system. First the thread 1 should check if the manual lock is available and then start the transaction. Mean while if the second thread requests for the lock it should be busy and hence the second thread should wait. Once the first is completed the second should acquire the lock and start with the transaction.
This is just a logic i have thought about. But I do not have any idea of how to do this in SQL Server. Are there any examples which can help me. Please let me know if you can give me some sample sql scripts or links that will be helpful for me. Thank you for your time and help.
You probably mean "semaphore". That is, something to serialise execution of the DML to only one process can run at a time.
This is native in SQL Server using sp_getapplock
You can configure 2nd processes to wait or fail when they call sp_getapplock, and also it can be self-cancelling in "transaction" mode.
You will still most likely end up in the same scenario. Having a dead lock based around your tailor made locks. SQL Server internally implements a very robust locking mechanism. You should use it.
The problem you're having is that resources (tables, indexes, etc.) are accessed (or modified) in a conflicting order by different transactions/threads.
If you create your own locking mechanism, you may end up with a dead lock just the same. Example:
Thread 1 creates a lock on Customer record
Thread 2 creates a lock on Order record
Thread 1 attempts to create a lock on Order record (but cannot proceed due to step 2)
Thread 2 attempts to create a lock on Customer record (but cannot proceed due to step 3)
Voila ... deadlock
The solution is to refactor the way resources are accessed, so records are always accessed in the same order and the problem will go away.
Thread 1 creates a lock on Customer record
Thread 2 attempts to create a lock on Customer record (but cannot proceed due to step 1)
Thread 1 creates a lock on Order record
Thread 1 completes transaction and unlocks both Order and Customer records
Thread 2 creates a lock on Customer record
Thread 2 creates a lock on Order record
Also, have a look here to read how locking can happen on a single table.
You manual Lock system sounds interesting but you need to aware that it will sacrifice concurrency, which is quite important for many OLTP application.
Advance db like Oracle and SQL server is quite good in avoiding dead lock and give you the tool to resolve dead lock, which help you just kill the session that cause the dead lock and let the other query finish it's job first.
Microsoft Has documentation which can be find here.
http://support.microsoft.com/kb/832524
Beside, there are many other reasons that could lead to deadlock. You can find some example here. how to solve deadlock problem?

Deadlock error in INSERT statement

We've got a web-based application. There are time-bound database operations (INSERTs and UPDATEs) in the application which take more time to complete, hence this particular flow has been changed into a Java Thread so it will not wait (block) for the complete database operation to be completed.
My problem is, if more than 1 user comes across this particular flow, I'm facing the following error thrown by PostgreSQL:
org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 13560 waits for ShareLock on transaction 3147316424; blocked by process 13566.
Process 13566 waits for ShareLock on transaction 3147316408; blocked by process 13560.
The above error is consistently thrown in INSERT statements.
Additional Information:
1) I have PRIMARY KEY defined in this table.
2) There are FOREIGN KEY references in this table.
3) Separate database connection is passed to each Java Thread.
Technologies
Web Server: Tomcat v6.0.10
Java v1.6.0
Servlet
Database: PostgreSQL v8.2.3
Connection Management: pgpool II
One way to cope with deadlocks is to have a retry mechanism that waits for a random interval and tries to run the transaction again. The random interval is necessary so that the colliding transactions don't continuously keep bumping into each other, causing what is called a live lock - something even nastier to debug. Actually most complex applications will need such a retry mechanism sooner or later when they need to handle transaction serialization failures.
Of course if you are able to determine the cause of the deadlock it's usually much better to eliminate it or it will come back to bite you. For almost all cases, even when the deadlock condition is rare, the little bit of throughput and coding overhead to get the locks in deterministic order or get more coarse-grained locks is worth it to avoid the occasional large latency hit and the sudden performance cliff when scaling concurrency.
When you are consistently getting two INSERT statements deadlocking it's most likely an unique index insert order issue. Try for example the following in two psql command windows:
Thread A | Thread B
BEGIN; | BEGIN;
| INSERT uniq=1;
INSERT uniq=2; |
| INSERT uniq=2;
| block waiting for thread A to commit or rollback, to
| see if this is an unique key error.
INSERT uniq=1; |
blocks waiting |
for thread B, |
DEADLOCK |
V
Usually the best course of action to resolve this is to figure out the parent objects that guard all such transactions. Most applications have one or two of primary entities, such as users or accounts, that are good candidates for this. Then all you need is for every transaction to get the locks on the primary entity it touches via SELECT ... FOR UPDATE. Or if touches several, get locks on all of them but in the same order every time (order by primary key is a good choice).
What PostgreSQL does here is covered in the documentation on Explicit Locking. The example in the "Deadlocks" section shows what you're probably doing. The part you may not have expected is that when you UPDATE something, that acquires a lock on that row that continues until the transaction involved ends. If you have multiple clients all doing updates of more than one thing at once, you'll inevitably end up with deadlocks unless you go out of your way to prevent them.
If you have multiple things that take out implicit locks like UPDATE, you should wrap the whole sequence in BEGIN/COMMIT transaction blocks, and make sure you're consistent about the order they acquire locks (even the implicit ones like what UPDATE grabs) at everywhere. If you need to update something in table A then table B, and one part of the app does A then B while the other does B then A, you're going to deadlock one day. Two UPDATEs against the same table are similarly destined to fail unless you can enforce some ordering of the two that's repeatable among clients. Sorting by primary key once you have the set of records to update and always grabbing the "lower" one first is a common strategy.
It's less likely your INSERTs are to blame here, those are much harder to get into a deadlocked situation, unless you violate a primary key as Ants already described.
What you don't want to do is try and duplicate locking in your app, which is going to turn into a giant scalability and reliability mess (and will likely still result in database deadlocks). If you can't work around this within the confines of the standard database locking methods, consider using either the advisory lock facility or explicit LOCK TABLE to enforce what you need instead. That will save you a world of painful coding over trying to push all the locks onto the client side. If you have multiple updates against a table and can't enforce the order they happen in, you have no choice but to lock the whole table while you execute them; that's the only route that doesn't introduce a potential for deadlock.
Deadlock explained:
In a nutshell, what is happening is that a particular SQL statement (INSERT or other) is waiting on another statement to release a lock on a particular part of the database, before it can proceed. Until this lock is released, the first SQL statement, call it "statement A" will not allow itself to access this part of the database to do its job (= regular lock situation). But... statement A has also put a lock on another part of the database to ensure that no other users of the database access (for reading, or modifiying/deleting, depending on the type of lock). Now... the second SQL statement, is itself in need of accessing the data section marked by the lock of Statement A. That is a DEAD LOCK : both Statement will wait, ad infinitum, on one another.
The remedy...
This would require to know the specific SQL statement these various threads are running, and looking in there if there is a way to either:
a) removing some of the locks, or changing their types.
For example, maybe the whole table is locked, whereby only a given row, or
a page thereof would be necessary.
b) preventing multiple of these queries to be submitted at a given time.
This would be done by way of semaphores/locks (aka MUTEX) at the level of the
multi-threading logic.
Beware that the "b)" approach, if not correctly implemented may just move the deadlock situation from within SQL to within the program/threads logic. The key would be to only create one mutex to be obtained first by any thread which is about to run one of these deadlock-prone queries.
Your problem, probably, is the insert command is trying to lock one or both index and the indexes is locked for the other tread.
One common mistake is lock resources in different order on each thread. Check the orders and try to lock the resources in the same order in all threads.