I have a nightly "archive and delete" process that archives data outside of a 150-day sliding window to Azure Blob Storage, and then deletes the data. In the past, the deletion process ran for about two hours and of course blocked all sorts of other processes (this is a big, busy table). So, we modified the process to delete in chunks, and that helped with the blocking of other processes.
However, recently the deletion process has been taking 12+ hours to run. When checking for blocking, it's constantly blocked by SPID -5 ... and I understand, this is supposedly an orphaned DT. However, none of the queries I run to get the GUID return any rows, for example:
SELECT
DISTINCT(request_owner_guid) as UoW_Guid
FROM sys.dm_tran_locks
WHERE request_session_id =-5
Any suggestions on what I need to do here? This is becoming a real problem. Thanks.
Related
When running CREATE INDEX CONCURRENTLY, can you lock up a table while the SHARE UPDATE EXCLUSIVE is being acquired? If it took 10 minutes to acquire the lock, would anyone be blocked from using the table during that time?
The point of running CONCURRENTLY is to safely add an index to an active table. But what I can't find a clear answer on is whether the initial lock being acquired can cause queries to queue up.
The documentation for a particular safe postgres migrations library mentions (https://github.com/doctolib/safe-pg-migrations#user-content-safe_add_remove_index):
If you still get lock timeout while adding / removing indexes, it
might be for one of those reasons:
Long-running queries are active on the table. To create / remove an index, PG needs to wait for the queries that are actually running to
finish before starting the index creation / removal. The blocking
activity logger might help you to pinpoint the culprit queries.
A vacuum / autovacuum is running on the table, holding a ShareUpdateExclusiveLock, you are most likely out of luck for the
current migration, but you may try to optimize your autovacuums
settings.
But even that doesn't clearly tell me - if I don't set a timeout (so the timeout is 0), and I wait a long time for the lock to get acquired - will that cause other queries to wait until the lock is acquired or will the ADD INDEX be the only thing blocked during this time?
It will block the things that ShareUpdateExclusive blocks. But not ordinary SELECT, INSERT, UPDATE, DELETE.
The most troubling thing it blocks might be ANALYZE. If the stats get too out of date and can't be repaired, your SELECTs might start choosing ridiculous plans which never finish, despite not being formally blocked.
It does have to transiently acquire stronger locks than ShareUpdateExclusive, but if it has to wait for them it does so in an indirect way which doesn't block others.
looks like today is going to be another rubbish one. we have recently updated our sql box with a complete monster, with loads of cores and ram, however we are stuck with out old DB schema which is crapola our old sql box had problems but nothing like what we are experiencing with the new one, although on the day of rolling out it was running super fast, within a week its a complete mess...
our .net app used by a couple of hundred people or so is generating a huge amount of deadlocks and timeouts on the SQL box. and we are struggling to work out why. we have - checked all the indexes and they are as good as they can be right now some of the major tables are way too wide and have a stupid amount of triggers on, but there is nothing we can do about this now.
alot of the pids seem to be the same for the same users who are trying multiple times.. so for instance..
User: user1 Time: 09:21 Error Message: Transaction (Process ID 76) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
User: user1 Time: 09:22 Error Message: Transaction (Process ID 76) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
etc.. when we moved the db to the new box it was backed up from the old and restored to the new...
if anyone has any suggestions as to something we can do , i will buy them multiple pints
thanks
nat
Deadlocks don't necessarily need high load to occur. They tend to be the byproduct of design issues in terms of which processes are locking data in which order, for how long, etc.
There are some useful features of the SQL profiler (2008 article here) to help you track & analyse deadlocks. I'd recommend this as the best starting point. If you're lucky, you'll find that there's just one or two culprits where you can readily, for instance, remove transactions, or reduce their longevity, to alleviate the situation.
I'm running Firebird 2.5 (and have also tried earlier versions) on Windows. Every day after 12:00PM running insert/update queries on one specific table hang, but complete successfully by 12:35 or so, no matter when started. It does seem that Firebird is doing some kind of maintenance on the table and it takes half an hour to complete, during which time the table cannot be written to (but the reads are fast). The table itself is really small, some 10000 rows, compared to millions of rows we have in other tables - and other tables do not get stuck.
I haven't been able to find any reason or solution. I tried dumping the table and restoring it, which didn't help, I tried switching between superserver and classic, changed versions with no success.
Has anyone experienced a problem like this?
No. Firebird doesn't have any internal maintenance procedures bind to some specified time of a day. Seems, there is some task on your server scheduled to run at 12:00 PM. Or there are network users of the server who start doing some heavy access at 12:00 PM.
The only maintenance FB does is "garbage collection" (geting rid of old record versions) and this is done on "when needed" basis (usually when records were selected, see the GCPolicy in firebird.conf) not on some predefined time.
Do you experience this hang only on during these certain hours or is it always slow to insert to that table? Have you checked the server load during the slowdown (ie in the task manager, is the CPU maxed out)? Anyway, here is some ideas to check:
What constraints / triggers do you have on the table? If they involve some extensive checks (ie against the other tables which contain millions of rows) this could be the reason inserts take so long.
Perhaps there is some other service which is triggered at that time? Ie do you have a cron job to make backup of the DB at that time? Or perhaps some other system service which runs at that time with higher priority slows down the server?
Do you have trace service active for the table? See fbtrace.conf in FireBird root directory. If it is active, extensive logging might be the cause of slowdown, if it isn't active, using it might help you to find the cause.
What are the setings for ForcedWrites / UnflushedWrites (see firebird.conf)? Does changing them make difference?
Is there something logged for this troublesome timeframe in firebird.log?
To me it looks like you have a process which starts at 12:00 and does something which locks the entire table. Use the monitoring table or the trace manager to see if there is any connection or active transaction which looks suspicious.
I also think your own transaction are started with the WAIT clause without a LOCK TIMEOUT, you might want to change this to NO WAIT or WAIT with a LOCK TIMEOUT, so that your transactions either fail immediately or after the timeout.
My suggestion is to use the TRACE API in 2.5 to track down what is happening near or around that time. That should help get you more information as to what is happening.
I use this for debugging http://upscene.com/products.misc.fbtm.php kinda buggy itself, but when it is working it is a god send.
Are some Client-Connections going DOWN at 12:00 PM? I had a similar problem on a 70.000 records sized table:
Client "A" has a permanently open DB Connection like "select * from TABLE". This is a "read only transaction" but reason enough for the server to generate Record-Versions. Why?
Client "B" made massive Updates to this Table, the Server tries to preserve the world like it was when "A" startet her "select". This is normal for Transaction able DB-Servers, and its implemented by creating Record Copies of the record-data before its updated.
So in my case for this TABLE 170.000 Record Versions existed. You can measure this by
gstat -r -t TABLE db.fdb | grep versions
If Client "B" goes down, the count of Record-Versions is NOT growing any more. Client "A" is the guilty one, freezing all this versions, forces the server to hold it. Finally if Client "A" goes down (or for example a firewall rule cuts all pending connections) Firebird is happy to start the process of getting rid of the now useless Record-Versions.
This "sweep"?! is bad programmed (even 2.5.2) cpu is 3% it do only <10.000 Versions / Minute so this TABLE has a performance of about 2%.
So as I understand it, SQL deadlocks happen when a SPID is busy processing another query and it can't be bothered to run another one because it's so busy right now. The SQL Server "randomly" picks one of the queries to deadlock out of the resources asked for and fails it out, throwing an exception.
I have an app running ~ 40 instances and a back-end Windows Service, all of which are hitting the same database. I'm looking to reduce deadlocks so I can increase the number of threads I can runs simultaneously.
Why can't SQL Server just enqueue the new query and run it when it has time and the resources are available? Most of what I'm doing can wait a few seconds on occasion.
Is there a way to set Transaction Isolation Level globally without having to specify it at the onset of each new connection/session?
Your understanding of deadlocks is not correct. What you've described is blocking. It's a common mistake to equate the two.
A deadlock occurs when two separate transactions each want different resources and neither will release the one that they have so that the other can run. It's probably easier to illustrate:
SPID #1 gets a lock on resource A
SPID #2 gets a lock on resource B
SPID #1 now needs a lock on resource B in order to complete
SPID #2 now needs a lock on resource A in order to complete
SPID #1 can't complete (and therefor release resource A) because SPID #2 has it
SPID #2 can't complete (and therefor release resource B) because SPID #1 has it
Since neither SPID can complete one has to give up (i.e. be chosen by the server as the deadlock victim) and will fail.
The best way to avoid them is to keep your transactions small (in number of resources needed) and quick.
Deadlock is where two threads of processing are both being held up by the other ( it can be more, but two is sufficiently complex ). So one thread locks a table, then requests a lock on another table. the other table is locked by the second thread, which cannot progress because it is waiting for a lock on the first table.
The reason that one of these has to be thrown out is that in a deadlock, they will never end - neither thread can progress at all. The only answer is for one to be stopped to allow the other to complete.
The solution to reducing deadlocks in the sort of situation you are talking about may be to redesign the solution. If you can make sure that less locking occurs, you will have less deadlocks.
Deadlocks occurs because, two concurrent transactions may overlap e lock different resources, both required by the other transaction to finish.
Let's imagine:
1 - Transaction A locks row1
2 - Transaction B locks row2
3 - Transaction A tries to lock row1, and, because of the previous lock, SQL server waits
4 - Transaction B tries to lock row2, and, because of the previous lock, SQL server waits
So, SQL server must choose on transaction, kill it, and allow the other to continue.
This image ilustrates this situation very well: http://www.eupodiatamatando.com/wp-content/uploads/2008/01/deadlocknajkcomafarialibh3.jpg
I have a table with ~800k rows. I ran an update users set hash = SHA1(CONCAT({about eight fields})) where 1;
Now I have a hung Sequel Pro process and I'm not sure about the mysqld process.
This is two questions:
What harm can possibly come from killing these programs? I'm working on a separate database, so no damage should come to other databases on the system, right?
Assume you had to update a table like this. What would be a quicker / more reliable method of updating without writing a separate script.
I just checked with phpMyAdmin and it appears as though the query is complete. I still have Sequel Pro using 100% of both my cores though...
If you're using InnoDB, which is backed by a transaction log for recovery and rollback purposes, then you can get away with a lot, especially in a non-production environment.
The easiest way to terminate a renegade query is to use the MySQL shell as the root user:
SHOW PROCESSLIST;
This will give you a list of the current connections and a process ID for each one. To terminate any given query, such as number 19, use:
KILL 19;
Usually this will undo and roll back the query. In some cases this is not sufficient and you may have to force-quit the MySQL server process with kill -9. Under most circumstances you should be able to restart the server right away, and the DB will be in the last fully committed state.
To get the thread IDs (it'll show the query alongside):
mysqladmin proc
To safely kill the query thread:
mysqladmin kill [id]
You'll end up with a partially updated table unless you use innodb, but you should be fine. Details:
During UPDATE or DELETE operations,
the kill flag is checked after each
block read and after each updated or
deleted row. If the kill flag is set,
the statement is aborted. Note that if
you are not using transactions, the
changes are not rolled back.
As for your second question, there is no better way to update a table if one is not allowed to write a separate script (to, say, throttle the updates).