I have a query that takes a long, long time to run (relatively speaking). It retrieves many rows with multiple varbinary(max) columns. This query needs optimising, no doubt about it - but my question is very specific to the ever-changing 'task state' I'm witnessing in SQL activity monitor.
Every 5 seconds or so the task state changes from suspended to running, then back again. What does this imply?
Note: I may raise a separate question regarding optimisation of such a query - but I'm not asking that for now, I'm asking very specifically about the quick change in state.
NOT A DUPLICATE BECAUSE:
I'm asking about the change in quick succession of the task state, I'm not asking what suspended means. I'm asking (if suspended means a wait on I/O) why it would wait on I/O, then not, quickly, many times per query.
This is normal, SUSPENDED simply means that the session is waiting for an event, such as I/O, to complete. You will find that sessions flick in and out of this state rather frequently.
You can see the explanations of the different statuses in this document here:
sp_who (Transact-SQL)
Related
I am doing a concurrency test in sql server 2019, I have SQLTest tool that runs concurrent queries, in my test I am using one single SELECT query (star schema) and on SSMS I have while loop that updates fact table records. while running both process I am seeing some of the threads/queries cancelled because of deadlock, which is expected but the option that I am looking or is there a possibility to add a wait time on my select before deadlock? in other words how much time SQL server waits before it creates deadlock error.
In this case I know constant updates are happening but we know that updates are for a fewer seconds so if SQL server can wait for some seconds before creating deadlock.
any suggestions or thoughts ?
I would suggest changing up your testing strategy a little.
Within your test harness, I would SET DEADLOCK_PRIORITY LOW;, so that when a deadlock is detected, your testing process voluntarily takes one for the team, allows itself to become the deadlock victim, and allows the conflicting process to continue.
Then, wrap the testing script in a TRY...CATCH. In the CATCH clause, check to see if the cause of the error is a deadlock (error code 1205), and if it is, retry your test. It's probably a good idea to also build a incremental counter into that so that you don't end up in an infinite deadlock loop.
is there a possibility to add a wait time on my select before deadlock?
No. It would make no sense.
A deadlock is defined as a dead end of locking, which will not, under no circumstrances, be fixed by simply waiting. One of the sides has to cancel.
I.e.
Tx1 has lock on table a, waits for lock on table b
Tx2 has lock on table b, waitf for lock on table a
Normally SQL Server waits (timeout) and cancels. In this case the deadlock detection steps up and realizes that no, unless a side is thrown out there is no way this gets resolved, so - it cancels one side. There is no waiting, because this is actually a programming bug. No joke.
Up there, Tx2 should FIRST ask for a lock on table a. It is good practice to get locks in a transaction in a defined order so this does not happen.
How can we identify idle events in Google Big Query.
The events that have been pushed by a user but not being queried.
I need to know if there is any existing procedure that i can follow.
I'm not sure I understand your question. Are you asking about query jobs that have been created but have not yet started running? If so, you can call jobs.list() and specify the PENDING state as a filter. Query jobs are generally only in the pending state for a very brief time however, unless you have specified batch priority.
So as I understand it, SQL deadlocks happen when a SPID is busy processing another query and it can't be bothered to run another one because it's so busy right now. The SQL Server "randomly" picks one of the queries to deadlock out of the resources asked for and fails it out, throwing an exception.
I have an app running ~ 40 instances and a back-end Windows Service, all of which are hitting the same database. I'm looking to reduce deadlocks so I can increase the number of threads I can runs simultaneously.
Why can't SQL Server just enqueue the new query and run it when it has time and the resources are available? Most of what I'm doing can wait a few seconds on occasion.
Is there a way to set Transaction Isolation Level globally without having to specify it at the onset of each new connection/session?
Your understanding of deadlocks is not correct. What you've described is blocking. It's a common mistake to equate the two.
A deadlock occurs when two separate transactions each want different resources and neither will release the one that they have so that the other can run. It's probably easier to illustrate:
SPID #1 gets a lock on resource A
SPID #2 gets a lock on resource B
SPID #1 now needs a lock on resource B in order to complete
SPID #2 now needs a lock on resource A in order to complete
SPID #1 can't complete (and therefor release resource A) because SPID #2 has it
SPID #2 can't complete (and therefor release resource B) because SPID #1 has it
Since neither SPID can complete one has to give up (i.e. be chosen by the server as the deadlock victim) and will fail.
The best way to avoid them is to keep your transactions small (in number of resources needed) and quick.
Deadlock is where two threads of processing are both being held up by the other ( it can be more, but two is sufficiently complex ). So one thread locks a table, then requests a lock on another table. the other table is locked by the second thread, which cannot progress because it is waiting for a lock on the first table.
The reason that one of these has to be thrown out is that in a deadlock, they will never end - neither thread can progress at all. The only answer is for one to be stopped to allow the other to complete.
The solution to reducing deadlocks in the sort of situation you are talking about may be to redesign the solution. If you can make sure that less locking occurs, you will have less deadlocks.
Deadlocks occurs because, two concurrent transactions may overlap e lock different resources, both required by the other transaction to finish.
Let's imagine:
1 - Transaction A locks row1
2 - Transaction B locks row2
3 - Transaction A tries to lock row1, and, because of the previous lock, SQL server waits
4 - Transaction B tries to lock row2, and, because of the previous lock, SQL server waits
So, SQL server must choose on transaction, kill it, and allow the other to continue.
This image ilustrates this situation very well: http://www.eupodiatamatando.com/wp-content/uploads/2008/01/deadlocknajkcomafarialibh3.jpg
We have a huge Oracle database and I frequently fetch data using SQL Navigator (v5.5). From time to time, I need to stop code execution by clicking on the Stop button because I realize that there are missing parts in my code. The problem is, after clicking on the Stop button, it takes a very long time to complete the stopping process (sometimes it takes hours!). The program says Stopping... at the bottom bar and I lose a lot of time till it finishes.
What is the rationale behind this? How can I speed up the stopping process? Just in case, I'm not an admin; I'm a limited user who uses some views to access the database.
Two things need to happen to stop a query:
The actual Oracle process has to be notified that you want to cancel the query
If the query has made any modification to the DB (DDL, DML), the work needs to be rolled back.
For the first point, the Oracle process that is executing the query should check from time to time if it should cancel the query or not. Even when it is doing a long task (big HASH JOIN for example), I think it checks every 3 seconds or so (I'm looking for the source of this info, I'll update the answer if I find it). Now is your software able to communicate correctly with Oracle? I'm not familiar with SLQ Navigator but I suppose the cancel mechanism should work like with any other tool so I'm guessing you're waiting for the second point:
Once the process has been notified to stop working, it has to undo everything it has already accomplished in this query (all statements are atomic in Oracle, they can't be stopped in the middle without rolling back). Most of the time in a DML statement the rollback will take longer than the work already accomplished (I see it like this: Oracle is optimized to work forward, not backward). If you are in this case (big DML), you will have to be patient during rollback, there is not much you can do to speed up the process.
If your query is a simple SELECT and your tool won't let you cancel, you could kill your session (needs admin rights from another session) -- this should be instantaneous.
When you cancel a query, the Oracle client should send OCIBreak() but this isn't implemented on a Windows server, that could be the cause.
Also, have your DBA check the value of SQLNET.EXPIRE_TIME.
A colleague of mine (I promise it was a colleague!) has left an update running on our main SQL Server since last Thursday (yes that's right folks, we're pushing 100 hours now!). The SQL in question (in one transaction, I might add) is:
update daily_prices set min_date = (select min(a.date)
from daily_prices a
where a.key = daily_prices.key and
a.iid = daily_prices.iid)
(Yeah I know, heinous...)
The total cost in the query plan is coming out as 22186.7, the estimated number of rows to update is around 151 million.
We obviously need to resolve this query one way or another, we realise that if we are to kill the query we're going to generate some brutal rollback, but we've got no way of knowing how far it has gotten. The only thing we do know is this entry from sys.dm_exec_requests:
session_id status query_text cpu_time total_elapsed_time reads writes logical_reads
52 suspended update daily_prices... 2328469 408947075 13831137 42458588 151809497
So my question is, what would be our best course of action?
wait it out
kill it and roll back, and hope that it rolls back before the next ice age
something else?
I personally would want to wait it out unless I though it had no chance of finishing this week, the roll back at this stage could take far longer than the query has to date. If it's a production server, I really wouldn't take option 2 and kill it unless I absolutely had to.
In terms of regaining some control / working system if you have suitable backups, bring online another database restore the backup / tlog backups, but you will not want to restore to beyond when the transaction was started (or it will still have to roll it back.) This at least gives you a system you could continue dev work against, but unlikely to be the ideal situation for a prod system.
If it's a production server, have some kind words with the individual as to the suitability of testing queries and query plans prior to it being executed. I am sure many DBA's can suggest the less polite methods of instruction :)
So we got fed up with waiting for our transaction to complete, (after a full week on
one piece of SQL, who wouldn't?), and as it was interfering with our backup
process, we thought killing it was a necessary evil.
The database started to rollback the transaction.
5 days passed.
We noted with some posts elsewhere on the internet that sometimes some magic
happened when the database was restarted and the transaction would "go away",
although these are generally debunked*, and it makes no sense, we thought we
had nothing left to lose so we gave it a go. We knew the database would go into
recovery mode, but the database was becoming increasingly sick anyway and unable
to run anything but its current rollback work anyway, and we've seen SQL Server misbehave with hogging system resources and not diverting them to where it needs to do the work.
(* we also know enough database theory to know that the DB wouldn't just "forget"
about a transaction in progress, but we were also seeing stack dumps in the
SQL Server error logs which kind of told us that the SQL Server was getting
increasingly grumpy at the amount of rollback it was having to undertake)
So we restarted the database.
Sure enough the database went into recovery mode. However, the SQL Server event Log
was now giving us an update every 20 seconds or so as to how long it was going to
take (in all, it reckoned about 25 hours from the log messages, but it ended up being
just an hour and a half (!)).
Whether this method of recovery/rollback is faster, I would strongly doubt (as I expect
SQL Server had to do the same level of work to unwind the transaction as before), however it did finish within an hour and a half, either way, I don't want to make a habit of restarting my production database when it is halfway through a rollback). The update messages in the event log were an absolute godsend, as anyone who has written a batch program
will tell you; however inaccurate they turned out to be - at least they were a worst case.
As we had the luxury of being the only two people using this production box, choosing to
send the database into recovery mode worked for us, and gave us informational messages we
didn't have access to with just our previous rollback state (or at least nothing we could
interpret given our lacking DBA skills). Would I recommend doing this in future?
....Absolutely not, however, hopefully the concerned parties have learnt their lesson, and
we can ask the board for some money for a proper development server! (epic Joel-Test fail!)