i ran a t-sql query on SQL Server 2005 DB it was suppose to run for 4 days , but when i checked after 4 days i found the machine was abruptly closed. So i want to trace what happened with the query or in other words whats was the status of the query when machine went down
Is there a mechanism to check this
It may be very difficult to do such a post-mortem. But while it may not be possible to determine what portion of the long running query was active when the shutdown occured, it is important to try and find the root cause of the problem!
Depending on the way the query is written, SQL will attempt to roll-back any SQL that wasn't completed (or any uncommitted transaction if transactions were used). This will happen the first time the server is restarted; If you have any desire to analyze the SQL transaction log (yuck!) make a copy.
Before getting to the part of the query which may have been running at the time of the shutdown, it may be preferable to look into SQL logs (I mean application logs, with information messages, timestamps and such, not SQL transaction logs for a given database), in search of basic facts, and possibly of an error message, happening some time prior to the shutdown and that could plausibly be the origin of the problem. Do also check the Windows event logs, as they may indicate abnormal system-wide conditions that could be a cause or consequence of the SQL shutdown.
The SQL transaction log can be "decompiled" and used to understand what was readily updated/inserted/deleted in the database, however this type of info for a 4 days run may be burried quite deep and long and possibly intersped with unrelated queries. In fact an out of disk space for the SQL log could possibly cause the server to become erratic (I would expect a more elegant handling of such situation, but if somehow the drive in question was also the system drive, bad things could happen...)
Finaly, by analyzing the "4 days long" SQL script it may be possible to infer which parts were completed, thanks to various side-effects of the script. In nothing else, this review may allow putting back the database in a coherent state if necessary, or to truncate the script to exclude the parts readily completed, in preparation for a new run for completign the work.
Related
I have a SQL Server database in production and it has been live for 2 months. A while ago the web application associated with it loading takes too long time. And sometimes it says timeout occurred.
Found a quick fix by running a command 'exec sp_updatestats' will fixed the problem. But I need to be run that one consistently (for every 5 minutes).
So I created a Windows service with timer and started on server. My question is what are the root causes and possible permanent solutions? Anyone?
Here is a Most expensive query from Activity Monitor
WAITFOR(RECEIVE TOP (1) message_type_name, conversation_handle, cast(message_body AS XML) as message_body from [SqlQueryNotificationService-2eea594b-f994-43be-a5ed-d9a47837a391]), TIMEOUT #p2;
To diagnose a poorly performing queries you need to:
Identify the poorly performing query, e.g. via application logging, a SQL Profiler trace filtered to show only queries with a longer duration than a certain threshold etc...
Get an execution plan for the query
At that point you can start to try to figure out what the performance issue is.
Given that exec sp_updatestats fixes your issue it could be that statistics on certain tables are out of date (this is a pretty common cause of performance issues). If thats the case then you might be able to tweak your statistics or at least rebuild only those statistics that are causing issues.
Its worth noting that updating statistics will also cause cached execution plans to become invalid, and so its plausible that your issue is unrelated to statistics - you need to collect more information about the poorly performing queries before looking at solutions.
Your most expensive query looks like its waiting for a message, i.e. its in your list of long running queries because its designed to run for a long time, not because its expensive.
Thanks for everyone i found a solution for my issue . Its quite different I've enabled sql dependency module on my sql server by setting up enable broker on , thats the one causing timeout query so by setting it to false everything is fine working now.
I am not sure this is the appropriate place to post this question, although I arrive to it in the context of writing a relatively complex SQL query. If it is not, please accept my apologies and, if I can abuse your kindness, point me to the appropriate Stack Exchange community where I might find a solution to my problem.
I had to implement a "simple" search; simple for the user to use as it only had one field but rather complicated for me to implement because any expression entered in the search field could be interpreted in quite a few different ways, involving multiple tables and relationships. So I decided to break up the search into simpler queries, each dumping a group of ids based on a specific set of criteria into a table variable. Finally, I would consolidate all the temporary results into a single set, joining in any missing fields and sorting the set according to the user preferences. The results seemed accurate, but at times the performance was surprisingly and unacceptably slow.
After some investigation, I learned that if a table variable is joined with other tables in SQL Server, it may result in slow performance due to inefficient query plan selection because SQL Server does not support statistics or track number of rows in a table variable while compiling a query plan. Apparently, however, I could improve the situation by enabling trace flag 2453, which allows a table variable to trigger recompile when enough number of rows are changed. Indeed, times dropped from approximately 30 seconds to well under 1 second, so I decided to test the trace flag in other queries that use table variables containing potentially many rows.
What I did not expect was that every enabling and disabling of trace flag 2453 would yield an informational event (similar to "DBCC TRACEON 2453, server process ID (SPID) 56. This is an informational message only; no user action is required") both in the SQL Server log and in the operating system event log. But some of the queries in which I have enabled the trace flag are run many times, which is causing those informational messages that require no user action to flood both logs, hiding other potentially useful messages that might require user action.
Through some additional investigation, I have learned that trace flag 2505 can prevent some of these events from being logged, but in my test environment (running SQL Server Express 2012 on Windows 7 SP1) this did not solve my issue.
Does anybody know how to prevent these messages from flooding the logs? Thanks!
I have two databases in different servers - center_db on siglv01\sql2008 and center_db on sig\sql2008.
Can I restart replication without needing to reinitialize it? The connection dropped more than 3 days ago and is now too slow: so I want to start replication without a reinitialize.
Based on the brief conversation above, I don't think you can do this without a re-init. Specifically, the distribution database only keeps so many commands before it starts trimming. The default is 72 hours. If the last command delivered to all of your subscribers is older than that, the distribution database doesn't have what it needs to play forward all of the activity that has happened since then.
Your only hope would be if the distribution agent is still running (it knows when the above situation happens and will give you an error saying as much). If so, try to figure out why delivery is slow (troubleshoot this like any other "slow application"; replication isn't magic) and see if it can get caught up that way. Depending on how many commands are remain undelivered, it may be faster to just re-init.
Error logs for our SQL Server instance are gathering a large amount of data (250k records in a month) all day, then all of a sudden stop at roughly the same time of day (9:15pm), though on different days of the week and at seemingly random intervals of days.
This corresponds to other issues on the server: 1) jobs that move files to shares on the database server fail 2) I am not able to access the server via any method (tried RDP and SSMS). Once the servers are rebooted, SQL Server comes up and SQL Server error logging resumes.
Windows Event Viewer doesn't show any notable error messages for System (the other event logs have wrapped already).
The error logs are being written to the D:\ drive, which has over 100GB free currently. The error log files are in the range of tens of megabytes.
Appreciate any ideas on what might have caused this or how troubleshoot it. Thanks!
The cause appears to have been a corrupted maintenance plan. I discovered this by correlating the timing of the lock-up to the times the maintenance plan was running. The lack of logging made this difficult to confirm. Guessing that at least some parts of it ran normally, but got rolled back on restart.
The current fix was to disable the maintenance plan and replace it with a collection of jobs that do the same tasks. I will likely recreate the original maintenance plan if the server remains stable for another week or two. If we stay stable past that point, it should solidly confirm the maintenance plan as the source of the problem.
A colleague of mine (I promise it was a colleague!) has left an update running on our main SQL Server since last Thursday (yes that's right folks, we're pushing 100 hours now!). The SQL in question (in one transaction, I might add) is:
update daily_prices set min_date = (select min(a.date)
from daily_prices a
where a.key = daily_prices.key and
a.iid = daily_prices.iid)
(Yeah I know, heinous...)
The total cost in the query plan is coming out as 22186.7, the estimated number of rows to update is around 151 million.
We obviously need to resolve this query one way or another, we realise that if we are to kill the query we're going to generate some brutal rollback, but we've got no way of knowing how far it has gotten. The only thing we do know is this entry from sys.dm_exec_requests:
session_id status query_text cpu_time total_elapsed_time reads writes logical_reads
52 suspended update daily_prices... 2328469 408947075 13831137 42458588 151809497
So my question is, what would be our best course of action?
wait it out
kill it and roll back, and hope that it rolls back before the next ice age
something else?
I personally would want to wait it out unless I though it had no chance of finishing this week, the roll back at this stage could take far longer than the query has to date. If it's a production server, I really wouldn't take option 2 and kill it unless I absolutely had to.
In terms of regaining some control / working system if you have suitable backups, bring online another database restore the backup / tlog backups, but you will not want to restore to beyond when the transaction was started (or it will still have to roll it back.) This at least gives you a system you could continue dev work against, but unlikely to be the ideal situation for a prod system.
If it's a production server, have some kind words with the individual as to the suitability of testing queries and query plans prior to it being executed. I am sure many DBA's can suggest the less polite methods of instruction :)
So we got fed up with waiting for our transaction to complete, (after a full week on
one piece of SQL, who wouldn't?), and as it was interfering with our backup
process, we thought killing it was a necessary evil.
The database started to rollback the transaction.
5 days passed.
We noted with some posts elsewhere on the internet that sometimes some magic
happened when the database was restarted and the transaction would "go away",
although these are generally debunked*, and it makes no sense, we thought we
had nothing left to lose so we gave it a go. We knew the database would go into
recovery mode, but the database was becoming increasingly sick anyway and unable
to run anything but its current rollback work anyway, and we've seen SQL Server misbehave with hogging system resources and not diverting them to where it needs to do the work.
(* we also know enough database theory to know that the DB wouldn't just "forget"
about a transaction in progress, but we were also seeing stack dumps in the
SQL Server error logs which kind of told us that the SQL Server was getting
increasingly grumpy at the amount of rollback it was having to undertake)
So we restarted the database.
Sure enough the database went into recovery mode. However, the SQL Server event Log
was now giving us an update every 20 seconds or so as to how long it was going to
take (in all, it reckoned about 25 hours from the log messages, but it ended up being
just an hour and a half (!)).
Whether this method of recovery/rollback is faster, I would strongly doubt (as I expect
SQL Server had to do the same level of work to unwind the transaction as before), however it did finish within an hour and a half, either way, I don't want to make a habit of restarting my production database when it is halfway through a rollback). The update messages in the event log were an absolute godsend, as anyone who has written a batch program
will tell you; however inaccurate they turned out to be - at least they were a worst case.
As we had the luxury of being the only two people using this production box, choosing to
send the database into recovery mode worked for us, and gave us informational messages we
didn't have access to with just our previous rollback state (or at least nothing we could
interpret given our lacking DBA skills). Would I recommend doing this in future?
....Absolutely not, however, hopefully the concerned parties have learnt their lesson, and
we can ask the board for some money for a proper development server! (epic Joel-Test fail!)