SQL Server 2005 Database 'In Recovery' - sql

I restored a 35Gb database on my dev machine yesterday and it was all going fine until this morning when my client app couldn't connect. So I opened SQL Management Studio to find the database 'In Recovery'.
I don't know a huge amount about this other than it is usually something to do with uncommitted transactions. Now since i know there aren't any uncommitted transactions it must be something else. So first off, I'd like to know under what conditions this can happen. Secondly, while this is going on I can't work so if there are any ways of either stopping the recovery, speeding it up or at least finding roughly how long it's gonna be that would help.

Do not shut down SQL while recovery is in progress. Let it finish. Check the error logs. If it doesn't finish, restore from backup.

You can find out how long it's going to take by looking in the event viewer. In the Application section on the Windows Logs you should get information messages from MSSQLSERVER with EventID 3450 telling you what it's up to. Something like:
Recovery of database 'XYZ' is 10% complete (approximately 123456 seconds remain) etc etc
I'm afraid I don't know how to stop it (yet).

Related

Transaction log shipping backup job remains after deleted database

I have been a lurker for several years and I think I have a question that hasn't been answered here.
We were in the middle of some pretty intense maintenance on or SQL server last night. Or primary database mdb file was very badly fragmented. We maintain a test copy of this database for testing and proof of concept purposes.
I had setup log shipping on the test database and without thinking I deleted the test database without removing the log shipping first. I am getting error 14421 - The log shipping secondary database SERVER.database has restore threshold of 45 minutes and is out of sync. No restore was performed for 10310 minutes. Restored latency is 0 minutes. Check agent log and logshipping monitor information.
I have removed everything I could with tsql. My research leads me to believe that this error is due to the backup job still trying to operate but I cannot find this job to remove it. It's really not a big deal but the error shows up every couple of minutes in the log.
Is there anything I can do?
Thanks in advance!
Log Shipping Information is stored in MSDB, not in the database itself. All you need to do is create a new database with the same name as the deleted database. Right click for properties, log ship tab and then uncheck the box to log ship the database. When you click okay the jobs(on primary and secondary) will be removed.

Run a SQL command in the event my connection is broken? (SQL Server)

Here's the sequence of events my hypothetical program makes...
Open a connection to server.
Run an UPDATE command.
Go off and do something that might take a significant amount of time.
Run another UPDATE that reverses the change in step 2.
Close connection.
But oh-no! During step 3, the machine running this program literally exploded. Other machines querying the same database will now think that the exploded machine is still working and doing something.
What I'd like to do, is just as the connection is opened, but before any changes have been made, tell the server that should this connection close for whatever reason, to run some SQL. That way, I can be sure that if something goes wrong, the closing update will run.
(To pre-empt the answer, I'm not looking for table/record locks or transactions. I'm not doing resource claims here.)
Many thanks, billpg.
I'm not sure there's anything built in, so I think you'll have to do some bespoke stuff...
This is totally hypothetical and straight off the top of my head, but:
Take the SPID of the connection you
opened and store it in some temp
table, with the text of the reversal
update.
Use an a background process (either
SSIS or something else) to monitor
the temp table and check that the
SPID is still present as an open connection.
If the connection dies then the background process can execute the stored revert command
If the connection completes properly then the SPID can be removed from the temp table so that the background process no longer reverts it when the connection closes.
Comments or improvements welcome!
I'll expand on my comment. In general, I think you should reconsider your approach. All database access code should open a connection, execute a query then close the connection where you rely on connection pooling to mitigate the expense of opening lots of database connections.
If it is the case that we are talking about a single SQL command whose rows on which it operates should not change, that is a problem that should be handled by the transaction isolation level. For that you might investigate the Snapshot isolation level in SQL Server 2005+.
If we are talking about a series of queries that are part of a long running transaction, that is more complicated and can be handled via storage of a transaction state which other connections read in order to determine whether they can proceed. Going down this road, you need to provide users with tools where they can cancel a long running transaction that might no longer be applicable.
Assuming it's even possible... this will only help you if the client machine explodes during the transaction. Also, there's a risk of false positives - the connection might get dropped for a few seconds due to network noise.
The approach that I'd take is to start a process on another machine that periodically pings the first one to check if it's still on-line, then takes action if it becomes unreachable.

How to kill/resolve a reeeeally long-running update in SQL Server

A colleague of mine (I promise it was a colleague!) has left an update running on our main SQL Server since last Thursday (yes that's right folks, we're pushing 100 hours now!). The SQL in question (in one transaction, I might add) is:
update daily_prices set min_date = (select min(a.date)
from daily_prices a
where a.key = daily_prices.key and
a.iid = daily_prices.iid)
(Yeah I know, heinous...)
The total cost in the query plan is coming out as 22186.7, the estimated number of rows to update is around 151 million.
We obviously need to resolve this query one way or another, we realise that if we are to kill the query we're going to generate some brutal rollback, but we've got no way of knowing how far it has gotten. The only thing we do know is this entry from sys.dm_exec_requests:
session_id status query_text cpu_time total_elapsed_time reads writes logical_reads
52 suspended update daily_prices... 2328469 408947075 13831137 42458588 151809497
So my question is, what would be our best course of action?
wait it out
kill it and roll back, and hope that it rolls back before the next ice age
something else?
I personally would want to wait it out unless I though it had no chance of finishing this week, the roll back at this stage could take far longer than the query has to date. If it's a production server, I really wouldn't take option 2 and kill it unless I absolutely had to.
In terms of regaining some control / working system if you have suitable backups, bring online another database restore the backup / tlog backups, but you will not want to restore to beyond when the transaction was started (or it will still have to roll it back.) This at least gives you a system you could continue dev work against, but unlikely to be the ideal situation for a prod system.
If it's a production server, have some kind words with the individual as to the suitability of testing queries and query plans prior to it being executed. I am sure many DBA's can suggest the less polite methods of instruction :)
So we got fed up with waiting for our transaction to complete, (after a full week on
one piece of SQL, who wouldn't?), and as it was interfering with our backup
process, we thought killing it was a necessary evil.
The database started to rollback the transaction.
5 days passed.
We noted with some posts elsewhere on the internet that sometimes some magic
happened when the database was restarted and the transaction would "go away",
although these are generally debunked*, and it makes no sense, we thought we
had nothing left to lose so we gave it a go. We knew the database would go into
recovery mode, but the database was becoming increasingly sick anyway and unable
to run anything but its current rollback work anyway, and we've seen SQL Server misbehave with hogging system resources and not diverting them to where it needs to do the work.
(* we also know enough database theory to know that the DB wouldn't just "forget"
about a transaction in progress, but we were also seeing stack dumps in the
SQL Server error logs which kind of told us that the SQL Server was getting
increasingly grumpy at the amount of rollback it was having to undertake)
So we restarted the database.
Sure enough the database went into recovery mode. However, the SQL Server event Log
was now giving us an update every 20 seconds or so as to how long it was going to
take (in all, it reckoned about 25 hours from the log messages, but it ended up being
just an hour and a half (!)).
Whether this method of recovery/rollback is faster, I would strongly doubt (as I expect
SQL Server had to do the same level of work to unwind the transaction as before), however it did finish within an hour and a half, either way, I don't want to make a habit of restarting my production database when it is halfway through a rollback). The update messages in the event log were an absolute godsend, as anyone who has written a batch program
will tell you; however inaccurate they turned out to be - at least they were a worst case.
As we had the luxury of being the only two people using this production box, choosing to
send the database into recovery mode worked for us, and gave us informational messages we
didn't have access to with just our previous rollback state (or at least nothing we could
interpret given our lacking DBA skills). Would I recommend doing this in future?
....Absolutely not, however, hopefully the concerned parties have learnt their lesson, and
we can ask the board for some money for a proper development server! (epic Joel-Test fail!)

Using Sql Server Replication

We are using Replication and seem to be having endless problems with it. It seems to shut down for unknown reasons. It needs to be shut down to remove a column and only starts back up half the time. Does anyone have any advice on how to properly use replication or some alternatives to it.
Edit:
We are using Sql Server 2005, We cannot use database mirroring as we used the other database for reporting. As far as I am aware you cannot query from a mirrored database.
If you need just couple of tables from your DB for reports, replication is more useful, but you also can set up log shipping with secondary server in STAND BY mode (especially if you need significant part of your data for reports), then you can run reports on secondary server. You just have to remember that log shipping will interfere with transaction log backups, so you have to use the same folder with log backup files for both processes.
I would think the combination of database mirroring and database snapshots will solve your issues.
First, database mirroring is very easy to setup and I have never had any problems with it (using it for the past 4+ years).
Second, creating a database snapshot on your failover server will allow you to run reports. You can setup a sql agent job to drop and re-create the snapshot on whatever acceptable interval you like.
Of course this is all dependent on if you need your reports to run on real-time data or if they can be delayed somewhat.
Here are a list of the problems that I have had to resolve to get replication working:
1) The replication sometimes lies to me and tells me this, even when its working fine.
"The server 'Bob' is not a Subscriber. (.Net SqlClient Data Provider)" I have tried to re-initialise it thinking that it was broken and it never was...
2) It can take a little while to restart itself, especially if your remote DB is on the other side of the planet, which it is in my case. If you are on a slow network connection, or it is not 100% reliable, then you can have problems. Also, the jobs which restart the process can sometimes take a while to run, which also delays things further.
3) Some changes require full re-initalisation which involves sending a new snapshot out. If you don't have your permissions quite right, and you can re-initialise manually, but it doesn't happen automatically, then this can be a another reason for problems.
We have a SQL transactional replication which runs perfectly happily. You seem to say that it is when you are making schema changes to the publisher that you get problems. Each time we do a schema change we drop the publication, subscription and the subscription database. Do the change, then re-build it all. We can do this becuase we can tolerate the time it takes to re-apply the snapshot. There are ways to apply schema changes to the publication and have them propogate to the subscriber. Take a look at sp_register_custom_scripting. We have made this work once, so I can give some more information about it if you need.
As #Jason says, you can report from a mirrored database by using a snapshot. Beware that the snapshot will take up space, and cause more work for the mirror server. Although how much space will depend on how much data is changing and how big your original database is. We do use a snapshot on a mirrored database for occasional reports because our entire database is not replicated.
log shipping http://msdn.microsoft.com/en-us/library/ms187103.aspx
What version of SQL Server are you using?
We're using replication now for a particular solution, and it seems to just work, day in, day out.
I would examine your event log's, and SQL Server logs to see if you can determine why it is shutting down, and why it doesn't start up.
Are you possibly patching the servers, or are you having network errors?
The alternatives to replication are log shipping, or database mirroring.
I personally prefer Database Mirroring, but it really depends what you're trying to do, as some of these aren't appropriate for certain situations.
We also have used SQL transactional replication. We had the same pains with updating schema, which requires dropping the publication on all servers, performing the updates, and then reinitializing replication, and hoping for the best. Sometimes it would not initialize, or a node would fall behind and we'd get little warning for it. A few times we even lost all the stored procedure execute permissions causing pretty much total failure on the websites.
We have a rather large database so reinitialization could take quite some time, meaning all updates had to be done at 2am on Sunday - not exactly when we're awake and alert and able to use all our faculties to deal with a problem that might arise.
We are ditching replication in favor of failover clustering on SQL 2008, but it can still be done all the way back to SQL 2000.
http://technet.microsoft.com/en-us/library/cc917693.aspx

What does a "WARNING: did not see LOP_CKPT_END" message mean on SQL Server 2005?

The above error message comes up just before SQL Server marks the database as "Suspect" and refuses to open it. Does anyone know what the message means and how to fix it? I think it's a matter of grabbing the backup, but would be nice if it was possible to recover the data.
I've had a look at the kb article but there are no transactions to resolve.
It appears that it means your distributed transaction coordinator failed to start correctly when bringing the SQL Server online.
please refer to this ASP.NET forum post and knowledge base article
Depending on the level of logging, you should be able to take the last known backup and slowly recover the logs using point in time recovery techniques to slowly bring the database up to the point right before failure began.
run checkdb to find out why it's been marked as suspect and see if it can be recovered without any data loss (win)