How intelligent is SQL Server Mirroring? - sql

While working on my current development product I have setup SQL server mirroring between the primary data center and the secondary data center. In the primary data center the SQL .mdf and .ldf files are stored on the SAN.
Now admittedly it should be very unlikely for us to lose the SAN but if for example the connection to the SAN was lost and the database integrity was lost. Would the mirroring still happen? I.e. would SQL now mirror the broken database and now both are equally broken?
From googling its not clear when mirroring will and will not happen so I was hoping that the community may be able to share some of there experiences.
I also have backup schedules setup which would be a final fail safe but realistically I would hope that the mirrored database would be our quickest way to bring everything back online.
In this scenario at present there is no witness server in the mirroring process although with the benefits of automatic failover I am thinking of adding one.
Thanks

As far as mirroring corruption between PRIMARY and SECONDARY goes: unfortunately, it depends. If the corruption is immediate and physical, then not normally -- the corruption is typically picked up by checks done at the end of the transaction and rolled back.
However, a database can exist in a corrupted state for some time before anything realises it is corrupted. If the underlying data pages are not touched, the engine never has cause to check them. So it is possible that underlying storage issues may mean that either database can become corrupted and you won't know until you attmept to access the affected pages. Traditionally, this would be a write operation, since your client connection will only read from the current active database (and not the partner).
This is why it is important to perform regular maintenance checks on your databases (e.g. DBCC CHECKDB). This becomes harder in a mirrored environment because only PRIMARY can typically be checked, so you really have to induce a manual failover to test your SECONDARY (unless you are running Enterprise, where you might be able to snapshot the mirror and check that -- I've not tried).
Starting with SQL Server 2008, the engine will attempt something called Automatic Page Repair, where it tries to automatically recover corrupted pages it encounters during the mirroring process. You should probably keep an eye on sys.dm_db_mirroring_auto_page_repair if this is something you are worried about.
If it is logical corruption, where the wrong data is entered, this will push across to SECONDARY without any means of stopping it.
However, I should point out that your approach might leave you with other issues. Mirroring isn't backup. And mirroring isn't great over WAN links.
In synchronous mode, it receives the client request, then writes to PRIMARY, then writes to SECONDARY, gets the OK back from SECONDARY and then sends an OK back to the client. If it can't write to SECONDARY, or doesn't get the response from SECONDARY, it rolls back the operation on PRIMARY (even though it was successful) and sends a failure back to the client.
A failing WAN link (even temporarily) can cause PRIMARY to choose not to accept connections (because it can't see SECONDARY). A failover mid-connection can leave you in an invalid logical data state, so make sure your transactions are sound.
With a WITNESS server, this can be a little more robust -- placing the witness server alongside PRIMARY in the same LAN allows WITNESS and PRIMARY to form quorum and agree that PRIMARY is still working, even though it can't see SECONDARY (thus not locking you out of a perfectly functioning database).
Instead, over my slower site-to-site links, I prefer to use log shipping between PRIMARY and SECONDARY. With a bit of effort I can control the transport between sites so as to rate-limit over the WAN link and it is possible keep the log-shipped SECONDARY in a single-user standby mode. This allows me to run the standard DBCC CHECKDB commands against SECONDARY, as well as also querying the SECONDARY for data reconcilliation purposes, too. I can also put a delay on the restoration, too, so I have some leeway to failover before a major logical data error reaches the SECONDARY (although that really depends on the RDO).
If I have a high-availability requirement, I might put in mirroring at the main site only -- i.e. two servers + witness. The relatively-quick few-second automatic failover time provided by the witnessed environment has saved me a few late-night calls, in the past.
Hope this helps.
J.

Related

Multiple application on network with same SQL database

I will have multiple computers on the same network with the same C# application running, connecting to a SQL database.
I am wondering if I need to use the service broker to ensure that if I update record A in table B on Machine 1, the change is pushed to Machine 2. I have seen applications that need to use messaging servers to accomplish this before but I was wondering why this is necessary, surely if they connect to the same database, any changes from one machine will be reflected on the other?
Thanks :)
This is mostly about consistency and latency.
If your applications always perform atomic operations on the database, and they always read whatever they need with no caching, everything will be consistent.
In practice, this is seldom the case. There's plenty of hidden opportunities for caching, like when you have an edit form - it has the values the entity had before you started the edit process, but what if someone modified those in the mean time? You'd just rewrite their changes with your data.
Solving this is a bunch of architectural decisions. Different scenarios require different approaches.
Once data is committed in the database, everyone reading it will see the same thing - but only if they actually get around to reading it, and the two reads aren't separated by another commit.
Update notifications are mostly concerned with invalidating caches, and perhaps some push-style processing (e.g. IM client might show you a popup saying you got a new message). However, SQL Server notifications are not reliable - there is no guarantee that you'll get the notification, and even less so that you'll get it in time. This means that to ensure consistency, you must not depend on the cached data, and you have to force an invalidation once in a while anyway, even if you didn't get a change notification.
Remember, even if you're actually using a database that's close enough to ACID, it's usually not the default setting (for performance and availability, mostly). You need to understand what kind of guarantees you're getting, and how to write code to handle this. Even the most perfect ACID database isn't going to help your consistency if your application introduces those inconsistencies :)

HA Database configuration that avoids split-brain issues?

I am looking for a (SQL/RDB) database setup that works something like this:
I will have 3+ databases in an active/active/active configuration
prior to doing any insert, the database will communicate with atleast a majority of the others, such that they all either insert at the same time or rollback (transaction)
this way I can write and read from any of the databases, and always get the same results (as long as the field wasn't updated very recently)
note: this is for a use case that will be very read-heavy and have few writes (and delay on the writes is an OK situation)
does anything like this exist? I see all sorts of solutions with database HA configurations, but most of them suggest writing to a primary node or having a passive backup
alternatively I could setup a custom application, and have each application talk to exactly 1 database, and achieve a similar result, but I was hoping something similar would already exist
So my questions is: does something like this exist? if not, are there any technical/architectural reasons why not?
P.S. - I will NOT be using a SAN where all databases can store/access the same data
edit: more clarifications as far as what I am looking for:
1. I have no database picked out yet, but I am more familiar with MySQL / SQL Server / Oracle, so I would have a minor inclination towards on of those
2. If a majority of the nodes are down (or a single node can't communicate with the collective), then I expect all writes from that node to fail, and accept that it may provide old/outdated information
failure / recover scenario expectations:
1. A node goes down: it will query and get updates from the other nodes when it comes back up
2. A node loses connection with the collective: it will provide potentially old data to read request, and refuse any writes
3. A node is in disagreement with the data stores in others: majority rule
4. 4. majority rule does not work: go with whomever has the latest data (although this really shouldn't happen)
5. The entries are insert/update/read only, i.e. there will be no deletes (except manually ofc), so I shouldn't need to worry about an update after a delete, however in that case I would choose to delete the record and ignore the update
6. Any other scenarios I missed?
update: I the closest I see to what I am looking for seems to be using a quorum + 2 DBs, however I would prefer if I could have 3 DBs instead, such that I can query any of them at any time (to further distribute the reads, and also to keep another copy of the data)
You need to define "very recently". In most environments with replication for inserts, all the databases will have the same data within a few seconds of an insert (and a few seconds seems pessimistic).
An alternative approach is a "read-from-one/write-to-all" approach. In this case, reads are spread through the system. Writes are then sent to all nodes by the application (or a common layer that the application uses).
Remember, though, that the issue with database replication is not how it works when it works. The issue is how it recovers when it fails and even how failures are identified. You need to decide what happens when nodes go down, how they recover lost transactions, how you decide that nodes are really synchronized. I would suggest that you peruse the documentation of the database that you are actually using and understand the replication mechanisms provided by that platform.

sql server 2005 mirrored database transaction log file maintenance

Ok so for standard, non-mirrored databases, the transaction log is kept in check either simply by having the database in simple mode or by doing regular backups. We keep ours in simple as we have SAN snapshot backups taking place and there is no need for SQL backups.
We're now going to mirroring. I obviously no longer have the choice of simple mode and must use full. this obviously leads to large log files and the need for log backups. That's fine I can deal with that; a maintenance plan that takes a log backup and discards any previous ones. I realise that this backup is essentially useless without its predecessors but the SAN snapshots are doing the backups.
My question is...
a) Is there a way to truncate the log file of all processed rows without creating a backup? (as I can't use them anyway...)
b) A maintenance plan is local to a server and is not replicated across a mirrored pair. How should it be done on a mirrored setup? such that when the database fails over, the plan starts running on the new principal, but doesn't get upset when its a mirror?
Thanks
A. If your server is important enough to mirror it, why isn't it important enough to take transaction log backups? SAN snapshots are point-in-time images of just one point in time, but they don't give you the ability to stop at different points of time along the way. When your developers truncate a table, you want to replay all of the logs right up until that statement, and stop there. That's what transaction log backups are good for.
B. Set up a maintenance plan (or even better, T-SQL scripts like Ola Hallengren's at http://ola.hallengren.com) to back up all of the databases, but check the boxes to only back up the online ones. (Off the top of my head, not sure if that's an option in 2005 - might be 2008 only.) That way, you'll always get whatever ones happen to fail over.
Of course, keep in mind that you need to be careful with things like cleanup scripts and copying those backup files. If you have half of your t-log backups on one share and half on the other, it's tougher to restore.
a) no, you cannot truncate a log that is part of a mirrored database. backing the logs up is your best option. I have several databases that are setup with mirroring simply based on teh HA needs but DR is not required for various reasons. That seems to be your situation? I would really still recommend keeping the log backups for a period of time. No reason to kill a perfectly good recovery plan that is added by your HA strategy. :)
b) My own solutions for this are to have a secondary agent job that monitors based on the status of the mirror. If the mirror is found to change, the secondary job on teh mirror instance is enabled and if possible, the old principal is disabled. if the principal was down and it comes back up, the job is still disabled. the only way the jobs themselves would be switched back is the event of again, another forced failover.

Database Replication or Mirroring?

What is the difference between Replication and Mirroring in SQL server 2005?
In short, mirroring allows you to have a second server be a "hot" stand-by copy of the main server, ready to take over any moment the main server fails. So mirroring offers fail-over and reliability.
Replication, on the other hand, allows two or more servers to stay "in sync" - that means the secondary servers can answer queries and (depending on setup) actually change data (it will be merged in the sync). You can also use it for local caching, load balancing, etc.
Mirroring is a feature that creates a copy of your database at bit level. Basically you have the same, identical, database in two places. You cannot optionally leave out parts of the database. You can have only one mirror, and the 'mirror' is always offline (it cannot be modified). Mirroring works by shipping the database log as is being created to the mirror and apply (redo-ing) the log on the mirror. Mirroring is a technology for high availability and disaster recoverability.
Replication is a feature that allow 'slices' of a database to be replicated between several sites. The 'slice' can be a set of database objects (ie. tables) but it can also contain parts of a table, like only certain rows (horizontal slicing) or only certain columns to be replicated. You can have multiple replicas and the 'replicas' are available to query and even can be updated. Replication works by tracking/detecting changes (either by triggers or by scanning the log) and shipping the changes, as T-SQL statements, to the subscribers (replicas). Replication is a technology for making data available at off sites and to consolidate data to central sites. Although it is sometimes used for high availability or for disaster recoverability, it is an artificial use for a problem that mirroring and log shipping address better.
There are several types and flavours of replication (merge, transactional, peer-to-peer etc.) and they differ in how they implement change tracking or update propagation, if you want to know more details you should read the MSDN spec on the subject.
Database mirroring is used to increase database uptime and reliability.
Replication is used primarily to distribute portions of your primary database -- the publisher -- to one or more subscriber databases. This is often done to make data available (typically for read only) on remote servers so that remote clients can access the data locally (to them) rather than directly from the publisher across a slower WAN connection. Although, as the previous posts indicate, there are more complex scenarios where updates are permitted on the subscribers. It also can have the benefit of reducing the I/O load on the publisher.

Using Sql Server Replication

We are using Replication and seem to be having endless problems with it. It seems to shut down for unknown reasons. It needs to be shut down to remove a column and only starts back up half the time. Does anyone have any advice on how to properly use replication or some alternatives to it.
Edit:
We are using Sql Server 2005, We cannot use database mirroring as we used the other database for reporting. As far as I am aware you cannot query from a mirrored database.
If you need just couple of tables from your DB for reports, replication is more useful, but you also can set up log shipping with secondary server in STAND BY mode (especially if you need significant part of your data for reports), then you can run reports on secondary server. You just have to remember that log shipping will interfere with transaction log backups, so you have to use the same folder with log backup files for both processes.
I would think the combination of database mirroring and database snapshots will solve your issues.
First, database mirroring is very easy to setup and I have never had any problems with it (using it for the past 4+ years).
Second, creating a database snapshot on your failover server will allow you to run reports. You can setup a sql agent job to drop and re-create the snapshot on whatever acceptable interval you like.
Of course this is all dependent on if you need your reports to run on real-time data or if they can be delayed somewhat.
Here are a list of the problems that I have had to resolve to get replication working:
1) The replication sometimes lies to me and tells me this, even when its working fine.
"The server 'Bob' is not a Subscriber. (.Net SqlClient Data Provider)" I have tried to re-initialise it thinking that it was broken and it never was...
2) It can take a little while to restart itself, especially if your remote DB is on the other side of the planet, which it is in my case. If you are on a slow network connection, or it is not 100% reliable, then you can have problems. Also, the jobs which restart the process can sometimes take a while to run, which also delays things further.
3) Some changes require full re-initalisation which involves sending a new snapshot out. If you don't have your permissions quite right, and you can re-initialise manually, but it doesn't happen automatically, then this can be a another reason for problems.
We have a SQL transactional replication which runs perfectly happily. You seem to say that it is when you are making schema changes to the publisher that you get problems. Each time we do a schema change we drop the publication, subscription and the subscription database. Do the change, then re-build it all. We can do this becuase we can tolerate the time it takes to re-apply the snapshot. There are ways to apply schema changes to the publication and have them propogate to the subscriber. Take a look at sp_register_custom_scripting. We have made this work once, so I can give some more information about it if you need.
As #Jason says, you can report from a mirrored database by using a snapshot. Beware that the snapshot will take up space, and cause more work for the mirror server. Although how much space will depend on how much data is changing and how big your original database is. We do use a snapshot on a mirrored database for occasional reports because our entire database is not replicated.
log shipping http://msdn.microsoft.com/en-us/library/ms187103.aspx
What version of SQL Server are you using?
We're using replication now for a particular solution, and it seems to just work, day in, day out.
I would examine your event log's, and SQL Server logs to see if you can determine why it is shutting down, and why it doesn't start up.
Are you possibly patching the servers, or are you having network errors?
The alternatives to replication are log shipping, or database mirroring.
I personally prefer Database Mirroring, but it really depends what you're trying to do, as some of these aren't appropriate for certain situations.
We also have used SQL transactional replication. We had the same pains with updating schema, which requires dropping the publication on all servers, performing the updates, and then reinitializing replication, and hoping for the best. Sometimes it would not initialize, or a node would fall behind and we'd get little warning for it. A few times we even lost all the stored procedure execute permissions causing pretty much total failure on the websites.
We have a rather large database so reinitialization could take quite some time, meaning all updates had to be done at 2am on Sunday - not exactly when we're awake and alert and able to use all our faculties to deal with a problem that might arise.
We are ditching replication in favor of failover clustering on SQL 2008, but it can still be done all the way back to SQL 2000.
http://technet.microsoft.com/en-us/library/cc917693.aspx