SQL server 2008 replication without reinitialize - sql

I have two databases in different servers - center_db on siglv01\sql2008 and center_db on sig\sql2008.
Can I restart replication without needing to reinitialize it? The connection dropped more than 3 days ago and is now too slow: so I want to start replication without a reinitialize.

Based on the brief conversation above, I don't think you can do this without a re-init. Specifically, the distribution database only keeps so many commands before it starts trimming. The default is 72 hours. If the last command delivered to all of your subscribers is older than that, the distribution database doesn't have what it needs to play forward all of the activity that has happened since then.
Your only hope would be if the distribution agent is still running (it knows when the above situation happens and will give you an error saying as much). If so, try to figure out why delivery is slow (troubleshoot this like any other "slow application"; replication isn't magic) and see if it can get caught up that way. Depending on how many commands are remain undelivered, it may be faster to just re-init.

Related

What would cause SQL Server to stop writing to the error log?

Error logs for our SQL Server instance are gathering a large amount of data (250k records in a month) all day, then all of a sudden stop at roughly the same time of day (9:15pm), though on different days of the week and at seemingly random intervals of days.
This corresponds to other issues on the server: 1) jobs that move files to shares on the database server fail 2) I am not able to access the server via any method (tried RDP and SSMS). Once the servers are rebooted, SQL Server comes up and SQL Server error logging resumes.
Windows Event Viewer doesn't show any notable error messages for System (the other event logs have wrapped already).
The error logs are being written to the D:\ drive, which has over 100GB free currently. The error log files are in the range of tens of megabytes.
Appreciate any ideas on what might have caused this or how troubleshoot it. Thanks!
The cause appears to have been a corrupted maintenance plan. I discovered this by correlating the timing of the lock-up to the times the maintenance plan was running. The lack of logging made this difficult to confirm. Guessing that at least some parts of it ran normally, but got rolled back on restart.
The current fix was to disable the maintenance plan and replace it with a collection of jobs that do the same tasks. I will likely recreate the original maintenance plan if the server remains stable for another week or two. If we stay stable past that point, it should solidly confirm the maintenance plan as the source of the problem.

SyncFramework 2.1 updates & deletes do not seem to apply properly

I'm synchronizing SQL Server 2008 with ~6 SQL Server 2008 Express clients (everything R2 I believe), using the SyncOrchestrator or specifically using http://code.msdn.microsoft.com/windowsdesktop/Database-SyncSQL-Server-e97d1208 as a base with slight modifications. To my knowledge this means all connections are peers or nodes.
I have 2 scopes. One is download only and the other is upload only. The download only scope is ridden with identity columns primarily because I didn't know any better and still couldn't wrap my head around introducing Guids as the PK on the client side. It doesn't totally matter as all clients should have exact replicas of about 8 or so tables and these machines don't touch this data in any way, only read it.
The upload only scope uses Guids as fortunately I can control that portion of the database and there would be no way 10 clients all using the same identity seed could sync back to the server properly. Both scopes use the default provisioning with bulk inserts and the whole 9 yards so there shouldn't be anything I'm doing on the provisioning end to screw this up.
I initially set everything up not using PerformPostRestoreFixup AND the initial database would be manually synchronized with insert statements from the host. This seemed fine but no updates or deletes seemed to ever be applied. You can safely ignore this (only used for historical accuracy and to prove my ineptness) as I then used VS2010 Database Projects to rebuild the database down to schema only & synchronized. I then used the steps outlined here (http://social.microsoft.com/Forums/br/syncdevdiscussions/thread/9ac6d1a1-1565-4b82-a8d8-3d4a9ff5d07b) (sync, backup, restore, call performpostrestorefixup, sync on x clients) and on my dev box where I'm setting all this up I could see updates and deletes just fine. Its when I deploy this to the x clients that I'm not seeing a mirror of the database as I think I should.
The initial sync will complain and try to synchronize all records again. I believe this is expected. During ApplyChangeFailed event on the client I set everything other than DbConflictType.ErrorsOccurred to ApplyAction.RetryWithForceWrite. This may be a source of problems as I initially thought this should be done to force the change down to the client. I want the server to always win in this scenario but during trace I always see the phrase "Local wins" during the bulk insert/update calls. It's possible I'm seeing the error before the re-apply happens but it's awkward to look at.
The only problem I seem to be having is with the download only scope. The initial client database is about a week old now and if I use the performpostrestorefixup steps I don't see any of the updates that have applied between now and then as I think I should. It's as if SyncFx almost prefers a blank database on the client side to kick off the initial sync then all the updates seem to apply just fine with no ApplyChangesFailed events kicking off.
If anyone has seen this before or has a clue where to go I would greatly appreciate it. My brain has fried trying to determine what it is that's going on. My last ditch effort will be to deploy blank databases to all the clients and have them start the sync. I've had no issues with this on the dev side but I can only test one other client to know if that'll do anything different. Aside from that I don't know what to do other than to keep doing manual syncs which would defeat this purpose entirely. I thought PerformPostRestoreFixup would alleviate the issue entirely but I seem to be having the same problems with or without it or perhaps I'm not looking at what I need to be.
Thanks
I wanted to report and close the entry with my findings.
When I would deploy a previously configured client database, I'd often get ApplyChangeFailed events in the form of this log:
"[05:30:41 PM] - ApplyChange Failed: TableName: , Stage: ApplyingInserts, ConflictType: LocalInsertRemoteInsert, Action: RetryWithForceWrite"
This is what I thought would be expected as it tried to reinsert the data that is already there. What this should've been changed to was an update statement during RetryWithForceWrite but I found the data was not updating with what was being sent down.
Once I started each client with a completely blank database and provisioned locally, all of these errors went away. It's as if every client expects some unique id only it sets. I'm also using x64 builds versus x86 which may have some or no bearing on the results. I wish I could determine what exactly happened but it seems that when in doubt, and whenever possible, starting from absolute zero and letting sync fill in the data is your safest option.

Log of Queries executed on SQL Server

i ran a t-sql query on SQL Server 2005 DB it was suppose to run for 4 days , but when i checked after 4 days i found the machine was abruptly closed. So i want to trace what happened with the query or in other words whats was the status of the query when machine went down
Is there a mechanism to check this
It may be very difficult to do such a post-mortem. But while it may not be possible to determine what portion of the long running query was active when the shutdown occured, it is important to try and find the root cause of the problem!
Depending on the way the query is written, SQL will attempt to roll-back any SQL that wasn't completed (or any uncommitted transaction if transactions were used). This will happen the first time the server is restarted; If you have any desire to analyze the SQL transaction log (yuck!) make a copy.
Before getting to the part of the query which may have been running at the time of the shutdown, it may be preferable to look into SQL logs (I mean application logs, with information messages, timestamps and such, not SQL transaction logs for a given database), in search of basic facts, and possibly of an error message, happening some time prior to the shutdown and that could plausibly be the origin of the problem. Do also check the Windows event logs, as they may indicate abnormal system-wide conditions that could be a cause or consequence of the SQL shutdown.
The SQL transaction log can be "decompiled" and used to understand what was readily updated/inserted/deleted in the database, however this type of info for a 4 days run may be burried quite deep and long and possibly intersped with unrelated queries. In fact an out of disk space for the SQL log could possibly cause the server to become erratic (I would expect a more elegant handling of such situation, but if somehow the drive in question was also the system drive, bad things could happen...)
Finaly, by analyzing the "4 days long" SQL script it may be possible to infer which parts were completed, thanks to various side-effects of the script. In nothing else, this review may allow putting back the database in a coherent state if necessary, or to truncate the script to exclude the parts readily completed, in preparation for a new run for completign the work.

Is it possible to get sub-1-second latency with transactional replication?

Our database architecture consists of two Sql Server 2005 servers each with an instance of the same database structure: one for all reads, and one for all writes. We use transactional replication to keep the read database up-to-date.
The two servers are very high-spec indeed (the write server has 32GB of RAM), and are connected via a fibre network.
When deciding upon this architecture we were led to believe that the latency for data to be replicated to the read server would be in the order of a few milliseconds (depending on load, obviously). In practice we are seeing latency of around 2-5 seconds in even the simplest of cases, which is unsatisfactory. By a simplest case, I mean updating a single value in a single row in a single table on the write db and seeing how long it takes to observe the new value in the read database.
What factors should we be looking at to achieve latency below 1 second? Is this even achievable?
Alternatively, is there a different mode of replication we should consider? What is the best practice for the locations of the data and log files?
Edit
Thanks to all for the advice and insight - I believe that the latency periods we are experiencing are normal; we were mis-led by our db hosting company as to what latency times to expect!
We're using the technique described near the bottom of this MSDN article (under the heading "scaling databases"), and we'd failed to deal properly with this warning:
The consequence of creating such specialized databases is latency: a write is now going to take time to be distributed to the reader databases. But if you can deal with the latency, the scaling potential is huge.
We're now looking at implementing a change to our caching mechanism that enforces reads from the write database when an item of data is considered to be "volatile".
No. It's highly unlikely you could achieve sub-1s latency times with SQL Server transactional replication even with fast hardware.
If you can get 1 - 5 seconds latency then you are doing well.
From here:
Using transactional replication, it is
possible for a Subscriber to be a few
seconds behind the Publisher. With a
latency of only a few seconds, the
Subscriber can easily be used as a
reporting server, offloading expensive
user queries and reporting from the
Publisher to the Subscriber.
In the following scenario (using the
Customer table shown later in this
section) the Subscriber was only four
seconds behind the Publisher. Even
more impressive, 60 percent of the
time it had a latency of two seconds
or less. The time is measured from
when the record was inserted or
updated at the Publisher until it was
actually written to the subscribing
database.
I would say it's definately possible.
I would look at:
Your network
Run ping commands between the two servers and see if there are any issues
If the servers are next to each other you should have < 1 ms.
Bottlenecks on the server
This could be network traffic (volume)
Like network cards not being configured for 1GB/sec
Anti-virus or other things
Do some analysis on some queries and see if you can identify indexes or locking which might be a problem
See if any of the selects on the read database might be blocking the writes.
Add with (nolock), and see if this makes a difference on one or two queries you're analyzing.
Essentially you have a complicated system which you have a problem with, you need to determine which component is the problem and fix it.
Transactional replication is probably best if the reports / selects you need to run need to be up to date. If they don't you could look at log shipping, although that would add some down time with each import.
For data/log files, make sure they're on seperate drives so the performance is maximized.
Something to remember about transaction replication is that a single update now requires several operations to happen for that change to occur.
First you update the source table.
Next the log readers sees the change and writes the change to the distribution database.
Next the distribution agent sees the new entry in the distribution database and reads that change, then runs the correct stored procedure on the subscriber to update the row.
If you monitor the statement run times on the two servers you'll probably see that they are running in just a few milliseconds. However it is the lag time while waiting for the log reader and distribution agent to see that they need to do something which is going to kill you.
If you truly need sub second processing time then you will want to look into writing your own processing engine to handle data moving from one server to another. I would recommend using SQL Service Broker to handle this as this way everything is native to SQL Server and no third party code has to be written.

Using Sql Server Replication

We are using Replication and seem to be having endless problems with it. It seems to shut down for unknown reasons. It needs to be shut down to remove a column and only starts back up half the time. Does anyone have any advice on how to properly use replication or some alternatives to it.
Edit:
We are using Sql Server 2005, We cannot use database mirroring as we used the other database for reporting. As far as I am aware you cannot query from a mirrored database.
If you need just couple of tables from your DB for reports, replication is more useful, but you also can set up log shipping with secondary server in STAND BY mode (especially if you need significant part of your data for reports), then you can run reports on secondary server. You just have to remember that log shipping will interfere with transaction log backups, so you have to use the same folder with log backup files for both processes.
I would think the combination of database mirroring and database snapshots will solve your issues.
First, database mirroring is very easy to setup and I have never had any problems with it (using it for the past 4+ years).
Second, creating a database snapshot on your failover server will allow you to run reports. You can setup a sql agent job to drop and re-create the snapshot on whatever acceptable interval you like.
Of course this is all dependent on if you need your reports to run on real-time data or if they can be delayed somewhat.
Here are a list of the problems that I have had to resolve to get replication working:
1) The replication sometimes lies to me and tells me this, even when its working fine.
"The server 'Bob' is not a Subscriber. (.Net SqlClient Data Provider)" I have tried to re-initialise it thinking that it was broken and it never was...
2) It can take a little while to restart itself, especially if your remote DB is on the other side of the planet, which it is in my case. If you are on a slow network connection, or it is not 100% reliable, then you can have problems. Also, the jobs which restart the process can sometimes take a while to run, which also delays things further.
3) Some changes require full re-initalisation which involves sending a new snapshot out. If you don't have your permissions quite right, and you can re-initialise manually, but it doesn't happen automatically, then this can be a another reason for problems.
We have a SQL transactional replication which runs perfectly happily. You seem to say that it is when you are making schema changes to the publisher that you get problems. Each time we do a schema change we drop the publication, subscription and the subscription database. Do the change, then re-build it all. We can do this becuase we can tolerate the time it takes to re-apply the snapshot. There are ways to apply schema changes to the publication and have them propogate to the subscriber. Take a look at sp_register_custom_scripting. We have made this work once, so I can give some more information about it if you need.
As #Jason says, you can report from a mirrored database by using a snapshot. Beware that the snapshot will take up space, and cause more work for the mirror server. Although how much space will depend on how much data is changing and how big your original database is. We do use a snapshot on a mirrored database for occasional reports because our entire database is not replicated.
log shipping http://msdn.microsoft.com/en-us/library/ms187103.aspx
What version of SQL Server are you using?
We're using replication now for a particular solution, and it seems to just work, day in, day out.
I would examine your event log's, and SQL Server logs to see if you can determine why it is shutting down, and why it doesn't start up.
Are you possibly patching the servers, or are you having network errors?
The alternatives to replication are log shipping, or database mirroring.
I personally prefer Database Mirroring, but it really depends what you're trying to do, as some of these aren't appropriate for certain situations.
We also have used SQL transactional replication. We had the same pains with updating schema, which requires dropping the publication on all servers, performing the updates, and then reinitializing replication, and hoping for the best. Sometimes it would not initialize, or a node would fall behind and we'd get little warning for it. A few times we even lost all the stored procedure execute permissions causing pretty much total failure on the websites.
We have a rather large database so reinitialization could take quite some time, meaning all updates had to be done at 2am on Sunday - not exactly when we're awake and alert and able to use all our faculties to deal with a problem that might arise.
We are ditching replication in favor of failover clustering on SQL 2008, but it can still be done all the way back to SQL 2000.
http://technet.microsoft.com/en-us/library/cc917693.aspx