I hope you can help me with this:
I have two database tables in separate servers, and I want them to be synchronized, I mean that when one of them is modified (Insert, delete, update), the other one is modified too. I´ve been searching for a while now and I´ve found that this can be acomplished with ActiveMQ, but, I haven´t found the way of doing It, can anybody give me a clue or a tutorial or something?.
I really appreciate your help.
Thanks in advance.
Is there any particular reason you want to mix in ActiveMQ for the task?
ActiveMQ is a message broker to send event messages around. There is no out-of-the-box database synchronization of DB events with ActiveMQ. You probably have to use Apache Camel (or custom code) to read and write the databases. That would be a non trivial task, non the less, since there are things as transactions, table locks you need to take into account.
If replication is all that is needed for HA or backup, you should really look at the built in mechanisms of SQL server.
Related
I'm reading about OpenLDAP replication and I don't understand why you would use refreshOnly mode vs. refreshAndPersist mode.
I've tried to do some searching online, but wasn't able to find any discussion on the benefits (if any) of refreshOnly.
This page says "In this configuration, assuming that a refreshAndPersist type of synchronization is used (it is not clear why you would even want to think about using refreshOnly but it is possible), then a write (modify) to any master will be immediately propagated to all the other masters (providers) acting in their slave (consumer) role."
But's it's referring to Multi-Master replication and doesn't say anything similar for Provider/Consumer replication.
Is there any reason why I should ever consider refreshOnly for OpenLDAP replication?
Thanks!
You would use refreshOnly if you wanted to completely control when replication took place, e.g. every midnight. You would use refreshAndPersist when you want replication to be continuous and as instantaneous as possible.
I wish to create a queue where a lot of computers would be writing in but each computer will write only once in his entire life. What you think would be the best way to achieve that?
I have read about SQL Server queues, SQL Server tables used as queue or service broker infrastructure.
SQL Server table : pretty easy to create but I am afraid of the performance
Service broker : more complex infrastructure. It seems that you have to run a service on the sender and have a send queue which is useless in my case because because all of them only send one message in their entire life.
What solution would be the best in my case?
you don't have to create a service on each computer. Service Broker objects can be confined to one DB server. For example, if you have 100 computers that need to drop of a message, they will need a connection string to the database server and execute a stored procedure that would enqueue the said message.
that said, it seems like a Service Broker queue would be an overkill for this. A simple table would probably suffice, or even better an MSMSQ (which would eliminate the need to connect to a DB).
Our production code uses tables as queues. We don't really need the robustness of Service Broker, and all our code already connects to databases for other stuff anyway.
Our code doesn't need more than a few hundred transactions per second, and I've shown that our queue can achieve over 10k transactions per second, so I'm fairly happy with the performance.
Here's a great article describing how to design tables for use as queues: http://rusanu.com/2010/03/26/using-tables-as-queues/
I would not design your table without first giving it a read.
Our company is also contemplating an alternative queue strategy involving Redis that doesn't require disk access since we are considering a design that would require tens or hundreds of thousands of inserts a second, but don't necessarily care about losing the data in the event of a failure. I would also give those methods a consideration if you need the throughput.
Maybe the better way transform your whole system from "several writers and one reader" to "one writer and one reader"? I mean you may make some service (web or any other) who will receive requests to write and will be the only writer into your database. This is ordinary situation and has many standard solutions.
I'm using an SQL Server 2008 R2 as a queuing mechanism. I add items to the table, and an external service reads and processes these items. This works great, but is missing one thing - I need mechanism whereby I can attempt to select a single row from the table and, if there isn't one, block until there is (preferably for a specific period of time).
Can anyone advise on how I might achieve this?
The only way to achieve a non-pooling blocking dequeue is WAITFOR (RECEIVE). Which implies Service Broker queues, with all the added overhead.
If you're using ordinary tables as queues you will not be able to achieve non-polling blocking. You must poll the queue by asking for a dequeue operation, and if it returns nothing, sleep and try again later.
I'm afraid I'm going to disagree with Andomar here: while his answer works as a generic question 'are there any rows in the table?' when it comes to queueing, due to the busy nature of overlapping enqueue/dequeue, checking for rows like this is a (almost) guaranteed deadlock under load. When it comes to using tables as queue, one must always stick to the basic enqueue/dequeue operations and don't try fancy stuff.
"since SQL Server 2005 introduced the OUTPUT clause, using tables as queues is no longer a hard problem". A great post on how to do this.
http://rusanu.com/2010/03/26/using-tables-as-queues/
I need mechanism whereby I can attempt
to select a single row from the table
and, if there isn't one, block until
there is (preferably for a specific
period of time).
You can loop and check for new rows every second:
while not exists (select * from QueueTable)
begin
wait for delay '00:01'
end
Disclaimer: this is not code I would use for a production system, but it does what you ask.
The previous commenter that suggested using Service Broker likely had the best answer. Service Broker allows you to essentially block while waiting for more input.
If Service Broker is overkill, you should consider a different approach to your problem. Can you provide more details of what you're trying to do?
Let me share my experiences with you in this area, you may find it helpful.
My team first used MSMQ transactional queues that would feed our asynchronous services (be they IIS hosted or WAS). The biggest problem we encountered was MS DTC issues under heavy load, like 100+ messages/second load; all it took was one slow database operation somewhere to start causing timeout exceptions and MS DTC would bring the house down so to speak (transactions would actually become lost if things got bad enough), and although we're not 100% certain of the root cause to this day, we do suspect MS DTC in a clustered environment has some serious issues.
Because of this, we started looking into different solutions. Service Bus for Windows Server (the on-premise version of Azure Service Bus) looked promising, but it was non-transactional so didn't suit our requirements.
We finally decided on the roll-your-own approach, an approach suggested to us by the guys who built the Azure Service Bus, because of our transactional requirements. Essentially, we followed the Azure Worker Role model for a worker role that would be fed via some queue; a polling-blocking model.
Honestly, this has been far better for us than anything else we've used. The pseudocode for such a service is:
hasMsg = true
while(true)
if(!hasMsg)
sleep
msg = GetNextMessage
if(msg == null)
hasMsg = false
else
hasMsg = true
Process(msg);
We've found that CPU usage is significantly lower this way (lower than traditional WCF services).
The tricky part of course is handling transactions. If you'd like to have multiple instances of your service read from the queue, you'll need to employ read-past/updlock in your sql, and also have your .net service enlist in the transactions in a way that will roll-back should the service fail. in this case, you'll want to go with retry/poison queues as tables in addition to your regular queues.
I am researching the possibility of using Firebird for a project.
However, one potential problem is replication and failover, or rather, lack of a (subjective) "good" solution. There are several potential solutions listed in the Firebird FAQ but they are either 1) Windows-centric; 2) horribly outdated; 3) commerical; or 4) not full-featured.
The only potential option I see is FIBRE and that looks 1) immature; 2) potentially dead; and 3) not full-featured.
I've learned about DRBD and Heartbeat and these solutions look promising. I am looking for your feedback should you already have 1) setup a replicated Firebird configuration; and/or 2) used DRBD with Firebird.
Any "gotchas", recommendations, tips, etc.?
Thanks!
There is one session about replication in Firebird Conference 2009
Holger Klemt
* Firebird Replicated Part 1
* Firebird Replicated Part 2
o In this two sessions you will see how easy it is to implement
your own replication system in a
Firebird database. Based on triggers
and simple scripts, your can create a
live backup system. The architecture
allows master-master, master-slave,
multi-master, online and offline
replication. The replicated Firebird
cluster can be used by any client
without interuption, also in the case
of partial hardware failures, planned
hardware and software maintenance
operations, for example the switch to
a new Firebird version.
We have been using DRBD/Heartbeat/Pacemaker Solution for the last 2 years for exactly the same problem. To keep Firebird databases up and running and failover. The setup is actually quite easy and I have a few suggestions that I will give you to get a head start. So these are just suggestions ...
create a drbd partition, format it and mount it to /data (with pacemaker of course)
put your aliases.conf to the drbd partion, so you won't have to change the aliases.conf twice everytime you make a change to it. Copy the aliases.conf file to /data and link it to /etc/firebird/2.1/aliases.conf on both nodes
The downside of using Drbd/Pacemaker in a Primary/Secondary setup is that the clients will loose the connection as soon as the primary node dies and until the secondary node is up. The will have to reconnect. I haven't really found another way arround that, although the firebird client should allow a connection timeout it never really worked with our applications (maybe the applications or the libraries we use don't really use the firebird connection timeout).
As for database replication, I am afraid you have to go the way as Hugues Van Landeghem decribed or quoted it. We developed such an application, that works with triggers. So a new line is added to a table, a trigger copies the key of the entry to another table which is constantly read by an application which grabs that entry and inserts it to another database. Quite ugly but it works just fine! I personally think Firebird should invest some time in having their own datbase replications system...they are really far behind...
Hope my information helped you a little bit. I have further questions feel free to contact me or visit my site # gefoo.org
An existing process changes the status field of a booking record in a table, in response to user input.
I have another process to write, that will run asynchronously for records with a particular status. It will read the table record, perform some operations (including calls to third party web services), and update the record's status field to indicate that processing is completed (or In Error, with an error count).
This operation sounds very similar to a queue. What are the benefits and tradeoffs of using MSMQ over a SQL Table in this situation, and why should I choose one over the other?
It is our software that is adding and updating records in the table.
It is a new piece of work (a Windows Service) that will be performing the asynchronous processing. This needs to be "always up".
There are several reasons, which were discussed on the Fog Creek forum here: http://discuss.fogcreek.com/joelonsoftware5/default.asp?cmd=show&ixPost=173704&ixReplies=5
The main benefit is that MSMQ can still be used when there is intermittant connectivity between computers (using a store and forward mechanism on the local machine). As far as the application is concerned it delivered the message to MSMQ, even though MSMQ will possibly deliver the message later.
You can only insert a record to a table when you can connect to the database.
A table approach is better when a workflow approach is required, and the process will move through various stages, and these stages need persisting in the DB.
If the rate at which booking records is created is low I would have the second process periodically check the table for new bookings.
Unless you are already using MSMQ, introducing it just gives you an extra platform component to support.
If the database is heavily loaded, or you get a lot of lock contention with two process reading and writing to the same region of the bookings table, then consider introducing MSMQ.
I also like this answer from le dorfier in the previous discussion:
I've used tables first, then refactor
to a full-fledged msg queue when (and
if) there's reason - which is trivial
if your design is reasonable.
Thanks, folks, for all the answers. Most helpful.
With MSMQ you can also offload the work to another server very easy by changing the location of the queue to another machine rather then the db server.
By the way, as of SQL Server 2005 there is built in queue in the DB. Its called SQL server Service Broker.
See : http://msdn.microsoft.com/en-us/library/ms345108.aspx
Also see previous discussion.
If you have MSMQ expertise, it's a good option. If you know databases but not MSMQ, ask yourself if you want to become expert in another technology; whether your application is a critical one; and which you'd rather debug when there's a problem.
I have recently been investigating this myself so wanted to mention my findings. The location of the Database in comparison to your application is a big factor on deciding which option is faster.
I tested inserting the time it took to insert 100 database entries versus logging the exact same data into a local MSMQ message. I then took the average of the results of performing this test several times.
What I found was that when the database is on the local network, inserting a row was up to 4 times faster than logging to an MSMQ.
When the database was being accessed over a decent internet connection, inserting a row into the database was up to 6 times slower than logging to an MSMQ.
So:
Local database - DB is faster, otherwise MSMQ is.
Instead of making raw MSMQ calls, it might be easier if you implement your sevice as a queued COM+ component and make queued function calls from your client application. In the end, the asynchronous service still uses MSMQ in the background, but your code will be much clearer and easier to use.
I would probably go with MSMQ, or ActiveMQ myself. I would suggest (presuming that you are considering MSMQ you are using windows, with MS technology) looking into WCF, or if you are using MS-SQL 2005+ having a trigger that calls into .net code to run your processing.
Service Broker was introduced in SQL 2005 and it is designed to be very quick at handling messages as the process is relatively simple (I believe its roots were in triggers). If you are concerned about scalability, in SQL 2008 they have released an independant processing executable to separate the processing from SQL Server (in standard Service Broker, everything is controlled by the SQL Server instances).
I would definitely consider using Service Broker over MSMQ but this is dependant on your SQL Development/DBA resources and their knowledge.
Besides of Mitch's answer, some other scenarios:
1. each of your message have its own due date to trigger the action, this can be done through MQ as well, but in this case I prefer to store it into db as it is more controllable;
2. subscriber needs to filter message and then process a portion of it, this can be done by LINQ too, depends on how complex the filter is, the db approach is better because I can use linq to EF do complex query easily;
3. For deployment, i want fully automated deployment process so that DB is a better choice for me. I am not a big fan of manual configurations.