ActiveMQ replicated levelDB with zookeeper - activemq

I want to understand zookeeper's role in replicated leveldb for ActiveMQ broker.
About zookeeper election : How does zookeeper knows that out of all the clients connected to zookeeper, which clients are ActiveMQ brokers fighting to become master. Is there any particular key or configuration which is passed by all the brokers connecting to zookeeper which says that we (let say 3) ActiveMQ brokers belong to same environment and fighting to become master.
At what interval slave broker copy data from master broker ? Any corner cases where data might be lost ?
Does ActiveMQ provides guarantee of message ordering using replicated leveldb ? I am talking about the case when re-election of master happens while producer is sending messages in sequence to the broker?
Thanks,
Anuj

By zkPath in the Zookeeper configuration and by broker name.
Each message is synced to a quorum (nodes/2+1) brokers before the transaction completes. So there is no sync interval, it's synced in real time. The cluster will no function unless you have a quorum of brokers online so there should be no data loss.
The messages are synced to a majority of the nodes in a synchronous fashion. At reelection, a node with the latest updates will be elected. Ordered messages should be no problem. However, it's generally problematic to rely critically on ordered messages in a message queueing. As a rule of thumb - message order will only be complete under "happy days". Dead letters, multiple consumers and so forth might as well mess up message order.

Related

Why would you run a messaging queue (eg RabbitMQ) cluster?

Overview
A RabbitMQ broker is a logical grouping of one or several Erlang
nodes, each running the RabbitMQ application and sharing users,
virtual hosts, queues, exchanges, bindings, and runtime parameters.
Sometimes we refer to the collection of nodes as a cluster.
Why would you do this? I understand to increase durability of messages (if a node goes down, other queues still get the messages). But what about performance? How does cluster improve performance. Won't all consumers/producers connect to the master node's queue anyway? If so, aren't we still getting traffic on a single node regardless? Do we put a load balancer so traffic is directed at different nodes each time?
How does RabbitMQ cluster increase performance?
Well, right after that paragraph, the documentation states the following:
What is Replicated?
All data/state required for the operation of a RabbitMQ broker is
replicated across all nodes. An exception to this are message queues,
which by default reside on one node, though they are visible and
reachable from all nodes. To replicate queues across nodes in a
cluster, see the documentation on high availability (note that you
will need a working cluster first).
So, you would cluster to provide higher capacity in your RabbitMQ broker than a single node can provide alone. Note that clustering by itself is not a high-availability strategy.
Your assertion that message durability is increased is false, as message queues continue to reside on one broker (unless mirroring is used).
By default, contents of a queue within a RabbitMQ cluster are located on a single node (the node on which the queue was declared) [1]
Without mirroring, when that node goes down, messages on it will be lost. The cluster will put the queue onto a different node. RabbitMQ does not handle network partitions well, so this can be a bit of a problem.
"Aren't we still getting traffic on a single node regardless?" - if you only have one queue, then yes. However, a bigger question is "why would you run a message broker with only one queue?" Similarly, if you only create queues on one node, then you will still have one point of failure in the system.

Does Kafka handle network failure better than RabbitMQ?

We have been having below issues from RabbitMQ and had been manually restarting the servers every weekend as a work around.
Network partition detected
Mnesia reports that this RabbitMQ cluster has experienced a network partition. This is a dangerous situation. RabbitMQ clusters should not be installed on networks which can experience partitions.
We have gone through other popular posts on the topic e.g. here and here
Our network is not highly reliable and occasional blips are expected but when it does come up I would have expected 1 of the 4 node RabbitMQ cluster to join the rest of cluster - as is the case with 4 nodes of Tomcat installed on same servers.
Although the nodes on single partition continue to run independently but doesnt seem like that is a graceful recovery from failure in one node.
We didnt have great luck with using any rabbitmqctl commands like rabbitmqctl cluster_status - It used to sporadically cause the rabbitmq process to hang which needed a sudo kill to RabbitMQ process.
We are at a point of evaluating moving to Kafka or any other message broker that handles message partition well
Any thoughts on working around not needing manual RabbitMQ restarts or ability of Kafka to handle such situation is highly appreciated
I think Kafka with replication should be able to handle network partitions quite easily, as long as the number of brokers partitioned is inferior to the replication factor of your topic (aka, the consumers and producers can always reach at least 1 broker for the topics they're operating with).
To avoid backpressure in the clients while Zookeeper discover the partition and propagate the information to the producers and consumer, you may want to set short ZK heartbeating (yes, you'll need ZK, and a cluster too since you absolutely don't want your whole ZK cluster partitioned).
Fair warning though : using a cluster of kafka brokers will drop the FIFO aspect of your message queue which can be pretty disturbing if you're expecting the same order of messages produced by the producers and read by the consumers, which you could expect with RabbitMQ.

ActiveMQ cosumer connection differ from producer

The following is my ActiveMQ setup:
I have two AMQ broker which are configured with failover.
I have 40 producer but only on consumer.
Now the problem:
From time to time, one of the producer lost the connection to the master broker. The failover reacts and the producer gets a new connection to the slave which gets the messages. So far so good. But the consumer does not have the problem, he consumes still the messages from the master. He does not know, that the slave has also some messages.
How can i now solve the problem woth losing those messages thay are sent to the slave?
Thank in advance
I would recommend you configure a network of brokers. That way, your brokers will be connected as well, and it no longer matters which broker your producers and consumers connect to - the messages will get propagated across the network.

rabbitMQ federation VS ActiveMQ Master/Slave

I am trying to set up cluster of brokers, which should have same feature like rabbitMQ cluster, but over WAN (my machines are in different locations), so rabbitMQ cluster does not work.
I am looking to alternatives, rabbitMQ federation is just backup the messages in the downstream, can not make sure they have exactly the same messages available at any time (downstream still keeps the old messages already consumed in the upstream)
how about ActiveMQ Master/Slave, I have found :
http://activemq.apache.org/how-do-distributed-queues-work.html
"queues and topics are all replicated between each broker in the cluster (so often to a master and maybe a single slave). So each broker in the cluster has exactly the same messages available at any time so if a master fails, clients failover to a slave and you don't loose a message."
My concern is that if it can automatically update to make sure Master/Slave always have the same messages, which means the consumed messages in Master will also disappear in Slaves.
Thanks :)
ActiveMQ has various clustering features.
First there is High Availability - "Master/Slave". The idea is that several physical servers act as a single logical ActiveMQ broker. If one goes down, another takes it place without losing data. You can do that by sharing the message store (shared file system or shared JDBC), or you could setup a replicated cluster, which replicates read/writes to the master down to all slaves (you need three+ servers). ActiveMQ is using LevelDB and Apache Zookeeper to achieve this.
The other format of cluster available in ActiveMQ is to be able to distribute load and separate security over several logical brokers. Brokers are then connected in a network of brokers. Messages are by default passed around to the broker with available consumers for that message. However, there is a rich toolbox of features in ActiveMQ to tweak a network of brokers to do things as always send a copy of a message to specific broker etc. It takes some messing with the more advanced features though (static network connectors and queue mirroring, maybe more).
Maybe there is a better way to solve your requirements, which is not really specified in the question?

ActiveMQ network of brokers with durable subscription topics

I have a little problem here with my sample JMS layout.
I have two brokers (A, B) on two machines, which are linked via network connector. The idea is that the producer can send to any broker and the consumer can listen to any broker and the topic to send to/receive from is available globally.
The topic has two durable subscriber clients (one on each machine) that both will process all the messages in the topic. I want it to be a durable subscription so that the processes won't loose any workload if a process has to be restarted. Both subscriber clients are configured to have a failover broker url, so that they first try to connect to their localhost broker and if not available to the other. Failover of the clients seems to work, but I found a problem in the following situation:
Each broker 'A' and 'B' have a subscriber client connected The producer is sending to 'A'. Broker 'B' gets restarted. Client of 'B' registers connection loss and switches to 'A'. 'B' comes up again, and because it had itself registered as a durable subscriber to 'A' it gets the message feed. It has no active durable subscriber now ('A' has now three, including 'B') and piles up until it reaches its connection limits.
Is my configuration wrong? Is it possible what I've intended?
Cheers,
Kai
Are you running master-slave configuration?
Why do you want both brokers to have connected clients at the same time?
If you user failover connection string (identifying both brokers in it) your consumers/producers will use ActiveMQ failover implementation and will connect/reconnect to the active node when needed. I don't think having two active instances with active clients is a good idea - unless you are trying to duplicate your processes (in this case there will be no synchronization)
To make both nodes (master and slave) to always have the same durable data you need
to persist your messages to the same place accessible to both nodes. It can be JDBC adapter connected to a single instance of database (probably behind the cluster) or it can be NAS with shared network folder for KahaDB.