RabbitMQ queue sharding and HA

RabbitMQ queue sharding and HA - rabbitmq

I wonder if it possible to have a cluster with HA and sharding queues on the same machine?
I'll explain by example:
Machine1:
Queues:
Queue_001
Queue_002
HA:
Copy of the queues of machine2 (003 and 004)
Machine2:
Queues:
Queue_003
Queue_004
HA:
Copy of the queues of machine1 (001 and 002)
As you can see, if you take one of those servers, the other server has all the needed information in order to answer requests.
Basically, the advantage of architecture like this is:
HA
Performance. Since we are using both machines to serve data. In contrast to a situation that the replica is just for backup.
This architecture is possible with RabbitMQ?

Related

Redis advantages of Sentinel and Cluster

I'm planning to create a high available Redis Cluster. After reading many articles about building Redis cluster i'm confused. So what exactly are
the advantages of a Redis Sentinel Master1 Slave1 Slave2 Cluster? Is it more reliable as a Redis Multinode Sharded Cluster?
the advantages of a Redis Multinode Sharded Cluster? Is it more reliable as a Redis Sentinel Master1 Slave1 Slave2 Cluster?
Further questions to the Redis Sentinel Master1 Slave1 Slave2 Cluster:
when i have 1 Master and the two Slaves and traffic is getting higher and higher so this cluster will be to small how can i make the cluster bigger?
Further questions to the Redis Multinode Sharded Cluster:
why are there so many demos with running a cluster on a single instance but on different ports? That makes no sense to me.
when i have a cluster with 4 masters and 4 replicas, how can an application or a client be sure to write to the cluster? When Master1 and Slave1 are dying but my application is writing always to the IP of Master1 then it will not work anymore. Which solutions are out there to implement a sharded cluster well to make it available for applications to find it with a single ip and port? Keepalived? HAproxy?
when i juse for a 4 master setup with e.g. Keepalived - doesn't that cancel out the different masters?
furthermore i need to understand why the multinode cluster is only for solutions where more data will need to be written as memory is available. Why? For me a multi master setup sounds good to be scaleable.
is it right that the the sharded cluster setup does not support multikey operations when the cluster is not in caching mode?
I'm unsure if these two solutions are the only ones. Hopefully you guys can help me to understand the architectures of Redis. Sorry for so many questions.

I will try to answer some of your questions but first let me describe the different deployment options of Redis.
Redis has three basic deployments: single node, sentinel and cluster.
Single node - The basic solution where you run single process running Redis.
It is not scalable and not highly available.
Redis Sentinel - Deployment that consist of multiple nodes where one is elected as master and the rest are slaves.
It adds high availability since in case of master failure one of the slaves will be automatically promoted to master.
It is not scalable since the master node is the only node that can write data.
You can configure the clients to direct read requests to the slaves, which will take some of the load from the master. However, in this case slaves might return stale data since they replicate the master asynchronously.
Redis Cluster - Deployment that consist of at least 6 nodes (3 masters and 3 slaves). where data is sharded between the masters. It is highly available since in case of master failure, one of his slaves will automatically be promoted to master. It is scalable since you can add more nodes and reshard the data so that the new nodes will take some of the load.
So to answer your questions:
The advantages of Sentinel over Redis Cluster are:
Hardware - You can setup fully working Sentinel deployment with three nodes. Redis Cluster requires at least six nodes.
Simplicity - usually it is easier to maintain and configure.
The advantages of Redis Cluster over Sentinel is that it is scalable.
The decision between that two deployment should be based on your expected load.
If your write load can be managed with a single Redis master node, you can go with Sentinel deployment.
If one node cannot handle your expected load, you must go with Cluster deployment.
Redis Sentinel deployment is not scalable so making the cluster bigger will not improve your performance. The only exception is that adding slaves can improve your read performance (in case you direct read requests to the slaves).
Redis Cluster running on a single node with multiple ports is only for development and demo purposes. In production it is useless.
In Redis Cluster deployment clients should have network access to all nodes (and node only Master1). This is because data is sharded between the masters.
In case client try to write data to Master1 but Master2 is the owner of the data, Master1 will return a MOVE message to the client, guiding it to send the request to Master2.
You cannot have a single HAProxy in front of all Redis nodes.
Same answer as in 5, in the cluster deployment clients should have direct connection to all masters and slaves not through LB or Keepalived.
Not sure I totally understood your question but Redis Cluster is the only solution for Redis that is scalable.
Redis Cluster deployment support multikey operations only when all keys are in the same node. You can use "hash tags" to force multiple keys to be handled by the same master.
Some good links that can help you understand it better:
Description on the different Redis deployment options: https://blog.octo.com/en/what-redis-deployment-do-you-need
Detailed explanation on the architecture of Redis Cluster: https://blog.usejournal.com/first-step-to-redis-cluster-7712e1c31847

Does Kafka handle network failure better than RabbitMQ?

We have been having below issues from RabbitMQ and had been manually restarting the servers every weekend as a work around.
Network partition detected
Mnesia reports that this RabbitMQ cluster has experienced a network partition. This is a dangerous situation. RabbitMQ clusters should not be installed on networks which can experience partitions.
We have gone through other popular posts on the topic e.g. here and here
Our network is not highly reliable and occasional blips are expected but when it does come up I would have expected 1 of the 4 node RabbitMQ cluster to join the rest of cluster - as is the case with 4 nodes of Tomcat installed on same servers.
Although the nodes on single partition continue to run independently but doesnt seem like that is a graceful recovery from failure in one node.
We didnt have great luck with using any rabbitmqctl commands like rabbitmqctl cluster_status - It used to sporadically cause the rabbitmq process to hang which needed a sudo kill to RabbitMQ process.
We are at a point of evaluating moving to Kafka or any other message broker that handles message partition well
Any thoughts on working around not needing manual RabbitMQ restarts or ability of Kafka to handle such situation is highly appreciated

I think Kafka with replication should be able to handle network partitions quite easily, as long as the number of brokers partitioned is inferior to the replication factor of your topic (aka, the consumers and producers can always reach at least 1 broker for the topics they're operating with).
To avoid backpressure in the clients while Zookeeper discover the partition and propagate the information to the producers and consumer, you may want to set short ZK heartbeating (yes, you'll need ZK, and a cluster too since you absolutely don't want your whole ZK cluster partitioned).
Fair warning though : using a cluster of kafka brokers will drop the FIFO aspect of your message queue which can be pretty disturbing if you're expecting the same order of messages produced by the producers and read by the consumers, which you could expect with RabbitMQ.

RabbitMQ clustering use case w/o HA

this is a use case question on RabbitMQ clustering. In the past, I have clustered RabbitMQ to make queues highly available (HA). I understand you can cluster RabbitMQ nodes without making HA queues but why would you do that? From a message consumer's POV, clustering in itself buys you nothing unless the queues are made HA (or so I feel). What kind of use-cases can you cite for make a non-HA RabbitMQ cluster?

By having more servers you can get more throughput, be able to accept more connected clients and so on. The non HA cluster is able to see resources in all nodes in the clusters, despite of where the resources were declared.

Multiple broker machine rabbitmq configuration, how does HA work?

I'm trying to figure out how HA works. (high availability queues)
The current configuration I have is: every machine has multiple celery workers and points to itself as broker. Each machine can do this rather than point at one broker machine because of HA; in this way, there is less load on any one machine, as all are brokers and have copies of the same queue.
My question is, is my above logic correct? Or do all workers need to point to one broker machine regardless of HA?

If you have looked at HA and clustering and have ensured that the queues mirror each other then what you are doing should be fine. But that may seem a tad inefficient to run it on every server where you run your workers.
The other option is to run your queues on a few servers for HA and have other servers running the workers to point to them. But since the celery worker config can only point to one broker url, you would need to work around that by possibly using a load balancer to which all workers will point to. This is to the best of what I've come to understand over the past few years on RabbitMQ HA for celery.

rabbitMQ federation VS ActiveMQ Master/Slave

I am trying to set up cluster of brokers, which should have same feature like rabbitMQ cluster, but over WAN (my machines are in different locations), so rabbitMQ cluster does not work.
I am looking to alternatives, rabbitMQ federation is just backup the messages in the downstream, can not make sure they have exactly the same messages available at any time (downstream still keeps the old messages already consumed in the upstream)
how about ActiveMQ Master/Slave, I have found :
http://activemq.apache.org/how-do-distributed-queues-work.html
"queues and topics are all replicated between each broker in the cluster (so often to a master and maybe a single slave). So each broker in the cluster has exactly the same messages available at any time so if a master fails, clients failover to a slave and you don't loose a message."
My concern is that if it can automatically update to make sure Master/Slave always have the same messages, which means the consumed messages in Master will also disappear in Slaves.
Thanks :)

ActiveMQ has various clustering features.
First there is High Availability - "Master/Slave". The idea is that several physical servers act as a single logical ActiveMQ broker. If one goes down, another takes it place without losing data. You can do that by sharing the message store (shared file system or shared JDBC), or you could setup a replicated cluster, which replicates read/writes to the master down to all slaves (you need three+ servers). ActiveMQ is using LevelDB and Apache Zookeeper to achieve this.
The other format of cluster available in ActiveMQ is to be able to distribute load and separate security over several logical brokers. Brokers are then connected in a network of brokers. Messages are by default passed around to the broker with available consumers for that message. However, there is a rich toolbox of features in ActiveMQ to tweak a network of brokers to do things as always send a copy of a message to specific broker etc. It takes some messing with the more advanced features though (static network connectors and queue mirroring, maybe more).
Maybe there is a better way to solve your requirements, which is not really specified in the question?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas