how to implement Rabbitmq high availability without using DRBD? - rabbitmq

I want to implement High availability for Rabbitmq server.i have read the Rabbitmq provide document,on document is use DRBD,i don't wants to use DRBD for share storage,from my side i have did clustering with two node and prepare mirror queue.
rest needs to be implemented high availability help me.
Thanks

The HA documentation can be found on the RabbitMQ site at
http://www.rabbitmq.com/ha.html
with the main set up being described in
http://www.rabbitmq.com/clustering.html
This doesn't involve DRBD and is simply a guide on how to mirror queues across multiple servers.
I have implemented a HA rabbit cluster based on the instructions in the above so I can attest to their clarity.
If you have any specific questions regarding HA setup that isn't clear from the above then I'm happy to answer them.

Related

Failover with Spring AMQP and RabbitMQ HA

There are multiple articles suggesting that load-balancer should be used in front of RabbitMQ cluster.
However, there are also multiple references that Spring AMQP is using some
failover implementation like connection reset when broker comes back to life.
I have several questions regarding this topic (given that those articles are more or less old and it's 2018 today)
When using Spring AMQP, is it load-balancing for still required?
If load-balancing is still suggested, how would I solve affinity of primary queue to its node? There would be much inter-connect between cluster nodes, because round-robin load-balancer would have 1-(1/n) success rate of hitting correct cluster node
Does Spring AMQP support some kind of topology awareness, which would allow it to consume from correct node?
There were some articles suggesting that clients should publish/consume to nodes respecting locality of queues. Does this still apply? How does this all fits together given load-balancing, Spring AMQP failover and CachingConnectionFactory?
Can anybody please provide answers to those topics and also provide relevant references, which would provide additional information for verification?
Thanks a lot
For each of your bullets:
a load balancer makes little sense with default configuration of Spring AMQP since it opens a single, long-lived, connection that is shared across all consumers. In, 2.0, you can configure the RabbitTemplate to use a separate connections; this is because it is a recommended configuration to use a different connection for publishers/consumers; this will be default in 2.1.
It might make sense to use a load balancer if you configure the connection factory to cache connections (instead of just channels) since, then, each component gets its own connection.
See next bullet.
See Queue Affinity and the LocalizedQueueConnectionFactory. It uses the management plugin to determine which node currently hosts the queue and connects to that. It will not work with a load balancer since it needs to connect to the actual node.
It is my understanding from several discussions that queue affinity is only needed in the most extreme environments and that, in most environments, the difference is immeasurable. However, environments/networks differ so much, YMMV so you may want to test. My general rule of thumb is to avoid premature optimization since the added complexity of the configuration may simply not be worth the benefit (and you may not have a problem in the first place).

Is it a good way to run Kafka on Kubernetes?

For a large online application, use k8s to run it. The scale maybe daily activity user 500,000.
The application inside k8s need messaging feature - Pub/Sub, there are these options:
Kafka
RabbitMQ
Redis
Kafka
It needs zookeeper and good to run on os depends on disk I/O. So if install it into k8s cluster, how? The performance will be worse?
And, if keep Kafka outside of the k8s cluster, connect Kafka from application inside the k8s cluster, how about that performance? They are in the different layer, won't be slow?
RabbitMQ
It's slow than Kafka, but for a daily activity user 500,000 application, is it good enough? If so, maybe it's a good choice.
Redis
It's another option. Maybe the most simple one. But from the internet I got that it will lose message sometimes. If true, that's terrible.
So, the most important thing is, use Kafka(also with zookeeper) on k8s, good or not in this use case?
Yes, running Kafka on Kubernetes is great. Check out this example: https://github.com/Yolean/kubernetes-kafka. It includes ZooKeeper and Kafka as StatefulSets.
PS. Running any of the services in your question on Kubernetes will be pleasant. You can Google the name of the service and "kubernetes" and find example manifests. Many examples here: https://github.com/kubernetes/charts.
For Kafka, you can find some suggestion here. Kubernetes 1.7+ supports local persistent volume, which may be good for Kafka deployment.
You can also take a look to the following project :
https://github.com/EnMasseProject/barnabas
It's about running Kafka on Kubernetes and OpenShift as well. It provides deploying with StatefulSets with persistent volumes or just in memory (for developing or just testing purpose). It provides deploying for Kafka Connect and Prometheus metrics as well.
Another simple configuration of Kafka/Zookeeper on Kubernetes in DigitalOcean with external access:
https://github.com/StanislavKo/k8s_digitalocean_kafka
You can connect to Kafka from outside of AWS/DO/GCE by regular binary protocol. Connection is PLAINTEXT or SASL_PLAINTEXT (user/password).
Kafka cluster is StatefulSet, so you can scale cluster easily.

Real world example of Apache Helix, Zookeeper, Mesos and Erlang?

I am new in
Apache ZooKeeper : ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Apache Mesos : Apache Mesos is a cluster manager that simplifies the complexity of running applications on a shared pool of servers.
Apache Helix : Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes.
Erlang Langauge : Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.
It sounds to me that Helix and Mesos both are useful for Clustering management System. How they are related to ZooKeeper? It'd better if someone give me a real world example for their usage.
I am curious to know How [BOINC][1] are distributing tasks to their clients? Are they using any of the above technologies? (Forget about Erlang).
I just need a brief view on it :)
Erlang was built by Ericsson, designed for use in phone systems. By design, it runs hundreds, thousands, or even 10s of thousands of small processes to handle tasks by sending information between them instead of sharing memory or state. This enables all sorts of interesting features that are great for high availability distributed systems such as:
hot code reloading. Each process is paused, it's relevant module code is swapped out, and it is resumed where it left off, so deploys can happen without restarting or causing significant interruption.
Easy distributed messaging and clustering. Sending a message to a local process or a remote one is fairly seamless in most instances.
Process-local GC. Garbage collection happens in each process independently instead of a global stop-the-world even like java, aiding in low-latency results.
Supervision trees and complex process hierarchy and monitoring/managing.
A few concrete real-world examples that makes great use of Erlang would be:
MongooseIM A highly performant and incredibly scalable, distributed XMPP / Chat server
Riak A distributed key/value store.
Mesos, on the other hand, you can sort of think of as a platform effectively for turning a datacenter of servers into a platform for teams and developers. If I, say as a company, own a datacenter with 10,000 physical servers, and I have 1,000 engineers developing hundreds of services, a good way to allow the engineers to deploy and manage services across that hardware without them needing to worry about the servers directly. It's an abstraction layer over-top of the physical servers to that allows you to share and intelligently allocate resources.
As a user of Mesos, I might say that I have Service X. It's an executable bundle that lives in location Y. Each instance of Service X needs 4 GB of RAM and 2 cores. And I need 8 instances which will be attached to a load balancer. You can specify this in configuration and deploy based on that config. Mesos will find hardware that has enough ram and CPU capacity available to handle each instance of that service and start it running in each of those locations.
It can handle a lot of other more complex topics about the orchestration of them as well, but that's probably a bit in-depth for this :)
Zookeepers most common use cases are Service Discover and configuration management. You can think of it, fundamentally, a bit like a nested key value store, where services can look at pre-defined paths to see where other services currently live.
A simple example is that I have a web service using a shared database cluster. I know a simple name for that database cluster and where the configuration for it lives in zookeeper. I can look up (or repeatedly poll) that path in zookeeper to check what the addresses of the active database hosts are. And on the other side, if I take a database node out of rotation and replace it with a new one, the config in zookeeper gets updated with the new address, and anything continually looking at it will detect this change and change where it's connected to.
A more complex use case for zookeeper is how Kafka uses it (or did at the time that I last used Kafka). Kafka has streams, and streams have many shards. Each consumer of each stream use zookeeper to save checkpoints in each shard after they have read and processed up to a certain point in the stream. That way if the consumer crashes or is restarted, it knows where to pick up in the stream.
I dont know about Meos and Earlang language. But this article might help you with Helix and Zookeeper.
This article tells us:
Zookeeper is responsible for gluing all parts together where Helix is cluster management component that registers all cluster details (cluster itself, nodes, resources).
The article is related to clustering in JBPM using helix and zookeeper.But with this you will get a basic idea on what helix and zookeeper is used for.
And from most of the articles i read online it seems like zookeeper and helix are used together.
Apache Zookeeper can be installed on a single machine or on a cluster.
It can be used to keep track of logs. It can provide various services on a distributed platform.
Storm and Kafka rely on Zookeeper.
Storm uses Zookeeper to store all state so that it can recover from an outage in any of its (distributed) component services.
Kafka queue consumers can use Zookeeper to store information on what has been consumed from the queue.

Are Activemq, Redis and Apache camel a right combination?

Are Activemq, Redis and Apache camel a right combination?
Am planning for a high performant enterprise level integration solution accross multiple applications
My objective is to make the solution
a. independent of the consumers performance
b. able to trouble shoot in case of any issue
c. highly available with failover support
d. Hanlde 10k msgs per second
Here I'm planning to have
a. network of activemq brokers running in all app servers and storing the consumed messages in redis data store
b. from redis data store, application can retrieve the messages through camel end points
(camel end point is chosen to process the messages before reaching the app).
Also can ActiveMQ be removed with only Redis + Apache camel, as I see from the discussions forms that Redis does most of the ActiveMQ stuff
Could any one advise on this technology stack.
ActiveMQ and Camel works great together and scales very well - should be no problem to handle the load given proper hardware.
Are you thinking about something like this?
Message producer App -> ActiveMQ -> Camel -> Redis
Message Consumer App <- Camel [some endpoint] <- Redis
Puting ActiveMQ in between is usually a very good way to achieve HA, load balancing and making the solution elastic. Depending on your specific setup with machines etc. ActiveMQ can help in many ways to solve HA issues.
Removing ActiveMQ can a good option if your apps use some other protocol than JMS/ActiveMQ messaging, i.e. HTTP, raw tcp or similar. Can you elaborate on how the apps will communicate with Camel? ActiveMQ, by default, supports transactions, guaranteed delivery and you can live with a limited number of threads on the server, even for your heavy traffic. For other protocols, this might be a bit trickier to achieve. Without a HA layer (cluster) in ActiveMQ you need to setup Redis to handle HA in all aspects, which might be just as easy, but Redis is a bit memory hungry, so be aware of that.

WebLogic load balancing

I'm currently developing a project supported on a WebLogic clustered environment. I've successfully set up the cluster, but now I want a load-balancing solution (currently, only for testing purposes, I'm using WebLogic's HttpClusterServlet with round-robin load-balancing).
Is there any documentation that gives a clear comparison (with pros and cons) of the various ways of providing load-balancing for WebLogic?
These are the main topics I want to cover:
Performance (normal and on failover);
What failures can be detected and how fast is the failover recovery;
Transparency to failure (e.g., ability to automatically retry an idempotent request);
How well is each load-balancing solution adapted to various topologies (N-tier, clustering)
Thanks in advance for your help.
Is there any documentation that gives a clear comparison (with pros and cons) of the various ways of providing load-balancing for WebLogic?
It's not clear what kind of application you are building and what kind of technologies are involved. But...
You will find useful information in Failover and Replication in a Cluster and Load Balancing in a Cluster (also look at Cluster Implementation Procedures) but, no real comparison between the different options, at least not to my knowledge. But, the choice isn't that complex: 1. Hardware load balancers will perform better than software load balancers and 2. If you go for software load balancers, then WebLogic plugin for Apache is the recommended (by BEA) choice for production. Actually, for web apps, its pretty usual to put the static files on a web server and thus to use the Apache mod_wl plugin. See the Installing and Configuring the Apache HTTP Server Plug-In chapter.
These are the main topics I want to cover:
Performance (normal and on failover): If this question is about persistent session, WebLogic uses in memory replication by default and this works pretty well with a relatively low overhead.
What failures can be detected and how fast is the failover recovery: It is unclear which protocols you're using. But see Connection Errors and Clustering Failover.
Transparency to failure (e.g., ability to automatically retry an idempotent request): Clarifying the protocols you are using would make answering easier. If this question is about HTTP requests, then see Figure 3-1 Connection Failover.
How well is each load-balancing solution adapted to various topologies (N-tier, clustering): The question is unclear and too vague (for me). But maybe have a look at Cluster Architectures.
Oh, by the way, another nice chapter that you must read Clustering Best Practices.