How to use kubernetes replication controllers to replicate message-based services - rabbitmq

We usually use message passing to send messages to decoupled services. This makes service discovery a non-issue, because (with AMQP in RabbitMQ for instance) you can use the broker's routing capability to dispatch messages to the right queues that feed the correct services. Load balancing is also handled by the message broker.
Enter kubernetes.
The use case that is usually laid out when talking about service replication and re-spawning failing services, is when your clients use some active protocol like http to contact a service, even if this service handles requests asynchronously. In this context, it is a natural fit to have replication controllers, that manage a group of services and a single entry point to load balance between them.
I like kubernetes' intuitive concepts, like rolling deployments, but how to you control this beasts that don't have an http interface ?
UPDATE:
I am not trying to set up a cluster of message brokers. I am looking at message consumers as services. Service clients don't connect directly to the services, they send messages to the message broker. The message broker acts as a load balancer in a way, and dispatches the messages to the subscribed queue consumers. These consumers implement the service.
My question gravitates around the fact that most usage patterns in demos handle services that are called via http, and kubernetes does a good job here to create a service proxy for these services, and a replication controller. Is it possible to create replication controllers for my kind of service, which does not have a http interface per se, and have all the benefits of rolling updates, and minimum instances?

I'm not sure I entirely understand the question. Are you asking how to use RabbitMQ with Kubernetes? Or how to set up a RabbitMQ cluster: https://www.rabbitmq.com/clustering.html? Or how rolling updates interact with RabbitMQ? Or something else?
I think you should be able to create one service and one replication controller per server, and then use the service DNS names in the cluster configuration file. This is the current approach used to run Zookeeper, also. We have a long-standing TODO to make this less verbose (https://github.com/GoogleCloudPlatform/kubernetes/issues/260), but the current approach should be straightforward. You do lose the ability to use a single kubectl rolling-update command to update the cluster, but it's also straightforward to update the instances individually.

Related

Load Balancing with multi broker ActiveMQ artemis instance

I need your help to suggest me how best I can achieve load balancing using the below diagram. here I am trying to create 2 machines with Master and expecting that the consumer/publisher application will use one common URL( a load-balanced one), where I should not expose the individual VM machine info and port ID. just that load balancer should take care of routing..
this is typically what we do with help of F5 load balancer or HTTP load balancer ..just wondering can be achieved over ActiveMQ and its advisable..?
on other side, I also tried configuring this way on weblogic to consume data from ActiveMQ queue
failover://(tcp://localhost:61616,tcp://localhost:61617)?randomize=true but this does not help.. or WebLogic is not understanding this format.
Messaging connections are stateful. They are not stateless like HTTP connections, and therefore cannot be load-balanced in the same way as HTTP connections. It may be possible to configure an F5 to deal with stateful messaging connections, but I can't say for sure. I'm not an expert on F5.
Both the ActiveMQ Artemis broker itself as well as the JMS client shipped with the broker have load-balancing functionality built in. There's too much to cover here so I recommend you review the clustering documentation for the relevant details.
You might also try using the broker balancer feature. It's currently experimental, but it should be ready to use in the 2.21.0 release coming in the March/April time-frame. It can act like an F5 for your messaging connections, but it can do some more intelligent things like always sending certain clients to the same node which can facilitate certain use-cases which are not possible in a traditional cluster.
The URL failover://(tcp://localhost:61616,tcp://localhost:61617)?randomize=true which are you using is for the OpenWire JMS client shipped with ActiveMQ 5.x. If you're using the core JMS client shipped with ActiveMQ Artemis then you should be using a URL like this instead:
(tcp://localhost:61616,tcp://localhost:61617)?ha=true

Load balancing WebSocket with Redis and RabbitMQ

Consider a small chat server. In this server, the actual processing of messages is done by nodes of a service called "chat". Communications of this service along with a "user" service are then aggregated via a "gateway" service in front that is the only service that actually communicates with the users and is in charge of passing requests received to other services via the RabbitMQ channel they share.
In a system designed like this, each user is connected to one of the instances of the "gateway" service and when sending and receiving messages indirectly communicates with the private "chat" or "user" services behind. To load balance this, we have an Nginx reverse-proxy on the edge that tries to distribute requests to different "gateway" instances. But since WebSocket connection is real-time, "chat" instances should also be able to send messages to the right instance of the "gateway" in charge of that specific user for user-specific messages and to all "gateway" instances for site-wide messages. This is a problem since with RabbitMQ I don't believe we can target a specific subscriber and even if we could, we don't know to which instance that specific user is connected right now.
Therefore, since we are using Socket.io for WebSocket connection, I am thinking of adding a new Redis node to the stack to allow this communication between different instances of the "gateway" service. This is directly supported by Socket.io and works alright and removes all sorts of limitations imposed by the RabbitMQ, however, we are still using RabbitMQ to route a message from a "chat" instance to a "gateway" instance that then will propagate through the Redis service and when the right "gateway" instance having access to the user is found, delivered to them.
This adds unnecessary lag to user-specific outbound messages. So here I am asking if anyone has a better idea of how this problem should be approached and how to decrease this lag.
Personally, I have this idea of adding Socket.io to "chat" services (with no client access) and use its backend to send the message directly to the Redis store so that the instance of the "gateway" connected to it can route it directly to the user, going over the whole RabbitMQ thing for this type of messages.
It might be important to mention that none of these services are here just to do this specific thing, RabbitMQ is heavily used for communication between different services acting as the message broker and the "gateway" service works with multiple other services for data aggregation, authentication and data validation and transformation. The above example was a simplified version of the problem at hand with the minimum number of moving parts that I could easily describe here.
Edit: To send messages directly to socket.io redis store, the following library can be used apparently not to load the whole socket.io library:
https://github.com/socketio/socket.io-redis-emitter

healthcheck using message broker

I'm currently running microservices in my company. They are not api servers, just processes that communicate with each other. So the implementation to communicate with each other is RabbitMQ.
Now I'm trying to implement a health checker to know if a server has restarted or crashed.
But I'm only familiar using a health check by calling a specific api in the server. But our services aren't an api server so they don't have any ports to imply. And I also don't wan't to add an api server for just to implement a health checker.
So I'm searching about any use cases about implementing health checks by sending messages (health check signals) to the health checker by a message broker such as RabbitMQ instead of using APIs.
Does anyone have some ideas?
Sounds like an obvious and easy mechanism for a system like yours that already relies on message queuing. Implement any architecture you want from publishing specific messages to each service - either on a single exchange where every service (as client) looks for himself as the topic, or on an exchange-per-service - or you could simply have an exchange that's read from by the health-check service and all services emit messages periodically (dead man style) to that exchange - and that service just makes sure it hears from anyone once in awhile.
Consider also using rabbit event exchange at your health-check service - so it'll be able to keep track of service connect/disconnects from the channel the service is talking to the exchange with. Channels are suppose to stay up all the time, so a disconnect indicates trouble of some kind - especially if it wasn't preceded by a service (as client) sent message indicating a normal going-down event. In other words, as a health "protocol" - instead of getting polled by a health service, each microservice would be proactive about sending "coming up", "ready", "healthy" (periodically), and "going down" messages to the health service.
As a general comment: In my opinion message queues are very much underutilized. There are many use cases they're more appropriate for than other techniques (e.g., more popular techniques like REST over HTTP). They provide distinct benefits which are built-in to the message queuing/message broker concept which you might very well otherwise need to provide for yourself for your use case (or use a "framework" which has provided it). I'd always consider the role - all the roles! - of a message broker in a system architecture and use it where it fits.

How to combine exchange-to-exchange bindings and federation in RabbitMQ?

I have a design for a RabbitMQ topology, but recently learned that RabbitMQ federation ignores messages that aren't "directly published" to the upstream exchange. This is a problem, because I am using a combination of exchange-to-exchange bindings and federation, so my setup isn't working.
Essentially, our setup is to have messages flowing into one exchange on an "inbound" server, federated to an exchange on a "routing" server, which is bound to another exchange on a routing server, which is federated to an "outgoing" server (which is where clients create queues and bind them). The reasoning behind the exchange-to-exchange binding is to force the routing to happen there, instead of allowing it to happen all the way upstream as would occur without that link. For load reasons, we can't afford for the routing to happen upstream in the "inbound" servers.
Is there a way to re-publish messages in the routing server so federation picks them up, or something to that effect? Is there something other than federation I should use in this topology?
Yes, the shovel plugin allows you to do just that. It consumes from one exchange and re-publishes to another, and the exchanges can be on the same or different nodes.

When to use RabbitMQ shovels and when Federation plugin?

For the company I work for we would like to use RabbitMQ as our main message bus. The idea we have is that every single application uses their own vhost for internal communication and that via the shovel or federation plugin we would make it possible to share certain type of the events across multiple vhosts (maybe even multiple machines (non-clustered)).
We chose for application per vhost to separate internal communication from public events and to keep the security adjustable per application.
Based on the information published on the RabbitMQ website I don't get it when I have to choose for shovels or when I have to choose for the federation plugin.
RabbitMQ has the following explanation when to use what:
Typically you would use the shovel to link brokers across the internet when you need more control than federation provides.
What is the fine grain control in shovels which I am missing when I choose for federation?
At this moment I think I would prefer the federation plugin because I could automate the inter-vhost-communication via the REST API provided by the federation plugin.
In case of shovels I would need to change the shovel configuration and reboot the RabbitMQ instance every time we would like to share an event between vhosts. Are my thoughts correct about this?
We are currently running RMQ on Windows with clients connecting from .NET. In the near future Java/Perl/PHP clients will join.
To summarize my questions:
What is the fine grain control in shovels which I am missing when I
choose for federation?
Is it correct that the only way to change the
inter-vhost-communication when I use shovels is by changing theconfig file and rebooting the instance?
Does the setup (vhost per application) make sense or am I missing the point completely?
Shovels and queue provide different means to be forward messages from one RabbitMQ node to another.
Federated Exchange
With a federated exchange, queues can be connected to the queue on the upstream(source) node. In addition, an exchange on the downstream(destination) node will receive a copy of messages that are published to the upstream node.
Federated exchanges are a similar to exchange-to-exchange bindings, in that, they can (optionally) subscribe to a limited set of messages from an upstream exchange.
Federated Queue
(NOTE: These are new in RabbitMQ 3.2.x)
With a federated queue, consumers can be connected to the queue on both the upstream(source) and downstream(destination) nodes.
In essence the downstream queue is a consumer on the upstream queue, with the expectation that there will be additional downstream consumers that process the messages in the same manner as a consumer attached to the upstream queue.
Any messages consumed by the downstream (federated) queue will not be available for consumers on the upstream queue.
Use Case:
If consumers are being migrated from one node to another, a federated queue will allow this to happen without messages being missed, or processed twice.
Use Case: from the RabbitMQ docs
The typical use would be to have the same "logical" queue distributed
over many brokers. Each broker would declare a federated queue with
all the other federated queues upstream. (The links would form a
complete bi-directional graph on n queues.)
Shovel
Shovels on the other hand, attach an "upstream" queue to a "downstream" exchange. (I place the terms in quotes because the shovel documentation does not describe the nodes with the same semantics as the federation documentation.)
The shovel consumes the messages from the queue and sends them to the exchange on the destination node. (NOTE: While not normally discussed as part of this the pattern, there is nothing stopping a consumer from connecting to the queue on the origin node.)
To answer the specific questions:
What is the fine grain control in shovels which I am missing when I
choose for federation?
A shovel does not have to reside on an "upstream" or "downstream" node. It can be configured and operate from an independent node.
A shovel can create all of the elements of the linkage by itself: the source queue, the bindings of the queue, and the destination exchange. Thus, it is non-invasive to either the source or destination node.
Is it correct that the only way to change the
inter-vhost-communication when I use shovels is by changing theconfig
file and rebooting the instance?
This has generally been the accepted downside of the shovel.
With the following command (caveat: only tested on RabbitMQ 3.1.x, and with a very specific rabbitmq.config file that only contain ) you can reload a shovel configuration from the specified file. (in this case /etc/rabbitmq/rabbitmq.config)
rabbitmqctl eval 'application:stop(rabbitmq_shovel), {ok, [[{rabbit, _}|[{rabbitmq_shovel, [{shovels, Shovels}] }]]]} = file:consult("/etc/rabbitmq/rabbitmq.config"), application:set_env(rabbitmq_shovel, shovels, Shovels), application:start(rabbitmq_shovel).'
.
Does the setup (vhost per application) make sense or am I missing the
point completely?
This decision is going to depend on your use case. vhosts primarily provide logical (and access) separation between queues/exchanges and authorized users.
Shovel acts like a well-designed built-in consumer. It can consume messages from a source broker and queue, and publish them into a destination broker and exchange. You could write an application to do that, but shovel already got it right - if all you need is to move messages from a queue to an exchange in the same or another broker, shovel can do it for you. Just as a well-behaving app, it can declare exchanges/queues/bindings, reconnect, change the routing key etc. You can set it up on the source or on the destination broker, or even use a third broker. It's basically an AMQP client.
Federation, on the other hand is used to connect your broker to one or multiple upstream brokers, or you can even create chains of brokers, bending the topology any way you like. You can federate exchanges or queues, and e.g. distribute messages to multiple brokers without the need to bind additional queues to a topic exchange or using a fanout exchange, and shoveling messages from each queue to a downstream broker.
To recap, federation operates at a higher level, while shovel is mostly "just" a well-written client.
To reconfigure shovel, you have to restart the broker, unfortunately.
I don't think you really need a per app vhost. You can add a per-app user to the broker without separate vhosts. Not sure what you mean on "share an event between vhosts", though.