Does the RabbitMQ shovel plugin preserve message ordering? - rabbitmq

I use the RabbitMQ shovel plugin (dynamic shovel, see below) to provide unidirectional messaging between two RabbitMQ brokers over an unreliable WAN link. I see regular connection losses in the RabbitMQ server log.
The relevant portions of the AMQP setup are identical for both brokers: one exchange (fanout, durable) and one queue (durable). The consuming application requires that messages are received in the same order they are produced at the sending side.
The observed behaviour seems to indicate that this is not the case, perhaps due to retransmissions, etc. Does the RabbitMQ shovel pluging preserve message ordering without message loss? What are the required configuration options?

For ensuring message ordering the following parameters should be configured:
"prefetch-count" : 1
"ack-mode" : "on-confirm"
prefetch-count The maximum number of unacknowledged messages copied over a shovel at any one time. Default is 1000.
ack-mode Determines how the shovel should acknowledge messages. If set to on-confirm (the default), messages are acknowledged to the source broker after they have been confirmed by the destination. This handles network errors and broker failures without losing messages, and is the slowest option. If set to on-publish, messages are acknowledged to the source broker after they have been published at the destination. This handles network errors without losing messages, but may lose messages in the event of broker failures. If set to no-ack, message acknowledgements are not used. This is the fastest option, but may lose messages in the event of network or broker failures.
For more information related to RabbitMQ's shovel mechanism please refer to the official documentation: https://www.rabbitmq.com/shovel-dynamic.html

Related

What's the difference among the DeliveryStates for the Vert.x AMQP Client

As a AmqpReceiver, what's the difference among the different DeliveryStates for a received message.
Runing a ReceiverTest for test, see https://github.com/vert-x3/vertx-amqp-client/blob/master/src/test/java/io/vertx/amqp/ReceiverTest.java.
It always got the same result when running testReceptionWithAcceptedMessages, testReceptionWithRejectedMessages: All the messages in the test queue are deleted.
Is it right that the message is still deleted from the MQ server when it is marked as rejected or released? Where can I find more docs about this?
Can Vert.x AMQP Client do the same things as the RabbitMQ client when consuming a queue? For example, positive or negative acknowledgements, multi-acks and requeueing etc. See https://www.rabbitmq.com/confirms.html#basics.
Thanks.
In those tests the client is Accepting and Rejecting the messages from the ActiveMQ Artemis broker. The broker will either discard the message when accepted, or DLQ the message when rejected under the configuration in the tests. You can configure the broker differently in your own case but for the sake of the tests it isn't relevant. What the broker you are talking to does when you Accept, Release, Rejected or Modify a delivery via a set disposition is going to vary depending on the broker you use and its configuration.
You can refer to section 3.3 and section 3.4 of the AMQP 1.0 specification for a definition of how delivery state affect deliveries being available, acquired or archived.

Is message lost when there's n/w issue while publishing the message to RabbitMQ?

I am using Spring AMQP to publish messages to RabbitMQ. Consider a scenario:
1. Java client sends a message to MQ usin amqpTemplate.convertAndSend()
2. But RabbitMQ is down OR there's some n/w issue
In this case the message will be lost? OR
Is there any way it'll be persisted and will be retried?
I checked the publish-confirm model as well but as I understood, ultimately we've to handle the nack messages through coding on our own.
The RabbitTemplate supports adding a RetryTemplate which can be configured whit whatever retry semantics you want. It will handle situations when the broker is down.
See Adding Retry Capabilities.
You can use a transaction or publisher confirms to ensure rabbit secured the message.

Logstash with rabbitmq cluster

I have a 3 node cluster of Rabbitmq behind a HAproxy Load Balancer. When I shut down a node, Rabbitmq successfully switches the queue to the other nodes. However, I notice that Logstash stops pulling messages from the queue unless I restart it. Is this a problem with the way rabbitmq operates? i.e. it deactivates all active consumers. I am not sure if log stash has any retry capability. Anyone run into this issue?
Quoting rabbit mq documentation, page for clustering first
What is Replicated? All data/state required for the operation of a
RabbitMQ broker is replicated across all nodes. An exception to this
are message queues, which by default reside on one node, though they
are visible and reachable from all nodes.
and high availability
Clients that are consuming from a mirrored queue may wish to know that
the queue from which they have been consuming has failed over. When a
mirrored queue fails over, knowledge of which messages have been sent
to which consumer is lost, and therefore all unacknowledged messages
are redelivered with the redelivered flag set. Consumers may wish to
know this is going to happen.
If so, they can consume with the argument x-cancel-on-ha-failover set
to true. Their consuming will then be cancelled on failover and a
consumer cancellation notification sent. It is then the consumer's
responsibility to reissue basic.consume to start consuming again.
So, what does all this mean:
You have to mirror queues
The consumers should use manual ACK
The consumers should reconnect on their own
So the answer to your question is no, it's not a problem with rabbitmq, that's simply how it works. It's up to clients to reconnect.

When to use RabbitMQ shovels and when Federation plugin?

For the company I work for we would like to use RabbitMQ as our main message bus. The idea we have is that every single application uses their own vhost for internal communication and that via the shovel or federation plugin we would make it possible to share certain type of the events across multiple vhosts (maybe even multiple machines (non-clustered)).
We chose for application per vhost to separate internal communication from public events and to keep the security adjustable per application.
Based on the information published on the RabbitMQ website I don't get it when I have to choose for shovels or when I have to choose for the federation plugin.
RabbitMQ has the following explanation when to use what:
Typically you would use the shovel to link brokers across the internet when you need more control than federation provides.
What is the fine grain control in shovels which I am missing when I choose for federation?
At this moment I think I would prefer the federation plugin because I could automate the inter-vhost-communication via the REST API provided by the federation plugin.
In case of shovels I would need to change the shovel configuration and reboot the RabbitMQ instance every time we would like to share an event between vhosts. Are my thoughts correct about this?
We are currently running RMQ on Windows with clients connecting from .NET. In the near future Java/Perl/PHP clients will join.
To summarize my questions:
What is the fine grain control in shovels which I am missing when I
choose for federation?
Is it correct that the only way to change the
inter-vhost-communication when I use shovels is by changing theconfig file and rebooting the instance?
Does the setup (vhost per application) make sense or am I missing the point completely?
Shovels and queue provide different means to be forward messages from one RabbitMQ node to another.
Federated Exchange
With a federated exchange, queues can be connected to the queue on the upstream(source) node. In addition, an exchange on the downstream(destination) node will receive a copy of messages that are published to the upstream node.
Federated exchanges are a similar to exchange-to-exchange bindings, in that, they can (optionally) subscribe to a limited set of messages from an upstream exchange.
Federated Queue
(NOTE: These are new in RabbitMQ 3.2.x)
With a federated queue, consumers can be connected to the queue on both the upstream(source) and downstream(destination) nodes.
In essence the downstream queue is a consumer on the upstream queue, with the expectation that there will be additional downstream consumers that process the messages in the same manner as a consumer attached to the upstream queue.
Any messages consumed by the downstream (federated) queue will not be available for consumers on the upstream queue.
Use Case:
If consumers are being migrated from one node to another, a federated queue will allow this to happen without messages being missed, or processed twice.
Use Case: from the RabbitMQ docs
The typical use would be to have the same "logical" queue distributed
over many brokers. Each broker would declare a federated queue with
all the other federated queues upstream. (The links would form a
complete bi-directional graph on n queues.)
Shovel
Shovels on the other hand, attach an "upstream" queue to a "downstream" exchange. (I place the terms in quotes because the shovel documentation does not describe the nodes with the same semantics as the federation documentation.)
The shovel consumes the messages from the queue and sends them to the exchange on the destination node. (NOTE: While not normally discussed as part of this the pattern, there is nothing stopping a consumer from connecting to the queue on the origin node.)
To answer the specific questions:
What is the fine grain control in shovels which I am missing when I
choose for federation?
A shovel does not have to reside on an "upstream" or "downstream" node. It can be configured and operate from an independent node.
A shovel can create all of the elements of the linkage by itself: the source queue, the bindings of the queue, and the destination exchange. Thus, it is non-invasive to either the source or destination node.
Is it correct that the only way to change the
inter-vhost-communication when I use shovels is by changing theconfig
file and rebooting the instance?
This has generally been the accepted downside of the shovel.
With the following command (caveat: only tested on RabbitMQ 3.1.x, and with a very specific rabbitmq.config file that only contain ) you can reload a shovel configuration from the specified file. (in this case /etc/rabbitmq/rabbitmq.config)
rabbitmqctl eval 'application:stop(rabbitmq_shovel), {ok, [[{rabbit, _}|[{rabbitmq_shovel, [{shovels, Shovels}] }]]]} = file:consult("/etc/rabbitmq/rabbitmq.config"), application:set_env(rabbitmq_shovel, shovels, Shovels), application:start(rabbitmq_shovel).'
.
Does the setup (vhost per application) make sense or am I missing the
point completely?
This decision is going to depend on your use case. vhosts primarily provide logical (and access) separation between queues/exchanges and authorized users.
Shovel acts like a well-designed built-in consumer. It can consume messages from a source broker and queue, and publish them into a destination broker and exchange. You could write an application to do that, but shovel already got it right - if all you need is to move messages from a queue to an exchange in the same or another broker, shovel can do it for you. Just as a well-behaving app, it can declare exchanges/queues/bindings, reconnect, change the routing key etc. You can set it up on the source or on the destination broker, or even use a third broker. It's basically an AMQP client.
Federation, on the other hand is used to connect your broker to one or multiple upstream brokers, or you can even create chains of brokers, bending the topology any way you like. You can federate exchanges or queues, and e.g. distribute messages to multiple brokers without the need to bind additional queues to a topic exchange or using a fanout exchange, and shoveling messages from each queue to a downstream broker.
To recap, federation operates at a higher level, while shovel is mostly "just" a well-written client.
To reconfigure shovel, you have to restart the broker, unfortunately.
I don't think you really need a per app vhost. You can add a per-app user to the broker without separate vhosts. Not sure what you mean on "share an event between vhosts", though.

ActiveMQ redelivery at application level

I use ActiveMQ as a job dispatcher. Which means one master sends job messages to ActiveMQ, and multiple slaves grab job messages from ActiveMQ and process them. When slaves finish one job, they send a message with job_id back to ActiveMQ.
However, slaves are unreliable. If one slave doesn't respond before a period of time, we can assume the slave is down, and try redeliver the sent job message.
Are there any good ideas to realize this re-delivery?
Typically a consumer handles redelivery so that it can maintain message order while a message appears as inflight on the broker. This means that redelivery is limited to a single consumer unless that consumer terminates. In this way the broker is unaware of redelivery.
In ActiveMQ v5.7+ you have the option of using broker side redelivery, it is possible to have the broker redeliver a message after a delay using a resend. This is implemented by a broker plugin that handles dead letter processing by redelivery via the scheduler. This is useful when total message order is not important and where through put and load distribution among consumers is. With broker redelivery, messages that fail delivery to a given consumer can get immediately re-dispatched.
See the ActiveMQ documentation for an example of setting this up in the configuration file.