I have more-or-less implemented the Reliability Pattern in my Mule application using persistent VM queues CloudHub, as documented here. While everything works fine, it has left me with a number of questions about actually ensuring reliable delivery of my messages. To illustrate the points below, assume I have http-request component within my "application logic flow" (see the diagram on the link above) that is throwing an exception because the endpoint is down, and I want to ensure that the in flight message will eventually get delivered to the endpoint:
As detailed on the link above, I have observed that when the exception is thrown within my "application logic flow", and I have made the flow transactional, the message is put back on the VM queue. However all that happens is the message then repeatedly taken off the queue, processed by the flow, and the exception is thrown again - ad infinitum. There appears to be no way of configuring any sort of retry delay or maximum number of retries on VM queues as is possible, for example, with ActiveMQ. The best work around I have come up with is to surround the http-request message processor with the until-successful scope, but I'd rather have these sorts of things apply to my whole flow (without having to wrap the whole flow in until-successful). Is this sort of thing possible using only VM queues and CloudHub?
I have configured my until-successful to place the message on another VM queue which I want to use as a dead-letter-queue. Again, this works fine, and I can login to CloudHub and see the messages populated on my DLQ - but then it appears to offer no way of moving messages from this queue back into the flow when the endpoint comes back up. All it seems you can do in CloudHub is clear your queue. Again, is this possible using VM queues and CloudHub only (i.e. no other queueing tool)?
VM queues are very basic, whether you use them in CloudHub or not.
VM queues have no capacity for delaying redelivery (like exponential back-offs). Use JMS queues if you need such features.
You need to create a flow for processing the DLQ, for example one that regularly consumes the queue via the requester module and re-injects the messages into the main queue. Again, with JMS, you would have better control.
Alternatively to JMS, you could consider hosted queues like CloudAMQP, Iron.io or AWS SQS. You would lose transaction support on the inbound endpoint but would gain better control on the (re)delivery behaviour.
Related
I am currently learning RabbitMQ and AMQP in general. I started working with some tutorials I found online and all of them show more or less the same example - a Spring Boot web app that, upon a REST call, produces a message and puts in onto a RabbitMQ queue and then, another class from the same app, which is configured as the Consumer of that message consumes it and processes the handler method.
I can't wrap my head around why this is beneficial in any way. The upside I understand is that the handler is executed in a separate thread, while the controller method can return right after sending the message to the queue. However, why would this be in any way better than just using Spring's #Async annotation on that handler method and calling it explicitly? In that case I suppose we would achieve the same thing, while not having to host and manage a seperate instance of a message broker like RabbitMQ.
Can someone please explain? Thanks.
Very simply:
with RabbitMq you can have persistent messages and a much safer and consistent exception management. In case the machine crashes, already pushed messages are not lost.
A message can be pushed to an exchange and consumed by more parallel consumers, that helps scaling the application in case the consumer code is too slow.
and a lot of other reasons...
Background
We're using langohr to interact with RabbitMQ. We've tried two different approaches to let RabbitMQ resend messages that has not yet been properly handled by our service. One way that works is to send a basic.nack with requeue set to the true but this will resend the message immediately until the service responds with a basic.ack. This is a bit problematic if the service for example tries to persist the message to a datastore that is currently down (and is down for a while). It would be better for us to just fetch the undelivered messages say every 20 seconds or so (i.e. we neither do a basic.ack or basic.nack if the datastore is down, we just let the messages be retained in the queue). We've tried to implement this using an ExecutorService whose gist is implemented like this:
(let [chan (lch/open conn)] ; We create a new channel since channels in Langohr are not thread-safe
(log/info "Triggering \"recover\" for channel" chan)
(try
(lb/recover chan)
(catch Exception e (log/error "Failed to call recover" e))
(finally (lch/close chan))))
Unfortunately this doesn't seem to work (the messages are not redelivered and just remains in the queue). If we restart the service the queued messages are consumed correctly. However we have other services that are implemented using spring-rabbitmq (in Java) and they seem to be taking care of this out of the box. I've tried looking in the source code to figure out how they do it but I haven't managed to do so yet.
Question
How do you instruct RabbitMQ to (re-)deliver messages in the queue periodically (preferably using Langohr)?
I am not sure what you are doing with your Spring AMQP apps, but there's nothing built into RabbitMQ for this.
However, it's pretty easy to set up dead-lettering using a TTL to requeue back to the original queue after some period of time. See this answer for examples, links etc.
EDIT
However, Spring AMQP does have a retry interceptor which can be configured to suspend the consumer thread for some period(s) during retry.
Stateful retry rejects and requeues; stateless retry handles the retries internally and has no interaction with the broker during retries.
See this answer which has instructions: we Nack the message, the nack puts the message into a holding queue for N seconds, then it TTLs out of that queue and into another queue that puts it back in the original queue.
It took a little bit of work to setup, but it works great!
I am using reliable delivery in mule flow. It is very simple case that takes message from JMS queue (ActiveMQ based), invokes several actions depending on it's content and, if everything is fine - delivers it into another JMS queue.
A flow is synchronized, both JMS queues are transactional (first BEGINS, second JOINS transaction), redelivery is used and DLQ for undelivered messages. Literally: I expect that all messages are properly either processed or delivered to DLQ.
For processing orchestration I am using Scatter/Gather flow control which works quite fine until I call external HTTP service using HTTP connector. When I use default threading profile it happens, that some messages are lost (like 3 of 5000 messages). They just disappear. No trace even in DLQ.
On the other hand, when I use custom profile (not utilizing thread) - all messages are getting processed without any problems.
What I have noticed is the fact, default threading profile utilizes 'ScatterGatherWorkManager', while custom uses 'ActiveMQ Session Task' threads.
So my question is: what is the possible cause of loosing these messages?
I am using Mule Server 3.6.1 CE Runtime.
by default scatter gather is setup for no failed routes you can define your own aggregation strategy to handle lost message
custom-aggregation-strategy
https://docs.mulesoft.com/mule-user-guide/v/3.6/scatter-gather
I am trying to understand how rabbitmq per-connection flow-control works with multiple consumers. In particular what would happen if one consumer were to hang? Would flow control be invoked and how would it affect the rest of the consumers? Would the behaviour depend upon whether the queues were durable or autodeleting?
Thanks.
Rabbit MQ uses "Credit Flow Control".
Essentially, whenever a message is received on a channel a credit is deducted. Credit starts at a default level, e.g. 200, and when it dips below 0, connections are blocked. After a certain number of messages are consumed and ACKed, the credit is bumped up a certain amount.
You can read more about it here:
http://videlalvaro.github.io/2013/09/rabbitmq-internals-credit-flow-for-erlang-processes.html
Per-connection flow control describes what happens when a publisher (or group of publishers) is sending messages to queues faster than the queues are being processed. This is a safety feature as RabbitMQ becomes unstable at some point when the queue fills without bound. From the documentation, this is automatic:
RabbitMQ will block connections which are publishing too quickly for queues to keep up. No configuration is required.
Unfortunately, the documentation is not terribly specific on when/how this flow control is implemented, other than "several times per second." So, if one consumer gets stuck, as long as the other consumer(s) can keep up, flow control should not be triggered.
I was looking for an ActiveMQ broker admin command, to tell it to pause a queue - that is:
continue accepting messages from producing clients
cease delivering to consuming clients, allowing the queue backlog to grow until the queue is resumed, whereupon the backlog is sent to clients.
I was unable to find such a command. The commonest answer was that it should be managed at the client end -- that is, locate every consumer and stop it. Other answers were workarounds, like manipulating network routes or firewalls so that the clients and broker could no longer communicate.
A cursory survey of other message queues indicates that ActiveMQ is not unusual in this regard.
It seems to me there are two reasons this functionality might not be implemented:
It is difficult to implement -- but I can't think of any reason why.
It is counter to the design philosophy of message queues
Which is it, and why?
Being able to pause a queue is supported in the newly released ActiveMQ 5.12.0:
When the queue is "paused":
NO messages sent to the associate consumers
messages still to be enqueued on the queue
ability to be able to browse the queue
all the JMX counters for the queue to be available and correct.
...
implemented pause/resume/isPaused queue view mbean ops and attribute
when paused, there is no dispatch to regular queue consumers, send
and browse work as normal. Any inflight messages will continue inflight
till ackes as normal.
See https://issues.apache.org/jira/browse/AMQ-5229
If you have Jolokia enabled (I think it is enabled by default nowadays), you can use something like the following curl request to pause the queue:
curl --user admin:admin http://127.0.0.1:8161/api/jolokia/exec/org.apache.activemq:brokerName=localhost,destinationName=myQueue,destinationType=Queue,type=Broker/pause
(Using the default username, password and broker name and a queue called myQueue)
Replace "pause" with "resume" in order to resume the queue.
Probably not too complicated to implement - as you say.
I don't know if it's an active design decision of if there has been no demand. Other similar products such as IBM WebSphere MQ implements "get/put inhibited" on queues, so it's obviously is not totally against the philosofy of messaging - rather a tool to operate and trouble shoot live systems.
I'm a bit biased, but I actually like to decouple the sender from the receive (if the are two different systems, that might eventually get switched/upgraded/changed..).
An easy way to decouple the systems, and be able to do what you want is to make the sender send to one queue "DATA.OUT" and the receiver listen to another "DATA.IN". Then you can use Apache Camel (which is typically bundled with ActiveMQ to achieve Enterprise Integration Patterns), to route from DATA.OUT to DATA.IN.
A Camel Route is possible to start/stop via JMX, which will achieve something similar to what you described.
I guess ActiveMQ design in the matter rather have you do these kind of things in a middleware layer, such as Apache Camel, rather than direct on the queues.