RabbitMQ Separate Rejected and Expired dead-lettered messages - header

Is there a way to deliver messages that have been deadlettered because they were rejected to a different queue than messages that have expired?
I can imagine building a service that reads the 'all dead' queue, parses the header xdeath reason and separates the messages by reason, posting them on different queues.
But is there also a way in RabbitMQ to do a similar thing?
(The use case: I want to process the expired messages by a superfast service to quickly reduce the backlog (for instance after we have a database failure). But the rejected messages need to be handled by a service with a lot of debug logging enabled.)

Related

MassTransit consumer fault handling when the destination system is down for long time

I have read the MT documentation on Error handling and faults and put some code to publish the fault and written a fault consumer to listen to the fault message after some number of retries with Polly.
I have a queue consumer gets the messages from RabbitMQ using MassTranasit and send to a cloud system through Http api. I have handled all possible exceptions and also wrapped http calls in Polly retry for transient network errors. But the problem with this approach is the message is literally abandoned from processing after the retries exhausted.
If the destination system is down for 10 hrs assume( this outage we don't know before otherwise i will plan for consumer service stop), what is the best strategy we can put with MassTransit to stop pulling the messages from Queue into Consumer? Is there a way we can stop receiving the messages based on number of failures etc..?
Thanks
You need a circuit breaker, it's a well-known pattern in distributed systems. The circuit breaker activates when the remote system is struggling under load and putting more requests to it will potentially strangle it. It would also allow you to stop sending messages to the remote system when it is down.
The circuit breaker is available in MassTransit out of the box.
I would also not recommend implementing retries using Polly in the consumer. MassTransit has a comprehensive set of retry policies and it also allows MassTransit to understand how many failures occur in the consumer, which is not available when you use Polly. For example, the circuit breaker middleware won't know about failures in a Polly-wrapped call and therefore won't be reacting properly.
If the remote system is down for a long time (like hours, as you described), any retry policy with a limited number of attempts will eventually fail. The circuit breaker will open but it would reset from time to time and try sending calling the consumer again. Otherwise, it won't ever know when the remote system is recovered. So, you would either need to recover messages from the error queue or add the redelivery middleware.
You can therefore configure your receive pipeline this way:
redelivery -> circuit breaker -> retry -> consumer

Distribute work with RabbiMQ

I'm using RabbitMQ and need a message to be queued and consumed by multiple servers, but only one of the servers could confirm this message and only it would remove this message from the queue, even if other servers have consumed the message, but do not have removed it from the queue. I wonder how RabbitMQ can do this scenario ?

How do you replay missed messages when using STOMP to connect to RabbitMQ?

I've got an iOS application which uses a STOMP Client to talk to RabbitMQ. The application loads a lot of state during startup, and then keeps that state in sync by receiving updates published on STOMP. Of course, if it loses its connection, it can no longer be sure it's in sync, and therefore has to re-load that large initial blob. Any kind of network interruption triggers this behavior and makes my customers sad.
There are a lot of big-picture ways to fix this (and I'm working on them) but in the meantime, I'm trying to use persistent queues to solve this problem. The idea is that the server will create a queue, bind it to the appropriate topics, and then start building the large startup bundle. When finished, it will hand everything off to the client. The client will set itself up with the startup bundle, open a subscription to the queue, and then process any updates which happened while the server was getting things ready. Similarly, if the client should become disconnected, it can simply reconnect and resume reading the messages it finds in the queue.
My problem is that while the client successfully receives messages sent after it connects, if there were any messages in the queue before it connected, they are not read. Likewise, if the client becomes disconnected, when it reconnects, it won't see any messages which arrived while it was away.
Can anyone suggest how I might get the client to be able to read those missing messages?
It turns out what was happening was that the STOMP adapter was consuming the messages but failing to deliver them. Thus, when the client reconnected, it wouldn't have any messages waiting for it.
To fix the problem, I changed the "ack" setting on the subscribe request to "client", meaning that STOMP shouldn't consider the message delivered until the client sends back an ACK frame. By changing my client appropriately, messages now get delivered even after the client has been away.

MSMQ, WCF and robustness

Not being an expert on MSMQ or WCF, I have read up a fair bit about it and it sounds and looks great.
I am trying to develop something, eventually but first some theory, which needs to be robust and durable.
MSMQ I guess will be hosted on a seperate server.
There will be 2 WCF services. One for incoming messages and the other for outgoing messages (takes a message, does some internal processing/validation then places it on the outgoing messages queue or maybe sending an email/text message/whatever)
I understand with the right configuration, we can have the system so that it can be transactional (no messages are ever lost) and can be sent exactly once, so no chance of duplication of messages.
The applications/services will be multithreaded to process messages, which there will be hundreds and thousands of them.
BUT during the processing of a message or through the services lifetime, what if the server crashes? What if the server reboots? What if the service throws an exception for whatever reason? How is it possible to not lose that message but some how to put it back on the queue waiting for it to be processed again?
Also how is it possible to make sure that the service is robust in such a way that it will spawn itself again?
I'd appreciate any advice and details here. There is quite alot to take in and WCF/MSMQ exposes quite alot of options.
Your assumption:
MSMQ I guess will be hosted on a seperate server.
is incorrect. MSMQ is installed on all machines which want to participate in message queuing.
There will be 2 WCF services. One for incoming messages and the other
for outgoing messages
In the most typical configuration, the destination queues are local to the listening service.
For example, your ServiceA would have a local queue from which it reads. ServiceB also has a local queue from which it reads. If ServiceA wants to call ServiceB it will put a message into ServiceB's local queue.
I understand with the right configuration, we can have the system so
that it can be transactional (no messages are ever lost)
This is correct. This is because MSMQ uses a messaging pattern called store-and-forward. See here for an explanation.
Essentially the reason it is safe to assume no message loss is because the transmission of a message from one machine to another actually takes place under three distinct transactions.
The first transaction: ServiceA writes to it's own temporary local queue. If this fails the transaction rolls back and ServiceA can handle the exception.
Second transaction: Queue manager on ServiceA machine transmits message to Queue manager on ServiceB machine. If failure then message remains on temporary queue.
Third transaction: ServiceB reads the message off local queue. If ServiceB message handler method throws exception then transaction rolls message back to local queue.
The applications/services will be multithreaded to process messages
This is fine except if you require order to be preserved in the message processing chain. If you need ordered processing then you cannot have multiple threads without implementing a re-sequencer to reapply order.
I thought that MSMQ can be hosted seperately and have x servers share
that queue?
All servers which want to participate in the exchange of messages have MSMQ installed. Each server can then write to any queue on any other server.
The reason for my thinking was because what if the server goes down?
Then how will the messages get sent/received into MSMQ
If the queues are transactional then that means messages on them are persisted to disk. If the server goes down then when it comes back up the messages are still there. While a server is down it obviously cannot participate in the exchange of messages. However, messages can still be "sent" to that server - they just remain local to the sender (in a temporary queue) until the destination server comes back on-line.
so by having one central MSMQ server (and having it mirrored/failover)
then there will be guarentee of uptime
The whole point of using message queueing is it's a fault-tolerant transport, so you don't need to guarantee uptime. If you have a 100% availability then there would be little reason to use message queuing.
how will WCF be notified of messages that are incoming?
Each service will listen on its own local queue. When a message arrives, the WCF runtime causes the handling method to be called and the message to be handled.
how will the service be notified of failures of sending messages
If ServiceA fails to transmit a message to ServiceB then ServiceB will never be notified of that failure. Nor should it be. ServiceA will handle the failure to transmit, not ServiceB. Your expectation in this instance creates a hard coupling between the services, something which message queueing is supposed to remove.
MSMQ can store messages even if temporary shutdown the service or reboot computer.
Main goal of WCF is transport message from source to destination. Doesn't matter what is the transport. In your case MSMQ is transport for WCF and not obvious to have online / available both client and service simultaneously. But when message is received, it's your responsibility to correctly process it, despite what transport was used to send message.

Behavior of channels in "confirm" mode with RabbitMQ

I've got some trouble understanding the confirm of RabbitMQ, I see the following explanation from RabbitMQ:
Notes
The broker loses persistent messages if it crashes before said
messages are written to disk. Under certain conditions, this causes
the broker to behave in surprising ways. For instance, consider this
scenario:
a client publishes a persistent message to a durable queue
a client consumes the message from the queue (noting that the message is persistent and the queue durable), but doesn't yet ack it,
the broker dies and is restarted, and
the client reconnects and starts consuming messages.
At this point, the client could reasonably assume that the message
will be delivered again. This is not the case: the restart has caused
the broker to lose the message. In order to guarantee persistence, a
client should use confirms. If the publisher's channel had been in
confirm mode, the publisher would not have received an ack for the
lost message (since the consumer hadn't ack'd it and it hadn't been
written to disk).
Then I am using this http://hg.rabbitmq.com/rabbitmq-java-client/file/default/test/src/com/rabbitmq/examples/ConfirmDontLoseMessages.java to do some basic test and verify the confirm, but get some weird results:
The waitForConfirmsOrDie method doesn't block the producer, which is different from my expectation, I suppose the waitForConfirmsOrDie will block the producer until all the messages have been ack'd or one of them is nack'd.
I remove the channel.confirmSelect() and channel.waitForConfirmsOrDie() from publisher, and change the consumer from auto ack to manual ack, I publish all messages to the queue and consume messages one by one, then I stop the rabbitmq server during the consuming process, what I expect now is the left messages will be lost after the rabbitmq server is restarted, because the channel is not in confirm mode, but I still see all other messages in the queue after the server restart.
Since I am new to RabbitMQ, can anyone tells me where is my problem of the confirm understanding?
My understanding is that "Channel Confirmation" is for Broker confirms it successfully got the message from producer, regardless of consumer ack this message or not. Depending on the queue type and message deliver mode, see http://www.rabbitmq.com/confirms.html for details,
the messages are confirmed when:
it decides a message will not be routed to queues
(if the mandatory flag is set then the basic.return is sent first) or
a transient message has reached all its queues (and mirrors) or
a persistent message has reached all its queues (and mirrors) and been persisted to disk (and fsynced) or
a persistent message has been consumed (and if necessary acknowledged) from all its queues
Old question but oh well..
I publish all messages to the queue and consume messages one by one, then I stop the rabbitmq server during the consuming process, what I expect now is the left messages will be lost after the rabbitmq server is restarted, because the channel is not in confirm mode, but I still see all other messages in the queue after the server restart.
This is actually how it should work, IF the persistence is enabled. If the server crashes or something else goes wrong, the messages cannot be confirmed, and thus, won't be removed from the queue.
Messages will only be removed from the queue if they are confirmed to be handled, or the broker didn't yet write it to memory or disk before the server crashed.
Confirming and acknowledging can be set off if wanted, and the producer won't be waiting for the acks. I cannot find the exact command for it right now, but it does exist.
More on the acks and confirms: https://www.rabbitmq.com/reliability.html