RabbitMQ - upgraded to a new version and got a lot of "PRECONDITION_FAILED unknown delivery tag 1" - rabbitmq

Just upgraded to a new version of RabbitMQ -- 2.3.1 -- and now the following error occurs:
PRECONDITION_FAILED unknown delivery tag 1
...followed by the channel closing. This worked on an older RabbitMQ with no client-side changes.
In terms of application behavior:
When App A wants to send an async message to App b and receive an answer from B, this is the algorithm:
App A generate a unique ID and puts it in the message object
Then App A subscribes to a new Queue with both the queue name and routing key equals to the uuid.
App B open the message, do some calculations and return the result to the channel with the routkey that it recieved.
App A gets the answer and close the queue.
So far everything went really well in 1.7.0. what went wrong in 2.3.1?
When Application A calls basicPublish(), application B immediately throws the following exception:
com.rabbitmq.client.ShutdownSignalException: channel error; reason: {#method<channel.close>(reply-code=406,reply-text=PRECONDITION_FAILED - unknown delivery tag 1,class-id=60,method-id=80),null,""}
at com.rabbitmq.client.impl.ChannelN.processAsync(ChannelN.java:191)
at com.rabbitmq.client.impl.AMQChannel.handleCompleteInboundCommand(AMQChannel.java:159)
at com.rabbitmq.client.impl.AMQChannel.handleFrame(AMQChannel.java:110)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:438)
Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; reason: {#method<channel.close>(reply-code=406,reply-text=PRECONDITION_FAILED - unknown delivery tag 1,class-id=60,method-id=80),null,""}

The only codepath that can cause that exception is through the broker handling a 'basic.ack', so this sounds like a client issue; check the client code.
In particular, check that you aren't ack'ing messages more than once. Doing so is in violation of the AMQP 0-9-1 spec:
A message MUST not be acknowledged more than once. The receiving peer MUST validate that a non-zero delivery-tag refers to a delivered message, and raise a channel exception if this is not the case
A great place to ask such questions is the rabbitmq-discuss mainling-list; all the RabbitMQ developers read that list and make a point of not leaving questions unanswered.
It's also worth noting that previous versions of Rabbit were more lax and did not throw an error in this case, but more recent versions do.

Just set noAck: false on BasicConsume method

Related

A failure to decode a rabbit message fails the reactive Reactive Messaging - readiness check

I've encountered a problem when using small rye reactive messaging with Quarkus, for a rabbit MQ incoming handler.
The message being published to rabbit has a json content type, and the method signature of the handling code is written accordingly:
#Incoming("event-name")
#Acknowledgment(Acknowledgment.Strategy.POST_PROCESSING)
public void processEvent(final JsonObject payload)
However, in the event that the message body contains bad json, that cannot be parsed successfully, this method is never invoked and a io.vertx.core.json.DecodeException is thrown, when handling the failure this calls into the io.smallrye.reactive.messaging.providers.extension.HealthCenter.reportApplicationFailure() which then means the healthcheck endpoint will product a DOWN response. The app in question runs in k8s, so the pod gets restarted, but the new instance will pick up on the same message and produce the same. The only way to deal with the issue seems to be manually remove the bad message from the queue.
Looking in the docs https://quarkus.io/guides/rabbitmq-reference#health-reporting it suggests that a failed message should be nacked and the failure-strategy should handle it, but it seems because the message isn't being parsed properly, it isn't getting as a far as the processing, the the failure strategy isn't being called.
I'm actually not certain if this is the intended behaviour in this circumstance, if I can do something about it or if it genuinely is a bug - using Quarkus 2.12.0.
My expectation is that it should be possible to handle this circumstance in some way without causing the health check to fail and dequeueing the message so that the bad message isn't picked up again and again.

TimedOut in python-telegram-bot but message is sent

I've got following error while trying to send a message to a specific telegram channel:
TimedOut: Timed out
The read operation timed out
the method which I used from python-telegram-bot was send_message.
Although my bot got this error but it still sent the message to the channel and because I did not catch that exception all data from the message was lost but I really need to remove my messages from that channel after a specific period of time.
Is this OK that the bot sent the message even though it got Timed Out? How can I prevent this from happen again or remove this kind of messages from the channel after being sent?
Time out errors mean that TG didn't send a response to your send_message request quick enough. It does not necessarily mean that the request wasn't processed - that's why the message may still be sent. However, without response from TG, you don't have the message id of the resulting message and it will be hard to impossible to delete it.
You can try to increase the time that PTB waits for a response from TG. THis can be done in different ways:
with the timeout parameter of send_message
with Defaults.timeout, if you're using PTBs Defaults setup
by specifying it via the request_kwargs that you pass to Updater
You may want to have a look at this wiki page on networking.
Disclaimer: I'm currently the maintainer of python-telegram-bot
After a couple of hours reading here and there, and passing timeout=30 to context.bot.send_audio and getting an error that says unknown parameter even though send_audio's docs clearly states it takes a timeout param, I found that you can fix this by passing the timeouts to the Application upon building it:
application = ApplicationBuilder()
.token(bot_data["token"])
.read_timeout(30)
.write_timeout(30)
.build()
This fixed my bot. Hope this helps you as well.

How does the 'Publisher returns" happen/work in Spring AMQP?

I am working on RabbitMQ integration. I have a microservice which receives messages from other services. I am currently looking into how to handle messages which encounter exceptions during processing.
The scenario could be:
ServiceA sends message to engine's queue.
Engine processes the message received.
During processing, engine encountered an exception (say a NullPointerException)
Engine returns the message to ServiceA for reprocessing
ServiceA holds the message until the exception in the engine is resolved (resending to engine can be manually triggered)
I bumped into Spring AMQP documentation about Publisher Returns but I could not totally grasp the context. I would like to know how this works and if this could be a solution to address above item #4. Or is there other solution for this?
Thank you in advance!
For #4 on your list the solution is quite simple - don't acknowledge the message automatically, rather then when the processing is finished. In that way -
if the client (subscriber) dies (for whatever reason) during processing of the message then that message is re-queued (so sent to ServiceA for reprocessing in your case).
If you want to explicitly re-queue the message you could do negative acknowledgment (search for it here).
In any case of re-queuing (manual or automatic) you should be careful that the single message that causes subscribers to die doesn't end up being processed forever by subscriber(s), that is - make sure that the exception that happened during processing was a random and not a guaranteed event. Example for this would be a message containing invalid XML - you process it, see it's invalid, handle the exception and re-queue, but then again another (or the same) subscriber gets it, and handles the same exception since the content of the message and the XML inside it didn't change and so on...

Multiple acknowledge for the same delivery tag

In my project I saw that there is a chance of acknowledging the same delivery tag twice. When this happens, the consumer gets unbound from the queue and no further messages come to the consumer (Observed using the RabbitMQ management dashboard).
How can I check that a given delivery tag has already been acknowledged? Is there a recommended way to handle such scenario using the RabbitMQ API?
I tried to avoid acknowledging twice in my code but unfortunately it is not possible due to some design issues.
As the AMQP protocol reference is pretty clear about this:
A message MUST not be acknowledged more than once. The receiving peer MUST validate that a non-zero delivery-tag refers to a delivered message, and raise a channel exception if this is not the case. ...
A quick test reveals that, at least in current versions, this does not cause a consumer to stop working, but that behavior might be implementation-dependent.
In short, you would have to review your design to avoid this situation.

MassTransit with RabbitMQ: When is a message moved to the error queue

I am using RabbitMQ version 3.0.2 & I see close to 1000 message in Error queue. I want to know
At what point messages are moved to the error queues?
Is there a way to know why a certain message is being moved to an error queue?
Is there any way to move message from error queue to normal queue?
Thank you
a) they fail to deserialize or b) the consumer throws an exception processing that message five times
Not really... If you peek at the message in the queue, the payload headers might contain a note but I don't think we did that. If you turn logging on (NLog, log4net, etc) you should be able to see the exceptions in your log. You'll have to correlate message ids at that point to figure out exactly why.
There is no built in way via MassTransit. Mostly because there doesn't seem to be a great, generic way to handle this. Everyone wants some process around this. Dru did create a BusDriver app (in the main MT source repo) that could be used to move messages back to the exchange in question. This default behaviour is there so you at least know things have been failing if you don't put in the infrastructure to handle it.
To add to Travis' answer, During my development I found some other reasons for messages going onto the error queue:
The published message type has no consumer
A SAGA and a consumer are expecting the same concrete message type. Even if you try and differentiate using "Accepts" and ".Selected", both a SAGA and a Consumer should not be programmed to receive the same message type.