Symfony Messenger: retry delay not working with Redis transport - redis

I have a Symfony 4 application using the Symfony Messenger component (version 4.3.2) to dispatch messages.
For asynchronous message handling some Redis transports are configured and they work fine. But then I decided that one of them should retry a few times when message handling fails. I configured a retry strategy and the transport actually started retrying on failure, but it seems to ignore the delay configuration (keys delay, multiplier, max_delay) and all the retry attempts are always made without any delay, all within one second or a similarly short timespan, which is really undesirable in this use case.
My Messenger configuration (config/packages/messenger.yaml) looks like this
framework:
messenger:
default_bus: messenger.bus.default
transports:
transport_without_retry:
dsn: '%env(REDIS_DSN)%/without_retry'
retry_strategy:
max_retries: 0
transport_with_retry:
dsn: '%env(REDIS_DSN)%/with_retry'
retry_strategy:
max_retries: 5
delay: 10000 # 10 seconds
multiplier: 3
max_delay: 3600000
routing:
'App\Message\RetryWorthMessage': transport_with_retry
I tried replacing Redis with Doctrine (as implementation of the retrying transport) and voila - the delays started to work as expected. I therefore suspect that the Redis transport imlementation doesn't support delayed retry. But I read the docs carefully, searched related Github issues, and still didn't find a definite answer.
So my question is: does Redis transport support delayed retry? If it does, how do I make it work?

It turned out that Redis transport supports delayed retry, but only since Messenger version 4.4.

Related

ActiveMQ failover transport options not working as expected

I would like to use the ActiveMQ failover transport as described in https://activemq.apache.org/failover-transport-reference.html.
The default "retry forever" failover options work as expected.
However, since "forever" is sometimes too long, I tried to set some options in order to terminate the retry earlier.
For example, at startup I would like to terminate the application immediately if the connection to a broker can not be established at the first attempt.
I tried the simplest option:
failover:tcp://localhost:61616?startupMaxReconnectAttempts=0
but to my surprise, the retry goes on "forever" nevertheless.
I have tried many other combinations of options, like
failover:tcp://localhost:61616?startupMaxReconnectAttempts=0&maxReconnectDelay=10&maxReconnectAttempts=0&timeout=10
but without the desired result.
What am I doing wrong? How can I configure the failover transport such that it will terminate reconnection attempts at startup if a broker is not available?
I am using ActiveMQ version 5.15.9 (https://hub.docker.com/r/rmohr/activemq) and the Apache.NMS.ActiveMQ lib version 1.8.
The relevant code snippet is
var factory = new ConnectionFactory(connectionString);
var connection = factory.CreateConnection();
var session = connection.CreateSession(); // hangs here
There is Apache.NMS.ActiveMQ specific URI configuration: https://activemq.apache.org/components/nms/providers/activemq/uri-configuration which is not consistent with https://activemq.apache.org/failover-transport-reference.html, which brings a lot of confusion.
Following the NMS documentation I came up with a working solution:
failover:(tcp://localhost:61616)?transport.startupMaxReconnectAttempts=1
the composite URI must be in parentheses: failover:(tcp://localhost:61616)?... and not failover:tcp://localhost:61616?....
transport specific options must be prefixed with transport.
option transport.startupMaxReconnectAttempts=0 corresponds to infinite retries

Why is "await Publish<T>" hanging / not completing / not finishing

The following piece of code has been working for some time and it has suddenly stopped returning:
await availableChangedPublishEndpoint
.Publish<IAvailableStockChanged>(
AvailableStockCounter.ConvertSkuQtyToAvailableStockChangedEvent(
newAvailable,
absMessage.Warehouse)
);
There is nothing clever in ConvertSkuQtyToAvailableStockChangedEvent - it just maps one simple class to another.
We added logs before and after this code and it's definitely just stopping at this point. Other systems are publishing fine, other messages are being sent from this application (for e.g. logs are actually sent via RabbitMQ). We have redeployed and we have upgraded to latest MassTransit version. We are seeing that the messages are being published - possibly multiple times, but this Publish method never returns.
We had a broken RabbitMQ node and a clean service restart on one node fixed it. I appreciate there might be other reasons for this behaviour, but this was our problem.
systemctl restart rabbitmq-server
Looking further into RabbitMQ we saw that some of the empty queues that were connected to this exchange were not synchronized (see below) and when we tried to synchronize them that wouldn't work.
We also couldn't delete some of these unsynchronized queues.
We believe an unexpected shutdown of one of the nodes had caused this problem - but it left most queues / exchanges completely OK.

Error While running Mule

While Running the Mule, I am facing the below error:
Timeout waiting for mule context to be completely started
Please let me know the work around solution for this. The same integration is working fine i.e the query fetching is happening fine with other system having mule but the same is not working in my system. Please Suggest a way to overcome this.
Thanks in Advance...!
Goutham ...Did you configured timeout in your flow? If it is configured ..
1. is it configured in Munit which we need to look into run and wait scope..
2. Or is this coming during the shutdown of mule ?
You can set a timeout value to enable the current flow to complete. However, there is no built in method or utility to check what messages are in flight. You can connect a profiler and see the active threads (or just a thread dump), this should provide you an overview of what’s happening at the JVM level.
To ensure all inflight messages are processed you can shutdown mule in two steps:
Stop the flow(s) manually (this will prevent new messages from coming)
Stop Mule
Alternatively, you can set shutdownTimeout to a milliseconds value for a flow; hwoever this is not a global value.
https://docs.mulesoft.com/mule-user-guide/v/3.8/starting-and-stopping-mule-esb
http://grepcode.com/file/repo1.maven.org/maven2/org.mule/mule-core/3.7.0/org/mule/transport/AbstractMessageDispatcher.java
The second link will provide you the internal implementation of Mule's AbstractMessageDispatcher .Hope this helps.
Thanks

Handling cache warm-up with twisted and systemd

I have a simple twisted application which I run using a systemd service, executing a script, which subsequently executes a .tac file.
The application is structured as a JSON RPC endpoint (fastjsonrpc), built into a t.w.r.Resource, which is in a t.w.s.Site, and served t.a.i.TCPServer, and the whole thing packed into a t.a.Application. This works fine.
Where I do run into trouble is when I try to warm up caches at startup. This warm-up process is pretty slow (~300 seconds), and makes systemd timeout and kill the process. Increasing the timeout is not really a viable option, since I wouldn't want this to block system boot.
Analogous code is used in a separate stack running on Flask from within Apache and wsgi. That server starts itself off and lets systemd go on while it takes its time building the caches. This behaviour is fine for me.
I've tried calling the warmup function using the following within the setup function of the t.w.r.Resource:
reactor.callLater(1, ep.warmup, None)
I've not yet tried using this from within systemd, and have been testing it from twistd directly on the command line. The server does work as expected, however it no longer responds to SIGINT (^C). Removing the callLater is all that's needed to let the server respond to SIGINT.
If the warmup function is called directly (not by callLater, i.e., the arrangement which makes systemd give up while waiting for warm up to complete), the resulting server also continues to respond to SIGINT.
Is there a better / good way to handle this sort of long-running warmup code?
Why would twistd / the reactor not respond to SIGINT? Am I missing something here?
Twisted is a single-threaded thing. It sounds like your "cache warmup" code is blocking the reactor for those 300 seconds. One easy way to fix this would be using deferToThread to let it run without blocking the reactor.

How to be sure that channel.basic_publish has succeeded (internet connection error, ...)?

Im doing this :
channel.basicPublish("myexchange", "routing", MessageProperties.PERSISTENT_TEXT_PLAIN,
"message".getBytes());
I would like to retry later to send the message if the publish didn't succeeded (connection loss, ...) but basicPublish is a void function and there is no callback in the arguments
Any idea ?
You are looking an HA client,
By default you have to implement the feature by your self.
If you use java there is :
https://github.com/joshdevins/rabbitmq-ha-client (it's just a bit old but it think it still work).
Anyway, if you want implement the functionality you have catch the exception and re-try later.
If the client lose the connection you should re-connect the client before re-send the message.
On the version 3.3.0 the last features is implemented by default to the java client:
java client
enhancements
14587 support automatically reconnecting to server(s) if connection is
interrupted
This point is very important you want send the messages sequentially.
A simple solution is put the messages in a client list and then remove the message from the list only if the message has been sent correctly.
I think you could find interesting also the Publisher Acknowledgements