We are facing an issue while producing message to an ActiveMQ 5.15.4 broker.
The thread trying to produce the message is blocked indefinitely:
Thread 464: (state = BLOCKED)
- java.lang.Object.wait(long) #bci=0 (Compiled frame; information may be imprecise)
- org.apache.activemq.transport.failover.FailoverTransport.oneway(java.lang.Object) #bci=370, line=620 (Compiled frame)
- org.apache.activemq.transport.MutexTransport.oneway(java.lang.Object) #bci=12, line=68 (Interpreted frame)
It seems that the FailoverTransport object is waiting to get valid connection (transport object not null) but the reconnection task is never launched.
Any idea how we can reach that situation and how to fix it?
At the end we did several actions and we didn't face the issue since then:
Switch from CachingConnectionFactory to PooledConnectionFactory
Use a dedicated connection factory for message senders distinct from the one used by the listeners
Define one JmsTemplate per destination, and set the defaultDestination
only once. (we used to have only 1 JmsTemplate and the destination was defined in the send method)
Related
I have a very basic demo application for testing the RabbitMQ blocking behaviour. I use RabbitMQ 3.10.6 with the .NET library RabbitMQ.Client 6.2.4 in .NET Framework 4.8.
The connection is created using ConnectionFactory.CreateConnection() and uses AutomaticRecoveryEnabled = true.
The application creates one channel and one queue for sending messages:
IModel sendChannel = Connection.CreateModel();
sendChannel.ConfirmSelect();
sendChannel.QueueDeclare("sendQueueName", true, false, false);
For receiving messages, again one channel and one queue are created:
IModel receiveChannel = Connection.CreateModel();
receiveChannel.ConfirmSelect();
receiveChannel.QueueDeclare("receiveQueueName", true, false, false);
var receiveQueueConsumer = new QueueConsumer(receiveChannel); // This is my own class which inherits from 'DefaultBasicConsumer' and passes 'receiveChannel' to its base in the constructor.
receiveChannel.BasicConsume("receiveQueueName", false, receiveQueueConsumer);
Now I fill my disk until the configured threshold in the RabbitMQ config file is reached.
As expected, the ConnectionBlocked event is fired. The connection now is in state "blocking".
Now I queue a message. AMQP properties are added to the message using channel.CreateBasicProperties() with Persistent = true. It is then queued:
sendChannel.BasicPublish("", "sendQueueName", amqpProperties, someBytes);
sendChannel.WaitForConfirms(TimeSpan.FromSeconds(5)); // Returns as expected after 5 seconds with return value 'true'.
The connection now is in state "blocked".
Now I shut down my demo application and I have to realize that disposing does not work as expected.
sendChannel.Close(); // Blocks for 10 seconds.
if (receiveChannel.IsOpen) receiveChannel.BasicCancel(ConsumerTags.First()); // In 'receiveQueueConsumer'. Throws a 'TimeoutException'.
Connection.Close(); // Freezes for at least a minute.
The behaviour is the same when calling Dispose() or Abort() instead of Close(). When I finally force kill the application (or when I set a timeout for Abort()) then the application closes but the underlying connection and channels are not removed. The connection still is in state "blocked".
At least once there is enough space on the disk again, the blocked connections and their channels are automatically removed by the broker. Without the need to restart it.
Here and here it sounds like the broker just won't react when it is "blocked".
There can be a broad number of reasons for a timeout, from a genuine connection interruption to a resource alarm in effect that prevents target node from reading any data coming from clients unless the alarm clears.
Nodes will temporarily block publishing connections by suspending reading from client connection.
Which would mean I can't free my resources unless I restart the broker or I make sure that the broker has plenty of resources to turn off the resource alarm. Is there an official confirmation for this? Or how do I need to adjust the dispose mechanism to make it work when the broker is blocked?
Setting up a CMS consumer with a listener involves two separate calls: first, acquiring a consumer:
cms::MessageConsumer* cms::Session::createConsumer( const cms::Destination* );
and then, setting a listener on the consumer:
void cms::MessageConsumer::setMessageListener( cms::MessageListener* );
Could messages be lost if the implementation subscribes to the destination (and receives messages from the broker/router) before the listener is activated? Or are such messages queued internally and delivered to the listener upon activation?
Why isn't there an API call to create the consumer with a listener as a construction argument? (Is it because the JMS spec doesn't have it?)
(Addendum: this is probably a flaw in the API itself. A more logical order would be to instantiate a consumer from a session, and have a cms::Consumer::subscribe( cms::Destination*, cms::MessageListener* ) method in the API.)
I don't think the API is flawed necessarily. Obviously it could have been designed a different way, but I believe the solution to your alleged problem comes from the start method on the Connection object (inherited via Startable). The documentation for Connection states:
A CMS client typically creates a connection, one or more sessions, and a number of message producers and consumers. When a connection is created, it is in stopped mode. That means that no messages are being delivered.
It is typical to leave the connection in stopped mode until setup is complete (that is, until all message consumers have been created). At that point, the client calls the connection's start method, and messages begin arriving at the connection's consumers. This setup convention minimizes any client confusion that may result from asynchronous message delivery while the client is still in the process of setting itself up.
A connection can be started immediately, and the setup can be done afterwards. Clients that do this must be prepared to handle asynchronous message delivery while they are still in the process of setting up.
This is the same pattern that JMS follows.
In any case I don't think there's any risk of message loss regardless of when you invoke start(). If the consumer is using an auto-acknowledge mode then messages should only be automatically acknowledged once they are delivered synchronously via one of the receive methods or asynchronously through the listener's onMessage. To do otherwise would be a bug in my estimation. I've worked with JMS for the last 10 years on various implementations and I've never seen any kind of condition where messages were lost related to this.
If you want to add consumers after you've already invoked start() you could certainly call stop() first, but I don't see any problem with simply adding them on the fly.
I have multiple durable subscribers listening to a Durable-Topic. Say, all the subscribers are configured to concurrently handle messages to '2-8'.
Among these subscribers, one is not able to process the messages due to some runtime dependencies (say, an external service is unavailable), so this subscriber throws custom RuntimeException to allow ActiveMq to redeliver the message 7 times (default). What I see in the Activemq administrative console is, there are too many redelivery attempts for this particular subscriber - also, I see there dequeue count increases drastically, for one message, it increases more 36 and not consistent. Why is it? Am I doing anything wrong?
My Listener factory configuration
DefaultJmsListenerContainerFactory factory = new DefaultJmsListenerContainerFactory();
factory.setMessageConverter(messageConverter());
factory.setConnectionFactory(connectionFactory);
factory.setPubSubDomain(true);
factory.setSubscriptionDurable(true);
factory.setSessionTransacted(true);
factory.setSessionAcknowledgeMode(Session.SESSION_TRANSACTED);
factory.setConcurrency("2-8");
factory.setClientId("topicListener2");
return factory;
I tried searching for solution of my problem but could not find it stack overflow.
Issue
When a user tries to declare a queue or exchange, in a corner case where RabbitMQ server is having some issue, the client keeps waiting without any timeout which causes the thread calling the rabbitmq to always remain in waiting state (wait which never ends).
Below is stacktrace
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:50)
- locked <0x00000007bb0464c8> (a com.rabbitmq.utility.BlockingValueOrException)
at com.rabbitmq.utility.BlockingCell.uninterruptibleGet(BlockingCell.java:89)
- locked <0x00000007bb0464c8> (a com.rabbitmq.utility.BlockingValueOrException)
at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:33)
at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:343)
at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:216)
at
(AMQChannel.java:118)
at com.rabbitmq.client.impl.ChannelN.queueDeclare(ChannelN.java:833)
at com.rabbitmq.client.impl.ChannelN.queueDeclare(ChannelN.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$CachedChannelInvocationHandler.invoke(CachingConnectionFactory.java:917)
- locked <0x00000007bb555300> (a java.lang.Object)
at com.sun.proxy.$Proxy293.queueDeclare(Unknown Source)
at org.springframework.amqp.rabbit.core.RabbitAdmin.declareQueues(RabbitAdmin.java:575)
at org.springframework.amqp.rabbit.core.RabbitAdmin.access$200(RabbitAdmin.java:66)
at org.springframework.amqp.rabbit.core.RabbitAdmin$12.doInRabbit(RabbitAdmin.java:504)
at org.springframework.amqp.rabbit.core.RabbitTemplate.doExecute(RabbitTemplate.java:1456)
at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:1412)
at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:1388)
at org.springframework.amqp.rabbit.core.RabbitAdmin.initialize(RabbitAdmin.java:500)
at org.springframework.amqp.rabbit.core.RabbitAdmin$11.onCreate(RabbitAdmin.java:419)
at org.springframework.amqp.rabbit.connection.CompositeConnectionListener.onCreate(CompositeConnectionListener.java:33)
at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:553)
- locked <0x00000007bb057828> (a java.lang.Object)
at org.springframework.amqp.rabbit.core.RabbitTemplate.doExecute(RabbitTemplate.java:1431)
at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:1412)
at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:1388)
at org.springframework.amqp.rabbit.core.RabbitAdmin.declareQueue(RabbitAdmin.java:207)
Any help will be highly appreciated. Declaration of queues is currently in my postconstruct of beans calling our component handling messaging, thus not letting any new bean create.
UPDATE
The issue came again on our prod server. When trying to connect via amqp-client-3.4.2 directly it seems working. But from spring-rabbit-1.6.7.RELEASE, spring-amqp-1.6.7.RELEASE it is not working.
Via amqp-client-3.4.2
ConnectionFactory factory = new ConnectionFactory();
factory.setHost("<<HOST NAME>>");
factory.setUsername("<<USERNAME>>");
factory.setPassword("<<PASSWORD>>");
factory.setVirtualHost("<<VIRTUAL HOST>>");
Connection connection = factory.newConnection();
Channel channel = connection.createChannel();
channel.queueDeclare(QUEUE_NAME, true, false, false, null);
Code flow with rabbit-amqp client
Spring way which is not working
CachingConnectionFactory factory = new CachingConnectionFactory();
factory.setHost("<<HOST NAME>>");
factory.setUsername("<<USERNAME>>");
factory.setPassword("<<PASSWORD>>");
factory.setVirtualHost("<<VIRTUAL HOST>>");
RabbitAdmin admin = new RabbitAdmin(factory);
Queue queue = new Queue(QUEUE_NAME);
admin.declareQueue(queue);
Code flow with spring amqp
This issue occurs rarely and we are still trying to figure out the reason behind this behavior. We tried setting connection timeout but did not worked in our test program.
On debugging it further it looks like an exception is not letting notification sent back to our code. For client not found kind of issues, we are getting exception properly.
We are using RabbitMQ 3.6.10 and Erlang 19.3.4 on CentOS Linux 7 (Core)
Declaration of queues is currently in my postconstruct of beans
I can't speak to the hang but you should NEVER interact with the broker from post construct, afterPropertiesSet() etc. It is too early in the application context lifecycle.
There are several work arounds - implement SmartLifecycle; return true from isAutoStartup() and put the bean in an early phase (see Phased). start() will be called after the application context is fully created.
However, it's generally better to just define the queues, bindings etc as beans and let the framework take care of doing all the declarations for you.
I had something semi-similar happen, which I'll share in case it helps anyone.
It appears to me that a call to "rabbitAdmin.declareQueue" will wait for any ongoing publisher-confirm callbacks to complete. I couldn't find this documented anywhere but this was the behaviour I witnessed.
In my case, a separate thread (Thread #2) was processing a publisher-confirmation while Thread #1 was trying to declare a queue (and hanging). Thread #1 was waiting for Thread #2 to complete, but in my case I had a deadlock due to some funky database-locking I was doing which actually caused Thread #2 to also wait for Thread #1 to complete.
The solution was for me to stop doing significant processing in publisher-confirmation callbacks. In my callback, I actually just launch yet another thread to do the real processing. This allows my publisher-confirmation callback to return almost-immediately, releasing any potential deadlocks.
We are facing a random issue with ActiveMQ and its consumers. We observe that, few consumers are not receiving messages, even though they are connected to ActiveMQ queue. But it works fine after the consumer restart.
We have a queue named testQueue at ActiveMQ side. A consumer is trying to de-queue the messages from this queue. We are using Spring's DefaultMessageListenerContainer for this purpose. Message is being delivered to the consumer node from ActiveMQ Broker. From the tcpdump as well, it was obvious that, message is reaching the consumer node, But the actual consumer code is not able to see the message. In other words, the message seems to be stuck either in ActiveMQ consumer code or in Spring’s DefaultMessageListenerContainer.
See refer to the below fig. for more clarity on the issue. Message is reaching Consumer node, but it is not reaching the “Actual Consumer Class”, which means that the message got stuck either in AMQ consumer code or Spring DMLC.
Below are the details captured from ActiveMQ admin.
Queue-Name /Pending-Message-Count /Consumer-Count /Messages-Enqueued /Messages-Dequeued
testQueue /9 /1 /9 /0
Below are the more details.
Connection-ID /SessionId /Selector /Enqueues /Dequeues /Dispatched /Dispatched-Queue /Prefetch
ID:bearsvir52-45176-1375519181268-3:5 /1 / /9 /0 /9 /9 /250
From the second table it is obvious that, messages are being delivered to the consumer, but the consumer is not acknowledging the message. Hence the messages are stuck in Dispatched-Queue at broker side.
Few points for to your notice:
1)There is no time difference b/w Broker node and consumer node.
2)Observed the tcpdump at consumer side. We can see MessageDispatch(Openwire) packet being transferred to consumer node, But could not find the MessageAck(Openwire) for the same.
3)Sometimes it is working on a node, and sometimes it is creating problem on the same node.
One cause of this can be incorrectly using a CachingConnectionFactory (with cached consumers) with a listener container that dynamically adjusts the consumers (max consumers > consumers). You can end up with a cached consumer just sitting in the pool and not being actively used. You never need to cache consumers with a listener container.
For problems like this, I generally recommend running with TRACE logging and you can see all the consumer activity.
It took lot of time to figure out the solution. There seems to be some issue with the org.apache.activemq.ActiveMQConnection.java class, in case of AMQ fail over. The connection object is not getting started at consumer side in such cases.
Following is the fix i have added in ActiveMQConnection.java file and compiled the sources to create activemq-core-x.x.x.jar
private final Object startMutex = new Object();
added a check in createSession method
public Session createSession(boolean transacted, int acknowledgeMode) throws JMSException {
synchronized (startMutex) {
if(!isStarted()) {
start();
}
}