Spring AMQP client hangs - rabbitmq

There is a behavior in RabbitMQ server that it will not accept subsequent connections / operations when it reaches to watermark value till the time it rebalances itself.
RabbitMQ client elegantly gets timeout when such situations happen after the connection timeout , But we are using Spring AMQP it continues to hang.
Steps to Reproduce
o Create a RabbitMQ HA cluster
o Create a simple program which produces and consumes message
a) Using Spring AMQP
b) Using RabbitMQ client
o Make RabbitMQ server reach high watermark value in memory so that it cannot accept any new connections or perform any operations say for 10 min
o Create Q, Send message from
a) Spring AMQP ( It will hang )
b) RabbitMQ client ( It will get timeout ) say after 1 min if connection timeout is being set as 1 min.
Spring Binaries Version
a) spring-rabbit-1.6.7.RELEASE.jar
b) spring-core-4.3.6.RELEASE.jar
c) spring-amqp-1.6.7.RELEASE.jar
We tried upgrading to Spring Rabbit and AMQP 2.0.2 version as well , But it didn’t helped.

You don't describe what your "RabbitMQ Client" is, but the java amqp-client uses classic Sockets by default. So you should get the same behavior with both (since Spring AMQP uses that client). Perhaps you are referring to some other language client.
With java Sockets, when the connection is blocked, the thread is "stuck" in socket write which is not interruptible, nor does it timeout.
To handle this condition, you have to use the 4.0 client or above and use NIO.
Here is an example application that demonstrates the technique.
#SpringBootApplication
public class So48699178Application {
private static Logger logger = LoggerFactory.getLogger(So48699178Application.class);
public static void main(String[] args) {
SpringApplication.run(So48699178Application.class, args);
}
#Bean
public ApplicationRunner runner(RabbitTemplate template, CachingConnectionFactory ccf) {
ConnectionFactory cf = ccf.getRabbitConnectionFactory();
NioParams nioParams = new NioParams();
nioParams.setWriteEnqueuingTimeoutInMs(20_000);
cf.setNioParams(nioParams);
cf.useNio();
return args -> {
Message message = MessageBuilder.withBody(new byte[100_000])
.andProperties(MessagePropertiesBuilder.newInstance()
.setDeliveryMode(MessageDeliveryMode.NON_PERSISTENT)
.build())
.build();
while (true) {
try {
template.send("foo", message);
}
catch (Exception e) {
logger.info(e.getMessage());
}
}
};
}
#Bean
public Queue foo() {
return new Queue("foo");
}
}
and
2018-02-09 12:00:29.803 INFO 9430 --- [ main] com.example.So48699178Application : java.io.IOException: Frame enqueuing failed
2018-02-09 12:00:49.803 INFO 9430 --- [ main] com.example.So48699178Application : java.io.IOException: Frame enqueuing failed
2018-02-09 12:01:09.807 INFO 9430 --- [ main] com.example.So48699178Application : java.io.IOException: Frame enqueuing failed

Related

Parallel Flux blocking call

My application set up is mentioned as part of issue# Correct way of using spring webclient in spring amqp
where I am trying to use Spring webclient to make API calls in Spring AMQP rabbit MQ consumer threads.
Issue seems to be that parallel flux blocking call just stalls or takes a very long time after first few requests are fired.
To simulate this, I did below minimalistic set up -
Dependencies used
Spring boot 2.2.6.RELEASE
spring-boot-starter-web
spring-boot-starter-webflux
reactor-netty 0.9.14.RELEASE
As mentioned in the other linked issue, below is configuration for webclient -
#Bean
public WebClient webClient() {
ConnectionProvider connectionProvider = ConnectionProvider
.builder("fixed")
.lifo()
.pendingAcquireTimeout(Duration.ofMillis(200000))
.maxConnections(100)
.pendingAcquireMaxCount(3000)
.maxIdleTime(Duration.ofMillis(290000))
.build();
HttpClient client = HttpClient.create(connectionProvider);
client.tcpConfiguration(<<connection timeout, read timeout, write timeout is set here....>>);
Webclient.Builder builder =
Webclient.builder().baseUrl(<<base URL>>).clientConnector(new ReactorClientHttpConnector(client));
return builder.build();
}
Below is #Service class with parallel flux webclient calls -
#Service
public class FluxtestService {
public Flux<Response> getFlux(List<Request> reqList) {
return Flux
.fromIterable(reqList)
.parallel()
.runOn(Schedulers.elastic())
.flatMap(s -> {
return webClient
.method(POST)
.uri(<<downstream url>>)
.body(BodyInserters.fromValue(s))
.exchange()
.flatMap(response -> {
if(response.statusCode().isError()){
return Mono.just(new Response());
}
return response.bodyToMono(Response.class);
})
}).sequential();
}
}
}
To simulate Spring AMQP rabbit mq consumer/listener, I created below #RestController -
#RestController
public class FluxTestController
#Autowired
private FluxtestService service;
#PostMapping("/fluxtest")
public List<Response> getFlux (List<Request> reqlist) {
return service.getFlux(reqlist).collectList().block();
}
I tried firing requests from jmeter with around 15 threads. First few set of requests are processed very quickly. While requests are being served, I can see below set of logs in log file -
Channel cleaned, now 32 active connections and 68 inactive connections
Once I submit more set of requests, the active connections keeps increasing till it reaches max configured 100. I don't see it decreasing at all. Till this point, response time is ok.
But any subsequent requests start taking very long time. Also I don't see the active connections reducing much at all even though there are no requests being fired.
Also after some time, I see below exceptions -
reactor.netty.internal.shaded.reactor.pool.PoolAcquireTimeoutException: Pool#acquire(Duration) has been pending for more than the configured timeout of 200000 ms
This probably shows that the downstream connection is not being released. Please help advise on this issue and possible fixes.
Seems issue was because the underlying connection was not being properly released in case webclient downstream call responded with error status. While using "exchange" with "webclient", it seems we need to ensure that the response is properly released; else it can lead to connections leak. Below are the changes that seemed to fix this issue -
Replace
.flatMap(response -> {
if(response.statusCode().isError()) {
return Mono.just(new Response());
}
return response.bodyToMono(Response.class);
})
with
.flatMap(response -> {
if(response.statusCode().isError()) {
response.releaseBody().thenReturn(Mono.just(new Response()));
}
return response.bodyToMono(Response.class);
})

Concurrency > 1 is not supported by reactive consumer, given that project reactor maintains its own concurrency mechanism

I'm migrating to the new spring cloud stream version.
Spring Cloud Stream 3.1.0
And I have the following consumer configuration:
someChannel-in-0:
destination: Response1
group: response-channel
consumer:
concurrency: 4
someChannel-out-0:
destination: response2
I have connected this channel to the new function binders
// just sample dummy code
#Bean
public Function<Flux<String>, Flux<String>> aggregate() {
return inbound -> inbound.
.map(w -> w.toLowerCase());
}
And when I'm starting the app I'm getting the following error:
Concurrency > 1 is not supported by reactive consumer, given that project reactor maintains its own concurrency mechanism.
My question is what is the equivalent of concurrency: 4 in project reactor and how do I implement this ?
Basically, Spring cloud stream manage consumers through MessageListenerContainer and it provides a hook which allow users create a bean and inject some advanced configurations. And so here comes to the solution if you are using RabbitMQ as messaging middleware.
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer> listenerContainerCustomizer() {
return (container, destinationName, group) -> {
if (container instanceof SimpleMessageListenerContainer) {
((SimpleMessageListenerContainer) container).setConcurrency("3");
}
};
}

How to handle errors when RabbitMQ exchange doesn't exist (and messages are sent through a messaging gateway interface)

I'd like to know what is the canonical way to handle errors in the following situation (code is a minimal working example):
Messages are sent through a messaging gateway which defines its defaultRequestChannel and a #Gateway method:
#MessagingGateway(name = MY_GATEWAY, defaultRequestChannel = INPUT_CHANNEL)
public interface MyGateway
{
#Gateway
public void sendMessage(String message);
Messages are read from the channel and sent through an AMQP outbound adapter:
#Bean
public IntegrationFlow apiMutuaInputFlow()
{
return IntegrationFlows
.from(INPUT_CHANNEL)
.handle(Amqp.outboundAdapter(rabbitConfig.myTemplate()))
.get();
}
The RabbitMQ configuration is skeletal:
#Configuration
public class RabbitMqConfiguration
{
#Autowired
private ConnectionFactory rabbitConnectionFactory;
#Bean
public RabbitTemplate myTemplate()
{
RabbitTemplate r = new RabbitTemplate(rabbitConnectionFactory);
r.setExchange(INPUT_QUEUE_NAME);
r.setConnectionFactory(rabbitConnectionFactory);
return r;
}
}
I generally include a bean to define the RabbitMQ configuration I'm relying upon (exchange, queues and bindings), and it actually works fine. But while testing for failure scenarios, I found a situation I don't know how to properly handle using Spring Integration. The steps are:
Remove the beans that configure RabbitMQ
Run the flow against an unconfigured, vanilla RabbitMQ instance.
What I would expect is:
The message cannot be delivered because the exchange cannot be found.
Either I find some way to get an exception from the messaging gateway on the caller thread.
Either I find some way to otherwise intercept this error.
What I find:
The message cannot be delivered because the exchange cannot be found, and indeed this error message is logged every time the #Gateway method is called.
2020-02-11 08:18:40.746 ERROR 42778 --- [ 127.0.0.1:5672] o.s.a.r.c.CachingConnectionFactory : Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no exchange 'my.exchange' in vhost '/', class-id=60, method-id=40)
The gateway is not failing, nor have I find a way to configure it to do so (e.g.: adding throws clauses to the interface methods, configuring a transactional channel, setting wait-for-confirm and a confirm-timeout).
I haven't found a way to otherwise catch that CachingConectionFactory error (e.g.: configuring a transactional channel).
I haven't found a way to catch an error message on another channel (specified on the gateway's errorChannel), or in Spring Integration's default errorChannel.
I understand such a failure may not be propagated upstream by the messaging gateway, whose job is isolating callers from the messaging API, but I definitely expect such an error to be interceptable.
Could you point me in the right direction?
Thank you.
RabbitMQ is inherently async, which is one reason that it performs so well.
You can, however, block the caller by enabling confirms and returns and setting this option:
/**
* Set to true if you want to block the calling thread until a publisher confirm has
* been received. Requires a template configured for returns. If a confirm is not
* received within the confirm timeout or a negative acknowledgment or returned
* message is received, an exception will be thrown. Does not apply to the gateway
* since it blocks awaiting the reply.
* #param waitForConfirm true to block until the confirmation or timeout is received.
* #since 5.2
* #see #setConfirmTimeout(long)
* #see #setMultiSend(boolean)
*/
public void setWaitForConfirm(boolean waitForConfirm) {
this.waitForConfirm = waitForConfirm;
}
(With the DSL .waitForConfirm(true)).
This also requires a confirm correlation expression. Here's an example from one of the test cases
#Bean
public IntegrationFlow flow(RabbitTemplate template) {
return f -> f.handle(Amqp.outboundAdapter(template)
.exchangeName("")
.routingKeyFunction(msg -> msg.getHeaders().get("rk", String.class))
.confirmCorrelationFunction(msg -> msg)
.waitForConfirm(true));
}
#Bean
public CachingConnectionFactory cf() {
CachingConnectionFactory ccf = new CachingConnectionFactory(
RabbitAvailableCondition.getBrokerRunning().getConnectionFactory());
ccf.setPublisherConfirmType(CachingConnectionFactory.ConfirmType.CORRELATED);
ccf.setPublisherReturns(true);
return ccf;
}
#Bean
public RabbitTemplate template(ConnectionFactory cf) {
RabbitTemplate rabbitTemplate = new RabbitTemplate(cf);
rabbitTemplate.setMandatory(true); // for returns
rabbitTemplate.setReceiveTimeout(10_000);
return rabbitTemplate;
}
Bear in mind this will slow down things considerably (similar to using transactions) so you may want to reconsider whether you want to do this on every send (unless performance is not an issue).

Spring AMQP Inbound Adapter with empty queue name

I'm developing a consumer application using Spring AMQP that receives messages from RabbitMQ. There is a topic exchange declared. To connect to Rabbit I create a queue with an empty name, because the broker will provide an automatic queue name, see the InterCor M4 Upgraded Specifications Hybrid specifications:
#Bean
public TopicExchange exchange() {
TopicExchange topicExchange = new TopicExchange(topicExchangeName);
topicExchange.setShouldDeclare(false);
return topicExchange;
}
#Bean
public Queue queue() {
return new Queue("", queueDurable, queueExclusive, queueAutoDelete, queueParameters);
}
#Bean
public Binding binding(Queue queue, TopicExchange exchange) {
return BindingBuilder.bind(queue).to(exchange).with(routingKey);
}
But when I try to configure an AMQP Inbound Channel Adapter using the Spring Integration Java DSL:
#Autowired
private Queue queue;
#Bean
public IntegrationFlow amqpInbound(ConnectionFactory connectionFactory) {
return IntegrationFlows.from(Amqp.inboundAdapter(connectionFactory, queue))
.handle(m -> System.out.println(m.getPayload()))
.get();
}
I get an error 'queueName' cannot be null or empty
2018-05-25 13:39:15.080 ERROR 14636 --- [erContainer#0-1] o.s.a.r.l.SimpleMessageListenerContainer : Failed to check/redeclare auto-delete queue(s).
java.lang.IllegalArgumentException: 'queueName' cannot be null or empty
at org.springframework.util.Assert.hasText(Assert.java:276) ~[spring-core-5.0.6.RELEASE.jar:5.0.6.RELEASE]
at org.springframework.amqp.rabbit.core.RabbitAdmin.getQueueProperties(RabbitAdmin.java:337) ~[spring-rabbit-2.0.3.RELEASE.jar:2.0.3.RELEASE]
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.redeclareElementsIfNecessary(AbstractMessageListenerContainer.java:1604) ~[spring-rabbit-2.0.3.RELEASE.jar:2.0.3.RELEASE]
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:963) [spring-rabbit-2.0.3.RELEASE.jar:2.0.3.RELEASE]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_162]
How can I set a value of the message queue name to an empty string?
That is not a good solution.
The problem is that with a broker-generated queue name, if the connection is lost and re-established, the queue name will change, but the container won't know about the new queue and will try to consume from the old one.
AnonymousQueue solves this problem by the framework generating the random name.
But, anonymous queues are not durable, are exclusive and are auto-delete.
If you want a Queue with different properties to that, but still want a random name, use
#Bean
public Queue queue() {
return new Queue(new AnonymousQueue.Base64UrlNamingStrategy().generateName(),
queueDurable, queueExclusive, queueAutoDelete, queueParameters);
}
That way, if the connection is lost and re-established, the queue will get the same name.
The AMQP-816 issue has been fixed and now is available in Spring Boot 2.1.0.
Updating the parent of the project fixes the issue:
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.0.RELEASE</version>
</parent>
Empty queue name
spring:
rabbitmq:
queue:
name:
durable: false
exclusive: true
autoDelete: true
creates an automatic queue name amq.gen-U1vKiSfIvy8bO11jLD29Sw:
Non-empty queue name
spring:
rabbitmq:
queue:
name: abc
durable: false
exclusive: true
autoDelete: true
creates a queue named abc:

Can't connect to a remote zookeeper from a Kafka producer

I've been playing with Apache Kafka for a few days, and here is my problem,
If I set up the local test described in the "quick start" section on the website, everything is fine, the kafka producer/ consumer, zookeeper server and kafka broker work perfectly.
Now if I run on a remote server (let's call it node2) :
- Zookeeper - port 2181
- Kafka Broker - port 9092
- kafka consumer
And then if I run from my local computer :
- kafka producer
Assuming that there is no firewall on node2.
The connection end up with a timeout.
Here is the error log :
/etc/java/jdk1.6.0_41/bin/java -Didea.launcher.port=7533 -Didea.launcher.bin.path=/home/kevin/Documents/idea-IU-123.169/bin -Dfile.encoding=UTF-8 -classpath /etc/java/jdk1.6.0_41/lib/dt.jar:/etc/java/jdk1.6.0_41/lib/tools.jar:/etc/java/jdk1.6.0_41/lib/jconsole.jar:/etc/java/jdk1.6.0_41/lib/htmlconverter.jar:/etc/java/jdk1.6.0_41/lib/sa-jdi.jar:/home/kevin/Desktop/kafka-0.7.2/examples/target/scala_2.8.0/classes:/home/kevin/Desktop/kafka-0.7.2/project/boot/scala-2.8.0/lib/scala-compiler.jar:/home/kevin/Desktop/kafka-0.7.2/project/boot/scala-2.8.0/lib/scala-library.jar:/home/kevin/Desktop/kafka-0.7.2/core/target/scala_2.8.0/classes:/home/kevin/Desktop/kafka-0.7.2/core/lib_managed/scala_2.8.0/compile/jopt-simple-3.2.jar:/home/kevin/Desktop/kafka-0.7.2/core/lib_managed/scala_2.8.0/compile/log4j-1.2.15.jar:/home/kevin/Desktop/kafka-0.7.2/core/lib_managed/scala_2.8.0/compile/zookeeper-3.3.4.jar:/home/kevin/Desktop/kafka-0.7.2/core/lib_managed/scala_2.8.0/compile/zkclient-0.1.jar:/home/kevin/Desktop/kafka-0.7.2/core/lib_managed/scala_2.8.0/compile/snappy-java-1.0.4.1.jar:/home/kevin/Desktop/kafka-0.7.2/examples/lib_managed/scala_2.8.0/compile/jopt-simple-3.2.jar:/home/kevin/Desktop/kafka-0.7.2/examples/lib_managed/scala_2.8.0/compile/log4j-1.2.15.jar:/home/kevin/Documents/idea-IU-123.169/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain kafka.examples.KafkaConsumerProducerDemo
log4j:WARN No appenders could be found for logger (org.I0Itec.zkclient.ZkConnection).
log4j:WARN Please initialize the log4j system properly.
Exception in thread "Thread-0" java.net.ConnectException: Connection timed out
at sun.nio.ch.Net.connect(Native Method)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:532)
at kafka.producer.SyncProducer.connect(SyncProducer.scala:173)
at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:196)
at kafka.producer.SyncProducer.send(SyncProducer.scala:92)
at kafka.producer.SyncProducer.send(SyncProducer.scala:125)
at kafka.producer.ProducerPool$$anonfun$send$1.apply$mcVI$sp(ProducerPool.scala:114)
at kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100)
at kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
at kafka.producer.ProducerPool.send(ProducerPool.scala:100)
at kafka.producer.Producer.zkSend(Producer.scala:137)
at kafka.producer.Producer.send(Producer.scala:99)
at kafka.javaapi.producer.Producer.send(Producer.scala:103)
at kafka.examples.Producer.run(Producer.java:53)
Process finished with exit code 0
And here is my Producer's code :
import java.util.Properties;
import kafka.javaapi.producer.ProducerData;
import kafka.producer.ProducerConfig;
public class Producer extends Thread{
private final kafka.javaapi.producer.Producer<String, String> producer;
private final String topic;
private final Properties props = new Properties();
public Producer(String topic)
{
props.put("zk.connect", "node2:2181");
props.put("connect.timeout.ms", "5000");
props.put("socket.timeout.ms", "30000");
props.put("serializer.class", "kafka.serializer.StringEncoder");
props.put("producer.type", "sync");
props.put("conpression.codec", "0");
producer = new kafka.javaapi.producer.Producer<String, String>(new ProducerConfig(props));
this.topic = topic;
}
public void run() {
String messageStr = new String("Message_test");
producer.send(new ProducerData<String, String>(topic, messageStr));
}
}
**So I also tested to switch
props.put("zk.connect", "node2:2181");
by
props.put("broker.list", "0:node2:9082");
And in that case I can connect successfully.**
See item #3 in http://kafka.apache.org/faq.html
The workaround is to explicitly set hostname property in server.properties of Kafka
You can verify this by using Zookeeper. If you are using kafka 0.7*, open ZkCli console and do get /brokers/ids/0 and you should get all the brokers metadata. Make sure the IP address/hostnames here matches the Zk connect string you are using in producer code -
props.put("zk.connect", "node2:2181");
In my case, I was using a producer running on my local machine connecting to a ubuntu VM (same box, different IP) and this workaround helped.