I am trying to tune a 3 node rabbitmq cluster (don't have a separate load balancer in the configuration) by setting ha-sync-batch-size. After playing around with it I am observing that the latency of the failover seems to actually increase with the batch size setting set. It seems that the default value (every message) works better with faster switchover to a new master node. Is this the general observation or are there other considerations for setting the batch size?
To test I am using a load tester with 20-50 concurrent users. And shut down one node at a time starting with the master node. Typically, a few messages error out and then the new master node kicks in. Is there a better way of reducing the window for the new master node election? Any feedback would be appreciated.
I think I found an answer. I failed to post that I am using RabbitTemplate. I wired a RetryTemplate into it as follows:
public #Bean RabbitTemplate templateFactory(){
log.debug("Creating an template factory.....");
RabbitTemplate r=new RabbitTemplate(connectionFactory);
r.setExchange(rabbitExchange);
r.setRoutingKey(rabbitBinding);
r.setRetryTemplate(retryTemplate());
return r;
}
#Bean RetryTemplate retryTemplate(){
RetryTemplate retryTemplate = new RetryTemplate();
ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
backOffPolicy.setInitialInterval(500);
backOffPolicy.setMultiplier(10.0);
backOffPolicy.setMaxInterval(5000);
retryTemplate.setBackOffPolicy(backOffPolicy);
SimpleRetryPolicy policy=new SimpleRetryPolicy();
policy.setMaxAttempts(3);
retryTemplate.setRetryPolicy(policy);
return retryTemplate;
}
And, spring quite reliably retries (3 times in this case) enough number of times to send the request through. If anybody has a better idea please post. Thanks
Related
I have a consumer:
#Bean
public Function<Flux<Message<byte[]>>, Mono<Void>> myReactiveConsumer() {
return flux ->
flux.doOnNext(this::processMessage)
.doOnError(this::isRepetableError, ?message -> sendToTimeoutQueue(message)?)
.doOnError(this::allOtherErrors, ?message -> sendToDlq(message)?)
.then();
}
In case of deterministic error I want the message to be sent to dead letter queue,
but if the error isn't deterministic, I want the message to be sent to specific timeout queue (depending on how many times it has failed).
I have tried configuring RetryTemplate but it doesn't seem to give me enough information to redirect the message to different queue
#StreamRetryTemplate
public RetryTemplate myRetryTemplate() {
return new RetryTemplate(...init);
}
Also configuring it through yaml file allows me to almost do what is needed but not exactly.
A solution like this seems good but I was unable to get it to work as spring cloud uses different beans.
How can I implement this retry logic?
We have Spring Boot web services hosted on AWS. They make frequent calls to a Redis Cluster cache using Jedis.
During load testing, we're seeing increased latency around ValueOperations that we're having trouble figuring out.
The method we've zoomed in on does two operations, a get followed by an expire.
public MyObject get(String key) {
var obj = (MyObject)valueOps.get(key);
if (obj != null) {
valueOps.getOperations().expire(key, TIMEOUT_S, TimeUnit.SECONDS)
}
}
Taking measurements on our environment, we see that it takes 200ms to call "valueOps.get" and another 160ms calling "expire", which isn't an acceptable amount of latency.
We've investigated these leads:
Thread contention. We don't currently suspect this. To test, we configured our JedisConnectionFactory with a JedisPoolConfig that has blockWhenExhausted=true and maxWaitMs=100, which if I understand correctly, means that if the connection pool is empty, a thread will block for 100ms waiting for a connection to be released before it fails. We had 0 failures running a load test with these settings.
Slow deserializer. We have our Redis client configured to use GenericJackson2JsonRedisSerializer. We see latency with the "expire" call, which we don't expect has to use the deserializer at all.
Redis latency. We used Redis Insights to inspect our cluster, and it's not pegged on memory or CPU when the load test is running. We also examined slowlog, and our slowest commands are not related to this operation (our slowest commands are at 20ms, which we're going to investigate).
Does anyone have any ideas? Even a "it could be this" would be appreciated.
i'm trying to create distributed transaction between multi services. for this cause i'm using mass transit framework - courier feature and RabbitMQ. and my Routing slip config is:
public class RoutingSlipPublisher
{
private readonly IBusControl _bus;
public RoutingSlipPublisher(IBusControl bus)
{
_bus = bus;
}
public async Task<Guid> PublishInsertCoding(Coding coding)
{
var builder = new RoutingSlipBuilder(NewId.NextGuid());
builder.AddActivity("Core_Coding_Insert", new Uri($"{RabbitMqConstants.RabbitMqUri}Core_Coding_Insert"));
builder.AddActivity("Kachar_Coding_Insert", new Uri($"{RabbitMqConstants.RabbitMqUri}Kachar_Coding_Insert"));
builder.AddActivity("Rahavard_Coding_Insert", new Uri($"{RabbitMqConstants.RabbitMqUri}Rahavard_Coding_Insert"));
builder.SetVariables(coding);
var routingSlip = builder.Build();
await _bus.Execute(routingSlip);
return routingSlip.TrackingNumber;
}
}
Issue:
when Kachar_Coding_Insert consumer not connected to RabbitMQ over specified time, I want compensate transaction. but this does not happen and the transaction is not complete until Kachar_Coding_Insert consumer connect to RabbitMQ and execute activity.
How do you solve this problem?
There is no centralized orchestrated with Courier, the routing slip is the source of truth. If the routing slip is in a queue waiting to be executed by an activity, and that activity is not available, the routing slip will stay in that queue until the service is started.
There isn't any way to change this behavior, your routing slip activity services should be available. You may want to monitor your services to ensure they're running, it seems like a required activity service not being available would be an unhealthy condition.
Background:
My service caches data in redis standalone setup in prod environment, using spring-data-redis RedisTemplate using #Cacheable annotation. I cache data for 3 mins , however I saw that my redis memory was gradually increasing (this observation was done for 1-2 weeks). I suspected that my redis keys were not getting evicted, as number of keys constantly increased (this could be due to constant load too). So I disconnected my service from redis for 3 min and observed redis memory . All the keys got expired and memory usage dropped.
However, when I restarted my service to cache data in redis , within 1-2 mins of doing the same I got the same number of keys as before( this was expected as load was high on my service) but the memory usage of redis was significantly less.
Below is a graph of number of keys in redis before, during redis not being used and after reconnecting my service to cache
Below is the graph of memory used by redis for above scenarios
As you can see , for the same number of keys , redis is consuming very high memory when it was run for a long duration (1-2 weeks). When I disconnected my service from redis to empty all keys and then again restarted to use redis cache , my memory usage was very low, for same number of keys
What could explain this behaviour? Could it be memory leak , for connection I have a class which extends CachingConfigurerSupport. The connection bean and redis template bean are as follows:
#Bean
public JedisConnectionFactory redisConnectionFactory() {
JedisConnectionFactory jedisConnFactory = new JedisConnectionFactory();
jedisConnFactory.setUsePool(true);
jedisConnFactory.setHostName(redisMasterUrl);
return jedisConnFactory;
}
#Bean
public RedisTemplate<Object, Object> redisTemplate(RedisConnectionFactory cf) {
RedisTemplate<Object, Object> redisTemplate = new RedisTemplate<Object, Object>();
redisTemplate.setConnectionFactory(cf);
return redisTemplate;
}
Anything I am missing out pertaining to connections , do I need to close connections anywhere when using RedisTemplate?
I think the answer is that once peak memory usage is reached by Redis , it never frees it until restarted.This is the nature of the memory allocator it uses.
Reference:
https://groups.google.com/forum/#!topic/redis-db/ibhYDLT_n68
I've got a working solution to my problem but wonder if there is a cleaner way of doing this.
My architecture is composed of several services, emitting messages through a Rabbitmq broker.
Some workers consume those messages and do background jobs.
The thing is that i wanted to be able to create different types of workers, all consuming the same services and be able to have several workers of the same type get the job through round robin.
To do this the messages are published by the service in a pub/sub fashion and consumed by a process that redistribute the messages in a work queue dedicated to a set of workers.
Is there a more elegant manner to do this?
Sorry if the explanation is not clear i'll edit it.
Thanks!
(I could have created one queue per worker in the services but with my solution I can subscribe as much as I want without touching any code)
Sounds like a perfect fit for topic exchange.
Please look:
https://www.rabbitmq.com/tutorials/tutorial-two-java.html
Section "Round-robin dispatching"
You have to set the parameter channel.basicQos(X):
channel.basicQos(1);
final Consumer consumer = new DefaultConsumer(channel) {
#Override
public void handleDelivery(String consumerTag, Envelope envelope, AMQP.BasicProperties properties, byte[] body) throws IOException {
String message = new String(body, "UTF-8");
System.out.println(" [x] Received '" + message + "'");
try {
doWork(message);
} finally {
System.out.println(" [x] Done");
channel.basicAck(envelope.getDeliveryTag(), false);
}
}
};