We have Spring Boot web services hosted on AWS. They make frequent calls to a Redis Cluster cache using Jedis.
During load testing, we're seeing increased latency around ValueOperations that we're having trouble figuring out.
The method we've zoomed in on does two operations, a get followed by an expire.
public MyObject get(String key) {
var obj = (MyObject)valueOps.get(key);
if (obj != null) {
valueOps.getOperations().expire(key, TIMEOUT_S, TimeUnit.SECONDS)
}
}
Taking measurements on our environment, we see that it takes 200ms to call "valueOps.get" and another 160ms calling "expire", which isn't an acceptable amount of latency.
We've investigated these leads:
Thread contention. We don't currently suspect this. To test, we configured our JedisConnectionFactory with a JedisPoolConfig that has blockWhenExhausted=true and maxWaitMs=100, which if I understand correctly, means that if the connection pool is empty, a thread will block for 100ms waiting for a connection to be released before it fails. We had 0 failures running a load test with these settings.
Slow deserializer. We have our Redis client configured to use GenericJackson2JsonRedisSerializer. We see latency with the "expire" call, which we don't expect has to use the deserializer at all.
Redis latency. We used Redis Insights to inspect our cluster, and it's not pegged on memory or CPU when the load test is running. We also examined slowlog, and our slowest commands are not related to this operation (our slowest commands are at 20ms, which we're going to investigate).
Does anyone have any ideas? Even a "it could be this" would be appreciated.
Related
I'm using pipeline with lettuce, and I have a design question. When trying send a block of commands to redis using the 'sendBlock' method below, I'm thinking about 2 options:
(1) Having one instance of the connection already established in the class, and reuse it:
private void sendBlock()
{
this.conn.setAutoFlushCommands(false);
(...)
this.conn.flushCommands();
}
(2) Every time I send a block of commands get a connection from redis, perform the action and close it.
private void sendBlock()
{
StatefulRedisModulesConnection<String, String> conn = RedisClusterImpl.connect();
conn.setAutoFlushCommands(false);
(...)
conn.flushCommands();
conn.close();
}
Since established connections seem to be shared between all threads in lettuce, I'm not sure if point 1 is correct. If not, I have to go to point 2. And in this case I don't know how costly is to obtain a connection from Redis, so I'm wondering if I need to use pooling (thing that is not recommended in the lettuce docs). In our use case the 'sendBlock' method can be simultaneously called hundreds of times, so it's intensively used by a lot of different threads.
Any help would be really appreciated.
Joan.
Lettuce connections are thread-safe and can be shared if you don't use Redis-blocking commands (ex. BLPOP) and transactions.
Those should be performed on separate connections, as the transaction will apply to the entire connection, and blocking operations will block the connection until they're complete.
Whether or not you should share a manually-flushed connection depends only on the number of operations you perform between flushes. Ex. if each block is 10k commands, and you have 10 threads, you could queue 100k to send at once where you expected 10k. Whether or not this matters will depend on your application, and you should check the performance of your individual case.
If each block is not sending many commands you may not even need to flush manually as Lettuce pipelines with auto-flush enabled (see this answer).
I am trying to consume Apache common pool library to implement an object pooling for the objects that are expensive to create in my application. For respource pooling I have used the GenericObjectPool class of the library to use the default implementation provided by API for the object pooling. In order to ensure that we do not end up having several idle objects in memory, I set up the minEvictableIdleTimeMillis and timeBetweenEvictionRunsMillis properties to 30 minutes.
As I understood from other questions, blogs and API documentation, these properties trigger a separate thread in order to evict the idle objects from pool.
Could someone help me if that has any adverse impact on application performance and if there is any way to test if that thread is actually executed or not?
Library comes with the performance disclaimer when evictor is enabled
Eviction runs contend with client threads for access to objects in the pool, so if they run too frequently performance issues may result.
reference : https://commons.apache.org/proper/commons-pool/api-1.6/org/apache/commons/pool/impl/GenericObjectPool.html
However, we have a high TPS system running eviction every 1 sec and we don't see much of a performance bottle necks.
As for the eviction thread runs are concerned, you can override the evict() method in your implementation of GenericObjectPool and add a log line.
#Override
public void evict() throws Exception {
//log
super.evict();
}
https://github.com/xetorthio/jedis/wiki/Getting-started
using Jedis in a multithreaded environment
You shouldn't use the same instance from different threads because you'll have strange errors. And sometimes creating lots of Jedis instances is not good enough because it means lots of sockets and connections, which leads to strange errors as well.
A single Jedis instance is not threadsafe
! To avoid these problems, you should use JedisPool, which is a threadsafe pool of network connections. You can use the pool to reliably create several Jedis instances, given you return the Jedis instance to the pool when done. This way you can overcome those strange errors and achieve great performance.
=================================================
I want to know why? Can anyone help me please
A single Jedis instance is not threadsafe because it was implemented this way. That's the decision that the author of the library made.
You can check in the source code of BinaryJedis which is a super type of Jedis https://github.com/xetorthio/jedis/blob/master/src/main/java/redis/clients/jedis/BinaryJedis.java
For example these lines:
public Transaction multi() {
client.multi();
client.getOne(); // expected OK
transaction = new Transaction(client);
return transaction;
}
As you can see the transaction field is shared for all threads using Jedis instance and initialized in this method. Later this transaction can be used in other methods. Imagine two threads perform transactional operations at the same time. The result may be that a transaction created by one thread is unintentionally accessed by another thread. The transaction field in this case is shared state access to which is not synchronized. This makes Jedis non-threadsafe.
The reason why the author decided to make Jedis non-threadsafe and JedisPool threadsafe might be to provide flexibility for clients so that if you have a single-threaded environment you can use Jedis and get better performance or if you have a multithreaded environment you can use JedisPool and get thread safety.
I'm running a Tomcat application which uses Jedis to access a Redis database. Form time to time the whole application blocks. By monitoring Tomcat using JavaMelody I found out that the problem seems be related to the JedisPool when a object requests a Jedis instance.
catalina-exec-74
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:503)
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1104)
redis.clients.util.Pool.getResource(Pool.java:20)
....
This is the JedisPoolConfig I'm using
JedisPoolConfig poolConfig = new JedisPoolConfig();
poolConfig.setMaxActive(20);
poolConfig.setTestOnBorrow(true);
poolConfig.setTestOnReturn(true);
poolConfig.setMaxIdle(5);
poolConfig.setMinIdle(5);
poolConfig.setTestWhileIdle(true);
poolConfig.setNumTestsPerEvictionRun(10);
poolConfig.setTimeBetweenEvictionRunsMillis(10000);
jedisPool = new JedisPool(poolConfig, "localhost");
So obviously some threads try to get a Jedis instance but the pool is empty and cannot return an instance so the default pool behavior is wait.
I've already double checked my whole code and I'm pretty sure I return every Jedis instance to the pool that I used before. So I'm not sure why I'm running out of instance.
Is there a ways to check how many instances are left in the pool? I'm trying to find a sensible value for the maxActive parameter to prevent the application from blocking.
Are there any other ways to create memory holes other than not returning the Jedis instances to the pool?
Returning the resource to the Pool is important, so remember to do it. Otherwise when closing your app it'll wait for the resource to return.
https://groups.google.com/forum/?fromgroups=#!topic/jedis_redis/UeOhezUd1tQ
After each Jedis method call, return the resource pool. Your app has probably used all the threads and waits for some to be dropped. This may cause behavior you're explaining and the app is probably blocked.
Jedis jedis = JedisFactory.jedisPool.getResource();
try{
jedis.set("key","val");
}finally{
JedisFactory.jedisPool.returnResource(jedis);
}
Partial answer to hopefully be of some help to people in similar scenarios, though I'm unsure if my problem was the same as yours (if you've since figured it out, please let us know!).
I've already double checked my whole code and I'm pretty sure I return every Jedis instance to the pool that I used before. So I'm not sure why I'm running out of instance.
I thought I had to - I always put my code in try / finally blocks, but it turns out I did have a memory leak:
Jedis jedis = DBHelper.pool.getResource();
try {
// Next line causes the leak
Jedis jedis = DBHelper.pool.getResource();
...
} finally {
DBHelper.pool.returnResource(jedis);
}
No idea how that second call snuck in, but it did and caused both a leak and the web app to block.
Is there a ways to check how many instances are left in the pool? I'm trying to find a sensible value for the maxActive parameter to prevent the application from blocking.
In my case, I both found the bug and optimized based on the number of clients seen by the redis server. I set the loglevel (in redis.conf to verbose (default is notice), which will report about every 5-10 seconds the number of clients connected. Once I found my memory leak, I repeatedly sent requests to the page calling that method, and watched the redis clients reported by redis logs climb, but never drop.. I would think that would be a good start for optimizing?
Anyways, hope this is helpful to someone!
When you use jedis pool, every time you get the resource using getResource(), you have to call releaseResource(). And if number of threads are more than resources, you will have thread contention. I found it much simpler to have Jedis connection per thread using Java ThreadLocal. So, for each thread check whether jedis connection already exists. If yes, use it, otherwise create a connection for the running thread. This ensures there wouldn't be any lock contention or error conditions to look after.
I'm using ServiceStack RedisClient for caching. How can I set a timeout? For example if the result is longer than 5 secs to return null?
Anyone knows?
Thanks
There are some operations like blocking LPOP/RPOP that includes a timeout.
In general redis runs in memory and is extremely fast so its rare that it timesout on its own. However the Network can be down so RedisNativeClient (the base class for RedisClient) includes a SendTimeout which you can set to do this.