Redis HA setup: AOF issue with replicas - redis

I tested Redis standalone server with AOF persistence for every write operation, as I cannot afford to lose even one updated value from distributed clients.
In the Redis HA setup, does it make sense to use AOF along with
Redis replication?
Is it possible to use Redis server replicas with synchronous
replication?
In my env, the write throughput is not really high, i.e. I'm OK with a write request latency of even ~50-80 milliseconds.

In the Redis HA setup, does it make sense to use AOF along with Redis replication?
Yes, you can use AOF along with Redis replication.
Is it possible to use Redis server replicas with synchronous replication?
No. The replication is asynchronous. However, you can try WAIT command to mitigate the problem.
NOTE: Flushing write to disk for every write operation might have a big performance problem. You'd better do benchmark before apply it to production.

Related

Redis as a queue - Configuration review

I need to setup a Redis DB (2.8), which i suppose to use as a queue, which means that it's must be fully persistent (no message can be missed).
I'm pretty new with Redis, and i would like a get a review for my configuration:
I want to use both AOF and RDB persistence models, while always will be selected as appendfsync policy. According to their decontamination, always is not recommended, but i must select this option as i use Redis as a queue, and i can't endure any massages missing.
I would like a create a Master-Slave-Slave cluster using Sentinel with automatic failover.
Redis service will be automatically started after server boot.
Any kind of comments and suggestions will be great. The administration point of view is more important to me (persistence, backup, restore, high availability, etc).

How to do a redis FLUSHALL without initiating a sentinel failover?

We have a redis configuration with two redis servers. We also have 3 sentinels to monitor the two instances and initiate a fail over when needed.
We currently have a process where we periodically have to do a FLUSHALL on the redis server. This is a blocking operation that takes longer than the time we have allotted for the sentinels to timeout. In other words, we have our sentinel configuration with:
sentinel down-after-milliseconds OurMasterName 5000
and doing a redis-cli FLUSHALL on the server takes > 5000 milliseconds, so the sentinels initiate a fail over.
We acknowledge that doing a FLUSHALL isn't great and we also know that we could increase the down-after-milliseconds to but for the purposes of this question assume that neither of these are options.
The question is: how can we do a FLUSHALL (or equivalent operation) WITHOUT having our sentinels initiate a fail over due to the FLUSHALL blocking for greater than 5000 milliseconds? Has anyone encountered and solved this problem?
You could just create new instances: if you are using something like AWS or Azure than you have API for creating a new Redis cluster. Start it, load it with data and once ready just modify the DNS, again with API call -so all these can be handled by some part of your application. But on premises things can get more complex because it will require some automation with ansible/chef/puppet.
The next best option you currently have to is to delete keys in batches to reduce the amout of work to at once. You can build a list, assuming you don't have one, using scan Then delete in whatever batch size works for you.
Edit: as you are not interested in keeping data, disable persistence, delete the RDB file, then just restart the instance. This way you do t have to update sentinel like you would if you take the provision new hosts.
Out of curiosity, if you're just going to be flushing all the time and don't care about the data as you'll be wiping it, why bother with sentinel?

Is AOF or RDB are required for redis cluster?

Redis have 2 persistence options: RDB and AOF. But not sure if it uses them to replicate data from masters to slaves. Should i keep one of them enabled for redis cluster or does it replicate data in some other way?
In documentation i found:
"If you wish, you can disable persistence at all, if you want your data to just exist as long as the server is running."
but not sure if this also true for cluster
Persistence is separate from replication; Redis uses the network for replication. You can disable persistence and still have replication from masters to slaves.

ElastiCache Redis Servers

AWS ElastiCache servers with redis come in everything from very small to very large multi cpu boxes. But redis is single threaded. Anyone know what Amazon is doing to make it use all the cores? I'm assuming that they do, otherwise it's kind of strange that they would be offering it.
The response from AWS was that redis is indeed single threaded. But it's a good suggestion to have more than one CPU to handle OS and network chores, so that Redis gets the resources to run. This makes sense.

what is meaning partial resynchronization of redis?

Starting with Redis 2.8,redis add a function named "Partial resynchronization".I read this official document,but i don't understand.who can help me?
It is about master-slave replication.
The normal behavior of a Redis slave (slave of command, or configuration) is to connect to the master, ask the master to accumulate master-slave traffic, request a complete dump on filesystem to the master, download this dump on the slave, load the dump, and finally play the accumulated traffic until the slave catches up with the master.
This mechanism is quite robust but not very efficient to cover transient connection drops between the slave and the master. If the master-slave link is down for a couple of seconds, the slave will request a full resynchronization (involving a dump, etc ...), even if only a few commands have been missed.
Starting with 2.8, Redis includes a partial replication mechanism so a slave can reconnect to the master, and if some conditions are met (like a transient connection drop), asks the master to resynchronize without having to dump the whole memory instance.
In order to support this feature, the master has to buffer and keep a backlog of commands, so they can be served to the slaves at any time if needed. If the slave is too late behind the master, the backlog may not contain anymore the required data. In that case, a normal full synchronization is done, as in previous versions.