How to do a redis FLUSHALL without initiating a sentinel failover? - redis

We have a redis configuration with two redis servers. We also have 3 sentinels to monitor the two instances and initiate a fail over when needed.
We currently have a process where we periodically have to do a FLUSHALL on the redis server. This is a blocking operation that takes longer than the time we have allotted for the sentinels to timeout. In other words, we have our sentinel configuration with:
sentinel down-after-milliseconds OurMasterName 5000
and doing a redis-cli FLUSHALL on the server takes > 5000 milliseconds, so the sentinels initiate a fail over.
We acknowledge that doing a FLUSHALL isn't great and we also know that we could increase the down-after-milliseconds to but for the purposes of this question assume that neither of these are options.
The question is: how can we do a FLUSHALL (or equivalent operation) WITHOUT having our sentinels initiate a fail over due to the FLUSHALL blocking for greater than 5000 milliseconds? Has anyone encountered and solved this problem?

You could just create new instances: if you are using something like AWS or Azure than you have API for creating a new Redis cluster. Start it, load it with data and once ready just modify the DNS, again with API call -so all these can be handled by some part of your application. But on premises things can get more complex because it will require some automation with ansible/chef/puppet.

The next best option you currently have to is to delete keys in batches to reduce the amout of work to at once. You can build a list, assuming you don't have one, using scan Then delete in whatever batch size works for you.
Edit: as you are not interested in keeping data, disable persistence, delete the RDB file, then just restart the instance. This way you do t have to update sentinel like you would if you take the provision new hosts.
Out of curiosity, if you're just going to be flushing all the time and don't care about the data as you'll be wiping it, why bother with sentinel?

Related

Replica node with connection issue to primary node in Redis

We are using ElastiCache for redis with cluster-mode on. Recently we are seeing random replica-to-primary connection issue which leads our client connection timeout at mget or xread. Usually this lasts for about 20 minutes and will recover itself but it still brought us customer experience impact. My question here is:
if redis fail to read from one replica, would it re-route the same request to another replica in the same shard?
what's the recommended way to mitigate the impact?
thanks in advance!

Is it safe to run SCRIPT FLUSH on Redis cluster?

Recently, I started to have some trouble with one of me Redis cluster. used_memroy and used_memory_rss increasing constantly.
According to some Googling, I found following discussion:
https://github.com/antirez/redis/issues/4570
Now I am wandering if it is safe to run SCRIPT FLUSH command on my production Redis cluster?
Yes - you can run the SCRIPT FLUSH command safely in a production cluster. The only potential side effect is blocking the server while it executes. Note, however, that you'll want to call it in each of your nodes.

Best Redis setup for session caching

I see there are multiple modes of operation for Redis (cluster, sentinel, master-slave, etc?). I don't fully understand the implications of each, but my question is this:
If I have a web application that requires distributed session persistence, which configuration of Redis makes the most sense? The main reason I'm using redis is to achieve some level of fault tolerance. If one of my frontend servers fails, I want the sessions to be available for other nodes to pickup the workload. If a redis node goes down, I don't want this to affect the user experiences, and I don't want to have to wake up a developer at midnight to correct the matter.
From everything I've read, Redis Sentinel is the way to go for fault tolerance.

redis sentinel out of sync with servers in a cluster

We have a setup with a number of redis (2.8) servers (lets say 4) and as many redis sentinels. On startup of each machine, we set a pre-select machine as master through the command line and all the rest as slaves of that. and the sentinels all monitor these machines. The clients first connect to the local sentinel and retrieve the master's IP address and then connect there.
This setup is trouble free most of the time but sometimes the sentinels go out of sync with servers. if I name the machines A,B,C and D - sentinels will think B is master while redis servers are all connected to A as the master. bringing down redis server on B doesnt help either. I had to bring it down and manually "Sentinel failover" on A to fix the issue. Question is
1. What causes this to happen and whats the easiest and quickest way to fix this ?
2. What is best configuration - is there something better than this ?
The only time you should set a master is the first time. Once sentinel has taken over management of replication you should let it do it. This includes on restarts. Don't use the command line to set replication. Let sentinel and redis manage it. This is why you're getting issues - you've told sentinel it is authoritative, but you are telling the Redis servers to ignore sentinel.
Sentinel stores the status in its Config file, so when it restarts it can resume the last configuration. So even on restart, let sentinel do it's job.
Also, if you have 4 servers (be specific, not "let's say") you should be running a quorum of three on your monitor statement in sentinel. With a quorum of two you can wind up with two masters

Is it recommended to run redis using Supervisor

Is it a good practice to run redis in production with Supervisor?
I've googled around, but haven't seen many examples of doing so. If not, what is the proper way of running redis in production?
I personally just use Monit on Redis in production. If Redis crash Monit will restart it but more importantly Monit will be able to monitor (and alert when a threeshold is reach) the amount of RAM that Redis currently takes (which is the biggest issue)
Configuration could be something like this (if maxmemory was set to 1Gb in Redis)
check process redis
with pidfile /var/run/redis.pid
start program = "/etc/init.d/redis-server start"
stop program = "/etc/init.d/redis-server stop"
if 10 restarts within 10 cycles
then timeout
if failed host 127.0.0.1 port 6379 then restart
if memory is greater than 1GB for 2 cycles then alert
Well..it depends. If I were do use redis under daemon control I would use runit. I do use monit but only for monitoring. I like to see the green light.
However, for redis to exploit the true power, you dont run redis as a deamon esp a master. If a master goes down, you will have to switch a slave to a master. Quit simply, I just shoot the node in the head and I have a chef recipe bring up a new node.
But then again....it also depends on how often you snapshot. I do not snapshot thus no need for deamon control.
People use reids for brute force speed. that means not writing to disk and keep all data in ram. If a node goes down...and you dont snapshot...data is lost.