How can I stop redis graceful when mem and swap is full? - redis

Last night, I run a job to insert data into a redis set(because I want keep my data unique).After I wake up this morning, I find insert operation because very slowly.
Htop shows Memory usage 1884/2015MB and swap usage 1019/1021MB
I realize that 2G memory can not hold redis.
Then I run shutdown in redis-cli, but no action, waiting and waiting...
I also try service redis_6379 stop, but terminal stop at stoping....
What can I do to make redis save all data to dump.rdb and close it graceful?

Normally, a simple redis-cli shutdown should suffice.
Are you using periodical snapshots? If yes, you might be safe to reboot your machine. One important thing to note is that enabling periodical snapshots doubles the memory usage since Redis has to create an in-memory copy of the dataset before writing it to disk.
Another important thing is to follow the advices from Redis setup hints, if you haven't already.
This might not answer your question, but should help you avoid it from happening again.

Related

Efficient way to take hot snapshots from redis in production?

We have redis cluster which holds more than 2 million and these keys has been updated with the time interval of 1 minute. Now we have a requirement to take the snapshot of the redis db in a particular interval For eg every 10 minute. This snapshot should not pause the redis command execution.
Is there any async way of taking snapshot from redis ?
It would be really helpful if we get any suggestion on open source tools or frameworks.
The Redis BGSAVE is async and takes a snapshot.
It calls the fork() function of the OS. According to the Redis manual,
Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great
Two million updates in one minutes, that is 30K+ QPS.
So you really have to try it out, run the benchmark that similutes your business, then issue BGSAVE, monitor the I/O and CPU usage of your system, and see if there's a spike in your redis calling latency.
Then issue LASTSAVE, which will tell you when your last success snapshot happened. So you can adjust your backup schedule.

redis del operation cause slow log

Recently, I met a problem when I use setbit in redis. As I use redis as a bloomFilter part for store, 0.2 billion data cost 380MB memory for 99.99% accuracy. Every day I need to delete the redis key for bloomfilter and create a new one, but found slow log, and this may affect other service in product environment. Counld anybody give a better suggest what to do to forbid this? thx a lot~
according command costs(ms):
DEL bloomFilterKey
use(microseconds):83886
Freeing the large mount of memory, i.e. 380MB, costs too much time, and blocks Redis.
In order to avoid this, you can upgrade your Redis to version 4.0, and use the new command UNLINK to delete the key. This command frees memory in a different thread, and won't block Redis.

how does "Disk-backed" replication work in redis cluster

the redis.conf says:
1) Disk-backed: The Redis master creates a new process that writes the RDB
file on disk. Later the file is transferred by the parent
process to the slaves incrementally
I just dont know what does "transferred by the parent process to the slaves" mean?
thank you
It is simple. First read the RDB file into a buffer, and use socket.write to send this to salve's port which is listenning this.
The implemention is more complex than what I said. But this is what redis do. You can refer the replication.c in redis/src for more details.
EDITED:
Yes, the disk-less mechanism just use the child process directly sends the RDB over the wire to slaves, without using the disk as intermediate storage.
Actually, if you use disk to save the RDB and redis master can serve many slaves at the same time without queuing. Once the disk-less replication serve on slave, and if another slave comes and want do a full sync, it need to be queued to wait for the first slave to finish. So there are another settings repl-diskless-sync-delay to wait more slave to do this parallel.
And these two method only occur after something wrong happens. In the normal case, the redis master and salve through a well connected wire to replicate the redis command the slave to keep the same between the master and slave. And if the wire is break or the slave fall down, then need do a partial resync action to obtain the part slave missed. If the psync is not possible to achieve, it will try do full resync. The full resync is what we talked about.
This is how a full synchronization works in more details:
The master starts a background saving process in order to produce an RDB file. At the same time it starts to buffer all new write commands received from the clients. When the background saving is complete, the master transfers the database file to the slave, which saves it on disk, and then loads it into memory. The master will then send all buffered commands to the slave. This is done as a stream of commands and is in the same format of the Redis protocol itself.
And the disk-less replication is just a new feature which supports the full-resync in that case to deal with the slow disk stress. More about it refer to https://redis.io/topics/replication. such as how do psync and why psync will fail, you can find answer from this article.

How to know whether the sync process has been finished in redis?

It seems that the only way to sync data between redis servers is to use the command slaveof, but how to know whether the data has been replicated successfully? I mean, I want to be notified just after the sync done.
I've read some resource code of redis, mainly replication.c, and find nothing official. The only way I know for now, is to use redis command info, and check a specific flag by polling, which looks bad.
Is there any better way to do this?
The way you're trying, i.e. slaveof, is to sync data between Redis master and Redis slave. Whenever some data has been written to master, it will be sync to slave. So, technically, the sync will never be DONE.
If what you want is a snapshot of current data set, you can use the BGSAVE command to save the data set into an RDB file. With the LASTSAVE command, you can check if the BGSAVE has been done. Then copy the file to another host, and load it with Redis.

Avoid redis processes to make RDB snapshots on the same time

Imagine setup of Redis Cluster for example, or just usual sharded setup, where we have N > 1 Redis processes per physical node. All our processes have same redis.conf and enabled SAVE options there with same SAVE period. So, if all our main Redis processes started on the same time - all of them will start SAVE on the same time or around it.
When we have 9 Redis processes and all of them start RDB snapshotting on the same time it:
Affects performance, because we make 9 forked processes that start consume CPU and do IO on the same time.
Requires too much reserved additional memory that can't be used as actual storage, because on write-heavy application Redis may use up to 2x the memory normally used during snapshotting. So... if we want to have redis processes for 100Gb on this node - we should take additional 100Gb for forking all processes on the same time to be safe.
Is there any best practice to modify this setup and make Redis processes start saving one by one or at least with some randomization?
I have only one idea with disabling schedule in redis.conf and write cron script that will start save one by one with time lag. But this solution looks like a hack and it should be some other practices here.