Avoid redis processes to make RDB snapshots on the same time

Avoid redis processes to make RDB snapshots on the same time - redis

Imagine setup of Redis Cluster for example, or just usual sharded setup, where we have N > 1 Redis processes per physical node. All our processes have same redis.conf and enabled SAVE options there with same SAVE period. So, if all our main Redis processes started on the same time - all of them will start SAVE on the same time or around it.
When we have 9 Redis processes and all of them start RDB snapshotting on the same time it:
Affects performance, because we make 9 forked processes that start consume CPU and do IO on the same time.
Requires too much reserved additional memory that can't be used as actual storage, because on write-heavy application Redis may use up to 2x the memory normally used during snapshotting. So... if we want to have redis processes for 100Gb on this node - we should take additional 100Gb for forking all processes on the same time to be safe.
Is there any best practice to modify this setup and make Redis processes start saving one by one or at least with some randomization?
I have only one idea with disabling schedule in redis.conf and write cron script that will start save one by one with time lag. But this solution looks like a hack and it should be some other practices here.

Related

Efficient way to take hot snapshots from redis in production?

We have redis cluster which holds more than 2 million and these keys has been updated with the time interval of 1 minute. Now we have a requirement to take the snapshot of the redis db in a particular interval For eg every 10 minute. This snapshot should not pause the redis command execution.
Is there any async way of taking snapshot from redis ?
It would be really helpful if we get any suggestion on open source tools or frameworks.

The Redis BGSAVE is async and takes a snapshot.
It calls the fork() function of the OS. According to the Redis manual,
Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great
Two million updates in one minutes, that is 30K+ QPS.
So you really have to try it out, run the benchmark that similutes your business, then issue BGSAVE, monitor the I/O and CPU usage of your system, and see if there's a spike in your redis calling latency.
Then issue LASTSAVE, which will tell you when your last success snapshot happened. So you can adjust your backup schedule.

Redis using as disk persistance with RDB and AOF file

We are using redis server in production with 6 GB data size, Initially
we thought redis can be used as memory cache only, If it restarts then we can repopulate from the persistants data store with minimal downtime.
Now we realized that re-population of data from persistence store is not a good idea at all, It is causing major service downtime.
We want to evaluate redis persistant option by using RDB and AOF combination.We tried taking RDB snapshot once in a hour and committing to the AOF file with one second interval in test environments. AOF file is growing too big in test environment only. We tried to analyze the AOF file content and noticed that lot of keys we don't want to persist to the disk, We need them only in redis memory.
Is there any way to stop logging certain type of keys (block list keys) while logging to the AOF file

Generally, Redis does not provide a way to exclude certain types of keys from persistency. If you need some keys to persist to disk and others not to, you should use two independent Redis instances - one for each type and configure their persistency settings approriately. Divide and conquer.
Note: it is possible, however, to control what gets persisted in AOF inside the context if a Lua script - see the "Selective replication of commands" section of EVAL's documentation. That said, besides the consistency risks, it would be too much of a hassle to use this approach for what you need imo.

How can I stop redis graceful when mem and swap is full?

Last night, I run a job to insert data into a redis set(because I want keep my data unique).After I wake up this morning, I find insert operation because very slowly.
Htop shows Memory usage 1884/2015MB and swap usage 1019/1021MB
I realize that 2G memory can not hold redis.
Then I run shutdown in redis-cli, but no action, waiting and waiting...
I also try service redis_6379 stop, but terminal stop at stoping....
What can I do to make redis save all data to dump.rdb and close it graceful?

Normally, a simple redis-cli shutdown should suffice.
Are you using periodical snapshots? If yes, you might be safe to reboot your machine. One important thing to note is that enabling periodical snapshots doubles the memory usage since Redis has to create an in-memory copy of the dataset before writing it to disk.
Another important thing is to follow the advices from Redis setup hints, if you haven't already.
This might not answer your question, but should help you avoid it from happening again.

Redis configuration for production

I'm developing project with redis.My redis configuration is normal redis setup configuration.
I don't know how should I do redis configuration? Master-Slave? Cluster?
Do you have anything suggestion redis configuration for production?

Standard approach would be to have one master and at least one slave. Depending on your I/O requirements and number of ops/sec, you can always have multiple read-only slaves. Slaves can be read from but not written to. So you'll want to design your application to take advantage of doing round-robin requests to the slaves and writes only to the single master.
Depending on your data storage/backup requirement, you can set fsync for append-only mode to be every second. So while this means you can lose up to one second worth of data, it's really much less than that because your slaves serve as hot backups, and they will have the data within milliseconds.
You'll at least want to do a BGSAVE every hour to get a dump.rdp produced. You can then save this file live while the server is still running, and store it to some off-site backup facility.
But if you're just using Redis as a standard memcache replacement and don't care about data, then you can ignore all of this. Much of it will be changing in Redis Cluster in the 3.0 version.

It depends on what your Read/Writes requirements are. Could you give us more informations on that matter ?
I think 10,000 people use instant my application.I persist member login token on redis.It's important for me.If I don't write redis, member don't login on application.
Even a Redis single instance will be enough to process 10K users (start redis-bench to the throughput available), so just to be sure use a Master/Slave configuration with autopromotion of the slave if the master goes down.
Since you want persistence, use RDB (maybe along with AOF), see this topic on Redisio.

How do I back up Redis sever RDB and AOF files for recovery to ensure minimal data loss?

Purpose:
I am trying to make backup copies of both dump.rdb every X time and appendonly.aof every Y time so if the files get corrupted for whatever reason (or even just AOF's appendonly.aof file) I can restore my data from the dump.rdb.backup snapshot and then whatever else has changed since with the most recent copy of appendonly.aof.backup I have.
Situation:
I backup dump.rdb every 5 minutes, and backup appendonly.aof every 1 second.
Problems:
1) Since dump.rdb is being written in the background into a temporary file by a child process - what happens to the key changes that occurs while the child process is creating a new image? I know the AOF file will keep appending regardless of the background write, but does the new dump.rdb file contain the key changes too?
2) If dump.rdb does NOT contain the key changes, is there some way to figure out the exact point where the child process is being forked? That way I can keep track of the point after which the AOF file would have the most up to date information.
Thanks!

Usually, people use either RDB, either AOF as a persistence strategy. Having both of them is quite expensive. Running a dump every 5 min, and copying the aof file every second sounds awfully frequent. Except if the Redis instances only store a tiny amount of data, you will likely kill the I/O subsystem of your box.
Now, regarding your questions:
1) Semantic of the RDB mechanism
The dump mechanism exploits the copy-on-write mechanism implemented by modern OS kernels when the clone/fork processes. When the fork is done, the system just creates the background process by copying the page table. The pages themselves are shared between the two processes. If a write operation is done on a page by the Redis process, the OS will transparently duplicate the page (so than the Redis instance has its own version, and the background process the previous version). The background process has therefore the guarantee that the memory structures are kept constant (and therefore consistent).
The consequence is any write operation started after the fork will not be taken in the dump. The dump is a consistent snapshot taken at fork time.
2) Keeping track of the fork point
You can estimate the fork timestamp by running the INFO persistence command and calculating rdb_last_save_time - rdb_last_bgsave_time_sec, but it is not very accurate (second only).
To be a bit more accurate (millisecond), you can parse the Redis log file to extract the following lines:
[3813] 11 Apr 10:59:23.132 * Background saving started by pid 3816
You need at least the "notice" log level to see these lines.
As far as I know, there is no way to correlate a specific entry in the AOF to the fork operation of the RDB (i.e. it is not possible to be 100% accurate).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas