Redis starts slowly when aof file is too large - redis

Redis starts slowly when aof file is too large.The aof file is still large after rewriting.How to deal with it?
We can not close aof and we need start redis up quickly

The aof file is still large after rewriting
That means your data set is really large.
In order to accelerate restart, you can do persistence with RDB file. It should be faster to load RDB file than AOF file.
Also you can try to split your big data set into several small Redis instances, or move your data to Redis Cluster, so that each node has a smaller data set, and the reload work runs faster.

Related

Redis using as disk persistance with RDB and AOF file

We are using redis server in production with 6 GB data size, Initially
we thought redis can be used as memory cache only, If it restarts then we can repopulate from the persistants data store with minimal downtime.
Now we realized that re-population of data from persistence store is not a good idea at all, It is causing major service downtime.
We want to evaluate redis persistant option by using RDB and AOF combination.We tried taking RDB snapshot once in a hour and committing to the AOF file with one second interval in test environments. AOF file is growing too big in test environment only. We tried to analyze the AOF file content and noticed that lot of keys we don't want to persist to the disk, We need them only in redis memory.
Is there any way to stop logging certain type of keys (block list keys) while logging to the AOF file
Generally, Redis does not provide a way to exclude certain types of keys from persistency. If you need some keys to persist to disk and others not to, you should use two independent Redis instances - one for each type and configure their persistency settings approriately. Divide and conquer.
Note: it is possible, however, to control what gets persisted in AOF inside the context if a Lua script - see the "Selective replication of commands" section of EVAL's documentation. That said, besides the consistency risks, it would be too much of a hassle to use this approach for what you need imo.

AOF and RDB backups in redis

This question is about Redis persistence.
I'm using redis as a 'fast backend' for a social networking website. It's a single server set up. I've been transferring PostgreSQL responsibilities to Redis steadily. Currently in etc/redis/redis.conf, the appendonly setting is set to appendonly no. Snapshotting settings are save 900 1, save 300 10, save 60 10000. All this is true for production and development both. As per production logs, save 60 10000 gets invoked heavily. Does this mean that practically, I'm getting backups every 60 seconds?
Some literature suggests using AOF and RDB backups together. Thus I was weighing in on turning appendonly on and using appendfsync everysec. For anyone who has had experience of both sides of the coin:
1) Will using appendonly on and appendfsync everysec cause a performance downgrade? Will it hit the CPU? The write load is on the high side.
2) Once I restart the redis server with these new settings, I'll still lose the last 60 secs of my data, correct?
3) Are restart times something to worry about? My dump.rdb file is small; ~90MB.
I'm trying to find out more about redis persistence, and getting my expectations right. Personally, I'm fine with losing 60s of data in the case of a catastrophe, thus whether I should use AOF is also something I'm pondering. Feel free to chime in. Thanks!
Does this mean that practically, I'm getting backups every 60 seconds?
NO. Redis does a background save after 60 seconds, if there're at least 10000 keys have been changed. Otherwise, it doesn't do a background save.
Will using appendonly on and appendfsync everysec cause a performance downgrade? Will it hit the CPU? The write load is on the high side.
It depends on many things, e.g. disk performance (SSD VS HDD), write/read load (QPS), data model, and so on. You need do a benchmark with your own data in your specific environment.
Once I restart the redis server with these new settings, I'll still lose the last 60 secs of my data, correct?
NO. If you turn on both AOF and RDB, when Redis restarts, the AOF file will be used to rebuild the database. Since you config it to appendfsync everysec, you will only lose the last 1 second of data.
Are restart times something to worry about? My dump.rdb file is small; ~90MB.
If you turn on AOF, and when Redis restarts, it replays logs in AOF file to rebuild the database. Normally AOF file is larger then RDB file, and it might be slower than recovering from RDB file. Should you worry about that? Do a benchmark with your own data in your specific environment.
EDIT
IMPORTANT NOTICE
Assume that you already set Redis to use RDB saving, and write lots of data to Redis. After a while, you want to turn on AOF saving. NEVER MODIFY THE CONFIG FILE TO TURN ON AOF AND RESTART REDIS, OTHERWISE YOU'LL LOSE EVERYTHING.
Because, once you set appendonly yes in redis.conf, and restart Redis, it will load data from AOF file, no matter whether the file exists or not. If the file doesn't exist, it creates an empty file, and tries to load data from that empty file. So you'll lose everything.
In fact, you don't have to restart Redis to turn on AOF. Instead, you can use config set command to dynamically turn it on: config set appendonly yes.

Redis / Limit of .rdb file

I use Redis, and it save it .rdb file (every transaction).
I notice that the .rdb on the production grows 15 MB every day (Now it stands on 75 MB).
Is there any limit to the .rdb file? Is this affect on the performance of the Redis DB?
The rdb on disk has no direct impact on the running redis instance.
The only limit seems to be the filesystem.
We have a 10 GB compressed rdb which is in-memory about 28 GB in size and also had much bigger ones.
But, you may encounter interrupts if you save such large datasets like ours to disk. (even if you use http://redis.io/commands/bgsave )
When the forked redis process writes the latest diff, redis is unresponsive until it's completely written to disk. This time span depends on different values like writes during bgsave, overall amount of keys, size of hashes and so on.
And, be sure to correctly set up the "save" configuration depending on your needs.

Some confusion on backup whole data in redis

Document say:
Whenever Redis needs to dump the dataset to disk, this is what happens:
Redis forks. We now have a child and a parent process.
The child starts to write the dataset to a temporary RDB file.
When the child is done writing the new RDB file, it replaces the old one.
Because I want to backup whole data, I type shutdown command in redis-cli expecting it shutdown and save all data to dump.rdb.After it shutdown completely, I go to db location and see what happen that dimpr.rdb is 423.9MB and temp-21331.rdb is 180.5MB.Temp file is still exist and smaller than dimpr.rdb.Apparently, redis do not use temp file replaces dump.rdb.
I am wondering whether dump.rdb is whole db file at this time?And is it safe to delete the temp file.
What does the file mod timestamp of temp-21331.rdb say? It sounds like a leftover from a crash.
You can delete it.
The documentation is definitely correct. When rewriting, all info is written to a temp file (compressed), and when complete, the dump.rdb file is replaced by this temp-file. There should however be no leftovers during normal usage. What is important: you always need enough free disk space for this operation to succeed. A safe guideline is: 140% times the redis memory limit (it would be 200% if no compression was applied).
Hope this helps, TW

How do I back up Redis sever RDB and AOF files for recovery to ensure minimal data loss?

Purpose:
I am trying to make backup copies of both dump.rdb every X time and appendonly.aof every Y time so if the files get corrupted for whatever reason (or even just AOF's appendonly.aof file) I can restore my data from the dump.rdb.backup snapshot and then whatever else has changed since with the most recent copy of appendonly.aof.backup I have.
Situation:
I backup dump.rdb every 5 minutes, and backup appendonly.aof every 1 second.
Problems:
1) Since dump.rdb is being written in the background into a temporary file by a child process - what happens to the key changes that occurs while the child process is creating a new image? I know the AOF file will keep appending regardless of the background write, but does the new dump.rdb file contain the key changes too?
2) If dump.rdb does NOT contain the key changes, is there some way to figure out the exact point where the child process is being forked? That way I can keep track of the point after which the AOF file would have the most up to date information.
Thanks!
Usually, people use either RDB, either AOF as a persistence strategy. Having both of them is quite expensive. Running a dump every 5 min, and copying the aof file every second sounds awfully frequent. Except if the Redis instances only store a tiny amount of data, you will likely kill the I/O subsystem of your box.
Now, regarding your questions:
1) Semantic of the RDB mechanism
The dump mechanism exploits the copy-on-write mechanism implemented by modern OS kernels when the clone/fork processes. When the fork is done, the system just creates the background process by copying the page table. The pages themselves are shared between the two processes. If a write operation is done on a page by the Redis process, the OS will transparently duplicate the page (so than the Redis instance has its own version, and the background process the previous version). The background process has therefore the guarantee that the memory structures are kept constant (and therefore consistent).
The consequence is any write operation started after the fork will not be taken in the dump. The dump is a consistent snapshot taken at fork time.
2) Keeping track of the fork point
You can estimate the fork timestamp by running the INFO persistence command and calculating rdb_last_save_time - rdb_last_bgsave_time_sec, but it is not very accurate (second only).
To be a bit more accurate (millisecond), you can parse the Redis log file to extract the following lines:
[3813] 11 Apr 10:59:23.132 * Background saving started by pid 3816
You need at least the "notice" log level to see these lines.
As far as I know, there is no way to correlate a specific entry in the AOF to the fork operation of the RDB (i.e. it is not possible to be 100% accurate).