Is there a way to save the AOF file on Amazon S3 instead of storing it locally?
This would avoid running out of space on disk for large datasets.
You can mount S3 as a drive letter (for example: TntDrive) and set Redis to write to that drive, but it's very wrong; the latency will kill you. See http://redis.io/topics/latency under "Latency due to AOF and disk I/O".
No, you can't do that, simply use BGREWRITEAOF to compress the size of the AOF periodically. Or work with dump files and not AOF files.
Related
AOF file size greater than memory allocated to redis nodes.
I have redis nodes with 16Gi each but aof file size is going greater than 16Gi and has reached 24Gi.
Is aof file journey of all the keys ?
I am adding keys processing it and deleting it and adding new keys. So will aof keep sync of all the deleted keys as well ?
It should be fine
The answer to your fundamental question is that the AoF file exceeding the size of your redis instance is not something that should overly concern you.The AoF file is a record of all the commands that have been executed against redis up to the current point. The purpose of the AoF file is to allow you to re-run all the commands to put redis back into its current state, so the fact that it's grown larger than the database is not a concern, when the AoF file is run all the way through your Redis instance will be exactly as large as it currently is.
You Might want to have a think on AOF rewrites
That said, it may be worth looking into how AOF rewrites are operating on your redis instance. Redis can rewrite the AoF file periodically to make it smaller and more efficient in case of a disaster.
Couple points
if you are running Redis 2.2 or less, you will want to from time to time call BGAOFREWRITE to keep the size of the AOF file under control.
If you are on a more modern version of Redis, you might want to take a look at the re-write config settings in your redis.conf file - see the aof config settings in redis.conf e.g. auto-aof-rewrite-percentage and auto-aof-rewrite-min-size
I have a appendonly.aof file that's grown too large (1.5gb, and 29,558,054 lines).
When I try and load redis it hangs on "Loading data into memory" for what seems like all day (still hasn't finished).
Is there anything I can do to optimize this file as it likely contains many duplicate transactions (like deleting the same record).
Or anything I can do to see progress to know if i'm waiting for nothing or how long it will take before I try and restore an older backup?
With redis 4+ you can use mixed format for optimizing appendonly by setting aof-use-rdb-preamble to yes.
With this setting in place redis dumps the data in RDB format to AOF file with every BGAOFREWRITE call, which you can verify by aof files contents which starts with REDIS keyword.
Upon restarts with this REDIS keyword in aof file along with this aof-use-rdb-preamble, redis will load the RDB followed by aof contents.
You can configure your Redis server based on this AOF
And if you are using docker you should be careful with how frequently your container is being restarted
I'm looking for the best way to backup my Redis data.
I read about RDB and AOF. But from what I think, the best way would be to combine it in the following way:
Create RDB periodically, and only save AOF from that point.
That way, when you restart. Redis can restore the RDB file (which is faster than the whole AOF rollback) and then for the last seconds rollback the AOF file.
The AOF file contains every write since the last RDB.
My question is, is this available in Redis? Or are there any downsides about it?
This is how Redis works by default.
See the comments about the aof-use-rdb-preamble configuration in the default redis.conf.
rdb files have snap info and append info.
So why does redis not first load rdb file, then load rdb command after rdb file?
The load code: loadDataFromDisk
From Redis doc:
It is possible to combine both AOF and RDB in the same instance. Notice that, in this case, when Redis restarts the AOF file will be used to reconstruct the original dataset since it is guaranteed to be the most complete.
I have tens of thousands of directories with a few files in each one on a non-AWS VPS (approx 1TB of data). I want to move them all to S3.
I can either zip these in to chunks of 7GB, move (wget or whatever) to EC2 (8GB Ubuntu), unzip and s3cmd them to S3.
OR
go straight to S3 from my VPS with s3cmd sync directory?
which method would be best for performance and reliability?
Thanks
There are two factors that would make the decision for me.
What is the avg file size? (Thousands of little files can take more time than several large files)
What kind of compression can you get?
If you decide to use an intermediary instance, you can attach a 1 TB EBS volume to handle the files while transferring. It will add a bit of cost, but you wont need to keep the volume once you are done.