RethinkDB backup data - backup

I read article about backing up data, but some issues is not clear for me:
What happens with data, that will be changed after backup process
was started?
Does backup operation work only on current machine? Or will it collect
data from all shards in cluster? If only on current, should I start
backup process on all servers?
Is it slow operation so I should forbid all operation to db while
backup in progress?

If a row changes while the backup is going on, the new value may or may not be in the backup. This is generally OK because RethinkDB only offers single-row atomicity anyway, but if you have a workload where that isn't OK then your other options are to use a filesystem that lets you snapshot the data on disk, or to add a new server to your cluster and set it as a replica of the table you want to back up.
It collects data from all shards.
It can take a very long time.

Related

Might transactional backup cause the operations to be deadlock, etc?

In SQL Server, I get Full and Transactional Log Backup (full: once in a day, transactional: hourly during workimng hours). As far as I see, there are some advantages of transactinal log backup over differential backup. Rearding to these issues, could you clarify me about the following points?
1. When getting transactional backup hourly during employees continue their operations with the data, might there be some problems like deadlock, or corruption of the data? I use job script in SQL Server Management Studio to get backup, but have no idea how SQL Server treats the records that are currently started to be edited.
2. In general looking, what do you suggest for backup selection in addition to full backup? Transactional Log or Differential backup?
No :)
Backups using the backup command do not require locks on any user tables.
Transaction log backups are usually more frequent than hourly, would your company really be okay with loosing an hours worth of data if something bad happened to you database disks?
Your schedule needs to depend on what your requirements are for your RPO (recovery point objective) and RTO (recovery time objective). If can only sustain 5 minutes worth of lost data then a 5 minute transaction log backup is required. If you can only cope with 1 hour worth of downtime then you need to make sure that you have data backups that can be restored and recovered in that amount of time - the first part will depend on how optimized your restore is (ie how long it takes to read the backups from your backup drives and write the data files back to your data drives - https://www.mssqltips.com/sqlservertip/4935/optimize-sql-server-database-restore-performance/#:~:text=%20Optimize%20SQL%20Server%20Database%20Restore%20Performance%20,restore%20the%20database%20by%20using%20some...%20More%20 has some ideas. The second part will depend on how much transaction log data needs to be read and applied back to the database to recover it to the desired point.
You might find that you simply can't do full database backups fast enough, in those cases incremental backups could work as there's less data to write but SQL Server will then have to put it back together.
Of course, if the restore is happening manually then you also need to account for human time in there!
It's a good idea to try out your backup and recovery process (before PROD!), this way you can tell if you're going to need to optimize the process further.

How can I dump my redis database daily?

I took over a project that is using Redis for temporary storage of objects in a queue. I want to know how I can dump all the objects on a nightly basis. The business only operates from 8am to 8pm and the jobs are completed daily so every day we start fresh.
So my I was wondering how to dump or delete all the data from Redis at midnight.
FYI - dumping data in the Redis terminology actually means saving it to disk. It appears that what you want is to remove all the data.
There are many ways to do what you want, but basically you need to set up a cron job or similar that executes a FLUSHALL for example. Alternatively, if your Redis server isn't configured with disk persistence, you can simply restart it.

Redis data recovery from Append Only File?

If we enable the AppendFileOnly in the redis.conf file, every operation which changes the redis database is loggged in that file.
Now, Suppose Redis has used all the memory allocated to it in the "maxmemory" direcive in the redis.conf file.
To store more data., it starts removing data by any one of the behaviours(volatile-lru, allkeys-lru etc.) specified in the redis.conf file.
Suppose some data gets removed from the main memory, But its log will still be there in the AppendOnlyFile(correct me if I am wrong). Can we get that data back using this AppendOnlyFile ?
Simply, I want to ask that if there is any way we can get that removed data back in the main memory ? Like Can we store that data into disk memory and load that data in the main memory when required.
I got this answer from google groups. I'm sharing it.
----->
Eviction of keys is recorded in the AOF as explicit DEL commands, so when
the file is replayed fully consistency is maintained.
The AOF is used only to recover the dataset after a restart, and is not
used by Redis for serving data. If the key still exists in it (with a
subsequent eviction DEL), the only way to "recover" it is by manually
editing the AOF to remove the respective deletion and restarting the
server.
-----> Another answer for this
The AOF, as its name suggests, is a file that's appended to. It's not a database that Redis searches through and deletes the creation record when a deletion record is encountered. In my opinion, that would be too much work for too little gain.
As mentioned previously, a configuration that re-writes the AOF (see the BGREWRITEAOF command as one example) will erase any keys from the AOF that had been deleted, and now you can't recover those keys from the AOF file. The AOF is not the best medium for recovering deleted keys. It's intended as a way to recover the database as it existed before a crash - without any deleted keys.
If you want to be able to recover data after it was deleted, you need a different kind of backup. More likely a snapshot (RDB) file that's been archived with the date/time that it was saved. If you learn that you need to recover data, select the snapshot file from a time you know the key existed, load it into a separate Redis instance, and retrieve the key with RESTORE or GET or similar commands. As has been mentioned, it's possible to parse the RDB or AOF file contents to extract data from them without loading the file into a running Redis instance. The downside to this approach is that such tools are separate from the Redis code and may not always understand changes to the data format of the files the way the Redis server does. You decide which approach will work with the level of speed and reliability you want.
But its log will still be there in the AppendOnlyFile(correct me if I am wrong). Can we get that data back using this AppendOnlyFile ?
NO, you CANNOT get the data back. When Redis evicts a key, it also appends a delete command to AOF. After rewriting the AOF, anything about the evicted key will be removed.
if there is any way we can get that removed data back in the main memory ? Like Can we store that data into disk memory and load that data in the main memory when required.
NO, you CANNOT do that. You have to take another durable data store (e.g. Mysql, Mongodb) for saving data to disk, and use Redis as cache.

How can I dump a single Redis DB index?

Redis SAVE and BGSAVE commands dump the complete Redis data to a persistent file.
But is there a way to dump only one DB index?
I am using the same Redis server with multiple DB indices.
I use DB 0 as config which is edited manually and contains just a small number of keys. I wish to dump this to a file as a config snapshot (versioned) to keep track of manual changes in the prod environment.
The rest of the DBs have a large number of items, that will take too long to dump, and I don't need to back them up.
Redis' persistence scope is the entire instance, meaning all shared/numbered databases and all keys in them. Saving only a subset of these is not supported.
Instead, use two independent Redis instances and configure each to persist (or not) per your needs. The overhead of running an insurance is a few megabytes so it is practically negligible.

What if Redis keys are never deleted programmatically?

What will happen to my redis data if no expiry is set and no DEL command is used.
Will it be removed after some default time ?
One more,
How redis stores data, is it in any file format ? because I can access data even after restarting the computer. So which files are created by redis and where ?
Thanks.
Redis is a in-memory data store meaning all your data is kept in RAM (ie. volatile). So theoritically your data will live as long as you don't turn the power off.
However, it also provides persistence in two modes:
RDB mode which takes snapshots of your dataset and saves them to the disk in a file called dump.drb. This is the default mode.
AOF mode which records every write operation executed by the server in an Append-Only file and then replays it thus reconstructing the original data.
Redis persistence is very good explained here and here by the creator of Redis himself.