How to know whether the sync process has been finished in redis? - redis

It seems that the only way to sync data between redis servers is to use the command slaveof, but how to know whether the data has been replicated successfully? I mean, I want to be notified just after the sync done.
I've read some resource code of redis, mainly replication.c, and find nothing official. The only way I know for now, is to use redis command info, and check a specific flag by polling, which looks bad.
Is there any better way to do this?

The way you're trying, i.e. slaveof, is to sync data between Redis master and Redis slave. Whenever some data has been written to master, it will be sync to slave. So, technically, the sync will never be DONE.
If what you want is a snapshot of current data set, you can use the BGSAVE command to save the data set into an RDB file. With the LASTSAVE command, you can check if the BGSAVE has been done. Then copy the file to another host, and load it with Redis.

Related

how does "Disk-backed" replication work in redis cluster

the redis.conf says:
1) Disk-backed: The Redis master creates a new process that writes the RDB
file on disk. Later the file is transferred by the parent
process to the slaves incrementally
I just dont know what does "transferred by the parent process to the slaves" mean?
thank you
It is simple. First read the RDB file into a buffer, and use socket.write to send this to salve's port which is listenning this.
The implemention is more complex than what I said. But this is what redis do. You can refer the replication.c in redis/src for more details.
EDITED:
Yes, the disk-less mechanism just use the child process directly sends the RDB over the wire to slaves, without using the disk as intermediate storage.
Actually, if you use disk to save the RDB and redis master can serve many slaves at the same time without queuing. Once the disk-less replication serve on slave, and if another slave comes and want do a full sync, it need to be queued to wait for the first slave to finish. So there are another settings repl-diskless-sync-delay to wait more slave to do this parallel.
And these two method only occur after something wrong happens. In the normal case, the redis master and salve through a well connected wire to replicate the redis command the slave to keep the same between the master and slave. And if the wire is break or the slave fall down, then need do a partial resync action to obtain the part slave missed. If the psync is not possible to achieve, it will try do full resync. The full resync is what we talked about.
This is how a full synchronization works in more details:
The master starts a background saving process in order to produce an RDB file. At the same time it starts to buffer all new write commands received from the clients. When the background saving is complete, the master transfers the database file to the slave, which saves it on disk, and then loads it into memory. The master will then send all buffered commands to the slave. This is done as a stream of commands and is in the same format of the Redis protocol itself.
And the disk-less replication is just a new feature which supports the full-resync in that case to deal with the slow disk stress. More about it refer to https://redis.io/topics/replication. such as how do psync and why psync will fail, you can find answer from this article.

Redis data recovery from Append Only File?

If we enable the AppendFileOnly in the redis.conf file, every operation which changes the redis database is loggged in that file.
Now, Suppose Redis has used all the memory allocated to it in the "maxmemory" direcive in the redis.conf file.
To store more data., it starts removing data by any one of the behaviours(volatile-lru, allkeys-lru etc.) specified in the redis.conf file.
Suppose some data gets removed from the main memory, But its log will still be there in the AppendOnlyFile(correct me if I am wrong). Can we get that data back using this AppendOnlyFile ?
Simply, I want to ask that if there is any way we can get that removed data back in the main memory ? Like Can we store that data into disk memory and load that data in the main memory when required.
I got this answer from google groups. I'm sharing it.
----->
Eviction of keys is recorded in the AOF as explicit DEL commands, so when
the file is replayed fully consistency is maintained.
The AOF is used only to recover the dataset after a restart, and is not
used by Redis for serving data. If the key still exists in it (with a
subsequent eviction DEL), the only way to "recover" it is by manually
editing the AOF to remove the respective deletion and restarting the
server.
-----> Another answer for this
The AOF, as its name suggests, is a file that's appended to. It's not a database that Redis searches through and deletes the creation record when a deletion record is encountered. In my opinion, that would be too much work for too little gain.
As mentioned previously, a configuration that re-writes the AOF (see the BGREWRITEAOF command as one example) will erase any keys from the AOF that had been deleted, and now you can't recover those keys from the AOF file. The AOF is not the best medium for recovering deleted keys. It's intended as a way to recover the database as it existed before a crash - without any deleted keys.
If you want to be able to recover data after it was deleted, you need a different kind of backup. More likely a snapshot (RDB) file that's been archived with the date/time that it was saved. If you learn that you need to recover data, select the snapshot file from a time you know the key existed, load it into a separate Redis instance, and retrieve the key with RESTORE or GET or similar commands. As has been mentioned, it's possible to parse the RDB or AOF file contents to extract data from them without loading the file into a running Redis instance. The downside to this approach is that such tools are separate from the Redis code and may not always understand changes to the data format of the files the way the Redis server does. You decide which approach will work with the level of speed and reliability you want.
But its log will still be there in the AppendOnlyFile(correct me if I am wrong). Can we get that data back using this AppendOnlyFile ?
NO, you CANNOT get the data back. When Redis evicts a key, it also appends a delete command to AOF. After rewriting the AOF, anything about the evicted key will be removed.
if there is any way we can get that removed data back in the main memory ? Like Can we store that data into disk memory and load that data in the main memory when required.
NO, you CANNOT do that. You have to take another durable data store (e.g. Mysql, Mongodb) for saving data to disk, and use Redis as cache.

Propagating data from Redis slave to a SQL database

I'm using Redis for storing simple key, value pairs; where, value is also of string type. In my Redis cluster, I've a master and two slaves. I want to propagate any of the changes to the data from one of the slaves to any other store (actually, oracle database). How can I do that reliably? The sink database only needs to be eventually consistent. Some delay is allowed.
Strategies I can think of:
a) Read the AOF file written by the slave machine and propagate the changes. (Requires parsing the AOF file and getting notified of every change to the file.)
b) Use rpoplpush. The reliable queue pattern provided. But, how to make the slave insert to that queue whenever it gets some set event from the master?
Any other possibility?
This is a very common problem faced by Redis developers. In a nutshell, it is the fact that:
Want to know all changes sinse last
Keep this change data atomic
I believe that any decision one way or another will be around these issues. So, yes AOF is one of best choises in this case, but here is not any production ready instruments for that. Yes, it is not very complex solution in case of one server but then using master/slave or cluster it can be very complex.
Using Keyspace notifications
Look's like Keyspace Notifications feature may be alternative. Keyspace notifications is a feature available since 2.8.0 and available in Redis cluster too. From original documentation:
Keyspace notifications allows clients to subscribe to Pub/Sub channels in order to receive events affecting the Redis data set in some way.Examples of the events that is possible to receive are the following:
All the commands affecting a given key.
All the keys receiving an LPUSH operation.
All the keys expiring in the database 0.
Events are delivered using the normal Pub/Sub layer of Redis, so clients implementing Pub/Sub are able to use this feature without modifications.
Because Redis Pub/Sub is fire and forget currently there is no way to use this feature if you application demands reliable notification of events, that is, if your Pub/Sub client disconnects, and reconnects later, all the events delivered during the time the client was disconnected are lost. This can be improved by duplicating the employees who serve this Pub/Sub channel:
The group of N workers subscribe to notification and put data to SET based "sync" list. This allow us control overhead and do not write same data to our sync list.
The other group of workers pop record with SPOP and write it other store.
Using manual update list
The other way is using special "sync" SET based list with every write operation (as i understand SET/HSET in your case). Something like:
MULTI
SET myKey value
SADD myKey
EXEC
Each time you modify your key you add key name to SET. So in other process or worker you can SPOP that key, read value and update source.
Also you can use RPOPLPUSH/LPOPRPUSH besides of SPOP in some kind of in progress list to protect your key would missed if worker failed. In this case each worker first RPOPLPUSH/LPOPRPUSH from sync set to in progress set, push data to storage and remove key from in progress set.

Migrate Redis Data To Cluster

We have a single Redis instance with a good amount of data (over 100GB). We also have an empty Redis Cluster with 6 nodes. What would be the best way to move all that data from the stand-alone instance to the Redis Cluster and make it distribute it evenly?
After some searching around, I came across a post detailing how to move data over to a cluster. It may take some time to move lots of data over but this is the best way I've seen so far.
You can read about it here: https://fnordig.de/2014/03/11/redis-cluster-with-pre-existing-data/
You could make it easier by using redis-rdb-tools and a cluster proxy programm like redis-cerberus after you dump data into an RDB file
rdb --command protocol RDB_FILE_PATH | nc PROXY_HOST PROXY_PORT
Piping an AOF file into a proxy maybe doesn't work somehow if the AOF file contains cross-slots commands like RPOPLPUSH (depending on the proxy's implementation). However if you are actually using this kind of commands, you are not supposed to use a cluster.

If I run a long transaction or Lua script on a master redis instance, does it block on the read-only slaves

I want to be able to access a very recent copy of my master Redis server keys. It doesn't have to be completely up to date as I will be polling the read only copy but I don't want the transactions and Lua scripts I run on the master instance to block on the read only instance as I SCAN through the keys on my read only instance.
Can anyone confirm/deny this behaviour?
It won't block the slaves from anything, but while the master is busy processing your logic replication will be stopped. Once the logic ends (possibly generating writes), replication will resume with the previous buffered contents and the new ones (if any).