redis async replication of a bitset - redis

I am using redis to store some pretty large bitsets. Redis is run in master/slave sentinel mode.
I got curious about the replication performance for very big bitsets (my bitset has a size of +-100Kbyte).
From the documentation: Async replication works by sending a stream of commands between master and slave.
Can I expect those commands to update a single bit in a slave or do they copy entire keys each time? Obviously I would prefer SETBIT commands to be passed instead of setting entire keys in order to decrease network traffic.

Async replication will only pass the write command eg SETBIT to the replica in most cases.
If the replica falls too far behind however, the replica will get flushed (cleared out) and a full resync will occur. This happens if there is a lot of latency and if there are a large number of writes. If you see this happening you can tune your replication buffers to lower the possibility of a full sync

Related

How to restart scylla db cluster without any data loss

I want to restart my Scylla db cluster. But I don't want to lose any data.
Do I lose any data if I restart one after other node?
No, you will not loose data if you are doing a rolling restart.
Scylla keeps the data replicated across multiple nodes (usually 3 or more)
Depending on your Replication Factor (RF) and Consistency Level (CL) you might see read or write operations failed during the restart. See interactive calc here https://docs.scylladb.com/getting-started/consistency/#consistency-level-calculator
If "restarting a node" just involves restarting Scylla or rebooting the kernel on which it runs, then you're safe: Scylla is a distributed database, and is designed to support durability and availability even when nodes temporarily disappear from the network. When a node is temporarily down, all its data is still available for reads (from two other replicas), and also writes continue to work normally and will be eventually replicated to the down node when it finally comes up (using the "hinted handoff" and/or "repair" mechanisms).
However, if by "restarting a node" you mean something more destructive - replacing it with a brand-new node with empty storage, as in some cloud setups where nodes have transient storage. In that case you have to be more careful: If the node's data is lost, we still have two more replicas and the database continues to be available, but you should tell the cluster to "stream" the data which the node lost back to the node - before continuing to do this destructive restart to additional nodes. If you have RF=3 and destroy three nodes at the same time, you will surely lose data.

Could you please explain Replication feature of Redis

I am very new in REDIS cache implementation.
Could you please let me know what is the replication factor means?
How it works or What is the impact?
Thanks.
At the base of Redis replication (excluding the high availability features provided as an additional layer by Redis Cluster or Redis Sentinel) there is a very simple to use and configure leader follower (master-slave) replication: it allows replica Redis instances to be exact copies of master instances. The replica will automatically reconnect to the master every time the link breaks, and will attempt to be an exact copy of it regardless of what happens to the master.
This system works using three main mechanisms:
When a master and a replica instances are well-connected, the master keeps the replica updated by sending a stream of commands to the replica, in order to replicate the effects on the dataset happening in the master side due to: client writes, keys expired or evicted, any other action changing the master dataset.
When the link between the master and the replica breaks, for network issues or because a timeout is sensed in the master or the replica, the replica reconnects and attempts to proceed with a partial resynchronization: it means that it will try to just obtain the part of the stream of commands it missed during the disconnection.
When a partial resynchronization is not possible, the replica will ask for a full resynchronization. This will involve a more complex process in which the master needs to create a snapshot of all its data, send it to the replica, and then continue sending the stream of commands as the dataset changes.
Redis uses by default asynchronous replication, which being low latency and high performance, is the natural replication mode for the vast majority of Redis use cases.
Synchronous replication of certain data can be requested by the clients using the WAIT command. However WAIT is only able to ensure that there are the specified number of acknowledged copies in the other Redis instances, it does not turn a set of Redis instances into a CP system with strong consistency: acknowledged writes can still be lost during a failover, depending on the exact configuration of the Redis persistence. However with WAIT the probability of losing a write after a failure event is greatly reduced to certain hard to trigger failure modes.

Redis replication for large data to new slave

I have a redis master which has 30 GB of data and the memory there is 90 GB. We have this setup as we have less writes and more reads. Normally, we would have a 3X db size RAM machine.
The problem here is, one slave went corrupt and later on when we added it back using sentinel. it got stuck in wait_bgsave state on master (after seeing the info on master)
The reason was that :
client-output-buffer-limit slave 256mb 64mb 60
This was set on master and since max memory is not available it breaks replication for the new slave.
I saw this question Redis replication and client-output-buffer-limit where similar issue is being discussed but i have a broader scope of question.
We can't use a lot of memory. So, what are the possible ways to do replication in this context to prevent any failure on master (wrt. memory and latency impacts)
I have few things on mind:
1 - Should i do diskless replication - will it have any impact on latency of writes and reads?
2 - Should i just copy the dump file from another slave to this new slave and restart redis. ? will that work.
3 - Should i increase the output-buffer-limit slave to a greater limit? If yes, then how much? I want to do this for sometime till replication happens and then revert it back to normal setting? I am skeptic about this approach.
You got this problem, because you have a slow replica, and it cannot read the replication data as fast as needed.
In order to solve the problem, you can try to increase the client-output-buffer-limit buffer limit. Also you can try to disable persistence on replica when it syncing from master, and enable persistence after that. By disabling persistence, replica might consume the data faster. However, if the bandwidth between master and replica is really small, you might need to consider re-deploy your replica to make it near the master, and have a large bandwidth.
1 - Should i do diskless replication - will it have any impact on latency of writes and reads?
IMHO, I think it has nothing to do with diskless replication.
2 - Should i just copy the dump file from another slave to this new slave and restart redis. ? will that work.
NO, it won't work.
3 - Should i increase the output-buffer-limit slave to a greater limit? If yes, then how much? I want to do this for sometime till replication happens and then revert it back to normal setting?
YES, you can try to increase the limit. And in your case, since your data size is 30G, so a hard limit of 30G should slove the problem. However, that's too much, and might have other impact. You need to do some benchmark to get a right limit.
YES, you can dynamically change this setting by the CONFIG SET command.

Behaviour of redis client-output-buffer-limit during resynchronization

I'm assuming that during replica resynchronisation (full or partial), the master will attempt to send data as fast as possible to the replica. Wouldn't this mean the replica output buffer on the master would rapidly fill up since the speed the master can write is likely to be faster than the throughput of the network? If I have client-output-buffer-limit set for replicas, wouldn't the master end up closing the connection before the resynchronisation can complete?
Yes, Redis Master will close the connection and the synchronization will be started from beginning again. But, please find some details below:
Do you need to touch this configuration parameter and what is the purpose/benefit/cost of it?
There is a zero (almost) chance it will happen with default configuration and pretty much moderate modern hardware.
"By default normal clients are not limited because they don't receive data
without asking (in a push way), but just after a request, so only asynchronous clients may create a scenario where data is requested faster than it can read." - the chunk from documentation .
Even if that happens, the replication will be started from beginning but it may lead up to infinite loop when slaves will continuously ask for synchronization over and over. Redis Master will need to fork whole memory snapshot (perform BGSAVE) and use up to 3 times of RAM from initial snapshot size each time during synchronization. That will be causing higher CPU utilization, memory spikes network utilization (if any) and IO.
General recommendations to avoid production issues tweaking this configuration parameter:
Don't decrease this buffer and before increasing the size of the buffer make sure you have enough memory on your box.
Please consider total amount of RAM as snapshot memory size (doubled for copy-on-write BGSAVE process) plus the size of any other buffers configured plus some extra capacity.
Please find more details here

How to build a simplified redis cluster (support data sharding and load balance)?

Since the redis cluster is still a work in progress, I want to build a simplied one by myselfin the current stage. The system should support data sharding,load balance and master-slave backup. A preliminary plan is as follows:
Master-slave: use multiple master-slave pairs in different locations to enhance the data security. Matsters are responsible for the write operation, while both masters and slaves can provide the read service. Datas are sent to all the masters during one write operation. Use Keepalived between the master and the slave to detect failures and switch master-slave automatically.
Data sharding: write a consistant hash on the client side to support data sharding during write/read in case the memory is not enougth in single machine.
Load balance: use LVS to redirect the read request to the corresponding server for the load balance.
My question is how to combine the LVS and the data sharding together?
For example, because of data sharding, all keys are splited and stored in server A,B and C without overlap. Considering the slave backup and other master-slave pairs, the system will contain 1(A,B,C), 2(A,B,C) , 3(A,B,C) and so on, where each one has three servers. How to configure the LVS to support the redirection in such a situation when a read request comes? Or is there other approachs in redis to achieve the same goal?
Thanks:)
You can get really close to what you need by using:
twemproxy shard data across multiple redis nodes (it also supports node ejection and connection pooling)
redis slave master/slave replication
redis sentinel to handle master failover
depending on your needs you probably need some script listening to fail overs (see sentinel docs) and clean things up when a master goes down