AWS MemoryDB minimum number of nodes - redis-cluster

I'm trying to use AWS MemoryDB for an application that has high availability requirements, but has small amount of data to store.
I want to minimize costs, and was going to go with a cluster that has 1 shard, and 2 nodes in separate availability zones.
However, I'm seeing this warning within the web console:
"Warning: To architect for high availability, we recommend that you retain at least 3 nodes per shard (1 primary and 2 replicas)."
I can't find any explanation for why 3 nodes would be necessary instead of 2. Does anyone know the reason for that recommendation? And does it hold with a small dataset within 1 shard?

Related

Replication across AZ

We have a 6-node cluster setup in which 3 server nodes are spread across 3 availability zones and each zone also has a client node. All is set up in a Kubernetes-based service.
Important configurations,
storeKeepBinary = true
cacheMode = Partitioned (some cache's about
5-8, out of 25 have this as TRANSACTIONAL)
AtomicityMode = Atomic
backups = 1 readFromBackups = false
no persistence for these tests
When we run it locally on physical boxes, we get a decent throughput. However when we deploy this in the cloud in an AZ-based setup in k8s. We see a steep drop. We can only get a performance comparable to on-prem cluster tests when we keep only a single cache node without any backups (backups=0).
I get that a different hardware and n/w latency in cloud come into play. And while i investigate all that wrt to the differences in cloud, i want to understand if there are some obvious behavioral issues under the cover wrt to ignite that i trying to understand a few things outline below,
Why should cache get calls be slower? It's a partitioned data, so lookup should be by key and since we have turned off 'readFrombackup', it should always go the primary partition. So adding number of cache servers should not change any of the get call latencies.
Similar for 'inserts/puts', other than the caches where the atomicity is 'Transactional', everything else should be the same when we go from one cache to 3 caches.
Any other areas anyone can suggest which i can take a look from configuration/etc.
TIA

Why do we need distributed lock in Redis if we partition on Key?

Folks,
I read thru Why we need distributed lock in the Redis but it is not answering my question.
I went thru the following
https://redis.io/topics/distlock to be specific https://redis.io/topics/distlock#the-redlock-algorithm and
https://redis.io/topics/partitioning
I understand that partitioning is used to distribute the load across the N nodes. So does this not mean that the interested data will always be in one node? If yes then why should I go for a distributed lock that locks across all N nodes?
Example:
Say if I persist a trade with the id 123, then the partition based on the hashing function will work out which node it has to go. Say for the sake of this example, it goes to 0th node.
Now my clients (multiple instances of my service) will want to access the trade 123.
Redis again based on the hashing is going to find in which Node the trade 123 lives.
This means the clients (in reality one instance among the multiple instances i.e only one client ) will lock the trade 123 on the 0th Node.
So why will it care to lock all the N nodes?
Unless partitioning is done on a particular record across all the Nodes i.e
Say a single trade is 1MB in size and it is partitioned across 4 Nodes with 0.25MB in each node.
Kindly share your thoughts. I am sure I am missing something but any leads are much appreciated.

What are the recommended settings for fast queue binding in a RabbitMQ cluster

When binding a queue in a RabbitMQ cluster with com.rabbit.mq:amqp-client:5.4.3, this takes a considerable amount of time when many queues are bound to an exchange within a short duration (averaging to ca. 1 second, with max times reaching 10 seconds). The RabbitMQ cluster is running 3 nodes in version 3.7.8 (Erlang 20.3.4) inside Rancher/Kubernetes.
When reducing the number of nodes down to a single node, maximum times stay well below 1 second (<700 ms. Still somewhat long, if you ask me, but acceptable). With a single node, average times range between 10 and 100 milliseconds.
I understand that replicating this information between the cluster nodes can take some time, but 1 to 2 orders of magnitude worse performance with just 3 nodes? (the same happens with 2 nodes as well, but 3 nodes is the minimum for a meaningful cluster setup)
Are there some knobs to turn to bring the time to bind a queue to an acceptable level for a cluster? Is there a configuration I'm missing? Having only a single node is a non-option with HA in mind. I couldn't find anything helpful in https://www.rabbitmq.com/clustering.html until now, maybe I missed it?
TLDR:
Are these timings expected for a RabbitMQ cluster?
What is the performance one can expect from a simple RabbitMQ (3 nodes) when binding queues to exchanges?
How many "bind" operations can be performed in e.g. 1 second?
Which factors affect this? Number of exchanges, number of queues, number of existing bindings, frequency of the operation?
What are the options to reduce the time required to create bindings?

Forcing Riak to store data on distinct physical servers

I'm concerned by this note in Riak's documentation:
N=3 simply means that three copies of each piece of data will be stored in the cluster. That is, three different partitions/vnodes will receive copies of the data. There are no guarantees that the three replicas will go to three separate physical nodes; however, the built-in functions for determining where replicas go attempts to distribute the data evenly.
https://docs.basho.com/riak/kv/2.1.3/learn/concepts/replication/#so-what-does-n-3-really-mean
I have a cluster of 6 physical servers with N=3. I want to be 100% sure that total loss of some number of nodes (1 or 2) will not lose any data. As I understand the caveat above, Riak cannot guarantee that. It appears that there is some (admittedly low) portion of my data that could have all 3 copies stored on the same physical server.
In practice, this means that for a sufficiently large data set I'm guaranteed to completely lose records if I have a catastrophic failure on a single node (gremlins eat/degauss the drive or something).
Is there a Riak configuration that avoids this concern?
Unfortunate confounding reality: I'm on an old version of Riak (1.4.12).
There is no configuration that avoids the minuscule possibility that a partition might have 2 or more copies on one physical node (although having 5+ nodes in your cluster makes it extremely unlikely that a single node with have more than 2 copies of a partition). With your 6 node cluster it is extremely unlikely that you would have 3 copies of a partition on one physical node.
The riak-admin command line tool can help you explore your partitions/vnodes. Running riak-admin vnode-status (http://docs.basho.com/riak/kv/2.1.4/using/admin/riak-admin/#vnode-status) on each node for example will output the status of all vnodes the are running on the local node the command is run on. If you run it on every node in your cluster you confirm whether or not your data is distributed in a satisfactory way.

some questions about couchbase's replicas detail

Here I get several questions about the replica functions in couchbase, and hope can be answered. First of all, I wanna give some my own understanding ahout the couchbase; If there are 10 nodes in my cluster, and I set the number of replica to be 3 in each bucket (
actually I find that the maximal value is 3 , and I coundn't find any other way to make it larger than 3), then, does it mean that each data in bucket can only be copied to
other three nodes(I guess the three nodes should be random choosen, but could it select manually )in totally 10 nodes; Furthermore, if some of the 10 nodes have downtime,
will it cause loss of data?
I conclude my questions as follows;
1, The maximal value of the replica number in couchbase is 3, right or wrong? If wrong, how could it be largger than 3.
2, I guess the three nodes should be random choosen, but could it select manually
3, If my understanding is right, it will have loss of data when we find some nodes being in downtime. How could we avoid the loss under that condition
The maximal value of the replica number in couchbase is 3, right or wrong? If wrong, how could it be larger than 3.
The maximum number of replicas that you can have is 3, we run in production with 1 replica but it all comes down to how large your cluster is and performance impact. The more replicas you have the more inter node communication and transfer that will occur.
When you have 3 replicas this means that each node has its data replicated to 3 other nodes, this means you could handle 3 node failures in your cluster. It could happen but it is unlikely, what's more likely to happen is a machine dies and then Couchbase can automatically fail over and promote a replica held in an other node to serve requests/updates.
Couchbase's system is nice because it means you can scale up and down by failing over a node and automatic re-balancing.
I guess the three nodes should be randomly chosen, but could I select it manually?
You have no say on which nodes replicas are held, in fact I think it's a great thing that all of Couchbase's sharding and replica processes are taken out of the developers hands, it's all an automatic process.
If my understanding is right, it will have loss of data when we find some nodes being in downtime. How could we avoid data loss under that condition?
As I said before, if one node goes down then a replica is promoted, with 3 back ups you'd need 3 nodes to fail before you noticed something happening. In a production environment you should already have a warning system for each individual node, be it New Relic, Nagios etc that would report if a server dies. If there was a catastrophic problem and you lost more than 4 nodes then yes you would have data loss.
Bare in mind automatic fail over in Couchbase isn't instantaneous but still it's pretty quick. If you need downtime across the cluster say for server maintenance that needs a restart or something then it is possible to fail a node over, remove it from the cluster, perform operations and tasks on it, then add it back into the cluster and rebalance. Perform those stops again for as many nodes as you need, I've personally done that when I forgot to set specific Linux things that needed a system restart.
Overall to avoid data loss, monitor your servers, make (daily/hourly) backups of the data in your cluster and have machines that are well provisioned for your workrate.
Hope that helps!