I went through the redis cluster documentation and there it is written that there are 16384 slots in a redis cluster (cluster mode enabled). Does this mean that there can be a maximum of only 16384 master nodes in a cluster?
If yes, then how do we scale beyond 16384 master nodes?
If no, then how will it work since at least one pair of two master will be assigned the same hash slot?
The key space is split into 16384 slots, effectively setting an upper limit for the cluster size of 16384 master nodes (however the suggested max size of nodes is in the order of ~ 1000 nodes).
For more information Refer this: https://redis.io/topics/cluster-spec
Hope this helps :)
Related
we have many datacenters but datacenter1 is the main.
the master in datacenter1 is being monitored by sentinel so if the master goes down one the replicas will become master and also all data is being synced continuously.
we want to have one Redis replica in each datacenter, replicate all data from datacenter1 but without the ability to become master. (always get data from data center 1 and just replica 1 have the ability to become master but other replicas must not be able)
is there a Redis config for this or any idea?
Redis Multi Datacenter
Redis config [1] has a replica-priority parameter which should serve your purpose.
The replica priority is an integer number published by Redis in the INFO
output. It is used by Redis Sentinel in order to select a replica to promote
into a master if the master is no longer working correctly.
A replica with a low priority number is considered better for promotion, so
for instance if there are three replicas with priority 10, 100, 25 Sentinel
will pick the one with priority 10, that is the lowest.
However a special priority of 0 marks the replica as not able to perform the
role of master, so a replica with priority of 0 will never be selected by
Redis Sentinel for promotion.
By default the priority is 100.
The idea can be setting lower replica-priority value to replicas in datacenter1 and higher value to replicas in other datacenters.
[1] redis.conf file of Redis version 6.2.6: https://github.com/redis/redis/blob/6.2.6/redis.conf
I have read redis-cluster documents but couldn't get the gist of it. Can someone help me understand it from the basics?
Redis Cluster does not use consistent hashing, but a different form of
sharding where every key is conceptually part of what we call an hash
slot.
Hash slots are defined by Redis so the data can be mapped to different nodes in the Redis cluster. The number of slots (16384 ) can be divided and distributed to different nodes.
For example, in a 3 node cluster one node can hold the slots 0 to 5640, next 5461 to 10922 and the third 10923 to 16383. Input key or a part of it is hashed (run against a hash function) to determine a slot number and hence the node to add the key to.
Think of it as logical shards. So redis has 16384 logical shards and these logical shards are mapped to the available physical machines in the cluster.
Mapping may look something like:
0-1000 : Machine 1
1000-2000 : Machine 2
2000-3000 : Machine 3
...
...
When redis gets a key, it does the following:
Calculate hash(key)%16384 -> this finds the logical shard where the given key belongs, let's say it comes to 1500
Look at logical shard to physical machine mapping to identify physical machine. From the above mapping, logical shard 1500 is served by Machine 2. So route the request to physical machine #2.
You can consider slot as its literal meaning, just like slots in the real world.
Every key belongs to a certain slot by some rules. And a slot also belongs to a certain redis node by config.
I have a Redis Cluster. I am using JedisCluster client to connect to my Redis.
My application is a bit complex and I want to basically control to which partition data from my application goes. For example, my application consists of sub-module A, B, C. Then I want that all data from sub-module A should go to partition 1 for example. Similarly data from sub-module B should go to partition 2 for example and so on.
I am using JedisCluster, but I don't find any API to write to a particular partition on my cluster. I am assuming I will have same partition names on all my Redis nodes and handling which data goes to which node will be automatically handled but to which partition will be handled by me.
I tried going through the JedisCluster lib at
https://github.com/xetorthio/jedis/blob/b03d4231f4412c67063e356a7c3acf9bb7e62534/src/main/java/redis/clients/jedis/JedisCluster.java
but couldn't find anything. Please help?
Thanks in advance for the help.
That's not how Redis Cluster works. With Redis Cluster, each node (partition) has a defined set of keys (slots) that it's handling. Writing a key to a master node which is not served by the master results in rejection of the command.
From the Redis Cluster Spec:
Redis Cluster implements a concept called hash tags that can be used in order to force certain keys to be stored in the same node.
[...]
The key space is split into 16384 slots, effectively setting an upper limit for the cluster size of 16384 master nodes (however the suggested max size of nodes is in the order of ~ 1000 nodes).
Each master node in a cluster handles a subset of the 16384 hash slots.
You need to define at Cluster configuration-level which master node is exclusively serving a particular slot or a set of slots. The configuration results in data locality.
The slot is calculated from the key. The good news is that you can enforce a particular slot for a key by using Key hash tags:
There is an exception for the computation of the hash slot that is used in order to implement hash tags. Hash tags are a way to ensure that multiple keys are allocated in the same hash slot. This is used in order to implement multi-key operations in Redis Cluster.
Example:
{user1000}.following
The content between {…} is used to calculate the slot. Key hash tags allow you to group keys on particular nodes and enforce the same data locality when using arbitrary hash tags.
You can also go a step further by using known hash tags that map to slots (you'd need either precalculate a table or see this Gist). By using known hash tags that map to a specific slot you're able to select the slot and so the master node on which the data is located.
Everything else is handled by your Redis client.
I'm using Apache Cassandra 2.1.1 and when using nodetool status the Load for one of my nodes is about half the size of the other two while the Owns is almost equal on all the nodes. I am somewhat new to Cassandra and don't know if I should be worried about this or not. I have tried using repair and cleanup after restarting all the nodes, but it still appears unbalanced. I am using GossipingPropertyFileSnitch with each node configured dc=DC1 and rack=RAC1 specified in cassandra-rackdc.properties. I am also using Murmur3Partitioner with NetworkTopologyStrategy where my keyspace is defined as
CREATE KEYSPACE awl WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2'} AND durable_writes = true;
I believe the problem to be with the awl keyspace since the size of the data/awl folder is the same size as reported by nodetool status. My output for nodetool status is below. Any help would be much appreciated.
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.1.1.152 3.56 GB 256 68.4% d42945cc-59eb-41de-9872-1fa252762797 RAC1
UN 10.1.1.153 6.8 GB 256 67.2% 065c471d-5025-4bf1-854d-52d579f2a6d3 RAC1
UN 10.1.1.154 6.31 GB 256 64.4% 46f05522-29cc-491c-ab65-334b205fc415 RAC1
I would suspect this is due to the distribution of the key values that are being inserted. They are probably not well distributed across the possible key values, so many of them are hashing to one node. Since you are using replication factor 2, the second replica is the next node in the ring, resulting in two nodes with more data than the third node.
You didn't show your table schema, so I don't know what you are using for the partition and clustering keys. You want to use key values that have a high cardinality and good distribution to avoid hot spots where a lot of inserts are hashing to one node. With a better distribution you will get better performance and more even space usage across the nodes.
I'm running a redis instance where I have stored a lot of hashes with integer fields and values. Specifically, there are many hashes of the form
{1: <int>, 2: <int>, ..., ~10000: <int>}
I was initially running redis with the default values for hash-max-ziplist-entries:
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
and redis was using approximately 3.2 GB of memory.
I then changed these values to
hash-max-ziplist-entries 10240
hash-max-ziplist-value 10000
and restarted redis. My memory usage went down to approximately 480 MB, but redis was using 100% CPU. I reverted the values back to 512 and 64, and restarted redis, but it was still only using 480 MB of memory.
I assume that the memory usage went down because a lot of my hashes were stored as ziplists. I would have guessed that after changing the values and restarting redis they would automatically be converted back into hash tables, but this doesn't appear to be the case.
So, are these hashes still being stored as a ziplist?
They are still in optimized "ziplist" format.
Redis will store hashes (via "hset" or similar) in an optimized way if the hash does end up having more than hash-max-ziplist-entries entries, or if the values are smaller than hash-max-ziplist-values bytes.
If these limits are broken Redis will store the item "normally", ie. not optimized.
Relevant section in documentation (http://redis.io/topics/memory-optimization):
If a specially encoded value will overflow the configured max size, Redis will automatically convert it into normal encoding.
Once the values are written in an optimized way, they are not "unpacked", even if you lower the max size settings later. The settings will apply to new keys that Redis stores.