I'm trying to understand if I have a dataset that has some string keys like
data1
data2
and so on
how does sharding work if I have cluster mode enabled? Say I have 6 shards, how does it decide data1 has to go to shard 1, data 2 to shard 2 and so on?
This is a broad question, you can find all the information related to clustering in here: Overview of Redis Cluster main components
I'm leaving the key concepts/summaries here:
All the keys are converted to a hashed numeric value(range between 0 to 16384). Then each node assigns a range of hash_values to serve.
Say I have 6 shards, how does it decide data1 has to go to shard 1, data 2 to shard 2 and so on?
Ans: So when you have cluster mode enabled, each of your cluster nodes will serve a range of keys.while the cluster is stable each node will store approximately 16384/6 = 2731 hased_values of keys. Now hash will be calculated using following rules: for key data1:
hash_slot = Hash_Algorithm(data1) % 16384
Hash algorithm could be MD5, CRC etc.
hash_slot = CRC16(data1) % 16384
Run this command on redis-cli to know which have what range: $> cluster slots
Sample Output (new version, includes IDs):
127.0.0.1:30001> cluster slots
1) 1) (integer) 0 // begin slot
2) (integer) 5460 // end slot
3) 1) "127.0.0.1"
2) (integer) 30001
3) "09dbe9720cda62f7865eabc5fd8857c5d2678366"
4) 1) "127.0.0.1"
2) (integer) 30004
3) "821d8ca00d7ccf931ed3ffc7e3db0599d2271abf"
2) 1) (integer) 5461 // begin slot
2) (integer) 10922 // end slot
3) 1) "127.0.0.1"
2) (integer) 30002
3) "c9d93d9f2c0c524ff34cc11838c2003d8c29e013"
4) 1) "127.0.0.1"
2) (integer) 30005
3) "faadb3eb99009de4ab72ad6b6ed87634c7ee410f"
.....
.....
As docs describe:
Keys distribution model
The key space is split into 16384 slots, effectively setting an upper limit for the cluster size of 16384 master nodes (however the suggested max size of nodes is in the order of ~ 1000 nodes).
Each master node in a cluster handles a subset of the 16384 hash slots. The cluster is stable when there is no cluster reconfiguration in progress (i.e. where hash slots are being moved from one node to another). When the cluster is stable, a single hash slot will be served by a single node (however the serving node can have one or more slaves that will replace it in the case of net splits or failures, and that can be used in order to scale read operations where reading stale data is acceptable).
The base algorithm used to map keys to hash slots is the following (read the next paragraph for the hash tag exception to this rule):
HASH_SLOT = CRC16(key) mod 16384
Related
I have a redis cluster with the following configuration :
91d426e9a569b1c1ad84d75580607e3f99658d30 127.0.0.1:7002#17002 myself,master - 0 1596197488000 1 connected 0-5460
9ff311ae9f413b48578ff0519e97fef2ced57b1e 127.0.0.1:7003#17003 master - 0 1596197490000 2 connected 5461-10922
4de4d36b968bd0b5b5dc8023cb00a5a2ab62effc 127.0.0.1:7004#17004 master - 0 1596197492253 3 connected 10923-16383
a32088043c31c5d3f20828bfe06306b9f0717635 127.0.0.1:7005#17005 slave 91d426e9a569b1c1ad84d75580607e3f99658d30 0 1596197490251 1 connected
b5e9ec7851dfd8dc5ab0cf35c230a0e716dd934c 127.0.0.1:7006#17006 slave 9ff311ae9f413b48578ff0519e97fef2ced57b1e 0 1596197489000 2 connected
a34cc74321e1c75e4cf203248bc0883833c928c7 127.0.0.1:7007#17007 slave 4de4d36b968bd0b5b5dc8023cb00a5a2ab62effc 0 1596197492000 3 connected
I want to create a set with all keys in the cluster by listening key operations with redis gears and store key names in a redis set called keys.
To do thant, I run this redis gears command
RG.PYEXECUTE "GearsBuilder('KeysReader').foreach(lambda x: execute('sadd', 'keys', x['key'])).register(readValue=False)"
It work, but only if the updated key is store on the same node of the key keys
Example :
With my cluster configuration, the key keys is store un node 91d426e9a569b1c1ad84d75580607e3f99658d30 (the first node).
If i run :
SET foo bar
SET bar foo
SMEMBERS keys
I have the following result :
127.0.0.1:7002> SET foo bar
-> Redirected to slot [12182] located at 127.0.0.1:7004
OK
127.0.0.1:7004> SET bar foo
-> Redirected to slot [5061] located at 127.0.0.1:7002
OK
127.0.0.1:7002> SMEMBERS keys
1) "bar"
2) "keys"
127.0.0.1:7002>
The first key name foo is not saved in the set keys.
Is it possible to have key names on other nodes saved in the keys set with redis gears ?
Redis version : 6.0.6
Redis gears version : 1.0.1
Thanks.
If the key was written to a shard that does not contain the 'keys' key you need to make sure to move it to another shard with the repartition operation (https://oss.redislabs.com/redisgears/operations.html#repartition), so this should work:
RG.PYEXECUTE "GearsBuilder('KeysReader').repartition(lambda x: 'keys').foreach(lambda x: execute('sadd', 'keys', x['key'])).register(readValue=False)"
The repartition operation will move the record to the correct shard and the 'sadd' will succeed.
Another option is to maintain a set per shard and collect them using another Gear function. To do that you need to use the hashtag function (https://oss.redislabs.com/redisgears/runtime.html#hashtag) to make sure the set created belongs to the current shard. So the following registration will maintain a set per shard:
RG.PYEXECUTE "GearsBuilder('KeysReader').foreach(lambda x: execute('sadd', 'keys{%s}' % hashtag(), x['key'])).register(mode='sync', readValue=False)"
Notice that the sync mode tells RedisGears not to start a distributed execution and it should be much faster to follow the keys this way.
Then to collect all the values:
RG.PYEXECUTE "GB('ShardsIDReader').flatmap(lambda x: execute('smembers', 'keys{%s}' % hashtag())).run()"
The first approach is good for read-intensive use cases and the second approach is good for write-intensive use cases. Depends on your use case you need to chose the right approach.
I'm using Redis on a clustered db (locally). I'm trying the MULTI command, but it seems that it is not working. Individual commands work and I can see how the shard moves.
Is there anything else I should be doing to make MULTI work? The documentation is unclear about whether or not it should work. https://redis.io/topics/cluster-spec
In the example below I just set individual keys (note how the port=cluster changes), then trying a multi command. The command executes before EXEC is called
127.0.0.1:30001> set a 1
-> Redirected to slot [15495] located at 127.0.0.1:30003
OK
127.0.0.1:30003> set b 2
-> Redirected to slot [3300] located at 127.0.0.1:30001
OK
127.0.0.1:30001> MULTI
OK
127.0.0.1:30001> HSET c f val
-> Redirected to slot [7365] located at 127.0.0.1:30002
(integer) 1
127.0.0.1:30002> HSET c f2 val2
(integer) 1
127.0.0.1:30002> EXEC
(error) ERR EXEC without MULTI
127.0.0.1:30002> HGET c f
"val"
127.0.0.1:30002>
MULTI transactions, as well as any multi-key operations, are supported only within a single hashslot in a clustered Redis deployment.
When I run the info command in redis-cli against a redis 3.2.4 server, it shows me this for expires:
expires=223518
However, when I then run a keys * command and ask for the ttl for each key, and only print out keys with a ttl > 0, I only see a couple hundred.
I thought that the expires is a count of the number of expiring keys but I am not even within an order of magnitude of this number.
Can someone clarify exactly what expires is meant to convey? Does this include both to-be-expired and previously expired but not yet evicted keys?
Update:
Here is how I counted the number of keys expiring:
task count_tmp_keys: :environment do
redis = Redis.new(timeout: 100)
keys = redis.keys '*'
ct_expiring = 0
keys.each do |k|
ttl = redis.ttl(k)
if ttl > 0
ct_expiring += 1
puts "Expiring: #{k}; ttl is #{ttl}; total: #{ct_expiring}"
STDOUT.flush
end
end
puts "Total expiring: #{ct_expiring}"
puts "Done at #{Time.now}"
end
When I ran this script it shows I have a total expiring of 78
When I run info, it says db0:keys=10237963,expires=224098,avg_ttl=0
Because 224098 is so much larger than 78, I am very confused. Is there perhaps a better way for me to obtain a list of all 225k expiring keys?
Also, how is it that my average ttl is 0? Wouldn't you expect it to be nonzero?
UPDATE
I have new information and a simple, 100% repro of this situation locally!
To repro: setup two redis processes locally on your laptop. Make one a slave of the other. On the slave process, set the following:
config set slave-serve-stale-data yes
config set slave-read-only no
Now, connect to the slave (not the master) and run:
set foo 1
expire foo 10
After 10 seconds, you will no longer be able to access foo, but info command will still show that you have 1 key expiring with an average ttl of 0.
Can someone explain this behavior?
expires contains existing keys with TTL which will expire, not including already expired keys.
Example ( with omission of extra information from info command for brevity ):
127.0.0.1:6379> flushall
OK
127.0.0.1:6379> SETEX mykey1 1000 "1"
OK
127.0.0.1:6379> SETEX mykey2 1000 "2"
OK
127.0.0.1:6379> SETEX mykey3 1000 "3"
OK
127.0.0.1:6379> info
# Keyspace
db0:keys=3,expires=3,avg_ttl=992766
127.0.0.1:6379> SETEX mykey4 1 "4"
OK
127.0.0.1:6379> SETEX mykey5 1 "5"
OK
127.0.0.1:6379> info
# Keyspace
db0:keys=3,expires=3,avg_ttl=969898
127.0.0.1:6379> keys *
1) "mykey2"
2) "mykey3"
3) "mykey1"
127.0.0.1:6379>
Given that in your situation you are asking about key expiry on slaves, per https://github.com/antirez/redis/issues/2861:
keys on a slave are not actively expired, and thus the avg_ttl is
never calculated
And per https://groups.google.com/forum/#!topic/redis-db/NFTpdmpOPnc:
avg_ttl is never initialized on a slave and thus it can be what ever
arbitrary value resides in memory at that place.
Thus, it is to be expected that the info command behaves differently on slaves.
The expires just returns the size of keys that will expire not the time.
The source code of 3.2.4
long long keys, vkeys;
keys = dictSize(server.db[j].dict);
vkeys = dictSize(server.db[j].expires);
if (keys || vkeys) {
info = sdscatprintf(info,
"db%d:keys=%lld,expires=%lld,avg_ttl=%lld\r\n",
j, keys, vkeys, server.db[j].avg_ttl);
}
It just calculate the size of server.db[j].expires. (note j is the database index).
The below function delete keys from smembers, they are not passed by eval arguments, is it proper in redis cluster?
def ClearLock():
key = 'Server:' + str(localIP) + ':UserLock'
script = '''
local keys = redis.call('smembers', KEYS[1])
local count = 0
for k,v in pairs(keys) do
redis.call('delete', v)
count = count + 1
end
redis.call('delete', KEYS[1])
return count
'''
ret = redisObj.eval(script, 1, key)
You're right to be worried using those keys that aren't passed by an eval argument.
Redis Cluster won't guarantee that those keys are present in the node that's running the lua script, and some of those delete commands will fail as a result.
One thing you can do is mark all those keys with a common hashtag. This will give you the guarantee that any time node re balancing isn't in progress, keys with the same hash tag will be present on the same node. See the sections on hash tags in the the redis cluster spec. http://redis.io/topics/cluster-spec
(When you are doing cluster node re balancing this script can still fail, so you'll need to figure out how you want to handle that)
Perhaps add the local ip for all entries in that set as the hash tag. The main key could become:
key = 'Server:{' + str(localIP) + '}:UserLock'
Adding the {} around the ip in the string will have redis read this as the hashtag.
You would also need to add that same hashtag {"(localIP)"} as part of the key for all entries you are going to later delete as part of this operation.
To add keys to redis I did the following via the redis CLI:
127.0.0.1:6379> KEYS *
1) "key1"
2) "key2"
3) "key3"
127.0.0.1:6379> SET name "rahul"
OK
127.0.0.1:6379> KEYS *
1) "key1"
2) "name"
3) "key2"
4) "key3"
127.0.0.1:6379>
To validate the persistence of the data in my redis data store, I re-started the server, upon checking the keys, I found few keys to be missing :
127.0.0.1:6379> KEYS *
1) "key3"
2) "key2"
3) "key1"
127.0.0.1:6379>
Are there any specific naming conventions for redis keys. I was using a Windows system. Any idea of what has gone wrong. TIA.
If you do a graceful shutdown values will be written to disk before the service is shutdown. If it's a abrupt shutdown or power failure values will be lost. For that you can enable persistance (RDB or AOF). By default redis follows RDB snapshotting, by default it takes snapshot based on three conditions
1) atleast one keys changed for 15 mins.
2) atleast ten keys changed for 5 mins.
3) atleast 10,000 keys changed for 1 min.
You can change these values in redis.conf file under SNAPSHOTTING.
Try reading the redis.conf file fully, it will give you more detailed explanations.