Assignment of slots to nodes method - redis

I'm trying to modify a script in which I was using redis-trib. For various reasons I can't use it now, but I can use redis-CLI.
The point is that when it comes to assigning slots to nodes, is it done in blocks or in some way? That is, if there are 3 nodes, I distribute the 16384 slots among the 3 nodes and assign the remaining ones to the last one. Or I have to go through another process.
Thank you very much.
Edit:
I'm asking because I'm looking at some unofficial documentation (I didn't see anything about it in the official one that wasn't using redis-trib), which shows the following:
for slot in {0..5461}; do redis-cli -p 7000 CLUSTER ADDSLOTS $slot > /dev/null; done;
for slot in {5462..10923}; do redis-cli -p 7001 CLUSTER ADDSLOTS $slot > /dev/null;; done;
for slot in {10924..16383}; do redis-cli -p 7002 CLUSTER ADDSLOTS $slot > /dev/null;; done;
I do not know the criteria on which this distribution is based.

Finally I perform the task of distributing the 16384 slots with the division by the number of nodes. For the rest of the unallocated slots, I calculate the module and distribute them equally, if possible, among the nodes.
In any case I try that the number of slots per node is the same or very similar and in any case that they are contiguous slots.

Related

Why does the EVALSHA command come at such a performance cost when compared to native commands run on the redis-cli client?

Here are some tests and results I have run against the redis-benchmark tool.
C02YLCE2LVCF:Downloads xxxxxx$ redis-benchmark -p 7000 -q -r 1000000 -n 2000000 JSON.SET fooz . [9999]
JSON.SET fooz . [9999]: 93049.23 requests per second
C02YLCE2LVCF:Downloads xxxxxx$ redis-benchmark -p 7000 -q -r 1000000 -n 2000000 evalsha 8d2d42f1e3a5ce869b50a2b65a8bfaafe8eff57a 1 fooz [5555]
evalsha 8d2d42f1e3a5ce869b50a2b65a8bfaafe8eff57a 1 fooz [5555]: 61132.17 requests per second
C02YLCE2LVCF:Downloads xxxxxx$ redis-benchmark -p 7000 -q -r 1000000 -n 2000000 eval "return redis.call('JSON.SET', KEYS[1], '.', ARGV[1])" 1 fooz [5555]
eval return redis.call('JSON.SET', KEYS[1], '.', ARGV[1]) 1 fooz [5555]: 57423.41 requests per second
That is a significant drop in performance for something that is supposed to have the advantage of performance for a script running server side verse the client running a script client side.
From client to EVALSHA = 34% performance loss
From EVALSHA to EVAL = 6% performance loss
The results are similar for a NON-JSON insert set command
C02YLCE2LVCF:Downloads xxxxxx$ redis-benchmark -p 7000 -q -r 1000000 -n 2000000 set fooz 3333
set fooz 3333: 116414.43 requests per second
C02YLCE2LVCF:Downloads xxxxxxx$ redis-benchmark -p 7000 -q -r 1000000 -n 2000000 evalsha e32aba8d03c97f4418a8593ed4166640651e18da 1 fooz [2222]
evalsha e32aba8d03c97f4418a8593ed4166640651e18da 1 fooz [2222]: 78520.67 requests per second
I first noticed this when I did an info commandstat and observed the poorer performance for the EVALSHA command
# Commandstats
cmdstat_ping:calls=331,usec=189,usec_per_call=0.57
cmdstat_eval:calls=65,usec=4868,usec_per_call=74.89
cmdstat_del:calls=2,usec=21,usec_per_call=10.50
cmdstat_ttl:calls=78,usec=131,usec_per_call=1.68
cmdstat_psync:calls=51,usec=2515,usec_per_call=49.31
cmdstat_command:calls=5,usec=3976,usec_per_call=795.20
cmdstat_scan:calls=172,usec=1280,usec_per_call=7.44
cmdstat_replconf:calls=185947,usec=217446,usec_per_call=1.17
****cmdstat_json.set:calls=1056,usec=26635,usec_per_call=25.22**
****cmdstat_evalsha:calls=1966,usec=68867,usec_per_call=35.03**
cmdstat_expire:calls=1073,usec=1118,usec_per_call=1.04
cmdstat_flushall:calls=9,usec=694,usec_per_call=77.11
cmdstat_monitor:calls=1,usec=1,usec_per_call=1.00
cmdstat_get:calls=17,usec=21,usec_per_call=1.24
cmdstat_cluster:calls=102761,usec=23379827,usec_per_call=227.52
cmdstat_client:calls=100551,usec=122382,usec_per_call=1.22
cmdstat_json.del:calls=247,usec=2487,usec_per_call=10.07
cmdstat_script:calls=207,usec=10834,usec_per_call=52.34
cmdstat_info:calls=4532,usec=229808,usec_per_call=50.71
cmdstat_json.get:calls=1615,usec=11923,usec_per_call=7.38
cmdstat_type:calls=78,usec=115,usec_per_call=1.47
From JSON.SET to EVALSHA there is ~30% performance reduction which is what I observed in the direct testing.
The question is, why? And, is this anything to be concerned with or is this observation within fair expectations?
For context, the reason why I am using EVALSHA and not the direct JSON.SET command is for 2 reasons.
The IORedis client library doesn't have direct support using RedisJson.
Because of the previous fact, I would have had to use send_command() which then would have sent the direct command over to the server but doesn't work with pipelining while using TypeScript. So I would have had to do every other command separately and forgo pipelining.
I thought this was supposed to be better performance?
****** Update:
So in the end, based on the below answer I refactored my code to only include 1 EVALSHA for the write because it uses 2 commands which are a set and expire command. Again, I can't single this into RedisJson so that is the reason why.
Here is the code for someones reference: Shows evalsha and fallback
await this.client.evalsha(this.luaWriteCommand, '1', documentChange.id, JSON.stringify(documentChange), expirationSeconds)
.catch((error) => {
console.error(error);
evalSHAFail = true;
});
if (evalSHAFail) {
console.error('EVALSHA for write not processed, using EVAL');
await this.client.eval("return redis.pcall('JSON.SET', KEYS[1], '.', ARGV[1]), redis.pcall('expire', KEYS[1], ARGV[2]);", '1', documentChange.id, JSON.stringify(documentChange), expirationSeconds);
console.log('SRANS FRUNDER');
this.luaWriteCommand = undefined;
Why Lua script is slower in your case?
Because EVALSHA needs to do more work than a single JSON.SET or SET command. When running EVALSHA, Redis needs to push arguments to Lua stack, run Lua script, and pop return values from Lua stack. It should be slower than a c function call for JSON.SET or SET.
So When does server side script has a performance advantage?
First of all, you must run more than one command in script, otherwise, there won't be any performance advantage as I mentioned above.
Secondly, server side script runs faster than sending serval commands to Redis one-by-one, get the results form Redis, and do the computation work on the client side. Because, Lua script saves lots of Round Trip Time.
Thirdly, if you need to do really complex computation work in Lua script. It might not be a good idea. Because Redis runs the script in a single thread, if the script takes too much time, it will block other clients. Instead, on the client side, you can take the advantage of multi-core to do the complex computation.

How to start certain number of nodes in Redis cluster

To create and start a cluster in Redis, I use create-cluster.sh file inside
/redis-3.04/utils/create-cluster
With the use of this I can create as many nodes I want by changing the:
Settings
PORT=30000
TIMEOUT=2000
NODES=10
REPLICAS=1.
I wonder if I can create for example 10 nodes (5 masters with 5 slaves) in the beginning but start only 4 masters and 4 slaves (meet and join).
Thanks in advance.
Yes. You can add more nodes if load increase on your existing cluster .
Basic Steps are :
Start new redis instances - Let's say you want to add 2 more master and there slaves (Total 4 redis instances)
Then using redis-trib utility do following :
redis-trib.rb add-node <new master node:port> <any existing master>
e.g. ./redis-trib.rb add-node 192.168.1.16:7000 192.168.1.15:7000
After this new node will be assigned an id . Note that id and run following command to add slave to node that we added in prev step
/redis-trib.rb add-node --slave --master-id <master-node-id> <new-node> <master-node>
./redis-trib.rb add-node --slave --master-id 6f9db976c3792e06f9cd252aec7cf262037bea4a 192.168.1.17:7000 192.168.1.16:7000
where 6f9db976c3792e06f9cd252aec7cf262037bea4a is id of 192.168.1.16:7000.
Using similar steps you can add 1 more master-slave pair .
Since these node do not contains any slots to serve, you have move some of the slots from existing masters to new masters .( Re-Sharding)
To that you can run following command/Resharding steps :
6.1 ./redis-trib.rb reshard <any-master-ip>:<master-port>
6.2 It will ask : How many slots do you want to move (from 1 to 16384)? => Enter number of slots you want to move
6.3 Then it will ask : What is the receiving node ID?
6.4 Enter node id to which slots need to be moved . (new masters)
6.5 It will prompt :
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1: (enter source node id or all)
6.6 Then it will prompt info saying Moving slot n to node node-id like
Moving slot 10960 from 37d10f18f349a6a5682c791bff90e0188ae35e49
Moving slot 10961 from 37d10f18f349a6a5682c791bff90e0188ae35e49
Moving slot 10962 from 37d10f18f349a6a5682c791bff90e0188ae35e49
6.7 It will ask : Do you want to proceed with the proposed reshard plan (yes/no)? Type Yes and enter and you are done .
Note : If data is large it might take some time to reshard.
Few Commands :
To know all nodes in cluster and cluster nodes with node ids:
redis-cli -h node-ip -p node-port cluster nodes
e.g. redis-cli -h 127.0.0.1 -p 7000 cluster nodes
To know all slots in cluster :
redis-cli -h 127.0.0.1 -p 7000 cluster slots
Ref : https://redis.io/commands/cluster-nodes
Hope this will help .

Cannot add values to Redis cluster - The cluster is down

I have 4 nodes, 3 are master and 1 of them is a slave. I am trying to add a simple string by set foo bar, but whenever i do it, i get this error:
(error) CLUSTERDOWN The cluster is down
Below is my cluster info
127.0.0.1:7000cluster info
cluster_state:fail
cluster_slots_assigned:11
cluster_slots_ok:11
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:4
cluster_size:3
cluster_current_epoch:3
cluster_my_epoch:3
cluster_stats_messages_sent:9262
cluster_stats_messages_received:9160
I am using Redis-x64-3.0.503. please let me know how to solve this
Cluster Nodes:
87982f22cf8fb12c1247a74a2c26cdd1b84a3b88 192.168.2.32:7000 slave bc1c56ef4598fb4ef9d26c804c5fabd462415f71 1492000375488 1492000374508 3 connected
9527ba919a8dcfaeb33d25ef805e0f679276dc8d 192.168.2.35:7000 master - 1492000375488 1492000374508 2 connected 16380
ff999fd6cbe0e843d1a49f7bbac1cb759c3a2d47 192.168.2.33:7000 master - 1492000375488 1492000374508 0 connected 16381
bc1c56ef4598fb4ef9d26c804c5fabd462415f71 127.0.0.1:7000 myself,master - 0 0 3 connected 1-8 16383
Just to add up and simplify what #neuront said.
Redis stores data in hash slots. For this to understand you need to know how Hasmaps or Hashtables work. For our reference here redis has a constant of 16384 slots to assign and distribute to all the masters servers.
Now if we look at the node configuration you posted and refer it with redis documentation, you'll see the end numbers mean the slots assigned to the masters.
In your case this is what it looks like
... slave ... connected
... master ... connected 16380
... master ... connected 16381
... master ... connected 1-8 16380
So all the machines are connected to form up the cluster but not all the hash slots are assigned to store the information. It should've been something like this
... slave ... connected
... master ... connected 1-5461
... master ... connected 5462-10922
... master ... connected 10923-16384
You see, now we are assigning the range of all the hash slots like the documentation says
slot: A hash slot number or range. Starting from argument number 9, but there may be up to 16384 entries in total (limit never reached). This is the list of hash slots served by this node. If the entry is just a number, is parsed as such. If it is a range, it is in the form start-end, and means that the node is responsible for all the hash slots from start to end including the start and end values.
For specifically in your case, when you store some data with a key foo, it must've been assigned to some other slot not registered in the cluster.
Since you're in Windows, you'll have to manually setup the distribution. For that you'll have to do something like this (this is in Linux, translate to windows command)
for slot in {0..5400}; do redis-cli -h master1 -p 6379 CLUSTER ADDSLOTS $slot; done;
taken from this article
Hope it helped.
Only 11 slots were assigned so your cluster is down, just like the message tells you. The slots are 16380 at 192.168.2.35:7000, 16381 at 192.168.2.33:7000 and 1-8 16383 at 127.0.0.1:7000.
Of couse the direct reason is that you need to assign all 16384 slots (0-16383) to the cluster, but I think it was caused by a configuration mistake.
You have a node with a localhost address 127.0.0.1:7000. However, 192.168.2.33:7000 is also a 127.0.0.1:7000, while 192.168.2.35:7000 is also a 127.0.0.1:7000. This localhost address problem makes a node cannot tell itself from another node, and I think this causes the chaos.
I suggest you reset all the nodes by cluster reset commands and re-create the cluster again, and make sure you are using their 192.168.*.* address this time.
#user1829319 Here goes the windows equivalents for add slots:
for /l %s in (0, 1, 8191) do redis-cli -h 127.0.0.1 -p 6379 CLUSTER ADDSLOTS %s
for /l %s in (8192, 1, 16383) do redis-cli -h 127.0.0.1 -p 6380 CLUSTER ADDSLOTS %s
you should recreate your cluster by doing flush all and cluster reset and in next cluster setup make sure you verify all slots has been assigned to masters or not using > cluster slots

Redis benchmarking for HMSET, HGETALL with a data size

Can someone let me know how can I use redis-benchmark to do a benchmarking for HMSET, HGETALL with a fixed data size (-d option in redis-benchmark). I am using redis 3.2.5.
I have gone through this answer and tried the below command:-
root#cache-server1:~# redis-benchmark -h a.b.c.d -p XXXX hmset hgetall myhash rand_int rand_string -d 2048
====== hmset hgetall myhash rand_int rand_string -d 2048 ======
10000 requests completed in 0.11 seconds
50 parallel clients
3 bytes payload
keep alive: 1
99.64% <= 1 milliseconds
100.00% <= 1 milliseconds
89285.71 requests per second
But looking at the output it seems it is using only 3 bytes payload.
If it is not possible via redis-benchmark can someone suggest some other alternative?
The payload is only 3 bytes (the default) because the -d is taken as part of the command. The command must be the last argument, and all switches must precede it.
Besides that, you can't use redis-benchmark to run two custom commands. Also, the -d option is only applicable to predefined tests (the ones that run by default or with the -t option) and has no meaning if the user specifies the command used in the benchmark.
If you have a specific benchmarking flow that you want to test, the best thing you can do is mock it with any client that you're comfortable with.

In Lettuce(4.x) for Redis how to reduce round trips and use output of one command as input for another command, especially for Georadius

I have seen this pass results to another command in redis
and using via command line this command works well :
src/redis-cli keys '*' | xargs src/redis-cli mget
However how can we achieve the same effect via Lettuce (i started trying out 4.0.2.Final)
Also a solution to this is particularly important in the following scenario :
Say we are using geolocation capabilities, and we add a set of locations of "my-location-category"
using GEOADD
GEOADD "category-1" 8.6638775 49.5282537 "location-id:1" 8.3796281 48.9978127 "location-id:2" 8.665351 49.553302 "location-id:3"
Next, say we do a GeoRadius to get locations within 10 km radius of 8.6582361 49.5285495 for "category-1"
Now when we get "location-id:1" & "location-id:3"
Given that I already set values for above keys "location-id:1" & "location-id:3"
I want to pipe commands to do the GEORADIUS as well as do mget on all the matching results.
Does Redis provide feature to do that?
and / or how can we achieve this via the Lettuce client library without first manually iterating through results of GEORADIUS and then do manual mget.
That would be more efficient performance for the program that uses it.
Does anyone know how we can do this ?
Update
This is the piped command for the scenario I discussed above :
src/redis-cli GEORADIUS "category-1" 8.6582361 49.5285495 10 km | xargs src/redis-cli mget
Now we need to know how to do this via Lettuce
IMPORTANT: never use KEYS, always use SCAN instead if you must.
This isn't really a question about Lettuce nor Java so I can actually answer it :)
What you're trying to do is use the results from a read operation (GEORADIUS) as input (key names) for another read operation (MGET). This type of flow can't be pipelined, well, just because of that - pipelining means that you don't need the answers for operations right away but in you case you do.
However.
Since you're reading String keys with MGET, you might as well just denormalize everything (remember, we're NoSQL) and store the contents of these keys in the Sorted Set's members, e.g.:
GEOADD "category-1" 8.6638775 49.5282537 "location-id:1:moredata:evenmoredata:{maybe a JSON document here}:orperhapsmsgpack"
This will allow you to get the locations and their "data" with one GEORADIUS call. Of course, any updates to location:1's data will need to be done across all categories.
A note about Lua scripts: while a Lua script could definitely save on the back and forth in this case, any such script will be against best practices/not cluster safe.
After digging around and studying Lua script, my conclusion is that removing round-trips in such a way can only be done via Lua scripts as suggested by Itamar Haber.
I ended up creating a lua script file (myscript.lua) as below
local locationKeys = redis.call('GEORADIUS', 'category-1', '8.6582361', '49.5285495', '10', 'km' )
if unpack(locationKeys) == nil then
return nil
else
return redis.call('MGET', unpack(locationKeys))
end
** of course we should be sending in parameters to this... this is just a poc :)
now you can execute it via command
src/redis-cli EVAL "$(cat myscript.lua)" 0
Then to reduce the network-overhead of sending across the entire script to Redis for execution, we have the option of registering the script with Redis.
Redis will give us a sha1 digested code for future references for that script, which can be used for next calls to that script.
This can be done as below :
src/redis-cli SCRIPT LOAD "$(cat myscript.lua)"
this should give back a sha1 code something like this : 49730aa2ed3034ee48f818e486tpbdf1b500b19e
next calls can be done using this code
eg
src/redis-cli evalsha 49730aa2ed3034ee48f818e486b2bdf1b500b19e 0
The sad part however here is that the sha1 digest is remembered only so long as the instance of redis is running. If it is restarted, that the sha1 digest is lost. Then you do the SCRIPT LOAD once again. And if nothing changes in the script, then the sha1-digest code will be the same.
Ideally while using through client api, we should first attempt evalsha, if that returns a "No matching script" error, then as a fallback do script load, and procure the sha1 code once again, and create an internal map of that and use that sha1 code for further calls.
This can well be done via Lettuce. I could find the methods for those. Hope this gives a good insight into solution for the problem.