Benchmarking on Redis gets low performance when connection number raise to merely 5000 - redis

Environment:
Redis on a single machine (standalone mode) with 512GB mem and 128 cores.
Benchmark procedure:
run redis-benchmark -h xx -p xx -c 5000 -n 1000000 -t set,get, the result is like:
run redis-benchmark -h xx -p xx -c 1700 -n 1000000 -t set,get 3 times on the same server (split 5000 conns to 3 processes for executing), and the result on avg is roughly like:
run redis-benchmark -h xx -p xx -c 1700 -n 1000000 -t set,get only once, and the result is:
I've tried adding -P pipeline config, and it makes no big difference when comparing to the above result. I'm wondering why it suffers from performance penalty when configuring 5000 conns in a single redis-benchmark process? And how could I benchmark the real capability of current Redis instance? Thanks!

Related

iperf: Meaning of columns in UDP measurement

I call iperf automaticly in a python script for the server and the client and save the output in a csv file:
server.cmd('iperf -s -p 5202 -u -t 50 -y C > result/iperf_server_output.csv') client.cmd('iperf -c 10.1.1.2 -p 5202 -i 1 -u -b 100m -t 20 -y C > result/iperf_client_output.csv')
result/iperf_server_output.csv stays empty, and result/iperf_client_output.csv looks like this:
20220921142402,10.1.1.1,55922,10.1.1.2,5202,1,0.0-1.0,12502350,100018800,0.000,0,8505,0.000,0
20220921142403,10.1.1.1,55922,10.1.1.2,5202,1,1.0-2.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142404,10.1.1.1,55922,10.1.1.2,5202,1,2.0-3.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142405,10.1.1.1,55922,10.1.1.2,5202,1,3.0-4.0,12500880,100007040,0.000,0,8504,0.000,0
20220921142406,10.1.1.1,55922,10.1.1.2,5202,1,4.0-5.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142407,10.1.1.1,55922,10.1.1.2,5202,1,5.0-6.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142408,10.1.1.1,55922,10.1.1.2,5202,1,6.0-7.0,12502350,100018800,0.000,0,8505,0.000,0
20220921142409,10.1.1.1,55922,10.1.1.2,5202,1,7.0-8.0,12497940,99983520,0.000,0,8502,0.000,0
20220921142410,10.1.1.1,55922,10.1.1.2,5202,1,8.0-9.0,12500880,100007040,0.000,0,8504,0.000,0
20220921142411,10.1.1.1,55922,10.1.1.2,5202,1,9.0-10.0,12500880,100007040,0.000,0,8504,0.000,0
20220921142412,10.1.1.1,55922,10.1.1.2,5202,1,10.0-11.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142413,10.1.1.1,55922,10.1.1.2,5202,1,11.0-12.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142414,10.1.1.1,55922,10.1.1.2,5202,1,12.0-13.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142415,10.1.1.1,55922,10.1.1.2,5202,1,13.0-14.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142416,10.1.1.1,55922,10.1.1.2,5202,1,14.0-15.0,12500880,100007040,0.000,0,8504,0.000,0
20220921142417,10.1.1.1,55922,10.1.1.2,5202,1,15.0-16.0,12500880,100007040,0.000,0,8504,0.000,0
20220921142418,10.1.1.1,55922,10.1.1.2,5202,1,16.0-17.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142419,10.1.1.1,55922,10.1.1.2,5202,1,17.0-18.0,12499410,99995280,0.000,0,8503,0.000,0
20220921142420,10.1.1.1,55922,10.1.1.2,5202,1,18.0-19.0,12500880,100007040,0.000,0,8504,0.000,0
20220921142421,10.1.1.1,55922,10.1.1.2,5202,1,19.0-20.0,12500880,100007040,0.000,0,8504,0.000,0
20220921142421,10.1.1.1,55922,10.1.1.2,5202,1,0.0-20.0,250005840,100001255,0.000,0,-1,-0.000,0
Now I want to plot the Bitrate with pandas, but I don't know what each column means.

GNU Parallel -q option causing BCP "unknown option" errors (different string quotes on local vs remote hosts)

Seeing very strange behavior where when when using gnu parallel to distribute export jobs using bcp from mssql-tools. It appears that when using the -q option for parallel, strings are interpreted differently on local host than on remote hosts.
Running only as a loop through files on local host, the bcp processes throws no errors
However, distributing the file exports with parallel, the bcp processes executing on the local host throw
/opt/mssql-tools/bin/bcp: unknown option
errors, while those executing on remote hosts (via a --sshloginfile param) finish successfully. The basic code being run looks like...
# setting some vars to pass
TO_SERVER_ODBCDSN="-D -S MyMSSQLServer"
TO_SERVER_IP="-S 172.18.54.22"
DB="$dest_db" #TODO: enforce being more careful with this value
TABLE="$tablename" # MUST exist beforehand, case matters
USER=$(tail -n+1 $source_home/mssql-creds.txt | head -1)
PASSWORD=$(tail -n+2 $source_home/mssql-creds.txt | head -1)
DATAFILES="/some/path/to/files/"
TARGET_GLOB="*.tsv"
RECOMMEDED_IMPORT_MODE='-c' # makes a HUGE difference, see https://stackoverflow.com/a/16310219/8236733
DELIMITER="\\\t" # (currently not used) DO NOT use format like "'\t'", nested quotes seem to cause hard-to-catch error, want "\t" literal
....
bcpexport() {
filename=$1
TO_SERVER_ODBCDSN=$2
DB=$3
TABLE=$4 # MUST exist beforehand, case matters
USER=$5
PASSWORD=$6
RECOMMEDED_IMPORT_MODE=$7 # makes a HUGE difference, see https://stackoverflow.com/a/16310219/8236733
DELIMITER=$8 # not currently used
WORKDIR=$9
LOGDIR=${10}
....
/opt/mssql-tools/bin/bcp "$TABLE" in "$localfile" \
$TO_SERVER_ODBCDSN \
-U $USER -P $PASSWORD \
-d $DB \
$RECOMMEDED_IMPORT_MODE
-t "\t" \
-e ${localfile}.bcperror.log
}
export -f bcpexport
parallelization_pernode=5
parallel -q -j $parallelization_pernode \
--sshloginfile $source_home/parallel-nodes.txt \
--env bcpexport \
bcpexport {} "$TO_SERVER_ODBCDSN" $DB $TABLE $USER $PASSWORD $RECOMMEDED_IMPORT_MODE $DELIMITER $workingdir $logdir \
::: $DATAFILES/$TARGET_GLOB #from hdfs nfs gateway
Looking at the bash interpretation of the processes (by running ps -aux | grep bcp on the hosts that parallelis given in the --sshloginfile) for the remote hosts we see...
/bin/bash -c bcpexport() { ... /opt/mssql-tools/bin/bcp "$TABLE" in "$localfile" $TO_SERVER_ODBCDSN -U $USER -P $PASSWORD -d $DB $RECOMMEDED_IMPORT_MODE; -t "\t" -e ${localfile}.bcperror.log; ...
for the local host, the bash interpretation is...
/bin/bash -c bcpexport() { ... /opt/mssql-tools/bin/bcp "$TABLE" in "$localfile" $TO_SERVER_ODBCDSN -U $USER -P $PASSWORD -d $DB $RECOMMEDED_IMPORT_MODE; -t "\t" -e ${localfile}.bcperror.log; ...
that is, they look the same.
My current thought is that the "\t" in the bcp command is being interpreted in a problematic way. Debugging parallel without vs with the -q option we see...
$ parallel -j 5 --sshloginfile ./parallel-nodes.txt echo "Number {}: Running on \`hostname\`: \t" ::: 1 2 3 4 5
Number 4: Running on HW04.ucera.local: t
Number 1: Running on HW04.ucera.local: t
Number 2: Running on HW03.ucera.local: t
Number 5: Running on HW03.ucera.local: t
Number 3: Running on HW02.ucera.local: t
$ parallel -q -j 5 --sshloginfile ./parallel-nodes.txt echo "Number {}: Running on \`hostname\`: \t" ::: 1 2 3 4 5
Number 1: Running on `hostname`:
Number 4: Running on `hostname`:
Number 3: Running on `hostname`: \t
Number 2: Running on `hostname`: \t
Number 5: Running on `hostname`: \t
The bcp command needs the "\t" literal not the "t" literal (and I suspect several other similar string corruptions (also I do believe that \t is the default for bcp anyway, but this is just an example and want to keep \t for code clarity)), but not sure how to get this for both local and remote nodes or even why this behavior differs by remote vs local.
Basically, need the the strings to be exactly the same for both local and remote hosts even if strings have spaces or escape characters in them (note, I think this used to not be the case (have older script on other machines that don't have this problem))
Not sure if this is counts more as a parallel problem or a bcp problem (currently thinking something is going wrong with the -q option in parallel, but not sure). Anyone have any debugging suggestions or fixes? Ideas of what could be happening?
Firstly, the reason why hostname is not expanded is due to -q. It quotes the ` so that it does not expand.
Secondly, I think what you see is the different behaviours in built-in echo and /bin/echo. Built-in echo depends on the shell. Here I compare echo \\\\t in different shells:
$ parallel --onall --tag -S sh#lo,bash#lo,csh#lo,tcsh#lo,ksh#lo,zsh#lo echo \\\\t ::: a
bash#lo \t a
tcsh#lo a
sh#lo a
ksh#lo \t a
zsh#lo a
csh#lo \t a
That does not, however, get you closer to a solution. If I were you I would use env_parallel to copy the environment variables. And if the login shell on the remote systems are not the same as your shell, then set PARALLEL_SHELL to force using that shell.
So:
#!/bin/bash
env_parallel --session
# setting some vars to pass
TO_SERVER_ODBCDSN="-D -S MyMSSQLServer"
:
:
PARALLEL_SHELL=bash env_parallel -q -j $parallelization_pernode ...
(no need to use neither --env nor 'export -f' when using 'env_parallel --session')
# Cleanup (not needed if this is the last line in the script)
env_parallel --end-session

REDIS //Benchmark tool// Keys in SET using -r <keyspacelen> do not match in GET

I am testing Redis (version: 0.8.8.384) using the benchmark tool, and the redis-server.exe that is included in the zip package locally.
I used the following command to test the keyspace_length:
redis-benchmark -t set,get -n 4 -c 1 -d 888 -r 1000
I have managed to capture a tracer (.pcap) locally using RawCap.exe.
What I have noticed is that the keys that are send in SET command, do not match with the keys in GET command. I would expect that the used keys are stored somewhere locally and then retrieved from the GET command to interrogate the value for each random key.
Am I missing something?
Thanks in advance!
It seems this behavior is the expected, since you can run a redis-benchmark for the GET command only:
redis-benchmark -t get -n 4 -c 1 -d 888 -r 1000
====== GET ======
4 requests completed in 0.00 seconds
1 parallel clients
888 bytes payload
keep alive: 1
100.00% <= 0 milliseconds
4000.00 requests per second
So each command specified in -t is tested independently.
Edit
You can pass a lua script to test a set/get in same key. Some thoughts after a post-lunch research :)
You can turn on MONITOR on redis-cli before executing this to be sure of what is happening. IMPORTANT: this will kill your benchmarking, just set it to see the actual commands using a small number of tests (e.g. redis-benchmark -n 10);
Since you're loading a lua script, this will be executed atomically every time, like if these command were in a MULTI/EXEC block;
You can lock a single random number to be used by both commands by specifying __rand_int__ parameter AND -r 1000 (e.g.). The -r parameter defines the range of random integers used. __rand_int__ WON'T work if you don't specify the -r parameter (you can see this when monitoring);
After turning MONITOR off, you can see that for bigger -n values the simulation seems to be faster. Try with -n 10 and -n 1000 and see if this holds true.
Read the https://redis.io/topics/benchmarks :)
The script:
redis-benchmark -r 10000 -n 1000 eval "redis.call('set',KEYS[1],'xpto') return redis.call('get', KEYS[1])" 1 __rand_int__
A sample MONITOR output:
1487868918.656881 [0 127.0.0.1:50561] "eval" "redis.call('set',KEYS[1],'xpto') return redis.call('get', KEYS[1])" "1" "000000009355"
1487868918.657032 [0 lua] "set" "000000009355" "xpto"
1487868918.657051 [0 lua] "get" "000000009355"

Redis: (error) ERR unknown command 'redis-benchmark'

I'm trying to run some redis benchmark tests, but they're all giving the same error, unknown command redis-benchmark:
C:\>redis-cli
127.0.0.1:6379> redis-benchmark -t set,get -r 1000000 -q
(error) ERR unknown command 'redis-benchmark'
You cannot run the redis-benchmark command inside a redis-cli shell. It's not part of the redis-cli commands. Try a regular prompt instead:
Not working:
C:\>redis-cli
127.0.0.1:6379> redis-benchmark -t set,get -r 1000000 -q
Working:
C:\>redis-benchmark -t set,get -r 1000000 -q
Outputs something like:
SET: 111856.82 requests per second
GET: 108225.10 requests per second

Copy all keys from one db to another in redis

Instade of move I want to copy all my keys from a particular db to another.
Is it possible in redis if yes than how ?
If you can't use MIGRATE COPY because of your redis version (2.6) you might want to copy each key separately which takes longer but doesn't require you to login to the machines themselves and allows you to move data from one database to another.
Here's how I copy all keys from one database to another (but without preserving ttls)
#set connection data accordingly
source_host=localhost
source_port=6379
source_db=0
target_host=localhost
target_port=6379
target_db=1
#copy all keys without preserving ttl!
redis-cli -h $source_host -p $source_port -n $source_db keys \* | while read key; do
echo "Copying $key"
redis-cli --raw -h $source_host -p $source_port -n $source_db DUMP "$key" \
| head -c -1 \
| redis-cli -x -h $target_host -p $target_port -n $target_db RESTORE "$key" 0
done
Keys are not going to be overwritten, in order to do that, delete those keys before copying or simply flush the whole target database before starting.
Copies all keys from database number 0 to database number 1 on localhost.
redis-cli --scan | xargs redis-cli migrate localhost 6379 '' 1 0 copy keys
If you use the same server/port you will get a timeout error but the keys seem to copy successfully anyway. GitHub Redis issue #1903
redis-cli -a $source_password -p $source_port -h $source_ip keys /*| while read key;
do echo "Copying $key";
redis-cli --raw -a $source_password -h $source_ip -p $source_port -n $dbname DUMP "$key"| head -c -1| redis-cli -x -a $destination_password -h $destination_IP -p $destination_port RESTORE "$key" 0;
Latest solution:
Use the RIOT open-source command line tool provided by Redislabs to copy the data.
Reference: https://developer.redis.com/riot/riot-redis/cookbook.html#_performing_migration
GitHub project link: https://github.com/redis-developer/riot
How to install: https://developer.redis.com/riot/riot-redis/
# Source Redis db
SH=test1-redis.com
SP=6379
# Target Redis db
TH=test1-redis.com
TP=6379
# Copy from db0 to db1 (standalone Redis db, Or cluster mode disabled)
#
riot-redis -h $SH -p $SP --db 0 replicate -h $TH -p $TP --db 1 --batch 10000 \
--scan-count 10000 \
--threads 4 \
--reader-threads 4 \
--reader-batch 500 \
--reader-queue 2000 \
--reader-pool 4
RIOT is quicker, supports multithreading, and works well with cross-environment Redis data copy ( AWS Elasticache, Redis OSS, and Redislabs ).
Not directly. I would suggest to use the always convenient redis-rdb-tools package (from Sripathi Krishnan) to extract the data from a normal rdb dump, and reinject it to another instance.
See https://github.com/sripathikrishnan/redis-rdb-tools
As far as I understand you need to copy keys from a particular DB (e.g 5 ) to a particular DB say 10. If that is the case you can use redis database dumper (https://github.com/r043v/rdd). Although as per documentation it has a switch (-d) to select a database for operation but didn't work for me, so what I did
1.) Edit the rdd.c file and look for int main(int argc,char argv) function
2.) Change the DB to as per your requirement
3.) compile the src by **make
4.) Dump all keys using ./rdd -o "save.rdd"
5.) Edit the rdd.c file again and change the DB
6.) Make again
7.) Import by using ./rdd "save.rdd" -o insert -s "IP" -p"Port"
I know this is old, but for those of you coming here form Google:
I just published a command line interface utility to npm and github that allows you to copy keys that match a given pattern (even *) from one Redis database to another.
You can find the utility here:
https://www.npmjs.com/package/redis-utils-cli
Try using dump to first dump all the keys and then restore the same
If migrating keys inside of the same redis engine, then you might use internal command MOVE for that (pipelining for more speed):
#!/bin/bash
#set connection data accordingly
source_host=localhost
source_port=6379
source_db=4
target_db=0
total=$(redis-cli -n 4 keys \* | sed 's/^/MOVE /g' | sed 's/$/ '$target_db'/g' | wc -c)
#copy all keys without preserving ttl!
time redis-cli -h $source_host -p $source_port -n $source_db keys \* | \
sed 's/^/MOVE /g' | sed 's/$/ 0/g' | \
pv -s $total | \
redis-cli -h $source_host -p $source_port -n $source_db >/dev/null