To create and start a cluster in Redis, I use create-cluster.sh file inside
/redis-3.04/utils/create-cluster
With the use of this I can create as many nodes I want by changing the:
Settings
PORT=30000
TIMEOUT=2000
NODES=10
REPLICAS=1.
I wonder if I can create for example 10 nodes (5 masters with 5 slaves) in the beginning but start only 4 masters and 4 slaves (meet and join).
Thanks in advance.
Yes. You can add more nodes if load increase on your existing cluster .
Basic Steps are :
Start new redis instances - Let's say you want to add 2 more master and there slaves (Total 4 redis instances)
Then using redis-trib utility do following :
redis-trib.rb add-node <new master node:port> <any existing master>
e.g. ./redis-trib.rb add-node 192.168.1.16:7000 192.168.1.15:7000
After this new node will be assigned an id . Note that id and run following command to add slave to node that we added in prev step
/redis-trib.rb add-node --slave --master-id <master-node-id> <new-node> <master-node>
./redis-trib.rb add-node --slave --master-id 6f9db976c3792e06f9cd252aec7cf262037bea4a 192.168.1.17:7000 192.168.1.16:7000
where 6f9db976c3792e06f9cd252aec7cf262037bea4a is id of 192.168.1.16:7000.
Using similar steps you can add 1 more master-slave pair .
Since these node do not contains any slots to serve, you have move some of the slots from existing masters to new masters .( Re-Sharding)
To that you can run following command/Resharding steps :
6.1 ./redis-trib.rb reshard <any-master-ip>:<master-port>
6.2 It will ask : How many slots do you want to move (from 1 to 16384)? => Enter number of slots you want to move
6.3 Then it will ask : What is the receiving node ID?
6.4 Enter node id to which slots need to be moved . (new masters)
6.5 It will prompt :
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1: (enter source node id or all)
6.6 Then it will prompt info saying Moving slot n to node node-id like
Moving slot 10960 from 37d10f18f349a6a5682c791bff90e0188ae35e49
Moving slot 10961 from 37d10f18f349a6a5682c791bff90e0188ae35e49
Moving slot 10962 from 37d10f18f349a6a5682c791bff90e0188ae35e49
6.7 It will ask : Do you want to proceed with the proposed reshard plan (yes/no)? Type Yes and enter and you are done .
Note : If data is large it might take some time to reshard.
Few Commands :
To know all nodes in cluster and cluster nodes with node ids:
redis-cli -h node-ip -p node-port cluster nodes
e.g. redis-cli -h 127.0.0.1 -p 7000 cluster nodes
To know all slots in cluster :
redis-cli -h 127.0.0.1 -p 7000 cluster slots
Ref : https://redis.io/commands/cluster-nodes
Hope this will help .
Related
I am newbie to Redis and trying to understand concept of Redis PubSub.
Step- 1:
root#01a623a828db:/data# redis-cli -n 1
127.0.0.1:6379[1]> subscribe foo
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "foo"
3) (integer) 1
In 1st step, subscribed database 1
Step- 2:
root#01a623a828db:/data# redis-cli -n 4
127.0.0.1:6379[4]> publish foo 2
(integer) 1
In 2nd step, published message on database 4
Step- 3:
root#01a623a828db:/data# redis-cli -n 1
127.0.0.1:6379[1]> subscribe foo
Reading messages... (press Ctrl-C to quit)
..........................................
1) "message"
2) "foo"
3) "2"
In 3rd step, on database 1 got the message which was published on database 4 in 2nd Step.
I tried to find out the reason behind it but I found same answer everywhere- "Pub/Sub has no relation to the key space. It was made to not interfere with it on any level, including database numbers. Publishing on db 10, will be heard by a subscriber on db 1. If you need scoping of some kind, prefix the channels with the name of the environment (test, staging, production)- This is as per official documentation of Redis PubSub."
Ques-
Why redis pubsub working architecture is independent of database?
How to implement "If you need scoping of some kind, prefix the channels with the name of the environment (test, staging, production)"?
"Publishing on db 10, will be heard by a subscriber on db 1."- It is not inline with statement
It was made to not interfere with it on any level, including database numbers.
it's a matter of design choice really.
If you need scoping, you can always prefix the pattern. eg: pattern productupdate on test env will be watched via test:productupdate and on staging env, it will be staging:productupdate
It seems to inline well with the statement. the database number doesn't matter here.
I'm trying to modify a script in which I was using redis-trib. For various reasons I can't use it now, but I can use redis-CLI.
The point is that when it comes to assigning slots to nodes, is it done in blocks or in some way? That is, if there are 3 nodes, I distribute the 16384 slots among the 3 nodes and assign the remaining ones to the last one. Or I have to go through another process.
Thank you very much.
Edit:
I'm asking because I'm looking at some unofficial documentation (I didn't see anything about it in the official one that wasn't using redis-trib), which shows the following:
for slot in {0..5461}; do redis-cli -p 7000 CLUSTER ADDSLOTS $slot > /dev/null; done;
for slot in {5462..10923}; do redis-cli -p 7001 CLUSTER ADDSLOTS $slot > /dev/null;; done;
for slot in {10924..16383}; do redis-cli -p 7002 CLUSTER ADDSLOTS $slot > /dev/null;; done;
I do not know the criteria on which this distribution is based.
Finally I perform the task of distributing the 16384 slots with the division by the number of nodes. For the rest of the unallocated slots, I calculate the module and distribute them equally, if possible, among the nodes.
In any case I try that the number of slots per node is the same or very similar and in any case that they are contiguous slots.
I have 4 nodes, 3 are master and 1 of them is a slave. I am trying to add a simple string by set foo bar, but whenever i do it, i get this error:
(error) CLUSTERDOWN The cluster is down
Below is my cluster info
127.0.0.1:7000cluster info
cluster_state:fail
cluster_slots_assigned:11
cluster_slots_ok:11
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:4
cluster_size:3
cluster_current_epoch:3
cluster_my_epoch:3
cluster_stats_messages_sent:9262
cluster_stats_messages_received:9160
I am using Redis-x64-3.0.503. please let me know how to solve this
Cluster Nodes:
87982f22cf8fb12c1247a74a2c26cdd1b84a3b88 192.168.2.32:7000 slave bc1c56ef4598fb4ef9d26c804c5fabd462415f71 1492000375488 1492000374508 3 connected
9527ba919a8dcfaeb33d25ef805e0f679276dc8d 192.168.2.35:7000 master - 1492000375488 1492000374508 2 connected 16380
ff999fd6cbe0e843d1a49f7bbac1cb759c3a2d47 192.168.2.33:7000 master - 1492000375488 1492000374508 0 connected 16381
bc1c56ef4598fb4ef9d26c804c5fabd462415f71 127.0.0.1:7000 myself,master - 0 0 3 connected 1-8 16383
Just to add up and simplify what #neuront said.
Redis stores data in hash slots. For this to understand you need to know how Hasmaps or Hashtables work. For our reference here redis has a constant of 16384 slots to assign and distribute to all the masters servers.
Now if we look at the node configuration you posted and refer it with redis documentation, you'll see the end numbers mean the slots assigned to the masters.
In your case this is what it looks like
... slave ... connected
... master ... connected 16380
... master ... connected 16381
... master ... connected 1-8 16380
So all the machines are connected to form up the cluster but not all the hash slots are assigned to store the information. It should've been something like this
... slave ... connected
... master ... connected 1-5461
... master ... connected 5462-10922
... master ... connected 10923-16384
You see, now we are assigning the range of all the hash slots like the documentation says
slot: A hash slot number or range. Starting from argument number 9, but there may be up to 16384 entries in total (limit never reached). This is the list of hash slots served by this node. If the entry is just a number, is parsed as such. If it is a range, it is in the form start-end, and means that the node is responsible for all the hash slots from start to end including the start and end values.
For specifically in your case, when you store some data with a key foo, it must've been assigned to some other slot not registered in the cluster.
Since you're in Windows, you'll have to manually setup the distribution. For that you'll have to do something like this (this is in Linux, translate to windows command)
for slot in {0..5400}; do redis-cli -h master1 -p 6379 CLUSTER ADDSLOTS $slot; done;
taken from this article
Hope it helped.
Only 11 slots were assigned so your cluster is down, just like the message tells you. The slots are 16380 at 192.168.2.35:7000, 16381 at 192.168.2.33:7000 and 1-8 16383 at 127.0.0.1:7000.
Of couse the direct reason is that you need to assign all 16384 slots (0-16383) to the cluster, but I think it was caused by a configuration mistake.
You have a node with a localhost address 127.0.0.1:7000. However, 192.168.2.33:7000 is also a 127.0.0.1:7000, while 192.168.2.35:7000 is also a 127.0.0.1:7000. This localhost address problem makes a node cannot tell itself from another node, and I think this causes the chaos.
I suggest you reset all the nodes by cluster reset commands and re-create the cluster again, and make sure you are using their 192.168.*.* address this time.
#user1829319 Here goes the windows equivalents for add slots:
for /l %s in (0, 1, 8191) do redis-cli -h 127.0.0.1 -p 6379 CLUSTER ADDSLOTS %s
for /l %s in (8192, 1, 16383) do redis-cli -h 127.0.0.1 -p 6380 CLUSTER ADDSLOTS %s
you should recreate your cluster by doing flush all and cluster reset and in next cluster setup make sure you verify all slots has been assigned to masters or not using > cluster slots
I have some fairly simple Hadoop streaming jobs that look like this:
yarn jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0.2.0.6.0-101.jar \
-files hdfs:///apps/local/count.pl \
-input /foo/data/bz2 \
-output /user/me/myoutput \
-mapper "cut -f4,8 -d," \
-reducer count.pl \
-combiner count.pl
The count.pl script is just a simple script that accumulates counts in a hash and prints them out at the end - the details are probably not relevant but I can post it if necessary.
The input is a directory containing 5 files encoded with bz2 compression, roughly the same size as each other, for a total of about 5GB (compressed).
When I look at the running job, it has 45 mappers, but they're all running on one node. The particular node changes from run to run, but always only one node. Therefore I'm achieving poor data locality as data is transferred over the network to this node, and probably achieving poor CPU usage too.
The entire cluster has 9 nodes, all the same basic configuration. The blocks of the data for all 5 files are spread out among the 9 nodes, as reported by the HDFS Name Node web UI.
I'm happy to share any requested info from my configuration, but this is a corporate cluster and I don't want to upload any full config files.
It looks like this previous thread [ why map task always running on a single node ] is relevant but not conclusive.
EDIT: at #jtravaglini's suggestion I tried the following variation and saw the same problem - all 45 map jobs running on a single node:
yarn jar \
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0.2.0.6.0-101.jar \
wordcount /foo/data/bz2 /user/me/myoutput
At the end of the output of that task in my shell, I see:
Launched map tasks=45
Launched reduce tasks=1
Data-local map tasks=18
Rack-local map tasks=27
which is the number of data-local tasks you'd expect to see on one node just by chance alone.
So I accidentally started 2 ElasticSearch instances on the same machine. One with port 9200, the other with port 9201. This means there's 2 cluster nodes, each with the same name, and each has 1/2 of the total shards for each index.
If I kill one of the instances, I now end up with 1 instance having 1/2 the shards.
How do I fix this problem? I want to have just 1 instance with all the shards in it (like it used to be)
SO... there is a clean way to resolve this. Although I must say the ElasticSearch documentation is very very confusing (all these buzzwords like cluster and zen discovery boggles my mind!)
1)
Now, if you have 2 instances, one in port 9200, and the other in 9201. And you want ALL the shards to be in 9200.
Run this command to disable allocation in the 9201 instance. You can change persistent to transient if you want this change to not be permanent. I'd keep it persistent so this doesn't ever happen again.
curl -XPUT localhost:9201/_cluster/settings -d '{
"persistent" : {
"cluster.routing.allocation.disable_allocation" : true
}
}'
2) Now, run the command to MOVE all the shards in the 9201 instance to 9200.
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"move" :
{
"index" : "<NAME OF INDEX HERE>", "shard" : <SHARD NUMBER HERE>,
"from_node" : "<ID OF 9201 node>", "to_node" : "<ID of 9200 node>"
}
}
]
}'
You need to run this command for every shard in the 9201 instance (the one you wanna get rid of).
If you have ElasticSearch head, that shard will be purple, and will have "REALLOCATING" status. If you have lots of data, say > 1 GB, it will take awhile for the shard to move - perhaps up to a hour or even more, so be patient. Don't shutdown the instance/node until everything is done moving.
That's it!