I'm using Redis in Cluster mode(6 nodes, 3 masters, and 3 slaves) and I'm using SE.Redis, However, commands with multiple keys in different hash slots are not supported as usual
so I'm using HashTags to be sure that certain key belongs to a particular hash slot using the {}. for example I have 2 keys like cacheItem:{1}, cacheItem:{94770}
I set those keys using ( each key in a separate request):
SEclient.Database.StringSet(key,value)
this works fine,
but now I want to query key1 and key2 which belongs to multiple hash slot
SEclient.Database.StringGet(redisKeys);
above will fail and throws an exception because those keys belong to multiple hash slots
while querying keys, I can't make sure that my keys will belong to the same hash slot,
this example is just 2 keys I have hundreds of keys which I want to query.
so I have following questions:
how can I query multiple keys when they belong to different hash slots?
what's the best practice to do that?
should I calculate hash slots on my side and then send individual requests per hash slot?
can I use TwemProxy for my scenario?
any helps highly appreciated
I can’t speak to SE.Redis, but you are on the right track. You either need to:
Make individual requests per key to ensure they go to the right cluster node, or...
Precalculate the shard + server each key belongs to, grouping by the host. Then send MGET requests with those keys to each host that owns them
Precalculating will require you (or your client) to know the cluster topology (hash slot owners) and the Redis key hashing method (don’t worry, it is simple and well documented) up front.
You can query cluster info from Redis to get owned slots.
The basic hashing algorithm is HASH_SLOT=CRC16 (key) mod 16384. Search around and you can find code for that for about any language 🙂 Remember that the use of hash tags makes this more complicated! See also: https://redis.io/commands/cluster-keyslot
Some Redis cluster clients will do this for you with internal magic (e.g. Lettuce in Java), but they are not all created equal 🙂
Also be aware that cluster topology can change at basically any time, and the above work is complicated. To be durable you’ll want to have retries if you get cross slot errors. Or you can just make many requests for single keys as it is much much simpler to maintain.
Related
Redis 4.0
Keys Command can list all required pattern keys
Memory Usage [key] can return the key memory
How to use them together to get sum of the used memory for that pattern keys
You'd have to implement that logic using any language you're most comfortable with. In pseudo code:
Get all key names using KEYS
For each key, get its MEMORY USAGE
Sum up the numbers
Note: don't use KEYS in production, use SCAN.
As #Itamar pointed out, do not use keys <pattern> on production as this command does a complete scan on all the keys in the redis server. This query will degrade the redis performance and almost all of the redis queries will take considerate amount of time (as redis is a single threaded application).
The thing you want to achieve can be achieved via creating a Lua script. Though I would recommend not to use custom solutions, there exists dashboards (like zabbix) for monitoring redis and memory usage.
I'm using redis cluster 3.0.1.
I think redis cluster use consistent hashing. The hash slots are similar to virtual nodes in consistent hashing. Cassandra's data distribution is almost the same as redis cluster, and this article said it's consistent hashing.
But the redis cluster turorial said redis cluster does not use consistent hash.
What do I miss? Thanks.
You are right, virtual nodes is quite simalar with hash slot.
But virtual nodes is not an original concept of consistent hashing, but more like a trick used by Cassandra based on consistent hashing. So it's also ok for redis to say not using consistent hashing.
So, don't bother with phraseology.
Consistent hashing gives a lot of nice properties when it hashes servers into a ring:
servers are randomly distributed in the ring, good for balancing load in a cluster
add/remove a server only affect its neighbors, minimize data migration
However, I don't think you can control which key goes to which server: i.e. I can't do the following assignment:
key 1-99 ==> serverA
key 100 ==> serverB
// I can probably reach the same traffic split, 99:1
// by given more virtual nodes to serverA, but it won't guarantee
// key 1 and key 99 is served by the same machine
This is allowed in redis, redis uses hash slot, which I believe is an explicit map from hash value -> severs. This gives you full control, especially it enables multi-key transaction: i.e.
key Alice, key Bob ==> serverA
// move 100$ from Alice's bank account to Bob's in one operation
// no need special technique like 2 phase commit
The key -> server mapping is now managed by yourself as opposed to by consistent hashing, the drawback is that there are more work/responsibility for the admins, Redis also provides commends to help you with the management: rebalance, reshard
Disclaimer: this is my own understanding (here's my sources), I wish I can just #redis_dev on stackoverflow and let them proofread my answer
Is there an efficient method to count specific class of keys on a Redis cluster?
Here, 'specific class of keys' means the keys that are used for a common purpose; for example, session keys. They can have a common key name prefix. There can be multiple classes. From now, I will refer the class of keys as simply the keys.
What I want to do is as follows:
Redis cluster must be used.
The keys must be distributed to the nodes of the Redis cluster.
There must be an efficient way to count the number of the keys on all of the nodes of the Redis cluster.
The keys can have TTL - that is, can expire.
The number of the nodes of the Redis cluster can be changed on runtime, and hash slots can be redistributed.
Clients are implemented using Node.js.
I've read the documentation, but could not find a proper solution.
Thanks in advance.
No, basically. That doesn't exist for "classic" (non-cluster), either. To do that without an additional storage mechanism, you would need to use SCAN repeatedly to iterate over the entire keyspace. Fortunately it does at least accept a filter (so you don't need to fetch every key), but is far from efficient - you'd typically only do this periodically as a review feature, not an operational feature. We actually include such a feature in "opserver"'s redis plugin.
When you switch to cluster, you'd need to repeat this but on one of each set of replication verticals. You would typically get that list via the CLUSTER commands, so the dynamic nature of the nodes is moot.
In both classic and cluster, it would be recommended to only do this on a replica - not the master. And again: only as an admin tool, not as a routine part of your system.
Do not use KEYS to do this. Prefer SCAN.
I'm new in Redis and use Redis 2.8 with StackExchange.Redis Libarary.
How can I write a KEYS pattern to get all keys with specific Hashed member value?
As I use StackExchange.Redis and want to get Keys with a pattern like this (when username is a member for a key): KEYS "username:*AAA*".
database.HashKeys("suggest me a pattern :) ")
I will call this method many times on HTTP user request to find out user's session data stored in Redis database, do you suggest a better alternative solution for this approach?
This simply isn't a direct fit for any redis features. You certainly shouldn't use KEYS for this - in addition to being expensive (you should prefer SCAN, btw), that scans the keys, not the values.
the question refers to the sharded configuration of redis. I have implemented a small test application in Java which creates 100.000 user hashes over Jedis in the form of user:userID. Each hash has the elements: name, phone, department, userID. I also created simple key-value pairs with the key phone:phone number which contains the userID whose phone number is the ID and sets for each department with the userIDs who work for that particular department. The two latter types I use only for seaching. These structures and the search are similar to Searching in values of a redis db.
In short, the data structures:
user:userID->{name, department, phone, userID}
department:department->([userID1, userID2,....])
phone:phone->userID
Use cases for the search:
access to user-hashes based on key i.e. userID
search for users with a phone number
search for all users of a department
Everything works all right in the single instance and sharded configuration but I would have the following questions:
In the single instance configuration it is possible to look for phone
number with a wide card e.g. with the KEY method but this is not
available in the sharded configuration. How would it be possible to
look for keys whose first part is known?
The user ID is generated from a zset whose score is increased for
userID. This can be doen in a transaction for the single instance
configuration but transactions seem not to be supported for sharding
configurations over jedis even if the participating keys are on the
same instance. How would it be possible to solve this problem if
multiple client threads can also do the user creation?
Thank you for your responses also in advance.
For the 1st part of your question:
There is no magic here, if you want to search across all your shards, you have to iterate over all shards. Jedis don't have this method, but you could extend ShardedJedis to add it (untested):
public Set<String> keys(String pattern) {
HashSet<String> found = new HashSet<String>();
for (Jedis jedis : getAllShards()) {
found.addAll(jedis.keys(pattern));
}
return found;
}
For the 2nd part of your question:
AFAIK, Jedis doesn't support transactions when using Shards, even if you do force the related keys to be on the same shard (see Jedis Advanced Usage).
This same link suggest a workaround that may apply for a few scenarios:
Mixed approach
If you want easy load distribution of ShardedJedis, but still need
transactions/pipelining/pubsub etc, you can also mix the normal and
the sharded approach: define a master as normal Jedis, the others as
sharded Jedis. Then make all the shards slaveof master. In your
application, direct your write requests to the master, the read
requests to ShardedJedis. Your writes don't scale anymore, but you
gain good read distribution, and you have
transactions/pipelining/pubsub simply using the master. Dataset should
fit in RAM of master. Remember that you can improve performance of the
master a lot, if you let the slaves do the persistance for the master!