Is there an efficient method to count specific class of keys on a Redis cluster?
Here, 'specific class of keys' means the keys that are used for a common purpose; for example, session keys. They can have a common key name prefix. There can be multiple classes. From now, I will refer the class of keys as simply the keys.
What I want to do is as follows:
Redis cluster must be used.
The keys must be distributed to the nodes of the Redis cluster.
There must be an efficient way to count the number of the keys on all of the nodes of the Redis cluster.
The keys can have TTL - that is, can expire.
The number of the nodes of the Redis cluster can be changed on runtime, and hash slots can be redistributed.
Clients are implemented using Node.js.
I've read the documentation, but could not find a proper solution.
Thanks in advance.
No, basically. That doesn't exist for "classic" (non-cluster), either. To do that without an additional storage mechanism, you would need to use SCAN repeatedly to iterate over the entire keyspace. Fortunately it does at least accept a filter (so you don't need to fetch every key), but is far from efficient - you'd typically only do this periodically as a review feature, not an operational feature. We actually include such a feature in "opserver"'s redis plugin.
When you switch to cluster, you'd need to repeat this but on one of each set of replication verticals. You would typically get that list via the CLUSTER commands, so the dynamic nature of the nodes is moot.
In both classic and cluster, it would be recommended to only do this on a replica - not the master. And again: only as an admin tool, not as a routine part of your system.
Do not use KEYS to do this. Prefer SCAN.
Related
So I just read about redlock. What I understood is that it needs 3 independent machines to work. By independent they mean that all the machines are masters and there is no replication amongst them, which means they are serving different types of data. So why would I need to lock a key present in three independent redis instances acting as masters ? What are the use cases where I would need to use redlock ?
So why would I need to lock a key present in three independent redis instances acting as masters?
It's not that you're locking a key within Redis. Rather, the key is the lock, and used to control access to some other resource. That other resource could be anything, and generally is something outside of Redis since Redis has its own mechanisms for allowing atomic access to its data structures.
What are the use cases where I would need to use redlock?
You would use a distributed lock when you want only one member at a time of a distributed system to do something.
To take a random example from the internet, here's Coinbase talking about their use of a distributed lock to "ensure that multiple processes do not concurrently generate and broadcast separate transactions to the network".
I'm using redis cluster 3.0.1.
I think redis cluster use consistent hashing. The hash slots are similar to virtual nodes in consistent hashing. Cassandra's data distribution is almost the same as redis cluster, and this article said it's consistent hashing.
But the redis cluster turorial said redis cluster does not use consistent hash.
What do I miss? Thanks.
You are right, virtual nodes is quite simalar with hash slot.
But virtual nodes is not an original concept of consistent hashing, but more like a trick used by Cassandra based on consistent hashing. So it's also ok for redis to say not using consistent hashing.
So, don't bother with phraseology.
Consistent hashing gives a lot of nice properties when it hashes servers into a ring:
servers are randomly distributed in the ring, good for balancing load in a cluster
add/remove a server only affect its neighbors, minimize data migration
However, I don't think you can control which key goes to which server: i.e. I can't do the following assignment:
key 1-99 ==> serverA
key 100 ==> serverB
// I can probably reach the same traffic split, 99:1
// by given more virtual nodes to serverA, but it won't guarantee
// key 1 and key 99 is served by the same machine
This is allowed in redis, redis uses hash slot, which I believe is an explicit map from hash value -> severs. This gives you full control, especially it enables multi-key transaction: i.e.
key Alice, key Bob ==> serverA
// move 100$ from Alice's bank account to Bob's in one operation
// no need special technique like 2 phase commit
The key -> server mapping is now managed by yourself as opposed to by consistent hashing, the drawback is that there are more work/responsibility for the admins, Redis also provides commends to help you with the management: rebalance, reshard
Disclaimer: this is my own understanding (here's my sources), I wish I can just #redis_dev on stackoverflow and let them proofread my answer
I'm using Redis in Cluster mode(6 nodes, 3 masters, and 3 slaves) and I'm using SE.Redis, However, commands with multiple keys in different hash slots are not supported as usual
so I'm using HashTags to be sure that certain key belongs to a particular hash slot using the {}. for example I have 2 keys like cacheItem:{1}, cacheItem:{94770}
I set those keys using ( each key in a separate request):
SEclient.Database.StringSet(key,value)
this works fine,
but now I want to query key1 and key2 which belongs to multiple hash slot
SEclient.Database.StringGet(redisKeys);
above will fail and throws an exception because those keys belong to multiple hash slots
while querying keys, I can't make sure that my keys will belong to the same hash slot,
this example is just 2 keys I have hundreds of keys which I want to query.
so I have following questions:
how can I query multiple keys when they belong to different hash slots?
what's the best practice to do that?
should I calculate hash slots on my side and then send individual requests per hash slot?
can I use TwemProxy for my scenario?
any helps highly appreciated
I can’t speak to SE.Redis, but you are on the right track. You either need to:
Make individual requests per key to ensure they go to the right cluster node, or...
Precalculate the shard + server each key belongs to, grouping by the host. Then send MGET requests with those keys to each host that owns them
Precalculating will require you (or your client) to know the cluster topology (hash slot owners) and the Redis key hashing method (don’t worry, it is simple and well documented) up front.
You can query cluster info from Redis to get owned slots.
The basic hashing algorithm is HASH_SLOT=CRC16 (key) mod 16384. Search around and you can find code for that for about any language 🙂 Remember that the use of hash tags makes this more complicated! See also: https://redis.io/commands/cluster-keyslot
Some Redis cluster clients will do this for you with internal magic (e.g. Lettuce in Java), but they are not all created equal 🙂
Also be aware that cluster topology can change at basically any time, and the above work is complicated. To be durable you’ll want to have retries if you get cross slot errors. Or you can just make many requests for single keys as it is much much simpler to maintain.
I have multiple redis instances. I made a cluster using different port. Now I want to transfer the data from pre-existing redis instances to the cluster. I know how to transfer data from one instance to the cluster but when the instances are greater than one, I am not able to do it.
You need to define some sort of sharding strategy for your redis cluster. Database Sharding So basically you need to have a certain consistent hashing strategy which will decide given a key, the shard or the redis instance in your cluster the key will go to. You need to have a certain script for this data migration that will have an array of all the redis instances in your cluster.
Then for a given key which you read from the standalone redis, you will use the hashing mechanism to find out the sharding index or the redis instance from the list you maintained earlier to use and accordingly you will write the data in that cluster node. My assumption in all this is that you have an in house redis cluster setup as opposed to the one which Redis Labs provide.
I am using 5 databases in my redis server. I want to evict keys belonging to a particular DB using LRU mechanism. Is it possible ?
I read this: how-to-make-redis-choose-lru-eviction-policy-for-only-some-of-the-keys.
But all my databases are using time to live for their entries. So cant use volatile-lru policy.
I tried volatile-ttl policy but other databases are having less ttl for their keys. So they will get evicted which I dont want.
That's one of the effects of using numbered/shared database - they all share the same configuration and resources. You should consider using separate Redis servers, one for each of your databases, to have better control over what gets evicted and when. Even more importantly, using dedicated instances allows you to better utilize the cores that you server has.