Redis 4.0
Keys Command can list all required pattern keys
Memory Usage [key] can return the key memory
How to use them together to get sum of the used memory for that pattern keys
You'd have to implement that logic using any language you're most comfortable with. In pseudo code:
Get all key names using KEYS
For each key, get its MEMORY USAGE
Sum up the numbers
Note: don't use KEYS in production, use SCAN.
As #Itamar pointed out, do not use keys <pattern> on production as this command does a complete scan on all the keys in the redis server. This query will degrade the redis performance and almost all of the redis queries will take considerate amount of time (as redis is a single threaded application).
The thing you want to achieve can be achieved via creating a Lua script. Though I would recommend not to use custom solutions, there exists dashboards (like zabbix) for monitoring redis and memory usage.
Related
I have a Spring boot application which is connected to Redis.
I want to perform a redis operation to fetch the keys which matches certain pattern.
I understand this can be achieved in multiple ways
Redis Template and Keys command : But its not suitable to be used on large data sets . As it may block the client (not the server) for long time and also exhaust the server memory due to the response buffer size.
Redis Template and Scan command : Redis docs recommends to use scan command in comparison to Keys. As it does the scanning iteratively which makes faster smaller operations and better on server resources.
Spring Data Redis Repository : Fetch all by creating a hash on the pattern in the Redis Entity.
But i am not sure which will give me overall faster performance to fetch all the matched keys under high load and would be recommended for my scenario.
Best Regards,
Saurav
Redis is single-threaded so traversing all keys on a large dataset (a large number of keys) may block the server for a long time (even several seconds). And so it is not advised to run 'Keys' in production at all.
The scan operation is built to run iteratively but you should note that you might get the same key more than once and also there is a chance that some keys will not be returned. Overall your system will run faster with Scan.
I have several databases in Redis and I want to determine how much each of them takes up in RAM individually.
In Redis documentation there is a command INFO https://redis.io/commands/info but it gives out information on the occupied memory entirely, without splitting into existing databases.
There are also libraries like https://github.com/snmaynard/redis-audit, which essentially exploit the KEYS * command, and then manual calculation.
Is there a built-in ability in Redis to monitor the current value of the memory occupied by each database separately, or maybe there are libraries for this, that are not based on the KEYS * command?
Redis do not offer the separate database memory used. Usually, we attention more on all memory used because of performance.
If your keys and values have less difference in each database, you can use DBSIZE command to sum each database keys, without KEYS * command.
Do not forget SELECT n before DBSIZE command.
I have some scripts that touch a handful of keys. What happens if Elasticache decides to reshard while my script is running? Will it wait for my script to complete before it moves the underlying keys? Or should I assume that it is not the case and design my application with this edge case in mind?
One example would be a script that increment 2 keys at once. I could receive a "cluster error" which means something went wrong and I have to execute my script again (and potentially end up with one key being incremented twice and the other once)
Assuming you are talking about a Lua script, for as long as you're passing the keys in the arguments (and not hardcoded in the script) you should be good. It will be all or nothing. If you are not using a Lua script - consider doing so
From EVAL command:
All Redis commands must be analyzed before execution to determine
which keys the command will operate on. In order for this to be true
for EVAL, keys must be passed explicitly. This is useful in many ways,
but especially to make sure Redis Cluster can forward your request to
the appropriate cluster node.
From AWS ElastiCache - Best Practices: Online Cluster Resizing:
During resharding, we recommend the following:
Avoid expensive commands – Avoid running any computationally and I/O
intensive operations, such as the KEYS and SMEMBERS commands. We
suggest this approach because these operations increase the load on
the cluster and have an impact on the performance of the cluster.
Instead, use the SCAN and SSCAN commands.
Follow Lua best practices – Avoid long running Lua scripts, and always
declare keys used in Lua scripts up front. We recommend this approach
to determine that the Lua script is not using cross slot commands.
Ensure that the keys used in Lua scripts belong to the same slot.
Is there an efficient method to count specific class of keys on a Redis cluster?
Here, 'specific class of keys' means the keys that are used for a common purpose; for example, session keys. They can have a common key name prefix. There can be multiple classes. From now, I will refer the class of keys as simply the keys.
What I want to do is as follows:
Redis cluster must be used.
The keys must be distributed to the nodes of the Redis cluster.
There must be an efficient way to count the number of the keys on all of the nodes of the Redis cluster.
The keys can have TTL - that is, can expire.
The number of the nodes of the Redis cluster can be changed on runtime, and hash slots can be redistributed.
Clients are implemented using Node.js.
I've read the documentation, but could not find a proper solution.
Thanks in advance.
No, basically. That doesn't exist for "classic" (non-cluster), either. To do that without an additional storage mechanism, you would need to use SCAN repeatedly to iterate over the entire keyspace. Fortunately it does at least accept a filter (so you don't need to fetch every key), but is far from efficient - you'd typically only do this periodically as a review feature, not an operational feature. We actually include such a feature in "opserver"'s redis plugin.
When you switch to cluster, you'd need to repeat this but on one of each set of replication verticals. You would typically get that list via the CLUSTER commands, so the dynamic nature of the nodes is moot.
In both classic and cluster, it would be recommended to only do this on a replica - not the master. And again: only as an admin tool, not as a routine part of your system.
Do not use KEYS to do this. Prefer SCAN.
I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254