When I create an Apache Ignite cache I believe it is implemented as a HashMap
Is it possible to configure it as a TreeMap, so that the order of the keys is guaranteed?
I don't think this is possible since at different times keys may belong on different nodes so iteration order will depend on cluster topology.
If cache in question is small, you can use ORDER BY _key together with SQL.
Related
Problem
Need to implement a bi-directional key value store. Cache entry is populated into the cache using a key (String) mapping to a value(Long). However, the the cache needs to handle both key->value and value->key lookup. What is the most efficient way to do this?
Note: value->key mapping does not need to be synchronized with key->value in real time, but cannot fall behind key->value mapping for too long (10-15 seconds max).
Naive Ignite Implementation
Use ignite continuous query to monitor changes, upon modification, update another (value->key) cache
Are there better way of achieving this goal? Is ignite the wrong tool to use?
You can use SQL queries and have secondary index on value. Your proposed approach should also work.
Redis 4.0
Keys Command can list all required pattern keys
Memory Usage [key] can return the key memory
How to use them together to get sum of the used memory for that pattern keys
You'd have to implement that logic using any language you're most comfortable with. In pseudo code:
Get all key names using KEYS
For each key, get its MEMORY USAGE
Sum up the numbers
Note: don't use KEYS in production, use SCAN.
As #Itamar pointed out, do not use keys <pattern> on production as this command does a complete scan on all the keys in the redis server. This query will degrade the redis performance and almost all of the redis queries will take considerate amount of time (as redis is a single threaded application).
The thing you want to achieve can be achieved via creating a Lua script. Though I would recommend not to use custom solutions, there exists dashboards (like zabbix) for monitoring redis and memory usage.
Is there an efficient method to count specific class of keys on a Redis cluster?
Here, 'specific class of keys' means the keys that are used for a common purpose; for example, session keys. They can have a common key name prefix. There can be multiple classes. From now, I will refer the class of keys as simply the keys.
What I want to do is as follows:
Redis cluster must be used.
The keys must be distributed to the nodes of the Redis cluster.
There must be an efficient way to count the number of the keys on all of the nodes of the Redis cluster.
The keys can have TTL - that is, can expire.
The number of the nodes of the Redis cluster can be changed on runtime, and hash slots can be redistributed.
Clients are implemented using Node.js.
I've read the documentation, but could not find a proper solution.
Thanks in advance.
No, basically. That doesn't exist for "classic" (non-cluster), either. To do that without an additional storage mechanism, you would need to use SCAN repeatedly to iterate over the entire keyspace. Fortunately it does at least accept a filter (so you don't need to fetch every key), but is far from efficient - you'd typically only do this periodically as a review feature, not an operational feature. We actually include such a feature in "opserver"'s redis plugin.
When you switch to cluster, you'd need to repeat this but on one of each set of replication verticals. You would typically get that list via the CLUSTER commands, so the dynamic nature of the nodes is moot.
In both classic and cluster, it would be recommended to only do this on a replica - not the master. And again: only as an admin tool, not as a routine part of your system.
Do not use KEYS to do this. Prefer SCAN.
I have a very large set of keys, 200M keys, with small values, <100 bytes, to store and I'm trying to use Redis. The problem is such that I have 10 Redis DB to split the keys over, but currently I'm on a single server with those 10 Redis DB. By a Redis DB I mean using SELECT. From my calculations it looks like I'm going to blow out memory. I think I'll need over 4TB of memory for this case! What are my options? First, my calculation is based on 10000 keys with 100 byte values taking 220MB of RAM (this is from a table I found). So simply put (2*10^8 / 10^4) * 220MB = 4.4TB.
If my calculation looks correct, what are my options? I've read on different posts that Redis VM is no longer an option. Can I use a Redis cluster? This still appears to require too many servers to be practical. I understand I could switch to another DB, but I'd like that to be the last resort option.
Firstly, using shared databases (i.e. the SELECT command) isn't a recommended practice since all of these databases are essentially managed by the same Redis process. It is preferable having 10 separate Redis processes (even on the same server) in order to avoid contention (more info here).
Next, there are ways to reduce the memory footprint of your database. You could, for example, perform client-side compression (see here) or consider other optimizations such as using Hashes to keep multiple values (as described here).
That said, a Redis server is ultimately bound by the amount of RAM that the host provides. Once you've reached that limit you'll need to shard your database and use a Redis cluster. Since you're already using multiple databases this shouldn't pose a big challenge as your code should already be compatible with that to a degree. Sharding can be done in one of three approaches: client, proxy or Redis Cluster. Client-side sharding can be implemented in your code or by the Redis client that you're using (if the client library that you're using supports that). Redis Cluster (v3) is expected to be released in the very near future and already has a stable release candidate. As for proxy-based sharding, there are several open source solutions out there, including Twitter's twemproxy, Netflix's dynomite and codis. Additional information about sharding and partitioning can be found here.
Disclaimer: I work at Redis Labs. Lastly, AFAIK there's only one Redis-as-a-Service provider that already provides built-in support for clustering Redis. Redis Labs' Redis Cloud is a fully-managed service that can scale seamlessly to any required capacity. Our clusters support both the '{}' hashtag standard as well as sharding by RegEx - more about this can be found here.
You can use LMDB with Dynomite to store data beyond your memory capacity. LMDB uses both disk and memory to store data. Dynomite make LMDB to be distributed.
We have done a POC with this combo and they work nicely together.
For more information, please check out our open issue here:
https://github.com/Netflix/dynomite/issues/254
I am new to redis. I need to know how sorting, intersection and other aggregate operations happen across shards. Is it possible to perform such operations?
Redis won't transparently handle this for you. You would need to basically retrieve the results from each shard and then reassemble them (assuming your search is on something other than the sharded key). Some libraries out there make sharding easier (see predis https://github.com/nrk/predis) with Redis. Basically, though, what you would do is run the query against all the shards, bring back the results and then merge the results, sort, intersect, aggregate, etc.
You may want to keep an eye on the Redis cluster project http://redis.io/topics/cluster-spec as it might provide what you want to achieve without sharding, although it is only in development at the moment.
Finally, you should also be aware that sharding does not provide any type of redundancy or facility to rebalance a shard when you add/remove new nodes. If a shard is gone, you lose all the data in that shard. This is fine if you are using Redis as a cache and not as the final authoritative store of your data. Because of this, make sure you consider master-slave replication of each shard as well.