Best way to analyze ~500K keys in Redis?

Best way to analyze ~500K keys in Redis? - redis

I have a database system which processes read operations against logical objects that are identified by OIDs (object identifiers). Example is 1.2.4.3.544. There are users identified by GUIDs. API gets results from DB and puts them into Redis. Key example is
SomePrefix_<oid>1.2.4.3.544</oid>_somedetails_<user>1f0c6cfe-ee9d-472c-b320-190df55f9527</user>
There are about a couple of hundreds unique OIDs in the system and about a hundred registered users. Also keys could not have user part for anonymous requests. Number of keys could vary from 80K to.. I suppose 500k.
I need to provide per-OID and per-user statistics to the UI and implement the possibility to delete-per-OID (the same for user). So the task is splitted into two. First version I had implemented was unsuccessful - I used .Keys("*") method in c# app which turns into Redis SCAN * to get all keys to the app to run through them, collect and distinct OIDs/users and show them on the UI. This operation was taking too much time to extract all keys to the app so I switched to another approach - on every save/delete key I incremented/decremented counters stored in Redis in another DB. Counters are simply integers with keys. This was almost okay but I got a requirement to set TTL for every cached request, so I faced with a dilemma how to store/track my statistics to keep it up-to-date with actually stored keys. Options I think
A) run LUA script to scan all keys and get all unique OIDS and unique users with counts, and return them to app. For deletion option - run SCAN inside LUA script and DEL all keys matched with pattern. Pros - no need for separate stats tracking. Cons - need to scan for all keys on every call. I have Zabbix dashboard to show statistics requested through my app and it could be painful to scan keys on every N seconds. These are cons.
B) In separate Redis DB store keys with sets where key is OID (or for users - key is GUID) and inside set store keys to cached requests. But how I could delete them when keys to cached requests are expired? Can I somehow link a value stored in a set with a key from another DB and make this value disappear from a set when key expires?

Related

Is there a way to increment all fields in a Redis hash at once?

I want to use Redis to cache a large amount of highly dynamic counters. In my case users are subscribing to sources and I want to cache each user's counter for that source. When a new item arrives at the source it's natural that the counters for all users subscribed to this source should be incremented by 1. Some sources have hundreds of thousands of subscribers, so it's important that this happens immediately.
However, Redis doesn't have a native method to increment all hash fields at once, only HINCRBY. Is there a solution for this?
While searching, I stumbled upon some threads where people wanted a bulk version of HINCRBY, but my case is different. I want to increment all fields in the hash.

Such operation should be achieved by using LUA script see:
https://redis.io/commands/evalsha
https://redis.io/commands/eval
You can use LUA script to "batch" couple of operations on a single hash

Query multiple keys in Redis in Cluster mode

I'm using Redis in Cluster mode(6 nodes, 3 masters, and 3 slaves) and I'm using SE.Redis, However, commands with multiple keys in different hash slots are not supported as usual
so I'm using HashTags to be sure that certain key belongs to a particular hash slot using the {}. for example I have 2 keys like cacheItem:{1}, cacheItem:{94770}
I set those keys using ( each key in a separate request):
SEclient.Database.StringSet(key,value)
this works fine,
but now I want to query key1 and key2 which belongs to multiple hash slot
SEclient.Database.StringGet(redisKeys);
above will fail and throws an exception because those keys belong to multiple hash slots
while querying keys, I can't make sure that my keys will belong to the same hash slot,
this example is just 2 keys I have hundreds of keys which I want to query.
so I have following questions:
how can I query multiple keys when they belong to different hash slots?
what's the best practice to do that?
should I calculate hash slots on my side and then send individual requests per hash slot?
can I use TwemProxy for my scenario?
any helps highly appreciated

I can’t speak to SE.Redis, but you are on the right track. You either need to:
Make individual requests per key to ensure they go to the right cluster node, or...
Precalculate the shard + server each key belongs to, grouping by the host. Then send MGET requests with those keys to each host that owns them
Precalculating will require you (or your client) to know the cluster topology (hash slot owners) and the Redis key hashing method (don’t worry, it is simple and well documented) up front.
You can query cluster info from Redis to get owned slots.
The basic hashing algorithm is HASH_SLOT=CRC16 (key) mod 16384. Search around and you can find code for that for about any language 🙂 Remember that the use of hash tags makes this more complicated! See also: https://redis.io/commands/cluster-keyslot
Some Redis cluster clients will do this for you with internal magic (e.g. Lettuce in Java), but they are not all created equal 🙂
Also be aware that cluster topology can change at basically any time, and the above work is complicated. To be durable you’ll want to have retries if you get cross slot errors. Or you can just make many requests for single keys as it is much much simpler to maintain.

remove all matching keys

I want to remove all keys matching SomePrefix* from my Redis. Is it possible ?
I see only m_connectionMultiplexer.GetDatabase().KeyDelete() but not KeyMatch() or GetAllKeys() within the library.
Preferably not Lua scripting such as link by Leonid Beschastny
I want to use that on initialization of web application for development state of the application.

SE.Redis directly mimics the features exposed by the server. The server does not have a "delete keys matching this pattern" feature. It does have "scan for keys matching this pattern" (via GetServer().GetKeys(...)), and it has "delete this key / these keys" (via GetDatabase.KeyDelete(...)). You could iterate in batches over the matching keys, deleting each batch in turn. Because you can pipeline requests, you don't pay latency per batch.
As an alternative implementation: partition the data by numeric database (select) or server, and use flushdb / flush.

Redis Keyspace Notifications and Key Expiration

The documentation about Redis Keyspace Notifications http://redis.io/topics/notifications says near its end, that a key with timeout is removed from the database
"When the key is accessed by a command and is found to be expired."
..
Question: Is it enough to retrieve the very key, e.g. via KEYS *, or do I have to access the content the key refers to?
Background: The second process I omitted (the .. above) is a probabilistic process, and the real deletion of an expired key may be delayed, and thus the delivery of the EXPIRED event. I want to ensure the notification is given to a subscriber, so just accessing the keys would be easiest.

Redis implements a logic of periodic checking of keys for expiry and picks a number (100) of keys and checks them for their expiry.
What I understand is that your concerned with the fact that with above logic there would exist events which belong to expired keys which have not been deleted.
To avoid such a case checking the keys just for existence would delete them. Cost of REDIS calls should be kept in mind and hence a LUA script or bulk command should be designed which is invoked periodically and iterates a list of keys and run EXISTS command on them and cause automatic delete if they are expired.
To test this you would need a large dataset.

How to list all Memcachier's keys on Heroku?

Is it possible to list all memcachier keys of a Rails app? My app used just 3 keys and there are more than 30 on Memcachier app's page.
Thanks

Use this script: https://gist.github.com/bkimble/1365005
And we can used in our apps~

You can't list all keys in memcached. memcached is a cache, not a database– if you need to consistently retrieve all keys, then memcached is probably not the tool you want to use.
With that in mind, 2 things:
It's actually possible to retrieve the first meg or so of keys: http://www.darkcoding.net/software/memcached-list-all-keys/ . Your prod server should not depend on this.
You could setup a system where you keep a key in memcached (named for example index), that has for its value a list of all the keys stored. Every time you're adding/deleting a key, you would also update index's list of keys. You can just retrieve index to get a list of all the keys. However, keep in mind that memcached can evict keys before they expire, so your app shouldn't rely on this technique for critical stuff.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas