Are smaller redis keys more efficient - redis

I am working on implementing a page cache using Redis. It works, but i am currently using the url as the Redis key. Hashing will of course cost me CPU time, but it will make the key smaller and less complicated. Will smaller, hashed redis keys outperform a key that is based off a page url?

Related

Fetch keys with matched pattern

I have a Spring boot application which is connected to Redis.
I want to perform a redis operation to fetch the keys which matches certain pattern.
I understand this can be achieved in multiple ways
Redis Template and Keys command : But its not suitable to be used on large data sets . As it may block the client (not the server) for long time and also exhaust the server memory due to the response buffer size.
Redis Template and Scan command : Redis docs recommends to use scan command in comparison to Keys. As it does the scanning iteratively which makes faster smaller operations and better on server resources.
Spring Data Redis Repository : Fetch all by creating a hash on the pattern in the Redis Entity.
But i am not sure which will give me overall faster performance to fetch all the matched keys under high load and would be recommended for my scenario.
Best Regards,
Saurav
Redis is single-threaded so traversing all keys on a large dataset (a large number of keys) may block the server for a long time (even several seconds). And so it is not advised to run 'Keys' in production at all.
The scan operation is built to run iteratively but you should note that you might get the same key more than once and also there is a chance that some keys will not be returned. Overall your system will run faster with Scan.

Is there an extra cost to cache misses on Redis

Is there an advantage to set a default value for an entry that will be heavily queried in Redis or will querying for the unset key take the same time?
Given the keys are stored in a distributed hash, it will have to check that the key is not in the bucket before returning on a miss, which may be a bit slower than finding and stopping at a hit. Is the bucket sorted of linear? Does anything else make it slower either way?
Redis is setup in a cluster and has many million entries in this case.
I'm assuming you're just talking about strings & hashes here here (so the only operations you care about are set/get, maybe hget/hset) - From Redis' perspective, a cache hit and cache miss have the same time complexity, if anything, a cache miss will be faster because redis will not have to transfer any data back over the socket to your app.

efficiency of hashing long string redis key before storing

tldr version:
I have a long encoded json payload that I store on redis as a key. I would like to know if hashing the key before storing will improve lookup performance and which hashing algorithm is recommended (I'm considering md5/sha1).
p/s i'm using python for code
other notes:
ttl for key is short (30 secs) hence I don't care about hash collision
I only need to check if key exists in redis
long story version:
I have a stream of transactions in json that are encoded in protobuf flowing to my application via a message queue at a high rate. I run worker nodes that read the data from the queue and process the data. However I realized that there were instances that duplicates were being sent.
my solution was to store the data in a "global" cache (redis) where my workers would check before attempting to process. as the flow rate is high, decoding the data and reading it is expensive hence i'm storing the strings whole.
transactions expire after 30s so i use a ttl of 30s.
therefore i'm wondering if hashing the strings before storing them would be a good idea as i only need to check for existance
thanks for reading
You really don't need a cryptographic hash. You want the fastest cryptographic algorithm that is good at collision avoidance.
Here is a good discussion of various options.
Fastest hash for non-cryptographic uses?
The Redis documentation discusses optimal key size here:
https://redis.io/topics/data-types-intro under the section "Redis keys"

Redis Combined Keys and Memory Usage command

Redis 4.0
Keys Command can list all required pattern keys
Memory Usage [key] can return the key memory
How to use them together to get sum of the used memory for that pattern keys
You'd have to implement that logic using any language you're most comfortable with. In pseudo code:
Get all key names using KEYS
For each key, get its MEMORY USAGE
Sum up the numbers
Note: don't use KEYS in production, use SCAN.
As #Itamar pointed out, do not use keys <pattern> on production as this command does a complete scan on all the keys in the redis server. This query will degrade the redis performance and almost all of the redis queries will take considerate amount of time (as redis is a single threaded application).
The thing you want to achieve can be achieved via creating a Lua script. Though I would recommend not to use custom solutions, there exists dashboards (like zabbix) for monitoring redis and memory usage.

Query multiple keys in Redis in Cluster mode

I'm using Redis in Cluster mode(6 nodes, 3 masters, and 3 slaves) and I'm using SE.Redis, However, commands with multiple keys in different hash slots are not supported as usual
so I'm using HashTags to be sure that certain key belongs to a particular hash slot using the {}. for example I have 2 keys like cacheItem:{1}, cacheItem:{94770}
I set those keys using ( each key in a separate request):
SEclient.Database.StringSet(key,value)
this works fine,
but now I want to query key1 and key2 which belongs to multiple hash slot
SEclient.Database.StringGet(redisKeys);
above will fail and throws an exception because those keys belong to multiple hash slots
while querying keys, I can't make sure that my keys will belong to the same hash slot,
this example is just 2 keys I have hundreds of keys which I want to query.
so I have following questions:
how can I query multiple keys when they belong to different hash slots?
what's the best practice to do that?
should I calculate hash slots on my side and then send individual requests per hash slot?
can I use TwemProxy for my scenario?
any helps highly appreciated
I can’t speak to SE.Redis, but you are on the right track. You either need to:
Make individual requests per key to ensure they go to the right cluster node, or...
Precalculate the shard + server each key belongs to, grouping by the host. Then send MGET requests with those keys to each host that owns them
Precalculating will require you (or your client) to know the cluster topology (hash slot owners) and the Redis key hashing method (don’t worry, it is simple and well documented) up front.
You can query cluster info from Redis to get owned slots.
The basic hashing algorithm is HASH_SLOT=CRC16 (key) mod 16384. Search around and you can find code for that for about any language 🙂 Remember that the use of hash tags makes this more complicated! See also: https://redis.io/commands/cluster-keyslot
Some Redis cluster clients will do this for you with internal magic (e.g. Lettuce in Java), but they are not all created equal 🙂
Also be aware that cluster topology can change at basically any time, and the above work is complicated. To be durable you’ll want to have retries if you get cross slot errors. Or you can just make many requests for single keys as it is much much simpler to maintain.