I'm under the impression that one should hash (i.e. sha3) their Redis key before adding data to it. (It might have even been with regard to memcache.) I don't remember why I have this impression or where it came from but I can't find anything to validate (or refute) it. The reasoning was the hash would help with even distribution across a cluster.
When using Redis (in either/both clustered and non-clustered modes) is it best pracatice to hash the key before calling SET? e.g. set(sha3("username:123"), "owlman123")
No, you shouldn't hash the key. Redis Cluster hashes the key itself for the purpose of choosing the node:
There are 16384 hash slots in Redis Cluster, and to compute what is the hash slot of a given key, we simply take the CRC16 of the key modulo 16384.
You can also use hash tags to control which keys share the same slot.
It might be a good idea if your keys are very long, as recommended in the official documentation:
A few other rules about keys:
Very long keys are not a good idea. For instance a key of 1024 bytes is a bad idea not only memory-wise, but also because the lookup of the key in the dataset may require several costly key-comparisons. Even when the task at hand is to match the existence of a large value, hashing it (for example with SHA1) is a better idea, especially from the perspective of memory and bandwidth.
source: https://redis.io/docs/data-types/tutorial/#keys
Related
I may have 100mi long, but partially static keys like:
someReal...LongStaticPrefix:12345
someReal...LongStaticPrefix:12
someReal...LongStaticPrefix:123456
Where only the last part of the key is dynamic, the rest is static.
Does Redis keep all keys long or does it make an internal alias or something like that?
Should I worry about storage or performance?
Or is it better if I make internal alias do the keys to keep them short?
Redis does keep the whole key. This long prefix will impact your memory usage.
Given redis uses a hash map to store the keys, the performance impact is low. Hash map load factor is usually between 0.5 and 1. This means usually there is just one or two keys per hash slot. So the performance impact is the extra network payload for the long key, the longer effort to hash it, and the longer comparison in the hash slot with one or two keys. It's likely negligible unless your prefix is really really long
Consider a shorter key prefix.
Before considering using a hash structure (HSET), consider if you are using redis cluster or if you may need to eventually. A single hash key cannot be sharded.
A minor optimization : Consider using a suffix for faster compares at the hash slot chain
Does Redis rehash the key under the hood?
i.e. do I need to hash my key before writing it to Redis?
for example:
Normally I would do:
redis.put(rehash(key), value)
Is it really necessary?
Redis uses the CRC16 algorithm to map keys to hash slots. There are 16384 hash slots in Redis Cluster, and to compute what is the hash slot of a given key, we simply take the CRC16 of the key modulo 16384.
There is no need for rehashing the key before writing to Redis, Redis does it for you.
In general, a hash function maps keys to small integers (buckets). An ideal hash function maps the keys to the integers in a random-like manner, so that bucket values are evenly distributed even if there are regularities in the input data.
CRC16 can ensure an evenly load on the nodes concerning the amount of keys i.e. a uniform distrubution.
What are the different key types supported by Redis? The documentation mentions all the various types (strings, set, hashmap, etc) of values supported by Redis, but I couldn't quiet find the key type information.
From redis documentation (Data types intro):
Redis keys
Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content
of a JPEG file. The empty string is also a valid key. A few other
rules about keys:
Very long keys are not a good idea. For instance a key of 1024 bytes is a bad idea not only memory-wise, but also because the
lookup of the key in the dataset may require several costly
key-comparisons. Even when the task at hand is to match the
existence of a large value, hashing it (for example with SHA1) is a
better idea, especially from the perspective of memory and
bandwidth.
Very short keys are often not a good idea. There is little point in writing "u1000flw" as a key if you can instead write
"user:1000:followers". The latter is more readable and the added
space is minor compared to the space used by the key object itself
and the value object. While short keys will obviously consume a bit
less memory, your job is to find the right balance.
Try to stick with a schema. For instance "object-type:id" is a good idea, as in "user:1000". Dots or dashes are often used for multi-word
fields, as in "comment:1234:reply.to" or "comment:1234:reply-to".
The maximum allowed key size is 512 MB.
From my experience any binary sequence typically means a String, but I may not be familiar with languages where you can achieve this by using other data types.
Keys in Redis are all strings, so it doesn't really matter what kind of value you pass into a client. Under-the-hood the RESP protocol is used and it will pass the value as a string to the engine.
Example:
ZADD some_key 1 some_value
some_key is always a string, even if you pass 3 as key, it is handled as a string. This is true for every client.
I am writing a JAR file that fetches large amount of data from Oracle db and stores in Redis. The details are properly stored, but the set key and hash key I have defined in the jar are getting limited in redis db. There should nearly 200 Hash and 300 set keys. But, I am getting only 29 keys when giving keys * in redis. Please help on how to increase the limit of the redis memory or hash or set key storage size.
Note: I changed the
hash-max-zipmap-entries 1024
hash-max-zipmap-value 64
manually in redis.conf file. But, its not reflecting. Anywhere it needs to be changed?
There is no limit about the number of set or hash keys you can put in a Redis instance, except for the size of the memory (check the maxmemory, and maxmemory-policy parameters).
The hash-max-zipmap-entries parameter is completely unrelated: it only controls memory optimization.
I suggest using a MONITOR command to check which queries are sent to the Redis instance.
hash-max-zipmap-value keeps the hash key value pair system in redis optimized as the searching for the keys in these hashes follow an amortized N and therefore longer keys will in turn increase the latency of the system.
These settings are available in redis.conf.
If one enters keys more then the specified number then the hash key value pair will be converted to basic key value pair structure internally and thereby will not be able to provide the advantage in memory which hashes provide so.
Does using a GUID or ulong key impact Redis DB performance?
Similar: Does name length impact performance in Redis?
This question is an old one, but other answers are a bit misleading. Eric's answer is totally unrelated to Redis. Pfreixes's answer is based on personal assumptions and is simply wrong.
In fact, it's fairly safe to use GUID keys (performance-wise) as even 300+ character keys don't affect performance significantly on O(1) operations. Check this benchmark: Does name length impact performance in Redis?.
GUID typically has a length of 32-36 chars, if you're using hex representation. As Evan Carrol noticed in comments, Redis strings are binary safe, so you can use binary value and reduce key size down to 128 bits (16 chars). Keys with such length won't hurt performance at all.
Also, documentation suggests to use hashing functions for really large keys: http://redis.io/topics/data-types-intro
Redis use a hash strategy to store all keys, every key is stored using a hash function. All Redis db peformance about keys fall into this function - or something related.
Original key is also stored to figure out future colisions between diferent keys, and yes big keys could be impact at memory handle and all of related fields : memory fragmentation, cache hits/misses, etc ...