Is there any number limitations of field for Redis command HMSET? - redis

What's the max number limitation of field for Redis command HMSET ? If I set 100000 fields to one key by HMSET , if would cause the performance issue comparing to use the each field as a key?

it is quite large, 2^64-1 in 64 bit systems, and 2^32 -1 in 32 bit systems,
https://groups.google.com/d/msg/redis-db/eArHCH9kHKA/UFFRkp0iQ4UJ
1) Number of keys in every Redis database: 2^64-1 in 64 bit systems.
2^32-1 in 32 bit systems. 2) Number of hash fields in every hash:
2^64-1 in 64 bit systems. 2^32-1 in 32 bit systems.
Given that a 32 bit instance has at max 4GB of addressable space, the
limit is unreachable. For 64 bit instances, given how big is 2^64-1,
the limit is unreachable.
So for every practical point of view consider keys and hashes only
limited by the amount of RAM you have.
Salvatore

I did a couple of quick tests for this using the lua client.
I tried storing 100,000 fields using a single hmset command, individual hmset commands, and pipelined individual commands, and timed how long they took to complete:
hmset 100000 fields: 3.164817
hmset individual fields: 9.564578
hmset in pipeline: 4.784714
I didn't try larger values as 1,000,000+ were taking too long but the code is here if you'd like to tinker. https://gist.github.com/kraftman/1f15dc75649f07ee044eccab5379a8e3
Depending on the application bear in mind that you loose the storage efficiency of hashes once you add too many fields('too many' can be set, see here for more info.

According to Redis documentation, there's no such limitation.
actually the number of fields you can put inside a hash has no practical limits (other than available memory)
I think there's no performance penalty to save data in a HASH. However, if you have a very large HASH, it's always a bad idea to call HGETALL. Because HGETALL returns all fields and values of the HASH, and that would block the Redis instance for a long time when the HASH is very large.
Whether a HASH is better than key-value store, largely depends on your scenario.

Related

Redis multi Key or multi Hash field

I have about 300k row data like this Session:Hist:[account]
Session:Hist:100000
Session:Hist:100001
Session:Hist:100002
.....
Each have 5-10 childs [session]:[time]
b31c2a43-e61b-493a-b8d4-ff0729fe89de:1846971068807
5552daa2-c9f6-4635-8a7c-6f027b4aa1a3:1846971065461
.....
I have 2 options:
Using Hash, key is Session:Hist:[account], field is [session], value is [time]
Using Hash flat all account, key is Session:Hist, field is [account]:[session], value is [time]
My Redis have 1 master, 4-5 Slave, using to store & push session (about 300k *5 in 2h) every days, and clear at end of day!
So the question is which options is better for performance (faster sync master-slave/smaller memory/faster for huge request), thanks for your help!
Comparing the two options mentioned, option #2 is less optimal.
According to official Redis documentation:
It is worth noting that small hashes (i.e., a few elements with small values) are encoded in special way in memory that make them very memory efficient.
More details here.
So having one huge hash with key Session:Hist would affect memory consumption. It would also affect clustering (sharding) since you would have one hash (hot-spot) located on one instance which would get hammered.
Option #1 does not suffer from the problems mentioned above. As long as you have many well-distributed (i.e. all accounts have similar count of sessions vs a few accounts being dominant with huge amount of sessions) hashes keyed as Session:Hist:[account].
If, however, there is a possibility for uneven distribution of sessions into accounts, you could try (and measure) the efficiency of option 1a:
Key: Session:Hist:[account]:[session - last two characters]
field: [session's last two characters]
value: [time]
Example:
Key: Session:Hist:100000:b31c2a43-e61b-493a-b8d4-ff0729fe89
field: de
value: 1846971068807
This way, each hash will only contain up to 256 fields (assuming last 2 characters of session are hex, all possible combinations would be 256). This would be optimal if redis.conf defines hash-max-zipmap-entries 256.
Obviously option 1a would require some modifications in your application but with proper bench-marking (i.e. memory savings) you could decide if it's worth the effort.

Using HSET or SETBIT for storing 6 billion SHA256 hashes in Redis

Problem set : I am looking to store 6 billion SHA256 hashes. I want to check if a hash exist and if so, an action will be performed. When it comes to storing the SHA256 hash (64 byte string) just to check the if the key exist, I've come across two functions to use
HSET/HEXIST and GETBIT/SETBIT
I want to make sure I take the least amount of memory, but also want to make sure lookups are quick.
The Use case will be "check if sha256 hash exist"
The problem,
I want to understand how to store this data as currently I have a 200% increase from text -> redis. I want to understand what would the best shard options using ziplist entries and ziplist value would be. How to split the hash to be effective so the ziplist is maximised.
I've tried setting the ziplist entries to 16 ^ 4 (65536) and the value to 60 based on splitting 4:60
Any help to help me understand options, and techniques to make this as small of a footprint but quick enough to run lookups.
Thanks
A bit late to the party but you can just use plain Redis keys for this:
# Store a given SHA256 hash
> SET 9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08 ""
OK
# Check whether a specific hash exists
> EXISTS 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae
0
Where both SET and EXISTS have a time complexity of O(1) for single keys.
As Redis can handle a maximum of 2^32 keys, you should split your dataset into two or more Redis servers / clusters, also depending on the number of nodes and the total combined memory available to your servers / clusters.
I would also suggest to use the binary sequence of your hashes instead of their textual representation - as that would allow to save ~50% of memory while storing your keys in Redis.

Storing 13 Million floats and integer in redis

I have a file with 13 million floats each of them have a associated index as integer. The original size of file is 80MB.
We want to pass multiple indexes to get float data. The only reason, I needed hashmap field and value as List does not support passing multiple indexes to get.
Stored them as hashmap in redis, with index being field and float as value. On checking memory usage it was about 970MB.
Storing 13 million as list is using 280MB.
Is there any optimization I can use.
Thanks in advance
running on elastic cache
You can do a real good optimization by creating buckets of index vs float values.
Hashes are very memory optimized internally.
So assume your data in original file looks like this:
index, float_value
2,3.44
5,6.55
6,7.33
8,34.55
And you have currently stored them one index to one float value in hash or a list.
You can do this optimization of bucketing the values:
Hash key would be index%1000, sub-key would be index, and value would be float value.
More details here as well :
At first, we decided to use Redis in the simplest way possible: for
each ID, the key would be the media ID, and the value would be the
user ID:
SET media:1155315 939 GET media:1155315
939 While prototyping this solution, however, we found that Redis needed about 70 MB to store 1,000,000 keys this way. Extrapolating to
the 300,000,000 we would eventually need, it was looking to be around
21GB worth of data — already bigger than the 17GB instance type on
Amazon EC2.
We asked the always-helpful Pieter Noordhuis, one of Redis’ core
developers, for input, and he suggested we use Redis hashes. Hashes in
Redis are dictionaries that are can be encoded in memory very
efficiently; the Redis setting ‘hash-zipmap-max-entries’ configures
the maximum number of entries a hash can have while still being
encoded efficiently. We found this setting was best around 1000; any
higher and the HSET commands would cause noticeable CPU activity. For
more details, you can check out the zipmap source file.
To take advantage of the hash type, we bucket all our Media IDs into
buckets of 1000 (we just take the ID, divide by 1000 and discard the
remainder). That determines which key we fall into; next, within the
hash that lives at that key, the Media ID is the lookup key within
the hash, and the user ID is the value. An example, given a Media ID
of 1155315, which means it falls into bucket 1155 (1155315 / 1000 =
1155):
HSET "mediabucket:1155" "1155315" "939" HGET "mediabucket:1155"
"1155315"
"939" The size difference was pretty striking; with our 1,000,000 key prototype (encoded into 1,000 hashes of 1,000 sub-keys each),
Redis only needs 16MB to store the information. Expanding to 300
million keys, the total is just under 5GB — which in fact, even fits
in the much cheaper m1.large instance type on Amazon, about 1/3 of the
cost of the larger instance we would have needed otherwise. Best of
all, lookups in hashes are still O(1), making them very quick.
If you’re interested in trying these combinations out, the script we
used to run these tests is available as a Gist on GitHub (we also
included Memcached in the script, for comparison — it took about 52MB
for the million keys)

Redis using too much memory smaller number of keys

I have a redis standalone server, with around 8000 keys at a given instance .
The used_memeory is showing to be around 8.5 GB.
My individuals key-value size is max around 50kb , by that calculation the used_memory should be less than 1 GB (50kb * 8000)
I am using spring RedisTemplate with default pool configuration to connect to redis
Any idea what should I look into, to narrow down where the memory is being consumed ?
zset internally uses two data structures to hold the same elements in order to get O(log(N)) INSERT and REMOVE operations into a sorted data structure.
The two Data-structures to be specific are,
Hash Table
Skip list
Storage for ideal cases according to my research is in the following order,
hset < set < zset
I would recommend you to start using hset in case you have hierarchical data storage. This would lower down your memory consumption but might make searching teeny-tiny bit slower (only if one key has more than say a couple of hundred records)

How can Redis be optimized for storing lists of GUIDs?

We are using Redis to store shuffled decks of cards. A card is represented by a 20 character GUID, and a deck is an array of shuffled card GUIDs. The primary operations called on the Deck list is LLEN (length) and LPOP (pop). The only time that we push to a deck is a) when the deck is initially created and b) when the deck runs out of cards and is re-shuffled (which happens rarely). Currently, the length of a deck varies from 10 to 700 items.
What type of memory optimizations can be made in Redis for this sort of problem? Is there any sort of setting we can configure to reduce the memory overhead, or optimize how (zip)list data types are used?
Related Article: http://redis.io/topics/memory-optimization
My first suggestion would be to use 8byte unsigned integers as your identifier key instead of guids, that saves you several bytes per entry in memory, and increases overall performance of any database including redis you are using.
In case you want to go with guid, and considering the size of list and the operations you are doing on the list.
You can tune the redis defaults to suit your need :
Redis defaults :
list-max-ziplist-entries 512
list-max-ziplist-value 64
You can change this to :
list-max-ziplist-entries 1024 #to accomodate your 700 cards list
list-max-ziplist-value 256 # to accomodate your 20 byte guids
YMMV, hence you need to benchmark redis with both settings, for storage as well as read/write performance with your sample data.