How can Redis be optimized for storing lists of GUIDs? - redis

We are using Redis to store shuffled decks of cards. A card is represented by a 20 character GUID, and a deck is an array of shuffled card GUIDs. The primary operations called on the Deck list is LLEN (length) and LPOP (pop). The only time that we push to a deck is a) when the deck is initially created and b) when the deck runs out of cards and is re-shuffled (which happens rarely). Currently, the length of a deck varies from 10 to 700 items.
What type of memory optimizations can be made in Redis for this sort of problem? Is there any sort of setting we can configure to reduce the memory overhead, or optimize how (zip)list data types are used?
Related Article: http://redis.io/topics/memory-optimization

My first suggestion would be to use 8byte unsigned integers as your identifier key instead of guids, that saves you several bytes per entry in memory, and increases overall performance of any database including redis you are using.
In case you want to go with guid, and considering the size of list and the operations you are doing on the list.
You can tune the redis defaults to suit your need :
Redis defaults :
list-max-ziplist-entries 512
list-max-ziplist-value 64
You can change this to :
list-max-ziplist-entries 1024 #to accomodate your 700 cards list
list-max-ziplist-value 256 # to accomodate your 20 byte guids
YMMV, hence you need to benchmark redis with both settings, for storage as well as read/write performance with your sample data.

Related

Storing 13 Million floats and integer in redis

I have a file with 13 million floats each of them have a associated index as integer. The original size of file is 80MB.
We want to pass multiple indexes to get float data. The only reason, I needed hashmap field and value as List does not support passing multiple indexes to get.
Stored them as hashmap in redis, with index being field and float as value. On checking memory usage it was about 970MB.
Storing 13 million as list is using 280MB.
Is there any optimization I can use.
Thanks in advance
running on elastic cache
You can do a real good optimization by creating buckets of index vs float values.
Hashes are very memory optimized internally.
So assume your data in original file looks like this:
index, float_value
2,3.44
5,6.55
6,7.33
8,34.55
And you have currently stored them one index to one float value in hash or a list.
You can do this optimization of bucketing the values:
Hash key would be index%1000, sub-key would be index, and value would be float value.
More details here as well :
At first, we decided to use Redis in the simplest way possible: for
each ID, the key would be the media ID, and the value would be the
user ID:
SET media:1155315 939 GET media:1155315
939 While prototyping this solution, however, we found that Redis needed about 70 MB to store 1,000,000 keys this way. Extrapolating to
the 300,000,000 we would eventually need, it was looking to be around
21GB worth of data — already bigger than the 17GB instance type on
Amazon EC2.
We asked the always-helpful Pieter Noordhuis, one of Redis’ core
developers, for input, and he suggested we use Redis hashes. Hashes in
Redis are dictionaries that are can be encoded in memory very
efficiently; the Redis setting ‘hash-zipmap-max-entries’ configures
the maximum number of entries a hash can have while still being
encoded efficiently. We found this setting was best around 1000; any
higher and the HSET commands would cause noticeable CPU activity. For
more details, you can check out the zipmap source file.
To take advantage of the hash type, we bucket all our Media IDs into
buckets of 1000 (we just take the ID, divide by 1000 and discard the
remainder). That determines which key we fall into; next, within the
hash that lives at that key, the Media ID is the lookup key within
the hash, and the user ID is the value. An example, given a Media ID
of 1155315, which means it falls into bucket 1155 (1155315 / 1000 =
1155):
HSET "mediabucket:1155" "1155315" "939" HGET "mediabucket:1155"
"1155315"
"939" The size difference was pretty striking; with our 1,000,000 key prototype (encoded into 1,000 hashes of 1,000 sub-keys each),
Redis only needs 16MB to store the information. Expanding to 300
million keys, the total is just under 5GB — which in fact, even fits
in the much cheaper m1.large instance type on Amazon, about 1/3 of the
cost of the larger instance we would have needed otherwise. Best of
all, lookups in hashes are still O(1), making them very quick.
If you’re interested in trying these combinations out, the script we
used to run these tests is available as a Gist on GitHub (we also
included Memcached in the script, for comparison — it took about 52MB
for the million keys)

Redis using too much memory smaller number of keys

I have a redis standalone server, with around 8000 keys at a given instance .
The used_memeory is showing to be around 8.5 GB.
My individuals key-value size is max around 50kb , by that calculation the used_memory should be less than 1 GB (50kb * 8000)
I am using spring RedisTemplate with default pool configuration to connect to redis
Any idea what should I look into, to narrow down where the memory is being consumed ?
zset internally uses two data structures to hold the same elements in order to get O(log(N)) INSERT and REMOVE operations into a sorted data structure.
The two Data-structures to be specific are,
Hash Table
Skip list
Storage for ideal cases according to my research is in the following order,
hset < set < zset
I would recommend you to start using hset in case you have hierarchical data storage. This would lower down your memory consumption but might make searching teeny-tiny bit slower (only if one key has more than say a couple of hundred records)

Why redis hash convert from ziplist to hashtable when key or value is large?

There are two configs about data structure of hash in redis: hash-max-ziplist-entries and hash-max-ziplist-value.
It's easy to understand it should convert to hashtable when there are too many entries, as it will cost too much time for the get command.
But why it convert to hashtable when the value is large? As far as I can understand, as there is a "length" field in ziplist's entry, it shouldn't matter if one entry is 1 bit or 100 bits, it just need to move over the whole entry to get next one.
In order to traverse both forward and backward, a doubly linked list has to save two pointers(i.e. 16 bytes on 64 bits machine) for each entry. If the entry data is small, say, 8 bytes, it will be very memory inefficiency: data is only 8 bytes, while the extra pointers cost 16 bytes.
In order to solve this problem, ziplist uses two variable length encoded numbers to replace the two pointers, and save all entries in contiguous memory. In this case, if all entry value is less than 64 bytes, these two variable length encoded numbers only cost 2 bytes (please correct me, if I'm wrong). This is very memory efficient. However, if the entry data is very large, say, 1024 bytes, this trick won't save too much memory, since the entry data costs more.
On the other hand, since ziplist saves all entries in contiguous memory in a compact way, it has to do memory reallocation for almost every write operation. That's very CPU inefficient. Also encoding and decoding those variable length encoded number cost CPU.
So if the entry data/value is small, you can use ziplist to achieve memory efficiency. However, if the data is large, you CANNOT get too much gain, while it costs you lots of CPU time.

Is there any number limitations of field for Redis command HMSET?

What's the max number limitation of field for Redis command HMSET ? If I set 100000 fields to one key by HMSET , if would cause the performance issue comparing to use the each field as a key?
it is quite large, 2^64-1 in 64 bit systems, and 2^32 -1 in 32 bit systems,
https://groups.google.com/d/msg/redis-db/eArHCH9kHKA/UFFRkp0iQ4UJ
1) Number of keys in every Redis database: 2^64-1 in 64 bit systems.
2^32-1 in 32 bit systems. 2) Number of hash fields in every hash:
2^64-1 in 64 bit systems. 2^32-1 in 32 bit systems.
Given that a 32 bit instance has at max 4GB of addressable space, the
limit is unreachable. For 64 bit instances, given how big is 2^64-1,
the limit is unreachable.
So for every practical point of view consider keys and hashes only
limited by the amount of RAM you have.
Salvatore
I did a couple of quick tests for this using the lua client.
I tried storing 100,000 fields using a single hmset command, individual hmset commands, and pipelined individual commands, and timed how long they took to complete:
hmset 100000 fields: 3.164817
hmset individual fields: 9.564578
hmset in pipeline: 4.784714
I didn't try larger values as 1,000,000+ were taking too long but the code is here if you'd like to tinker. https://gist.github.com/kraftman/1f15dc75649f07ee044eccab5379a8e3
Depending on the application bear in mind that you loose the storage efficiency of hashes once you add too many fields('too many' can be set, see here for more info.
According to Redis documentation, there's no such limitation.
actually the number of fields you can put inside a hash has no practical limits (other than available memory)
I think there's no performance penalty to save data in a HASH. However, if you have a very large HASH, it's always a bad idea to call HGETALL. Because HGETALL returns all fields and values of the HASH, and that would block the Redis instance for a long time when the HASH is very large.
Whether a HASH is better than key-value store, largely depends on your scenario.

Are redis hashes kept in ziplist after changing hash-max-ziplist-entries?

I'm running a redis instance where I have stored a lot of hashes with integer fields and values. Specifically, there are many hashes of the form
{1: <int>, 2: <int>, ..., ~10000: <int>}
I was initially running redis with the default values for hash-max-ziplist-entries:
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
and redis was using approximately 3.2 GB of memory.
I then changed these values to
hash-max-ziplist-entries 10240
hash-max-ziplist-value 10000
and restarted redis. My memory usage went down to approximately 480 MB, but redis was using 100% CPU. I reverted the values back to 512 and 64, and restarted redis, but it was still only using 480 MB of memory.
I assume that the memory usage went down because a lot of my hashes were stored as ziplists. I would have guessed that after changing the values and restarting redis they would automatically be converted back into hash tables, but this doesn't appear to be the case.
So, are these hashes still being stored as a ziplist?
They are still in optimized "ziplist" format.
Redis will store hashes (via "hset" or similar) in an optimized way if the hash does end up having more than hash-max-ziplist-entries entries, or if the values are smaller than hash-max-ziplist-values bytes.
If these limits are broken Redis will store the item "normally", ie. not optimized.
Relevant section in documentation (http://redis.io/topics/memory-optimization):
If a specially encoded value will overflow the configured max size, Redis will automatically convert it into normal encoding.
Once the values are written in an optimized way, they are not "unpacked", even if you lower the max size settings later. The settings will apply to new keys that Redis stores.