Redis MEMORY USAGE & INFO MEMORY - redis

MEMORY USAGE KEY gives the memory in bytes that key is taking(https://redis.io/commands/memory-usage)
If I sum up the values returned by the command by all of the keys in redis, should it sum up to one of the memory stats returns from INFO MEMORY ?
If yes. Which one would it be?
used_memory_rss
used_memory_rss_human
used_memory_dataset

No, even if you sum up that output from MEMORY USAGE, you will not get to the sums reported by INFO MEMORY.
MEMORY USAGE attempts to estimate the memory usage associated with a given key - the data but also its overheads.
used_memory_rss is the amount of memory allocated, inclusive of server overheads and fragmentation.
used_memory_dataset attempts to account for the data itself, without overheads.
So, roughly: used_memory_dataset < sum of MEMORY USAGE < used_memory_rss

Related

How to set the capacity while creating table by invoking indexedTable function

How to set the capacity while creating table by invoking indexedTable function?
I use indexedTable to create a table, the code is:
t1 = indexedTable(`sym`id, 1:0, `sym`id`val, [SYMBOL,INT,INT])
t2 = indexedTable(`sym`id, 100:0, `sym`id`val, [SYMBOL,INT,INT])
I find there is no difference in writing and querying data. So I wonder what's the role of capacity?
The parameter "capacity" is a positive integer, showing the memory (in the number of records) allocated by the system for the table when the table is built. When the number of records exceeds the capacity, the system will first allocate a new memory space of 1.2~2 times the capacity, then copy the data to the new memory space, and finally release the original memory. For larger tables, the memory usage of such operations will be very high. Therefore, it is recommended to allocate a reasonable capacity in advance when building the table.

Does redis key size also include the data size for that key or just the key itself?

I'm trying to analyise the db size for redis db and tweak the storage of our data per a few articles such as https://davidcel.is/posts/the-story-of-my-redis-database/
and https://engineering.instagram.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c
I've read documentation about "key sizes" (i.e. https://redis.io/commands/object)
and tried running various tools like:
redis-cli --bigkeys
and also tried to read the output from the redis-cli:
INFO memory
The size semantics are not clear to me.
Does the reported size reflect ONLY the size for the key itself, i.e. if my key is "abc" and the value is "value1" the reported size is for the "abc" portion? Also the same question in respects to complex data structures for that key such as a hash / array or list.
Trial and error doesn't seem to give me a clear result.
Different tools give different answers.
First read about --bigkeys - it reports big value sizes in the keyspace, excluding the space taken by the key's name. Note that in this case the size of the value means something different for each data type, i.e. Strings are sized by their STRLEN (bytes) whereas all other by the number of their nested elements.
So that basically means that it gives little indication about actual usage, but rather does as it is intended - finds big keys (not big key names, only estimated big values).
INFO MEMORY is a different story. The used_memory is reported in bytes and reflects the entire RAM consumption of key names, their values and all associated overheads of the internal data structures.
There also DEBUG OBJECT but note that it's output is not a reliable way to measure the memory consumption of a key in Redis - the serializedlength field is given in bytes needed for persisting the object, not the actual footprint in memory that includes various administrative overheads on top of the data itself.
Lastly, as of v4 we have the MEMORY USAGE command that does a much better job - see https://github.com/antirez/redis-doc/pull/851 for the details.

Optaplanner; What is the allowable size limit?

My problem has a problem size of 80000 but I stuck when I exceeds this limit,
Is there a limit for the problem size used in Optaplanner?
What is this limit?
I get a java heap exception when I exceed this limit (80000)
Some idea's to look into:
1) Give the JVM more memory: -Xmx=2G
2) Use a more efficient data structure. 80k instances will easy fit into a small memory. My bet is you have some sort of cross matrix between 2 collections. For example, a distance matrix for 20k VRP locations needs (20k)² = 400m integers (each of which at least 4 bytes), so it requires almost 2GB of RAM to keep in memory in its most efficient form (an array). Use a profiler such as JProfiler or VisualVM to find out which datastructures are taken such much memory.
3) Read the chapter about "planning clone". Sometimes splitting a Job up in a Job and JobAssignment can save memory because only the JobAssignment needs to be cloned, while in the other case everything that references Job needs to be planning cloned too.

Memory utilization in redis for each database

Redis allows storing data in 16 different 'databases' (0 to 15). Is there a way to get utilized memory & disk space per database. INFO command only lists number of keys per database.
No, you can not control each database individually. These "databases" are just for logical partitioning of your data.
What you can do (depends on your specific requirements and setup) is spin multiple redis instances, each one does a different task and each one has its own redis.conf file with a memory cap. Disk space can't be capped though, at least not in Redis level.
Side note: Bear in mind that the 16 database number is not hardcoded - you can set it in redis.conf.
I did it by calling dump on all the keys in a Redis DB and measuring the total number of bytes used. This will slow down your server and take a while. It seems the size dump returns is about 4 times smaller than the actual memory use. These number will give you an idea of which db is using the most space.
Here's my code:
https://gist.github.com/mathieulongtin/fa2efceb7b546cbb6626ee899e2cfa0b

“Programming Pearls”: Searching

We can avoid many calls to a storage allocator by keeping a collection
of available nodes in his own structure.
This idea can be applied to Binary search tree data structure.
The author said that :"Allocating the nodes all at once can greatly reduces the
tree's space requirements, which reduces the run time by about a third."
I'm curious how this trick can reduce space requirements. I mean If we
want to build a binary search tree with four nodes, we need to allocate
memory for these four nodes, no matter we allocate the nodes one by one or all at
once.
Memory allocators are notoriously bad at allocating very small objects. The situation has somewhat improved in the last decade, but the trick from the book is still relevant.
Most allocators keep additional information with the block that they allocate to you, so that they could free the memory properly. For example, the malloc/free pair of C or new[]/delete[] pair of C++ needs to save the information about the length of the actual memory chunk somewhere; usually, this data ends up in the four bytes just prior to the address returned to you.
This means that at least four additional bytes will be wasted for each allocation. If your tree node takes twelve bytes (four bytes for each of the two pointers plus four bytes for the number), sixteen bytes would be allocated for each node - a 33.3% increase.
Memory allocator needs to perform additional bookkeeping as well. Every time a chunk is taken from the heap, the allocator must account for it.
Finally, the more memory your tree uses, the less is the chance that the adjacent node would be fetched in the cache when the current node is processed, because of the distance in memory to the next node.
This sort of relates to how Strings are handled by Java. Whereas when you concat to a string, you are actually using 3 string objects : the old string, the new segment and the new result. Eventually the garbage collector tidies up but in this situation (my string example and your procedural binary search) - you are growing memory space in a wasteful mannor. At least thats how I understand it.