“Programming Pearls”: Searching - binary-search-tree

We can avoid many calls to a storage allocator by keeping a collection
of available nodes in his own structure.
This idea can be applied to Binary search tree data structure.
The author said that :"Allocating the nodes all at once can greatly reduces the
tree's space requirements, which reduces the run time by about a third."
I'm curious how this trick can reduce space requirements. I mean If we
want to build a binary search tree with four nodes, we need to allocate
memory for these four nodes, no matter we allocate the nodes one by one or all at
once.

Memory allocators are notoriously bad at allocating very small objects. The situation has somewhat improved in the last decade, but the trick from the book is still relevant.
Most allocators keep additional information with the block that they allocate to you, so that they could free the memory properly. For example, the malloc/free pair of C or new[]/delete[] pair of C++ needs to save the information about the length of the actual memory chunk somewhere; usually, this data ends up in the four bytes just prior to the address returned to you.
This means that at least four additional bytes will be wasted for each allocation. If your tree node takes twelve bytes (four bytes for each of the two pointers plus four bytes for the number), sixteen bytes would be allocated for each node - a 33.3% increase.
Memory allocator needs to perform additional bookkeeping as well. Every time a chunk is taken from the heap, the allocator must account for it.
Finally, the more memory your tree uses, the less is the chance that the adjacent node would be fetched in the cache when the current node is processed, because of the distance in memory to the next node.

This sort of relates to how Strings are handled by Java. Whereas when you concat to a string, you are actually using 3 string objects : the old string, the new segment and the new result. Eventually the garbage collector tidies up but in this situation (my string example and your procedural binary search) - you are growing memory space in a wasteful mannor. At least thats how I understand it.

Related

Redis ZRANGEBYLEX command complexity

According documentation section for ZRANGEBYLEX command, there is following information. If store keys in ordered set with zero score, later keys can be retrieved with lexicographical order. And ZRANGEBYLEX operation complexity will be O(log(N)+M), where N is total elements count and M is result set size. Documentation has some information about string comparation, but tells nothing about structure, in which elements will be stored.
But after some experiments and reading source code, it's probably what ZRANGEBYLEX operation has a linear time search, when every element in ziplist will be matched against request. If so, complexity will be more larger than described above - about O(N), because every element in ziplist will be scanned.
After debugging with gdb, it's clean that ZRANGEBYLEX command is implemented in genericZrangebylexCommand function. Control flow continues at eptr = zzlFirstInLexRange(zl,&range);, so major work for element retrieving will be performed at zzlFirstInLexRange function. All namings and following control flow consider that ziplist structure is used, and all comparation with input operands are done sequentially element by element.
Inspecting memory with analysis after inserting well-known keys in redis store, it seems that ZSET elements are really stored in ziplist - byte-per-byte comparation with gauge confirm it.
So question - how can documentation be wrong and propagate logarithmic complexity where linear one appears? Or maybe ZRANGEBYLEX command works slightly different? Thanks in advance.
how can documentation be wrong and propagate logarithmic complexity where linear one appears?
The documentation has been wrong on more than a few occasions, but it is an ongoing open source effort that you can contribute to via the repository (https://github.com/antirez/redis-doc).
Or maybe ZRANGEBYLEX command works slightly different?
Your conclusion is correct in the sense that Sorted Set search operations, whether lexicographical or not, exhibit linear time complexity when Ziplists are used for encoding them.
However.
Ziplists are an optimization that prefers CPU to memory, meaning it is meant for use on small sets (i.e. low N values). It is controlled via configuration (see the zset-max-ziplist-entries and zset-max-ziplist-value directives), and once the data grows above the specified thresholds the ziplist encoding is converted to a skip list.
Because ziplists are small (little Ns), their complexity can be assumed to be constant, i.e. O(1). On the other hand, due to their nature, skip lists exhibit logarithmic search time. IMO that means that the documentation's integrity remains intact, as it provides the worst case complexity.

Why redis hash convert from ziplist to hashtable when key or value is large?

There are two configs about data structure of hash in redis: hash-max-ziplist-entries and hash-max-ziplist-value.
It's easy to understand it should convert to hashtable when there are too many entries, as it will cost too much time for the get command.
But why it convert to hashtable when the value is large? As far as I can understand, as there is a "length" field in ziplist's entry, it shouldn't matter if one entry is 1 bit or 100 bits, it just need to move over the whole entry to get next one.
In order to traverse both forward and backward, a doubly linked list has to save two pointers(i.e. 16 bytes on 64 bits machine) for each entry. If the entry data is small, say, 8 bytes, it will be very memory inefficiency: data is only 8 bytes, while the extra pointers cost 16 bytes.
In order to solve this problem, ziplist uses two variable length encoded numbers to replace the two pointers, and save all entries in contiguous memory. In this case, if all entry value is less than 64 bytes, these two variable length encoded numbers only cost 2 bytes (please correct me, if I'm wrong). This is very memory efficient. However, if the entry data is very large, say, 1024 bytes, this trick won't save too much memory, since the entry data costs more.
On the other hand, since ziplist saves all entries in contiguous memory in a compact way, it has to do memory reallocation for almost every write operation. That's very CPU inefficient. Also encoding and decoding those variable length encoded number cost CPU.
So if the entry data/value is small, you can use ziplist to achieve memory efficiency. However, if the data is large, you CANNOT get too much gain, while it costs you lots of CPU time.

Does redis key size also include the data size for that key or just the key itself?

I'm trying to analyise the db size for redis db and tweak the storage of our data per a few articles such as https://davidcel.is/posts/the-story-of-my-redis-database/
and https://engineering.instagram.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c
I've read documentation about "key sizes" (i.e. https://redis.io/commands/object)
and tried running various tools like:
redis-cli --bigkeys
and also tried to read the output from the redis-cli:
INFO memory
The size semantics are not clear to me.
Does the reported size reflect ONLY the size for the key itself, i.e. if my key is "abc" and the value is "value1" the reported size is for the "abc" portion? Also the same question in respects to complex data structures for that key such as a hash / array or list.
Trial and error doesn't seem to give me a clear result.
Different tools give different answers.
First read about --bigkeys - it reports big value sizes in the keyspace, excluding the space taken by the key's name. Note that in this case the size of the value means something different for each data type, i.e. Strings are sized by their STRLEN (bytes) whereas all other by the number of their nested elements.
So that basically means that it gives little indication about actual usage, but rather does as it is intended - finds big keys (not big key names, only estimated big values).
INFO MEMORY is a different story. The used_memory is reported in bytes and reflects the entire RAM consumption of key names, their values and all associated overheads of the internal data structures.
There also DEBUG OBJECT but note that it's output is not a reliable way to measure the memory consumption of a key in Redis - the serializedlength field is given in bytes needed for persisting the object, not the actual footprint in memory that includes various administrative overheads on top of the data itself.
Lastly, as of v4 we have the MEMORY USAGE command that does a much better job - see https://github.com/antirez/redis-doc/pull/851 for the details.

Explain (buffers, analyse) in postgresql

I am new in postgresql and I try to understand explain (buffers, analyse) instruction. I have a query and I execute it using explain (buffers, analyse).
The first time i execute it the performance is worse than the second time. Also, the first time i get a 'read' parameter next to 'hit' while the second time the 'read' does not exist.
Can somebody help me understand?
first time you select, pages get warm - they are loaded to cache, once they are in RAM - all next selects will be faster (RAM speed is higher).
Accordingly buffers show read, when pages are not in cache, cos postgres reads them, and no read when they are warm, so cache is hit...
Update with docs:
BUFFERS
Include information on buffer usage. Specifically, include the
number of shared blocks hit, read, dirtied, and written, the number of
local blocks hit, read, dirtied, and written, and the number of temp
blocks read and written. A hit means that a read was avoided because
the block was found already in cache when needed. Shared blocks
contain data from regular tables and indexes; local blocks contain
data from temporary tables and indexes; while temp blocks contain
short-term working data used in sorts, hashes, Materialize plan nodes,
and similar cases. The number of blocks dirtied indicates the number
of previously unmodified blocks that were changed by this query; while
the number of blocks written indicates the number of
previously-dirtied blocks evicted from cache by this backend during
query processing. The number of blocks shown for an upper-level node
includes those used by all its child nodes. In text format, only
non-zero values are printed. This parameter may only be used when
ANALYZE is also enabled. It defaults to FALSE.
And surprisingly not much about buffers here.

Optaplanner; What is the allowable size limit?

My problem has a problem size of 80000 but I stuck when I exceeds this limit,
Is there a limit for the problem size used in Optaplanner?
What is this limit?
I get a java heap exception when I exceed this limit (80000)
Some idea's to look into:
1) Give the JVM more memory: -Xmx=2G
2) Use a more efficient data structure. 80k instances will easy fit into a small memory. My bet is you have some sort of cross matrix between 2 collections. For example, a distance matrix for 20k VRP locations needs (20k)² = 400m integers (each of which at least 4 bytes), so it requires almost 2GB of RAM to keep in memory in its most efficient form (an array). Use a profiler such as JProfiler or VisualVM to find out which datastructures are taken such much memory.
3) Read the chapter about "planning clone". Sometimes splitting a Job up in a Job and JobAssignment can save memory because only the JobAssignment needs to be cloned, while in the other case everything that references Job needs to be planning cloned too.