Binary Search Tree Minimum Value - binary-search-tree

I am new to binary search tree data structure. One thing I don't understand is why the leftest node is the smallest
10
/ \
5 12
/ \ / \
1 6 0 14
In the above instance, 0 is the smallest value not 1.
Let me know where I got mixed up.
Thank you!

That tree is not binary search tree.
Creating a binary search tree is a process which starts with adding
elements.
You can do it with array.
First there is no element so make it root.Then start adding elements as node.If the new value is bigger than before add it array[2 x n + 1] (call index of the last value: n). If it is smaller than before add it to array[2 x n]. So all values left of any node is smaller than it and all values right of any node is bigger than it. Even 10 and 6's place.6 cannot be 11.(at your tree,it isn't actually.).That's all !

For a tree to be considered as a binary search tree, it must satisfy the following property:
... the key in each node must be greater than all keys stored in the left sub-tree, and smaller than all keys in right sub-tree
Source: https://en.wikipedia.org/wiki/Binary_search_tree
The tree you posted is not a binary search tree because the root node (10) is not smaller than all keys in the right sub-tree (node 0)

I'm not really sure of your question, but binary search works by comparing the search-value to the value of the node, starting with the root node (value 10 here). If the search-value is less, it then looks at the left node of the root (value 5), otherwise it looks next at the right node (12).
It doesn't matter so much where in the tree the value is as long as the less and greater rule is followed.
In fact, you want to have trees set up like this (except for the bad 0 node), because the more balanced a tree is (number of nodes on left vs. number of nodes on right), the faster your search will be!
A tree balancing algorithm might, for example, look for the median value in a list of values and make that the value of the root node.

Related

Size of a serialized complete binary tree

I'm tryin to work out the size of a serialized binary tree having N nodes (also mentioned in Leetcode). This is how I calculate the size:
If we assume the storage required to store values be V bits for each node, then the storage needed to store N nodes will be N.V. We also need to store NULL for the leaves; since there are exactly Ceiling(N/2) leaves in a complete tree, and assuming only one bit is enough to represent NULL, then an additional of 2 x Ceiling(N/2) bits will be required. 2 x Ceiling(N/2) translates to N+1 as in a complete tree N is always an odd number.
So, N.V + (N+1) bit is required in total.
However, I can see that in Leetcode and some other places (e.g. this), it's calculated as N.V + 2N.
What am I missing?
What am I missing?
The two references you provided (LeetCode and blog article) deal with arbitrary binary trees, not necessarily complete. So let me first deal with arbitrary binary trees:
Although a NULL reference could be represented with one bit (e.g. with value 0), you also need to store the fact that a reference is not a NULL (value 1). You cannot just omit the bit, as then the next bit (belonging to a node value) could be misinterpreted as indicating a NULL reference. So you should not only count that bit for each NULL reference, but count it in for all branches.
The serialised format would for each node represent:
The node's value (𝑉 bits)
The fact whether or not its left child is a NULL (1 bit)
The fact whether or not its right child is a NULL (1 bit)
Example:
Let 𝑉 be 4
Tree to serialise:
10
/ \
7 13
\
14
Serialisation process (level order):
node value
has left
has right
serialised
without spacing
10
yes
yes
1010 1 1
101011
7
no
no
0111 0 0
011100
13
no
yes
1101 0 1
110101
14
no
no
1110 0 0
111000
Complete:
101011011100110101111000
If we were only to store the 0 when there is a NULL, then we would get this:
101001110011010111000
^
But now the bit at the indicated position is ambiguous, because that bit could be interpreted as representing a NULL reference, but actually it is the first of 𝑉 bits 0111 representing the value 7.
It is however possible to reduce the serialised string with 2 bits: the very last 2 bits will always be 0 in a tree traversal that is guaranteed to end with a leaf. This is for example the case for level-order and pre-order traversals. So then you could just omit those 2 bits.
The case for complete binary trees
First of all about the definition of a complete binary tree. You write:
in a complete tree N is always an odd number.
I suppose then your definition of a complete tree is what at Wikipedia is called a perfect tree. We can however also look at (nearly) complete binary trees (and then 𝑁 is not necessarily odd).
For complete binary trees the case is simpler, as a level order traversal of a complete binary tree will never include NULLs, i.e. there are no "gaps" in such a traversal.
So you can just serialise the node's values in that order, giving each 𝑉 bits. This is actually the array representation that is used for binary heaps:
The parent / child relationship is defined implicitly by the elements' indices in the array.
If serialisation happens in a string data type that implicitly has a length attribute, then that's it. If there is no such meta data, then you need to prefix the value of 𝑁 in the serialisation, reserving a predefined number of bits for it. Alternatively, if there is a special value of 𝑉 bits that will never occur as actual node value, you could append it as a terminator (much like \0 in C-strings).

Finding minimum distance in a binary search tree

I have a binary search tree in which I am trying to find the minimum distance using the following characteristic:
distance = [a + b - x]
where a and b are nodes and x is a value given by the user. I'm not sure how to do this. I thought I would start at the root of the tree and then use in order traversal to index all the nodes and then maintain a separate array that I can then compare all the absolute values but this seems inefficient.......

Where to start with Binary Search Tree?

By my understanding when completing a binary search you start with the middle value and complete a divide and conquer algorithm upon it until you find the correct value.
However when I have looked at Binary Search Trees it was my understanding that this is completed in the same way with the initial node being the middle value, however I have seen examples of unsorted lists starting with first node being the first value in the array.
Which method is correct?
Thanks
Typically, you start with the middle node, then examine the left and right halves.
Divide and conquer algorithms approach the problem recursively by breaking the original problem into sub-problems of smaller size. The problem will be reduced down until it is small enough to be solved in a straightforward manner.
It the case of the Binary Search Tree, the algorithm takes the middle node, then recursively solves the right and left sub-problems.
BinarySearch(Array arr, value)
return BinarySearchAux(arr, value, 0, arr.length)
BinarySearch(Array arr, value, start, end)
if start >= end
return value == arr[start]
mid = floor((end - start) / 2)
if value == arr[mid]
return true
return
BinarySearchAux(arr, value, start, mid-1) ||
BinarySearchAux(arr, value, mid+1, end)

calculating binary tree internal nodes

I could find a question related to full binary tree.
A full binary tree is a rooted tree in which every internal node has exactly two children. How many internal
nodes are there in a full binary tree with 500 leaves?
I feels the answer as 250. Please explain
Take any two leaves and combine them to create an internal node. Now, you can increase by one the number of internal nodes and delete the two used leaves, which transforms than internal node in a new leaf.
Thus, if we call f(n) the number of internal nodes with n leaves, the previous argument leads us to f(n) = 1 + f(n - 1), where f(2) = 1. Therefore, f(n) = n - 1.
Thus, for 500 the result is 499.
If full binary tree (T) has 500 leaves (L), then the number of internal nodes is I = L – 1 i.e I = 500 - 1.
Result is 499.

Gain maximization on trees

Consider a tree in which each node is associated with a system state and contains a sequence of actions that are performed on the system.
The root is an empty node associated with the original state of the system. The state associated with a node n is obtained by applying the sequence of actions contained in n to the original system state.
The sequence of actions of a node n is obtained by queuing a new action to the parent's sequence of actions.
Moving from a node to another (i.e., adding a new action to the sequence of actions) produces a gain, which is attached to the edge connecting the two nodes.
Some "math":
each system state S is associated with a value U(S)
the gain achieved by a node n associated with the state S cannot be greater than U(S) and smaller than 0
If n and m are nodes in the tree and n is the parent of m, U(n) - U(m) = g(n,m), i.e., the gain on the edge between n and m represents the reduction of U from n to m
See the figure for an example.
My objective is the one of finding the path in the tree that guarantees the highest gain (where the gain of a path is computed by summing all the gains of the edges on the path):
Path* = arg max_{path} (sum g(n,m), for each adjacent n,m in path)
Notice that the tree is NOT known at the beginning, and thus a solution that does not require to visit the entire tree (discarding those paths that for sure do not bring to the optimal solution) to find the optimal solution would be the best option.
NOTE: I obtained an answer here and here for a similar problem in offline mode, i.e., when the graph was known. However, in this context the tree is not known and thus algorithms such as Bellman-Ford would perform no better than a brute-fore approach (as suggested). Instead, I would like to build something that resembles backtracking without building the entire tree to find the best solution (branch and bound?).
EDIT: U(S) becomes smaller and smaller as depth increases.
As you have noticed, a branch and bound can be used to solve your problem. Just expand the nodes that seem the most promising until you find complete solutions, while keeping track of the best known solution. If a node has a U(S) lower than the best known solution during the process, just skip it. When you have no more node, you are done.
Here is an algorithmΒ :
pending_nodes <- (root)
best_solution <- nothing
while pending_nodes is not empty
Drop the node n from pending_nodes having the highest U(n) + gain(n)
if n is a leaf
if best_solution = nothing
best_solution <- n
else if gain( best_solution ) < gain( n )
best_solution <- n
end if
else
if best_solution β‰  nothing
if U(n) + gain(n) < gain(best_solution)
stop. best_solution is the best
end if
end if
append the children of n to pending_nodes
end if
end while