Binary Search Tree inorder predecessor space complexity - binary-search-tree

I am studying binary search trees and have not been able to find much information on the space required to find the predecessor of a given node. Bases on an iterative approach, I believe I would need O(1) space (in-place) because we only need one variable plus a single node on a stack. To accomplish this recursively, we would have to maintain a stack. Since it is possible to traverse to the left most/minimum node, it is possible that we would traverse the entire height of the binary search tree. Therefore, the space complexity for this would be O(h).
Are these assumptions correct or am I missing anything?

Keep in mind that each recursive call decreases the height, and there is only a single call for each height value. Therefore, we can perform an iterartive search.
Let n1, n2 be nodes such that n2 is the root and n1 is null.
Let v be the node you are looking for
While n2 is not v:
n1 := n2
if v.value > n2.value, n2 := n2.right
else n2 := n.left
return n1
I've kept a constant number (2) of pointers, therefore the complexity is O(1).

Related

BFS bad complexity

I am using adjacency lists to represent graph in OCaml. Then I made the following implementation of a BFS in OCaml starting at the node s.
let bfs graph s=
let size = Array.length graph in
let seen = Array.make size false and next = [s] in
let rec aux = function
|[] -> ()
|t::q -> if not seen.(t) then begin seen.(t) <- true; aux (q#graph.(t)) end else aux q
in aux next
size represents the number of nodes of the graph. seen is an array where seen.(t) = true if we've seen the node t, and next is a list of the node we need to see.
The thing is that normally the time complexity for BFS is linear (O( V +E)) yet I feel like my implementation doesn't have this complexity. If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
Thus I am wondering how can I adapt this code to make an efficient BFS? The problem (I think) comes from the implementation of a Queue using lists. Does the complexity of the Queue module in OCaml takes O(1) to add an element? In this case how can I use this module to make my bfs work, since I can't do pattern matching with Queue just as easily as list?
the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
You are absolutely right – this is the bottleneck of your BFS. You should be happily able to use the Queue module, because according to https://ocaml.org/learn/tutorials/comparison_of_standard_containers.html operation of insertion and taking elements is O(1).
One of the differences between queues and lists in OCaml is that queues are mutable structures, so you will need to use non pure functions like add, take and top that respectively insert element in-place, pop element from the front and return first element.
If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |).
That is indeed the problem. What you should be using is graph.(t) # q. The complexity of that is O(| graph.(t) |).
You might ask: What difference does that make?
The difference is that |q| can be anything from 0 to V * E. graph.(t) on the other hand you can work with. You visit every vertex in the graph at most once so overall the complexity will be
O(\Sum_V |grahp.(v))
The sum of all edges of each vertex in the graph. Or in other words: E.
That brings you to the overall complexity of O(V + E).

How to effectively get the N lowest values from the collection (Top N) in Kotlin?

How to effectively get the N lowest values from the collection (Top N) in Kotlin?
Is there any other way besides collectionOrSequence.sortedby{it.value}.take(n)?
Assume I have a collection with +100500 elements and I need to found 10 lowest. I'm afraid that the sortedby will create new temporary collection which later will take only 10 items.
You could keep a list of the n smallest elements and just update it on demand, e.g.
fun <T : Comparable<T>> top(n: Int, collection: Iterable<T>): List<T> {
return collection.fold(ArrayList<T>()) { topList, candidate ->
if (topList.size < n || candidate < topList.last()) {
// ideally insert at the right place
topList.add(candidate)
topList.sort()
// trim to size
if (topList.size > n)
topList.removeAt(n)
}
topList
}
}
That way you only compare the current element of your list once to the largest element of the top n elements which would usually be faster than sorting the entire list https://pl.kotl.in/SyQPtDTcQ
If you're running on the JVM, you could use Guava's Comparators.least(int, Comparator), which uses a more efficient algorithm than any of these suggestions, taking O(n + k log k) time and O(k) memory to find the lowest k elements in a collection of size n, as opposed to zapl's algorithm (O(nk log k)) or Lior's (O(nk)).
You have more to worry about.
collectionOrSequence.sortedby{it.value} runs java.util.Arrays.sort, that will run timSort (or mergeSort if requested).
timSort is great, but usually ends by n*log(n) operations, which is much more than the O(n) of copying the array.
Each of the O(n*log.n) operations will run a function (the lambda you provided, {it.value}) --> an additional meaningful overhead.
Lastly, java.util.Arrays.sort will convert the collection to Array and back to a List - 2 additional conversions (which you wanted to avoid, but this is secondary)
The efficient way to do it is probably:
map the values for comparison into a list: O(n) conversions (once per element) rather than O(n*log.n) or more.
Iterate over the list (or Array) created to collect the N smallest elements in one pass
Keep a list of N smallest elements found so far and their index on the original list. If it is small (e.g. 10 items) - mutableList is a good fit.
Keep a variable holding the max value for the small element list.
When iterating over the original collection, compare the current element on the original list against the max value of the small values list. If smaller than it - replace it in the "small list" and find the updated max value in it.
Use the indexes from the "small list" to extract the 10 smallest elements of the original list.
That would allow you to go from O(n*log.n) to O(n).
Of course, if time is critical - it is always best to benchmark the specific case.
If you managed, on the first step, to extract primitives for the basis of comparison (e.g. int or long) - that would be even more efficient.
I suggest implementing your own sort method based on a typical quickSort algorithm(in descending order, and take the first N elements), if the collection has 1k+ values spread randomly.

Where to start with Binary Search Tree?

By my understanding when completing a binary search you start with the middle value and complete a divide and conquer algorithm upon it until you find the correct value.
However when I have looked at Binary Search Trees it was my understanding that this is completed in the same way with the initial node being the middle value, however I have seen examples of unsorted lists starting with first node being the first value in the array.
Which method is correct?
Thanks
Typically, you start with the middle node, then examine the left and right halves.
Divide and conquer algorithms approach the problem recursively by breaking the original problem into sub-problems of smaller size. The problem will be reduced down until it is small enough to be solved in a straightforward manner.
It the case of the Binary Search Tree, the algorithm takes the middle node, then recursively solves the right and left sub-problems.
BinarySearch(Array arr, value)
return BinarySearchAux(arr, value, 0, arr.length)
BinarySearch(Array arr, value, start, end)
if start >= end
return value == arr[start]
mid = floor((end - start) / 2)
if value == arr[mid]
return true
return
BinarySearchAux(arr, value, start, mid-1) ||
BinarySearchAux(arr, value, mid+1, end)

Return highest or lowest value Z notation , formal method

I am new to Z notation,
Lets say I have a function f defined as X |--> Y ,
where X is string and Y is number.
How can I get highest Y value in this function? Does 'loop' exist in formal method so I can solve it using loop?
I know there is recursion in Z notation, but based on the material provided, I only found it apply in multiset or bag, can it apply in function?
Any extra reference application of 'loop' or recursion application will be appreciated. Sorry for my English.
You can just use the predefined function max that takes a set of integers as input and returns the maximum number. The input values here are the range (the set of all values) of the function:
max(ran(f))
Please note that the maximum is not defined for empty sets.
Regarding your question about recursion or loops: You can actually define a function recursively but I think your question aims more at a way to compute something. This is not easily expressed in Z and this is IMO a good thing because it is used for specifications and it is not a programming language. Even if there wouldn't be a max or ran function, you could still specify the number m you are looking for by:
\exists s:String # (s,m):f /\
\forall s2:String, i2:Z # (s2,i2):f ==> i2 <= m
("m is a value of f, belonging to an s and all other values i2 of f are smaller or equal")
After getting used to the style it is usually far better to understand than any programming language (except your are trying to describe an algorithm itself and not its expected outcome).#
Just for reference: An example of a recursive definition (let's call it rmax) for the maximum would consist of a base case:
\forall e:Z # rmax({e}) = e
and a recursive case:
\forall e:Z; S:\pow(Z) #
S \noteq {} \land
rmax({e} \cup S) = \IF e > rmax(S) \THEN e \ELSE rmax(S)
But note that this is still not a "computation rule" of rmax because e in the second rule can be an arbitrary element of S. In more complex scenarios it might even be not obvious that the defined relation is a function at all because depending on the chosen elements different results could be computed.

Gain maximization on trees

Consider a tree in which each node is associated with a system state and contains a sequence of actions that are performed on the system.
The root is an empty node associated with the original state of the system. The state associated with a node n is obtained by applying the sequence of actions contained in n to the original system state.
The sequence of actions of a node n is obtained by queuing a new action to the parent's sequence of actions.
Moving from a node to another (i.e., adding a new action to the sequence of actions) produces a gain, which is attached to the edge connecting the two nodes.
Some "math":
each system state S is associated with a value U(S)
the gain achieved by a node n associated with the state S cannot be greater than U(S) and smaller than 0
If n and m are nodes in the tree and n is the parent of m, U(n) - U(m) = g(n,m), i.e., the gain on the edge between n and m represents the reduction of U from n to m
See the figure for an example.
My objective is the one of finding the path in the tree that guarantees the highest gain (where the gain of a path is computed by summing all the gains of the edges on the path):
Path* = arg max_{path} (sum g(n,m), for each adjacent n,m in path)
Notice that the tree is NOT known at the beginning, and thus a solution that does not require to visit the entire tree (discarding those paths that for sure do not bring to the optimal solution) to find the optimal solution would be the best option.
NOTE: I obtained an answer here and here for a similar problem in offline mode, i.e., when the graph was known. However, in this context the tree is not known and thus algorithms such as Bellman-Ford would perform no better than a brute-fore approach (as suggested). Instead, I would like to build something that resembles backtracking without building the entire tree to find the best solution (branch and bound?).
EDIT: U(S) becomes smaller and smaller as depth increases.
As you have noticed, a branch and bound can be used to solve your problem. Just expand the nodes that seem the most promising until you find complete solutions, while keeping track of the best known solution. If a node has a U(S) lower than the best known solution during the process, just skip it. When you have no more node, you are done.
Here is an algorithm :
pending_nodes <- (root)
best_solution <- nothing
while pending_nodes is not empty
Drop the node n from pending_nodes having the highest U(n) + gain(n)
if n is a leaf
if best_solution = nothing
best_solution <- n
else if gain( best_solution ) < gain( n )
best_solution <- n
end if
else
if best_solution ≠ nothing
if U(n) + gain(n) < gain(best_solution)
stop. best_solution is the best
end if
end if
append the children of n to pending_nodes
end if
end while