Binary Search Trees queries - binary-search-tree

I have these couple of questions:
Given a BST of floats, find the highest number just below a given float value
Implement a binary search tree for floating-point values
My ideas: I thought a greedy on the given location would give us the right answer for 1) and 2) would be by basically just considering subtrees of depth = precision of the value. This would give us a standard BST but with subtrees to access floating point data points.
Let me know if these are correct.

I don't think there is significant difference between BST for integer node and floating point node, and answer for 1) and 2) are straightforward. By BST in-order traversal, find the highest number below given float value until encounter a value that is greater than give value or traversal done.

Related

Binary Search: Number of comparisons in the worst case

I am trying to figure out the number of comparisons that binary search does on an array of a given size in the worst case.
Let's say there is an array A with 123,456 elements (or any other number). Binary search is applied to find some element E. The comparison is to determine whether A[i] = E. How many times would this comparison be executed in the worst case?
According to this post, the number of worst case comparisons is 2logn+1.
Result: 50
According to this post, the max. number of binary search comparisons is log2(n+1).
Result: 25
According to this post, the number of comparisons is 2logn-1.
Result: 50
I am confused by the different answers. Can anyone tell me which one is correct and how I can determine the maximum number of comparisons in the worst case?
According to this Wiki page:
In the worst case, binary search makes floor(log2(n)+1) iterations of the comparison loop, where the floor notation denotes the floor function that yields the greatest integer less than or equal to the argument, and log2 is the binary logarithm. This is because the worst case is reached when the search reaches the deepest level of the tree, and there are always floor(log2(n)+1) levels in the tree for any binary search.
Also, it's not enough to consider only comparisons A[i] = E. The binary search also includes comparisons E <= A[mid], where the mid is the midpoint of the index interval.

Why is the time complexity of binary search logN but the time complexity of a BST is N?

In Algorithms, 4th edition by Robert Sedgewick, the time complexity table for different algorithms is given as:
Based on this table, the searching time complexity of a BST is N, and of binary search in and of itself is logN.
What is the difference between the two? I have seen explanations about these separately and they made sense, however, I can't seem to understand why the searching time complexity of a BST isn't logN, as we are searching by continually breaking the tree in half and ignoring the other parts.
From binary-search-trees-bst-explained-with-examples
...on average, each comparison allows the operations to skip about half of the tree, so that each lookup, insertion or deletion takes time proportional to the logarithm of the number of items stored in the tree, O(log n) . However, some times the worst case can happen, when the tree isn't balanced and the time complexity is O(n) for all three of these functions.
So, you kind of expect log(N) but it's not absolutely guaranteed.
the searching time complexity of a BST is N, and of binary search in and of itself is logN. What is the difference between the two?
The difference is that a binary search on a sorted array always starts at the middle element (i.e. the median when n is odd). This cannot be guaranteed in a BST. The root might be the middle element, but it doesn't have to be.
For instance, this is a valid BST:
10
/
8
/
5
/
2
/
1
...but it is not a balanced one, and so the process of finding the value 1 given the root of that tree, will include visiting all its nodes. If however the same values were presented in a sorted list (1,2,5,8,10), a binary search would start at 5 and never visit 8 or 10.
Adding self-balancing trees to the table
We can extend the given table with self-balancing search trees, like AVL, and then we get this:
implementation
search
insert
delete
sequential search (unordered list)
𝑁
𝑁
𝑁
binary search (ordered array)
lg𝑁
𝑁
𝑁
BST
𝑁
𝑁
𝑁
AVL
lgN
lgN
lgN

What is the most possible height when the binary search tree haw n nodes?

Is there a mathematical type for the most possible height of a tree with exactly n nodes?
It can be anything. If you are not implementing a balanced binary search tree (like AVL tree or Red-Black tree), then the height of the tree will depend on the inputs you give. In the worst-case, height can be equal to the number of nodes(if each value is greater than the previous one or each value is less than the previous one). If you need more info, please consider describing the specific use case for which this question was asked.

Big numbers in Redis Sorted Set

I would like to store values as a score in a redis sorted set that be be as big as 10^24 (and if possible even 2^256)
What are the integer size limits with ZRANGE?
For some context I'm trying to implement a ranking of top holders for a custom ethereum token. e.g. https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances
I want to hold the balances in a Redis DB and access it through node.js. I can retrieve the actual balances using web3, in case the db crashes or something. The point is i would like to have the data sorted and i would like to be able to access the data blazingly fast.
Quotation from the Redis documentation about sorted sets:
Range of integer scores that can be expressed precisely
Redis sorted sets use a double 64-bit floating point number to represent the score. In all the architectures we support, this is represented as an IEEE 754 floating point number, that is able to represent precisely integer numbers between -(2^53) and +(2^53) included. In more practical terms, all the integers between -9007199254740992 and 9007199254740992 are perfectly representable. Larger integers, or fractions, are internally represented in exponential form, so it is possible that you get only an approximation of the decimal number, or of the very big integer, that you set as score.
So when leaving the precise range and an approximation of the score is good enough for your use case, wikipedia says that 2^1023 would be the highest exponent possible.

Which is faster, a ternary search tree or a binary search tree?

ternery search tree requires O(log(n)+k) comparisons where n is number of strings and k is the length of the string to be searched and binary search tree requires log(n) comparisons then why is TST faster than BST ?
Because in the ternary case it is log3(n) where in the binary case it is log2(n).
Ternary search trees are specifically designed to store strings, so our analysis is going to need to take into account that each item stored is a string with some length. Let's say that the length of the longest string in the data structure is L and that there are n total strings.
You're correct that a binary search tree makes only O(log n) comparisons when doing a lookup. However, since the items stored in the tree are all strings, each one of those comparisons takes time O(L) to complete. Consequently, the actual runtime of doing a search with a binary search tree in this case would be O(L log n), since there are O(log n) comparisons costing O(L) time each.
Now, let's consider a ternary search tree. With a standard TST implementation, for each character of the input string to look up, we do a BST lookup to find the tree to descend into. This takes time O(log n) and we do it L times for a total runtime of O(L log n), matching the BST lookup time. However, you can actually improve upon this by replacing the standard BSTs in the ternary search tree with weight-balanced trees whose weight is given by the number of strings in each subtree, a more careful analysis can be used to show that the total runtime for a lookup will be O(L + log n), which is significantly faster than a standard BST lookup.
Hope this helps!