I don't know whether the term "Lazy" Binary Search is valid, but I was going through some old materials and I just wanted to know if anyone can explain the algorithm of a Lazy Binary Search and compare it to a non-lazy Binary Search.
Let's say, we have this array of numbers:
2, 11, 13, 21, 44, 50, 69, 88
How to look for the number 11 using a Lazy Binary Search?
Justin was wrong.
The common binarySearch algorithm first checks whether the target is equal to current middle entry before proceeding to the left or right halves if required. Lazy binarySearch algorithm postpones the equality check until the very end.
algorithm lazyBinarySearch(array, n, target)
left<-0
right<-n-1
while (left<right) do
mid<-(left+right)/2
if(target>array[mid])then
left<-mid+1
else
right<-mid
endif
endwhile
if(target==array[left])then
display "found at position", left
else
display "not found"
endif
In your case, in an array,
2 11 13 21 44 50 69 88
and you want to search for 11
First we do a trace of common binary search,
index 0 1 2 3 4 5 6 7
2 11 13 21 44 50 69 88
left mid right
First while loop:
left <= right, we enter the first while loop. We calculated the mid index by (0+7)/2=3.5=3 by integer division, mid = 3. straight away we check if target 11 is equal to the mid index entry, 11 != 21, then we decide whether to go left or right, we finds out 11 < 21, should go left. left index remains unchanged, right index becomes mid index -1, right = 3 - 1 = 2. Done this step.
Second while loop:
left <= right, 0 <= 2, we enter the seond while loop. Mid index is recalcuated: (0+2)/2=1, mid = 1. At once we do the equality check, target 11 is the same as the index 1 entry, 11 == 11. We found this entry, leaving behind all the left right mid indexes (don't care) and breaks out the while loop, return index 1.
Now we trace this search by lazy binazySearch algorithm, initial array with left/right indexes set up the same as previous.
First while loop:
left < right, we enter the first while loop. Mid index is calculated as the same above = 3. Instead of doing an equality check in common binarySearch we do a comparison with the mid index entry this time. And the comparison only checks if our target 11 is greater than the mid index entry, leaving whether they equal or not to the very end outside the while loop. So we find out 11 < 21, right index is reset to the mid index, right = 3. Done this step.
Second while loop:
left < right, 0 < 3, we enter the second while loop. mid index is recalculated as mid = (0+3)/2 = 1 by integer division. Again we do a comparison with mid index entry 11, we realise it's not greater than mid index entry. We fall into the else part of the while loop and reset the right index to be mid index, right = 1. Done this step.
Third while loop:
left < right, 0 < 1, this time we have to re-enter the while loop again since it still satisfies the while condition. Mid index becomes (0+1)/2=0, mid = 0. After comparing target 11 with mid index entry 2 we found out it's greater than it (11 > 2), left index is reset to mid + 1, left = 0 + 1 = 1. Done this step.
With left index = 1 and right index = 1, left = right, while loop condition is no longer satisfied, so there's no need to re-enter. We fall into the if part down below. Target 11 equals left index entry 11, we found the target and returns left index 1.
As you can see, lazy binarySearch does one more while loop step to finally realise the index is actually 1. And this is how the word "postpones the equality check" means in the definition I mentioned in the very beginning. Clearly the lazy binarySearch algorithm does more things than common binarySearch before reaching the program termination. And the term "lazy" is reflected in the time of when to check the target equals the mid index entry.
However lazy binarySearch is more preferable to use under some other circumstances, but it's not in the context of this case.
(The reasoning part of the algorithm is done by me, anyone wishes to copy please do credit)
source: "Data Structures Outside In With Java" by Sesh Venugopal, Prentice Hall
As far as I am aware "Lazy binary search" is just another name for "Binary search".
Related
L = {w | 2|w|a != 3|w|b + 2} ∪ {aaab, bbba}.
|w|a = number of a's, same for b.
How do I use the top of the stack when only the number of a's/b's counts and they can be in any order?
It is not completely clear what you mean by "use the top of the stack."
To construct a PDA may start with one for the language {w: |w|a = |w|b}.
When it reads an a it
puts an a on the stack if there is already an a or the stack is empty
removes a b from the stack
For the case of reading a b symmetrically. The PDA accepts if the stack is empty when the entire input has been read. So the stack indicates whether so far more a or more b have been read, because the majority symbol is the one that it contains.
With the factors 2 and 3 and the added 2 b it becomes a bit more complicated. I would not handle this in the stack but in the states. This means, we implement a counter for 0 or 1 a and for 0,1 or 2 b there. When we read a an input symbol x, we first try to increment the respective counter in the state. If this is possible, this is the only thing we do. If the counter is full, we set it to zero and take the action corresponding to this symbol in the PDA above for the stack.
For the +2, we count the first two b in the states before we actually start filling the counter.
The following SQL query is supposed to return the max consecutive numbers in a set.
WITH RECURSIVE Mystery(X,Y) AS (SELECT A AS X, A AS Y FROM R)
UNION (SELECT m1.X, m2.Y
FROM Mystery m1, Mystery m2
WHERE m2.X = m1.Y + 1)
SELECT MAX(Y-X) + 1 FROM Mystery;
This query on the set {7, 9, 10, 14, 15, 16, 18} returns 3, because {14 15 16} is the longest chain of consecutive numbers and there are three numbers in that chain. But when I try to work through this manually I don't see how it arrives at that result.
For example, given the number set above I could create two columns:
m1.x
m2.y
7
7
9
9
10
10
14
14
15
15
16
16
18
18
If we are working on rows and columns, not the actual data, as I understand it WHERE m2.X = m1.Y + 1 takes the value from the next row in Y and puts it in the current row of X, like so
m1.X
m2.Y
9
7
10
9
14
10
15
14
16
15
18
16
18
Null?
The main part on which I am uncertain is where in the SQL recursion actually happens. According to Denis Lukichev recursion is the R part - or in this case the RECURSIVE Mystery(X,Y) - and stops when the table is empty. But if the above is true, how would the table ever empty?
Since I don't know how to proceed with the above, let me try a different direction. If WHERE m2.X = m1.Y + 1 is actually a comparison, the result should be:
m1.X
m2.Y
14
14
15
15
16
16
But at this point, it seems that it should continue recursively on this until only two rows are left (nothing else to compare). If it stops here to get the correct count of 3 rows (2 + 1), what is actually stopping the recursion?
I understand that for the above example the MAX(Y-X) + 1 effectively returns the actual number of recursion steps and adds 1.
But if I have 7 consecutive numbers and the recursion flows down to 2 rows, should this not end up with an incorrect 3 as the result? I understand recursion in C++ and other languages, but this is confusing to me.
Full disclosure, yes it appears this is a common university question, but I am retired, discovered this while researching recursion for my use, and need to understand how it works to use similar recursion in my projects.
Based on this db<>fiddle shared previously, you may find it instructive to alter the CTE to include an iteration number as follows, and then to show the content of the CTE rather than the output of final SELECT. Here's an amended CTE and its content after the recursion is complete:
Amended CTE
WITH RECURSIVE Mystery(X,Y) AS ((SELECT A AS X, A AS Y, 1 as Z FROM R)
UNION (SELECT m1.X, m2.A, Z+1
FROM Mystery m1
JOIN R m2 ON m2.A = m1.Y + 1))
CTE Content
x
y
z
7
7
1
9
9
1
10
10
1
14
14
1
15
15
1
16
16
1
18
18
1
9
10
2
14
15
2
15
16
2
14
16
3
The Z field holds the iteration count. Where Z = 1 we've simply got the rows from the table R. The, values X and Y are both from the field A. In terms of what we are attempting to achieve these represent sequences consecutive numbers, which start at X and continue to (at least) Y.
Where Z = 2, the second iteration, we find all the rows first iteration where there is a value in R which is one higher than our Y value, or one higher than the last member of our sequence of consecutive numbers. That becomes the new highest number, and we add one to the number of iterations. As only three numbers in our original data set have successors within the set, there are only three rows output in the second iteration.
Where Z = 3, the third iteration, we find all the rows of the second iteration (note we are not considering all the rows of the first iteration again), where there is, again, a value in R which is one higher than our Y value, or one higher than the last member of our sequence of consecutive numbers. That, again, becomes the new highest number, and we add one to the number of iterations.
The process will attempt a fourth iteration, but as there are no rows in R where the value is one more than the Y values from our third iteration, no extra data gets added to the CTE and recursion ends.
Going back to the original db<>fiddle, the process then searches our CTE content to output MAX(Y-X) + 1, which is the maximum difference between the first and last values in any consecutive sequence, plus one. This finds it's value from the record produced in the third iteration, using ((16-14) + 1) which has a value of 3.
For this specific piece of code, the output is always equivalent to the value in the Z field as every addition of a row through the recursion adds one to Z and adds one to Y.
I have read this: https://www.topcoder.com/community/competitive-programming/tutorials/binary-search.
I can't understand some parts==>
What we can call the main theorem states that binary search can be
used if and only if for all x in S, p(x) implies p(y) for all y > x.
This property is what we use when we discard the second half of the
search space. It is equivalent to saying that ¬p(x) implies ¬p(y) for
all y < x (the symbol ¬ denotes the logical not operator), which is
what we use when we discard the first half of the search space.
But I think this condition does not hold when we want to find an element(checking for equality only) in an array and this condition only holds when we're trying to find Inequality for example when we're searching for an element greater or equal to our target value.
Example: We are finding 5 in this array.
indexes=0 1 2 3 4 5 6 7 8
1 3 4 4 5 6 7 8 9
we define p(x)=>
if(a[x]==5) return true else return false
step one=>middle index = 8+1/2 = 9/2 = 4 ==> a[4]=5
and p(x) is correct for this and from the main theory, the result is that
p(x+1) ........ p(n) is true but its not.
So what is the problem?
We CAN use that theorem when looking for an exact value, because we
only use it when discarding one half. If we are looking for say 5,
and we find say 6 in the middle, the we can discard the upper half,
because we now know (due to the theorem) that all items in there are > 5
Also notice, that if we have a sorted sequence, and want to find any element
that satisfies an inequality, looking at the end elements is enough.
Problem description:
A set of lists of N integers i1,i2,....,iN with 0<= i1<=i2<=i3<=....<=iN <=M, is created by starting with one integer 0<=i1<=M, and repeatedly adding one integer that is greater or equal to the last integer added.
When adding the last integer to get the final set of lists, the index runs starting from 0 to BinomialC[M+N,N)]-1.
For example, for M=3, i1=0,1,2,3
so the lists are
{0},{1},...,{3}.
Adding another integer i2>=i1 will result in
{0,0},{0,1},{0,2},{0,3},
{1,1},{1,2},{1,3},
{2,2},{2,3}
{3,3}
with indices
0,1,2,3,
4,5,6,
7,8,
9.
This index can be represented in terms of i1,i2,...,iN and M. If the conditions >= were not present, then it would be simply i1*(M+1)^(N-1)+i2*(M+1)^(N-2)+...+iN*(M+1)^(N-N). But, in the case above, there is a negative shift in the index due to the restrictions. For example, N=2 the shift is -i1(i1+1)/2 and index is i = i1*(M+1)^1 + i2*(M+1)^0 -i1(i1+1)/2.
Question:
Does anyone especially from mathematics background knows how to write the index for general N element case? or just the final expression? Any help would be appriciated!
Thanks!
I want to create a visualization of a BST, but every example I find online stops after inserting only 7 or less values. Let's say I'm doing the following sequence:
insert(5),insert(7),insert(9),insert(8),insert(3),insert(2),insert(4),insert(6),insert(10).
Up until insert(6), I end up with:
My main question is: where do I go from here? Do I add on to my left-most leaf or do I add on to my "lowest" leaf?
Also: according to wikipedia the code for an insertion is:
void insert(Node* node, int value) {
if (value < node->key) {
if (node->leftChild == NULL)
node->leftChild = new Node(value);
else
insert(node->leftChild, value);
} else {
if(node->rightChild == NULL)
node->rightChild = new Node(value);
else
insert(node->rightChild, value);
}
}
But according to this, once I'm at 8 and I get insert(3), it would add 3 to the left of 8, as it would compare 3 with the node 9, see that the less-than spot is already taken by 8, then rerun the insertion with 8 being the node compared to, and place the 3 as the left child of 8. But this would just create kind of a list.
Thanks.
What seems to mislead you is that on every insert, you must start from the root (which, in your case, is 5). So, let's take the graph above and try to insert 3 following the algorithm you pasted:
3 < 5, we go left and we meet 3
3 == 3, we go right (here it's the same, the code above says "go right") and we meet 4
3 < 4, we go left. Since 4 has no left child, 3 becomes its left child.
Try to build your tree from zero using the algorithm above. Incidentally, you won't find many examples with n > 10 nodes, because they tend to be exceedingly long without any benefit for the reader.