Recursive Runtime - Space Complexity (pg.44 of Cracking the Coding Interview) - time-complexity

On pg. 44 of Cracking the Coding Interview, there is the following algo:
int f(int n) {
if (n <= 1) {
return 1;
}
return f(n - 1) + f(n - 1);
}
The book says that this has time complexity of O(2^n) and space-complexity of O(n). I get the time complexity part since there are O(2^n) nodes created. I don't understand why the space-complexity is not also that. The book says because it's because only O(n) nodes exist at any given time.
How can that be? Wouldn't the call-stack have all 2^n calls when we are at the bottom level of f(1)? What am I missing?
Please let me know if I can provide more detail.
Thanks,

No. The second call to f(n-1) doesn't take place until after the first one returns, so they do not occupy stack space at the same time. When the first one returns, its stack space is freed and may be reused for the second call.
The same applies at every level of the recursion. The memory used is proportional to the maximum depth of the call tree, not the overall number of nodes.

Related

Binary search start or end is target

Why is it that when I see example code for binary search there is never an if statement to check if the start of the array or end is the target?
import java.util.Arrays;
public class App {
public static int binary_search(int[] arr, int left, int right, int target) {
if (left > right) {
return -1;
}
int mid = (left + right) / 2;
if (target == arr[mid]) {
return mid;
}
if (target < arr[mid]) {
return binary_search(arr, left, mid - 1, target);
}
return binary_search(arr, mid + 1, right, target);
}
public static void main(String[] args) {
int[] arr = { 3, 2, 4, -1, 0, 1, 10, 20, 9, 7 };
Arrays.sort(arr);
for (int i = 0; i < arr.length; i++) {
System.out.println("Index: " + i + " value: " + arr[i]);
}
System.out.println(binary_search(arr, arr[0], arr.length - 1, -1));
}
}
in this example if the target was -1 or 20 the search would enter recursion. But it added an if statement to check if target is mid, so why not add two more statements also checking if its left or right?
EDIT:
As pointed out in the comments, I may have misinterpreted the initial question. The answer below assumes that OP meant having the start/end checks as part of each step of the recursion, as opposed to checking once before the recursion even starts.
Since I don't know for sure which interpretation was intended, I'm leaving this post here for now.
Original post:
You seem to be under the impression that "they added an extra check for mid, so surely they should also add an extra check for start and end".
The check "Is mid the target?" is in fact not a mere optimization they added. Recursively checking "mid" is the whole point of a binary search.
When you have a sorted array of elements, a binary search works like this:
Compare the middle element to the target
If the middle element is smaller, throw away the first half
If the middle element is larger, throw away the second half
Otherwise, we found it!
Repeat until we either find the target or there are no more elements.
The act of checking the middle is fundamental to determining which half of the array to continue searching through.
Now, let's say we also add a check for start and end. What does this gain us? Well, if at any point the target happens to be at the very start or end of a segment, we skip a few steps and end slightly sooner. Is this a likely event?
For small toy examples with a few elements, yeah, maybe.
For a massive real-world dataset with billions of entries? Hm, let's think about it. For the sake of simplicity, we assume that we know the target is in the array.
We start with the whole array. Is the first element the target? The odds of that is one in a billion. Pretty unlikely. Is the last element the target? The odds of that is also one in a billion. Pretty unlikely too. You've wasted two extra comparisons to speed up an extremely unlikely case.
We limit ourselves to, say, the first half. We do the same thing again. Is the first element the target? Probably not since the odds are one in half a billion.
...and so on.
The bigger the dataset, the more useless the start/end "optimization" becomes. In fact, in terms of (maximally optimized) comparisons, each step of the algorithm has three comparisons instead of the usual one. VERY roughly estimated, that suggests that the algorithm on average becomes three times slower.
Even for smaller datasets, it is of dubious use since it basically becomes a quasi-linear search instead of a binary search. Yes, the odds are higher, but on average, we can expect a larger amount of comparisons before we reach our target.
The whole point of a binary search is to reach the target with as few wasted comparisons as possible. Adding more unlikely-to-succeed comparisons is typically not the way to improve that.
Edit:
The implementation as posted by OP may also confuse the issue slightly. The implementation chooses to make two comparisons between target and mid. A more optimal implementation would instead make a single three-way comparison (i.e. determine ">", "=" or "<" as a single step instead of two separate ones). This is, for instance, how Java's compareTo or C++'s <=> normally works.
BambooleanLogic's answer is correct and comprehensive. I was curious about how much slower this 'optimization' made binary search, so I wrote a short script to test the change in how many comparisons are performed on average:
Given an array of integers 0, ... , N
do a binary search for every integer in the array,
and count the total number of array accesses made.
To be fair to the optimization, I made it so that after checking arr[left] against target, we increase left by 1, and similarly for right, so that every comparison is as useful as possible. You can try this yourself at Try it online
Results:
Binary search on size 10: Standard 29 Optimized 43 Ratio 1.4828
Binary search on size 100: Standard 580 Optimized 1180 Ratio 2.0345
Binary search on size 1000: Standard 8987 Optimized 21247 Ratio 2.3642
Binary search on size 10000: Standard 123631 Optimized 311205 Ratio 2.5172
Binary search on size 100000: Standard 1568946 Optimized 4108630 Ratio 2.6187
Binary search on size 1000000: Standard 18951445 Optimized 51068017 Ratio 2.6947
Binary search on size 10000000: Standard 223222809 Optimized 610154319 Ratio 2.7334
so the total comparisons does seem to tend to triple the standard number, implying the optimization becomes increasingly unhelpful for larger arrays. I'd be curious whether the limiting ratio is exactly 3.
To add some extra check for start and end along with the mid value is not impressive.
In any algorithm design the main concerned is moving around it's complexity either it is time complexity or space complexity. Most of the time the time complexity is taken as more important aspect.
To learn more about Binary Search Algorithm in different use case like -
If Array is not containing any repeated
If Array has repeated element in this case -
a) return leftmost index/value
b) return rightmost index/value
and many more point

What is the time complexity of below function?

I was reading book about competitive programming and was encountered to problem where we have to count all possible paths in the n*n matrix.
Now the conditions are :
`
1. All cells must be visited for once (cells must not be unvisited or visited more than once)
2. Path should start from (1,1) and end at (n,n)
3. Possible moves are right, left, up, down from current cell
4. You cannot go out of the grid
Now this my code for the problem :
typedef long long ll;
ll path_count(ll n,vector<vector<bool>>& done,ll r,ll c){
ll count=0;
done[r][c] = true;
if(r==(n-1) && c==(n-1)){
for(ll i=0;i<n;i++){
for(ll j=0;j<n;j++) if(!done[i][j]) {
done[r][c]=false;
return 0;
}
}
count++;
}
else {
if((r+1)<n && !done[r+1][c]) count+=path_count(n,done,r+1,c);
if((r-1)>=0 && !done[r-1][c]) count+=path_count(n,done,r-1,c);
if((c+1)<n && !done[r][c+1]) count+=path_count(n,done,r,c+1);
if((c-1)>=0 && !done[r][c-1]) count+=path_count(n,done,r,c-1);
}
done[r][c] = false;
return count;
}
Here if we define recurrence relation then it can be like: T(n) = 4T(n-1)+n2
Is this recurrence relation true? I don't think so because if we use masters theorem then it would give us result as O(4n*n2) and I don't think it can be of this order.
The reason, why I am telling, is this because when I use it for 7*7 matrix it takes around 110.09 seconds and I don't think for n=7 O(4n*n2) should take that much time.
If we calculate it for n=7 the approx instructions can be 47*77 = 802816 ~ 106. For such amount of instruction it should not take that much time. So here I conclude that my recurrene relation is false.
This code generates output as 111712 for 7 and it is same as the book's output. So code is right.
So what is the correct time complexity??
No, the complexity is not O(4^n * n^2).
Consider the 4^n in your notation. This means, going to a depth of at most n - or 7 in your case, and having 4 choices at each level. But this is not the case. In the 8th, level you still have multiple choices where to go next. In fact, you are branching until you find the path, which is of depth n^2.
So, a non tight bound will give us O(4^(n^2) * n^2). This bound however is far from being tight, as it assumes you have 4 valid choices from each of your recursive calls. This is not the case.
I am not sure how much tighter it can be, but a first attempt will drop it to O(3^(n^2) * n^2), since you cannot go from the node you came from. This bound is still far from optimal.

Space complexity of a recursive algorithm from the CTCI book [duplicate]

This question already has answers here:
What is the space complexity of a recursive fibonacci algorithm?
(5 answers)
Closed 2 years ago.
I am going through the CTCI book and can't understand one of their examples. They start with:
int sum(int n) {
if (n <= 0) {
return 0;
}
return n + sum(n-1);
}
and explain that it's O(n) time and O(n) space because each of the calls is added to the call stack and takes up actual memory.
The next example is:
int f(int n) {
if (n <= 0) {
return 1;
}
return f(n - 1) + f(n-1);
}
and states that time complexity is O(2^n) and space is O(n). Although I understand why the time is O(2^n), I am not sure why the space is O(n)? Their explanation is that "only O(n) nodes exists at any given time". Why we don't count the space taken by each call stack, as it is in the first example?
P.S. After reading similar questions, should I assume that stack frame's space is reclaimed once we start moving back (or up) the recursion?
Unlike the time complexity, which is simply a total time that is needed to run a program, the space complexity describes the space required to execute the program. So it doesn't really matter that there are 2n nodes in the execution tree of the program. The call stack automatically folds and releases the additional memory used. What matters is the maximal depth of the call tree, which is O(n) for this program. Should be noted, though, that recursion is a special case that naturally releases any used memory upon stack fold. If memory is allocated explicitly during runtime, it should be released explicitly as well.
Regarding the first example, the call tree is simply a list of depth n, resulting in similar complexity of O(n).

What is the time complexity of this?

Code:
int main()
{
for(long long i=0;i<10000000;i++)
{
}
return 0;
}
I asked this because i wanted to know , Whether an empty loop add to the time of running of program. Like, say we do have a function within the loop but it does not run on every loop due to some condition:
Code:
int main()
{
for(long long i=0;i<10000;i++)
{
for(long long i=1;i<10000;i++)
{
if(//"some condition")
{
func(); // some function which we know is going to run only one-hundredth of the time due to the condition. time complexity of func() is O(1).
}
}
}
return 0;
}
Will the timecomplexity be O(N*N)??
Time-complexity is only meaningful in the context of variable-sized data-set; it describes how quickly the program's total execution time will increase as the size of the data-set increases. For example, if you have N items to process, and your algorithm needs to read each of those items a fixed number of times, then your algorithm is considered to be O(N).
In your first case, if we assume you have a "data set" whose current size is 10000000, then your single for-loop would be O(N) -- but note that since your for-loop doesn't have any observable effects, an optimizing compiler would probably just omit the loop entirely, reducing it to effectively O(1).
In your second (nested-loop) example (assuming the variable-set-size is 10000), the algorithm is O(N^2), because the number of steps the program has to run increases with the square of the set-size. That is true regardless of how often the internal if test evaluates to true, because the program will have to do some steps (such as evaluating the if condition) N*N times no how often (or rarely) the if-test evaluates to true. (Again, the exception would be if the compiler could somehow prove that the if statement never evaluates to true, or that the func() function had no observable side-effects, in which case it could legally omit the whole thing and just return 0 immediately)
Your first code has a worst-case complexity of O(n), because it iterates n times. Regardless of it doing nothing or a milllion things in each iteration, it is always of O(n) complexity. It may not be optimized away and the optimizer may not skip the empty loop.
Similarly, your second program has a complexity of O(n^2) because it iterates n^2 many times. The if condition inside may or may not be satisfied for some cases, and the program may not execute in the cases where the if is not satisfied, but it visits n^2 cases, which is enough to establish an O(n^2) complexity.

Calculating the expected running time of function

I have a question about calculating the expected running time of a given function. I understand just fine, how to calculate code fragments with cycles in them (for / while / if , etc.) but functions without them seems a bit odd to me. For example, lets say that we have the following code fragment:
public void Add(T item)
{
var newArr = new T[this.arr.Length + 1];
Array.Copy(this.arr, newArr, this.arr.Length);
newArr[newArr.Length - 1] = item;
this.arr = newArr;
}
If my logic works correctly, the function Add has a complexity of O(1), because in the best/worst/average case it will just read every line of code once, right?
You always have to consider the time complexity of the function calls, too. I don't know how Array.Copy is implemented, but I'm going to guess it's O(N), making the whole Add function O(N) as well. Your intuition is right, though - the rest of it is in fact O(1).
If you have multiple sub-operations with O(n) + O(log(n)) etc and the costliest step is the cost of the whole operation - by default big O refers to the worst case. Here as you copy the array, it is an O(n) operation
Complexity is calculated following this 2 rules :
-Calling a method (complexity+ 1)
-Encountering the following keywords : if, while, repeat, for, &&, ||, catch, case, etc … (complexity+ 1)
In your case , given you are trying to copy an array and not a single value , the algorithm will complete N copy operations giving you an O(N) operation.