I have some difficulties in understanding the time complexity analysis for one solution for the Happy Number Question from Leet code, for my doubts on complexity analysis, I marked them in bold and really appreciate your advice
Here is the question:
Link: https://leetcode.com/problems/happy-number/
Question:
Write an algorithm to determine if a number is "happy".
A happy number is a number defined by the following process: Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process ends in 1 are happy numbers.
Example:
Input: 19
Output: true
Explanation:
1^2(square of 1) + 9^2 = 82
8^2 + 2^2 = 68
6^2 + 8^2 = 100
1^2 + 0^2 + 0^2 = 1
Here is the code:
class Solution(object):
def isHappy(self, n):
#getnext function will compute the sum of square of each digit of n
def getnext(n):
totalsum = 0
while n>0:
n,v = divmod(n,10)
totalsum+=v**2
return totalsum
#we declare seen as a set to track the number we already visited
seen = set()
#we stop checking if: either the number reaches one or the number was visited #already(ex.a cycle)
while n!=1 and (n not in seen):
seen.add(n)
n = getnext(n)
return n==1
Note: feel free to let me know if I need to explain how the code works
Time Complexity Analysis:
Time complexity : O(243 * 3 + logN + loglogN + log loglog N)...=O(logN).
Finding the next value for a given number has a cost of O(log n)because we are processing each digit in the number, and the number of digits in a number is given by logN.
My doubt: why the number of digits in a number is given by logN? what is N here? the value of a specific number or something else?
To work out the total time complexity, we'll need to think carefully about how many numbers are in the chain, and how big they are.
We determined above that once a number is below 243, it is impossible for it to go back up above 243.Therefore, based on our very shallow analysis we know for sure that once a number is below 243, it is impossible for it to take more than another 243 steps to terminate.
Each of these numbers has at most 3 digits. With a little more analysis, we could replace the 243 with the length of the longest number chain below 243, however because the constant doesn't matter anyway, we won't worry about it.
My doubt: I think the above paragraph is related to the time complexity component of 243*3, but I cannot understand why we multiply 243 by 3
For an n above 243, we need to consider the cost of each number in the chain that is above 243. With a little math, we can show that in the worst case, these costs will be O(log n) + O(log log n) + O(log log log N)... Luckily for us, the O(logN) is the dominating part, and the others are all tiny in comparison (collectively, they add up to less than logN), so we can ignore them. My doubt: what is the reasoning behind O(log log n) O(log log log N) for an n above 243?
Well, my guess for the first doubt is that the number of digits of a base 10 number is given by it's value (N) taken to the logarithm at base 10, rounded down. So for example, 1023 would have floor(log10(1023)) digits, which is 3. So yes, the N is the value of the number. the log in time complexity indicates a logarithm, not specifically that of base 2 or base e.
As for the second doubt, it probably has to do with the work required to reduce a number to below 243, but I am not sure. I'll edit this answer once I work that bit out.
Let's say N has M digits. Than getnext(N) <= 81*M. The equality happens when N only has 9's.
When N < 1000, i.e. at most 3 digits, getnext(N) <= 3*81 = 243. Now, you will have to call getnext(.) at most O(243) times to figure out if N is indeed happy.
If M > 3, number of digits of getnext(N) must be less than M. Try getnext(9999), getnext(99999), and so on [1].
Notes:
[1] Adding a digit to N can make it at most 10*N + 9, i.e. adding a 9 at the end. But the number of digits increases to M+1 only. It's a logarithmic relationship between N and M. Hence, the same relationship holds between N and 81*M.
Using the Leetcode solution
class Solution {
private int getNext(int n) {
int totalSum = 0;
while (n > 0) {
int d = n % 10;
n = n / 10;
totalSum += d * d;
}
return totalSum;
}
public boolean isHappy(int n) {
Set<Integer> seen = new HashSet<>();
while (n != 1 && !seen.contains(n)) {
seen.add(n);
n = getNext(n);
}
return n == 1;
}
}
}
O(243*3) for n < 243
3 is the max number of digits in n
e.g. For n = 243
getNext() will take a maximum of 3 iterations because there are 3 digits for us to loop over.
isHappy() can take a maximum of 243 iterations to find a cycle or terminate, because we can store a max of 243 numbers in our hash set.
O(log n) + O(log log n) + O(log log log N)... for n > 243
1st iteration + 2nd iteration + 3rd iteration ...
getNext() will be called a maximum of O(log n) times. Because log10 n is the number of digits.
isHappy() will be called a maximum of 9^2 per digit. This is the max we can store in the hash set before we find a cycle or terminate.
First Iteration
9^2 * number of digits
O(81*(log n)) drop the constant
O(log n)
+
Second Iteration
O(log (81*(log n))) drop the constant
O(log log n)
+
Third Iteration
O(log log log N)
+
ect ...
Related
char[] chars = String.valueOf(num).toCharArray();
int n = chars.length;
for (int i = n - 1; i > 0; i--) {
if (chars[i - 1] > chars[i]) {
chars[i - 1]--;
Arrays.fill(chars, i, n, '9');
}
}
return Integer.parseInt(new String(chars));
What is the time complexity of this code? Could you teach me how to calculate it? Thank you!
Time complexity is a measure of how a program's run-time changes as input size grows. The first thing you have to determine is what (if any) aspect of your input can cause run-time to vary, and to represent that aspect as a variable. Framing run-time as a function of that variable (or those variables) is how you determine the program's time complexity. Conventionally, when only a single aspect of the input causes run-time to vary, we represent that aspect by the variable n.
Your program primarily varies in run-time based on how many times the for-loop runs. We can see that the for-loop runs the length of chars times (note that the length of chars is the number of digits of num). Conveniently, that length is already denoted as n, which we will use as the variable representing input size. That is, taking n to be the number of digits in num, we want to determine exactly how run-time varies with n by expressing run-time as a function of n.
Also note that when doing complexity analysis, you are primarily concerned with the growth in run-time as n gets arbitrarily large (how run-time scales with n as n goes to infinity), so you typically ignore constant factors and only focus on the highest order terms, writing run-time in "Big O" notation. That is, because 3n, n, and n/2 all grow in about the same way as n goes to infinity, we would represent them all as O(n), because our primary goal is to distinguish this kind of linear growth, from the quadratic O(n^2) growth of n^2, 5n^2 + 10, or n^2 + n, from the logarithmic O(logn) growth of log(n), log(2n), or log(n) + 1, or the constant O(1) time (time that doesn't scale with n) represented by 1, 5, 100000 etc.
So, let's try to express the total number of operations the program does in terms of n. It can be helpful at first to just go line-by-line for this and to add everything up in the end.
The first line, turning num into an n character string and then turning that string to an length n array of chars, does O(n) work. (n work to turn each of n digits into a character, than another n work to put each of those characters into an array. n + n = 2n = O(n) total work)
char[] chars = String.valueOf(num).toCharArray(); // O(n)
The next line just reads the length value from the array and saves it as n. This operation takes the same amount of time no matter how long the array is, so it is O(1).
int n = chars.length; // O(1)
For every digit of num, our program runs 1 for loop, so O(n) total loops:
for (int i = n - 1; i > 0; i--) { // O(n)
Inside the for loop, a conditional check is performed, and then, if it returns true, a value may be decremented and the array from i to n filled.
if (chars[i - 1] > chars[i]) { // O(1)
chars[i - 1]--; // O(1)
Arrays.fill(chars, i, n, '9'); // O(n-i) = O(n)
}
The fill operation is O(n-i) because that is how many characters may be changed to '9'. O(n-i) is O(n), because i is just a constant and lower order than n, which, as previously mentioned, means it gets ignored in big O.
Finally, you parse the n characters of chars as an int, and return it. Altogether:
static Integer foo(int num) {
char[] chars = String.valueOf(num).toCharArray(); // O(n)
int n = chars.length; // O(1)
for (int i = n - 1; i > 0; i--) { // O(n)
if (chars[i - 1] > chars[i]) { // O(1)
chars[i - 1]--; // O(1)
Arrays.fill(chars, i, n, '9'); // O(n)
}
}
return Integer.parseInt(new String(chars)); // O(n)
}
When we add everything up, we get, the total time-complexity as a function of n, T(n).
T(n) = O(n) + O(1) + O(n)*(O(1) + O(1) + O(n)) + O(n)
There is a product in the expression to represent the total work done across all iterations of the for-loop: O(n) iterations times O(1) + O(1) + O(n) work in each iteration. In reality, some iterations the for loop might only do O(1) work (when the condition is false), but in the worst case the whole body is executed every iteration, and complexity analysis is typically done for the worst case unless otherwise specified.
You can simplify this function for run-time by using the fact that big O strips constants and lower-order terms, along with the facts that that O(a) + O(b) = O(a + b) and a*O(b) = O(a*b).
T(n) = O(n+1+n) + O(n)*O(1 + 1 + n)
= O(2n+1) + O(n)*O(n+2)
= O(n) + O(n)*O(n)
= O(n) + O(n^2)
= O(n^2 + n)
= O(n^2)
So you would say that the overall time complexity of the program is O(n^2), meaning that run-time scales quadratically with input size in the worst case.
I have an integer, N.
I denote f[i] = number of appearances of the digit i in N.
Now, I have the following algorithm.
FOR i = 0 TO 9
FOR j = 1 TO f[i]
k = k*10 + i;
My teacher said this is O(N). It seems to me more like a O(logN) algorithm.
Am I missing something?
I think that you and your teacher are saying the same thing but it gets confused because the integer you are using is named N but it is also common to refer to an algorithm that is linear in the size of its input as O(N). N is getting overloaded as the specific name and the generic figure of speech.
Suppose we say instead that your number is Z and its digits are counted in the array d and then their frequencies are in f. For example, we could have:
Z = 12321
d = [1,2,3,2,1]
f = [0,2,2,1,0,0,0,0,0,0]
Then the cost of going through all the digits in d and computing the count for each will be O( size(d) ) = O( log (Z) ). This is basically what your second loop is doing in reverse, it's executing one time for each occurence of each digits. So you are right that there is something logarithmic going on here -- the number of digits of Z is logarithmic in the size of Z. But your teacher is also right that there is something linear going on here -- counting those digits is linear in the number of digits.
The time complexity of an algorithm is generally measured as a function of the input size. Your algorithm doesn't take N as an input; the input seems to be the array f. There is another variable named k which your code doesn't declare, but I assume that's an oversight and you meant to initialise e.g. k = 0 before the first loop, so that k is not an input to the algorithm.
The outer loop runs 10 times, and the inner loop runs f[i] times for each i. Therefore the total number of iterations of the inner loop equals the sum of the numbers in the array f. So the complexity could be written as O(sum(f)) or O(Σf) where Σ is the mathematical symbol for summation.
Since you defined that N is an integer which f counts the digits of, it is in fact possible to prove that O(Σf) is the same thing as O(log N), so long as N must be a positive integer. This is because Σf equals how many digits the number N has, which is approximately (log N) / (log 10). So by your definition of N, you are correct.
My guess is that your teacher disagrees with you because they think N means something else. If your teacher defines N = Σf then the complexity would be O(N). Or perhaps your teacher made a genuine mistake; that is not impossible. But the first thing to do is make sure you agree on the meaning of N.
I find your explanation a bit confusing, but lets assume N = 9075936782959 is an integer. Then O(N) doesn't really make sense. O(length of N) makes more sense. I'll use n for the length of N.
Then f(i) = iterate over each number in N and sum to find how many times i is in N, that makes O(f(i)) = n (it's linear). I'm assuming f(i) is a function, not an array.
Your algorithm loops at most:
10 times (first loop)
0 to n times, but the total is n (the sum of f(i) for all digits must be n)
It's tempting to say that algorithm is then O(algo) = 10 + n*f(i) = n^2 (removing the constant), but f(i) is only calculated 10 times, each time the second loops is entered, so O(algo) = 10 + n + 10*f(i) = 10 + 11n = n. If f(i) is an array, it's constant time.
I'm sure I didn't see the problem the same way as you. I'm still a little confused about the definition in your question. How did you come up with log(n)?
So these are the for loops that I have to find the time complexity, but I am not really clearly understood how to calculate.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
For the first line, (int i = n; i > 1; i /= 3), keeps diving i by 3 and if i is less than 1 then the loop stops there, right?
But what is the time complexity of that? I think it is n, but I am not really sure.
The reason why I am thinking it is n is, if I assume that n is 30 then i will be like 30, 10, 3, 1 then the loop stops. It runs n times, doesn't it?
And for the last for loop, I think its time complexity is also n because what it does is
k starts as 2 and keeps multiplying itself to itself until k is greater than n.
So if n is 20, k will be like 2, 4, 16 then stop. It runs n times too.
I don't really think I am understanding this kind of questions because time complexity can be log(n) or n^2 or etc but all I see is n.
I don't really know when it comes to log or square. Or anything else.
Every for loop runs n times, I think. How can log or square be involved?
Can anyone help me understanding this? Please.
Since all three loops are independent of each other, we can analyse them separately and multiply the results at the end.
1. i loop
A classic logarithmic loop. There are countless examples on SO, this being a similar one. Using the result given on that page and replacing the division constant:
The exact number of times that this loop will execute is ceil(log3(n)).
2. j loop
As you correctly figured, this runs O(n / 2) times;
The exact number is floor(n / 2).
3. k loop
Another classic known result - the log-log loop. The code just happens to be an exact replicate of this SO post;
The exact number is ceil(log2(log2(n)))
Combining the above steps, the total time complexity is given by
Note that the j-loop overshadows the k-loop.
Numerical tests for confirmation
JavaScript code:
T = function(n) {
var m = 0;
for (var i = n; i > 1; i /= 3) {
for (var j = 0; j < n; j += 2)
m++;
for (var k = 2; k < n; k = k * k)
m++;
}
return m;
}
M = function(n) {
return ceil(log(n)/log(3)) * (floor(n/2) + ceil(log2(log2(n))));
}
M(n) is what the math predicts that T(n) will exactly be (the number of inner loop executions):
n T(n) M(n)
-----------------------
100000 550055 550055
105000 577555 577555
110000 605055 605055
115000 632555 632555
120000 660055 660055
125000 687555 687555
130000 715055 715055
135000 742555 742555
140000 770055 770055
145000 797555 797555
150000 825055 825055
M(n) matches T(n) perfectly as expected. A plot of T(n) against n log n (the predicted time complexity):
I'd say that is a convincing straight line.
tl;dr; I describe a couple of examples first, I analyze the complexity of the stated problem of OP at the bottom of this post
In short, the big O notation tells you something about how a program is going to perform if you scale the input.
Imagine a program (P0) that counts to 100. No matter how often you run the program, it's going to count to 100 as fast each time (give or take). Obviously right?
Now imagine a program (P1) that counts to a number that is variable, i.e. it takes a number as an input to which it counts. We call this variable n. Now each time P1 runs, the performance of P1 is dependent on the size of n. If we make n a 100, P1 will run very quickly. If we make n equal to a googleplex, it's going to take a little longer.
Basically, the performance of P1 is dependent on how big n is, and this is what we mean when we say that P1 has time-complexity O(n).
Now imagine a program (P2) where we count to the square root of n, rather than to itself. Clearly the performance of P2 is going to be worse than P1, because the number to which they count differs immensely (especially for larger n's (= scaling)). You'll know by intuition that P2's time-complexity is equal to O(n^2) if P1's complexity is equal to O(n).
Now consider a program (P3) that looks like this:
var length= input.length;
for(var i = 0; i < length; i++) {
for (var j = 0; j < length; j++) {
Console.WriteLine($"Product is {input[i] * input[j]}");
}
}
There's no n to be found here, but as you might realise, this program still depends on an input called input here. Simply because the program depends on some kind of input, we declare this input as n if we talk about time-complexity. If a program takes multiple inputs, we simply call those different names so that a time-complexity could be expressed as O(n * n2 + m * n3) where this hypothetical program would take 4 inputs.
For P3, we can discover it's time-complexity by first analyzing the number of different inputs, and then by analyzing in what way it's performance depends on the input.
P3 has 3 variables that it's using, called length, i and j. The first line of code does a simple assignment, which' performance is not dependent on any input, meaning the time-complexity of that line of code is equal to O(1) meaning constant time.
The second line of code is a for loop, implying we're going to do something that might depend on the length of something. And indeed we can tell that this first for loop (and everything in it) will be executed length times. If we increase the size of length, this line of code will do linearly more, thus this line of code's time complexity is O(length) (called linear time).
The next line of code will take O(length) time again, following the same logic as before, however since we are executing this every time execute the for loop around it, the time complexity will be multiplied by it: which results in O(length) * O(length) = O(length^2).
The insides of the second for loop do not depend on the size of the input (even though the input is necessary) because indexing on the input (for arrays!!) will not become slower if we increase the size of the input. This means that the insides will be constant time = O(1). Since this runs in side of the other for loop, we again have to multiply it to obtain the total time complexity of the nested lines of code: `outside for-loops * current block of code = O(length^2) * O(1) = O(length^2).
The total time-complexity of the program is just the sum of everything we've calculated: O(1) + O(length^2) = O(length^2) = O(n^2). The first line of code was O(1) and the for loops were analyzed to be O(length^2). You will notice 2 things:
We rename length to n: We do this because we express
time-complexity based on generic parameters and not on the ones that
happen to live within the program.
We removed O(1) from the equation. We do this because we're only
interested in the biggest terms (= fastest growing). Since O(n^2)
is way 'bigger' than O(1), the time-complexity is defined equal to
it (this only works like that for terms (e.g. split by +), not for
factors (e.g. split by *).
OP's problem
Now we can consider your program (P4) which is a little trickier because the variables within the program are defined a little cloudier than the ones in my examples.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
}
If we analyze we can say this:
The first line of code is executed O(cbrt(3)) times where cbrt is the cubic root of it's input. Since i is divided by 3 every loop, the cubic root of n is the number of times the loop needs to be executed before i is smaller or equal to 1.
The second for loop is linear in time because j is executed
O(n / 2) times because it is increased by 2 rather than 1 which
would be 'normal'. Since we know that O(n/2) = O(n), we can say
that this for loop is executed O(cbrt(3)) * O(n) = O(n * cbrt(n)) times (first for * the nested for).
The third for is also nested in the first for, but since it is not nested in the second for, we're not going to multiply it by the second one (obviously because it is only executed each time the first for is executed). Here, k is bound by n, however since it is increased by a factor of itself each time, we cannot say it is linear, i.e. it's increase is defined by a variable rather than by a constant. Since we increase k by a factor of itself (we square it), it will reach n in 2log(n) steps. Deducing this is easy if you understand how log works, if you don't get this you need to understand that first. In any case, since we analyze that this for loop will be run O(2log(n)) time, the total complexity of the third for is O(cbrt(3)) * O(2log(n)) = O(cbrt(n) *2log(n))
The total time-complexity of the program is now calculated by the sum of the different sub-timecomplexities: O(n * cbrt(n)) + O(cbrt(n) *2log(n))
As we saw before, we only care about the fastest growing term if we talk about big O notation, so we say that the time-complexity of your program is equal to O(n * cbrt(n)).
Here is the code given in the book "Cracking the Coding Interview" by Gayle Laakmann. Here time complexity of the code to find:-
int sumDigits(int n)
{ int sum=0;
while(n >0)
{
sum+=n%10;
n/=10
}
return sum ;
}
I know the time complexity should be the number of digits in n.
According to the book, its run time complexity is O(log n). Book provided brief description but I don't understand.
while(n > 0)
{
sum += n % 10;
n /= 10;
}
so, how much steps will this while loop take so that n comes to 0? What you do in each step, you divide n with 10. So, you need to do it k times in order to come to 0. Note, k is the number of digits in n.
Lets go step by step:
First step is when n > 0, you divide n by 10. If n is still positive, you will divide it by 10. What you get there is n/10/10 or n / (10^2). After third time, its n / (10^3). And after k times, its n/(10^k) = 0. And loop will end. But this is not 0 in mathematical sense, its 0 because we deal with integers. What you really have is |n|/(10^k) < 1, where k∈N.
So, we have this now:
n/(10^k) < 1
n < 10^k
logn < k
Btw. its also n/(10^(k-1)) > 1, so its:
k-1 < logn < k. (btw. don't forget, this is base 10).
So, you need to do logn + 1 steps in order to finish the loop, and thats why its O(log(n)).
The number of times the logic would run is log(n) to the base of 10 which is same as (log(n) to the base 2)/ (log (10) to the base of 2). In terms of time complexity, this would simply be O(log (n)). Note that log(n) to the base of 10 is how you would represent the number of digits in n.
I understand Big-O notation, but I don't know how to calculate it for many functions. In particular, I've been trying to figure out the computational complexity of the naive version of the Fibonacci sequence:
int Fibonacci(int n)
{
if (n <= 1)
return n;
else
return Fibonacci(n - 1) + Fibonacci(n - 2);
}
What is the computational complexity of the Fibonacci sequence and how is it calculated?
You model the time function to calculate Fib(n) as sum of time to calculate Fib(n-1) plus the time to calculate Fib(n-2) plus the time to add them together (O(1)). This is assuming that repeated evaluations of the same Fib(n) take the same time - i.e. no memoization is used.
T(n<=1) = O(1)
T(n) = T(n-1) + T(n-2) + O(1)
You solve this recurrence relation (using generating functions, for instance) and you'll end up with the answer.
Alternatively, you can draw the recursion tree, which will have depth n and intuitively figure out that this function is asymptotically O(2n). You can then prove your conjecture by induction.
Base: n = 1 is obvious
Assume T(n-1) = O(2n-1), therefore
T(n) = T(n-1) + T(n-2) + O(1) which is equal to
T(n) = O(2n-1) + O(2n-2) + O(1) = O(2n)
However, as noted in a comment, this is not the tight bound. An interesting fact about this function is that the T(n) is asymptotically the same as the value of Fib(n) since both are defined as
f(n) = f(n-1) + f(n-2).
The leaves of the recursion tree will always return 1. The value of Fib(n) is sum of all values returned by the leaves in the recursion tree which is equal to the count of leaves. Since each leaf will take O(1) to compute, T(n) is equal to Fib(n) x O(1). Consequently, the tight bound for this function is the Fibonacci sequence itself (~θ(1.6n)). You can find out this tight bound by using generating functions as I'd mentioned above.
Just ask yourself how many statements need to execute for F(n) to complete.
For F(1), the answer is 1 (the first part of the conditional).
For F(n), the answer is F(n-1) + F(n-2).
So what function satisfies these rules? Try an (a > 1):
an == a(n-1) + a(n-2)
Divide through by a(n-2):
a2 == a + 1
Solve for a and you get (1+sqrt(5))/2 = 1.6180339887, otherwise known as the golden ratio.
So it takes exponential time.
I agree with pgaur and rickerbh, recursive-fibonacci's complexity is O(2^n).
I came to the same conclusion by a rather simplistic but I believe still valid reasoning.
First, it's all about figuring out how many times recursive fibonacci function ( F() from now on ) gets called when calculating the Nth fibonacci number. If it gets called once per number in the sequence 0 to n, then we have O(n), if it gets called n times for each number, then we get O(n*n), or O(n^2), and so on.
So, when F() is called for a number n, the number of times F() is called for a given number between 0 and n-1 grows as we approach 0.
As a first impression, it seems to me that if we put it in a visual way, drawing a unit per time F() is called for a given number, wet get a sort of pyramid shape (that is, if we center units horizontally). Something like this:
n *
n-1 **
n-2 ****
...
2 ***********
1 ******************
0 ***************************
Now, the question is, how fast is the base of this pyramid enlarging as n grows?
Let's take a real case, for instance F(6)
F(6) * <-- only once
F(5) * <-- only once too
F(4) **
F(3) ****
F(2) ********
F(1) **************** <-- 16
F(0) ******************************** <-- 32
We see F(0) gets called 32 times, which is 2^5, which for this sample case is 2^(n-1).
Now, we want to know how many times F(x) gets called at all, and we can see the number of times F(0) is called is only a part of that.
If we mentally move all the *'s from F(6) to F(2) lines into F(1) line, we see that F(1) and F(0) lines are now equal in length. Which means, total times F() gets called when n=6 is 2x32=64=2^6.
Now, in terms of complexity:
O( F(6) ) = O(2^6)
O( F(n) ) = O(2^n)
There's a very nice discussion of this specific problem over at MIT. On page 5, they make the point that, if you assume that an addition takes one computational unit, the time required to compute Fib(N) is very closely related to the result of Fib(N).
As a result, you can skip directly to the very close approximation of the Fibonacci series:
Fib(N) = (1/sqrt(5)) * 1.618^(N+1) (approximately)
and say, therefore, that the worst case performance of the naive algorithm is
O((1/sqrt(5)) * 1.618^(N+1)) = O(1.618^(N+1))
PS: There is a discussion of the closed form expression of the Nth Fibonacci number over at Wikipedia if you'd like more information.
You can expand it and have a visulization
T(n) = T(n-1) + T(n-2) <
T(n-1) + T(n-1)
= 2*T(n-1)
= 2*2*T(n-2)
= 2*2*2*T(n-3)
....
= 2^i*T(n-i)
...
==> O(2^n)
Recursive algorithm's time complexity can be better estimated by drawing recursion tree, In this case the recurrence relation for drawing recursion tree would be T(n)=T(n-1)+T(n-2)+O(1)
note that each step takes O(1) meaning constant time,since it does only one comparison to check value of n in if block.Recursion tree would look like
n
(n-1) (n-2)
(n-2)(n-3) (n-3)(n-4) ...so on
Here lets say each level of above tree is denoted by i
hence,
i
0 n
1 (n-1) (n-2)
2 (n-2) (n-3) (n-3) (n-4)
3 (n-3)(n-4) (n-4)(n-5) (n-4)(n-5) (n-5)(n-6)
lets say at particular value of i, the tree ends, that case would be when n-i=1, hence i=n-1, meaning that the height of the tree is n-1.
Now lets see how much work is done for each of n layers in tree.Note that each step takes O(1) time as stated in recurrence relation.
2^0=1 n
2^1=2 (n-1) (n-2)
2^2=4 (n-2) (n-3) (n-3) (n-4)
2^3=8 (n-3)(n-4) (n-4)(n-5) (n-4)(n-5) (n-5)(n-6) ..so on
2^i for ith level
since i=n-1 is height of the tree work done at each level will be
i work
1 2^1
2 2^2
3 2^3..so on
Hence total work done will sum of work done at each level, hence it will be 2^0+2^1+2^2+2^3...+2^(n-1) since i=n-1.
By geometric series this sum is 2^n, Hence total time complexity here is O(2^n)
The proof answers are good, but I always have to do a few iterations by hand to really convince myself. So I drew out a small calling tree on my whiteboard, and started counting the nodes. I split my counts out into total nodes, leaf nodes, and interior nodes. Here's what I got:
IN | OUT | TOT | LEAF | INT
1 | 1 | 1 | 1 | 0
2 | 1 | 1 | 1 | 0
3 | 2 | 3 | 2 | 1
4 | 3 | 5 | 3 | 2
5 | 5 | 9 | 5 | 4
6 | 8 | 15 | 8 | 7
7 | 13 | 25 | 13 | 12
8 | 21 | 41 | 21 | 20
9 | 34 | 67 | 34 | 33
10 | 55 | 109 | 55 | 54
What immediately leaps out is that the number of leaf nodes is fib(n). What took a few more iterations to notice is that the number of interior nodes is fib(n) - 1. Therefore the total number of nodes is 2 * fib(n) - 1.
Since you drop the coefficients when classifying computational complexity, the final answer is θ(fib(n)).
It is bounded on the lower end by 2^(n/2) and on the upper end by 2^n (as noted in other comments). And an interesting fact of that recursive implementation is that it has a tight asymptotic bound of Fib(n) itself. These facts can be summarized:
T(n) = Ω(2^(n/2)) (lower bound)
T(n) = O(2^n) (upper bound)
T(n) = Θ(Fib(n)) (tight bound)
The tight bound can be reduced further using its closed form if you like.
It is simple to calculate by diagramming function calls. Simply add the function calls for each value of n and look at how the number grows.
The Big O is O(Z^n) where Z is the golden ratio or about 1.62.
Both the Leonardo numbers and the Fibonacci numbers approach this ratio as we increase n.
Unlike other Big O questions there is no variability in the input and both the algorithm and implementation of the algorithm are clearly defined.
There is no need for a bunch of complex math. Simply diagram out the function calls below and fit a function to the numbers.
Or if you are familiar with the golden ratio you will recognize it as such.
This answer is more correct than the accepted answer which claims that it will approach f(n) = 2^n. It never will. It will approach f(n) = golden_ratio^n.
2 (2 -> 1, 0)
4 (3 -> 2, 1) (2 -> 1, 0)
8 (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
14 (5 -> 4, 3) (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
(3 -> 2, 1) (2 -> 1, 0)
22 (6 -> 5, 4)
(5 -> 4, 3) (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
(3 -> 2, 1) (2 -> 1, 0)
(4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
The naive recursion version of Fibonacci is exponential by design due to repetition in the computation:
At the root you are computing:
F(n) depends on F(n-1) and F(n-2)
F(n-1) depends on F(n-2) again and F(n-3)
F(n-2) depends on F(n-3) again and F(n-4)
then you are having at each level 2 recursive calls that are wasting a lot of data in the calculation, the time function will look like this:
T(n) = T(n-1) + T(n-2) + C, with C constant
T(n-1) = T(n-2) + T(n-3) > T(n-2) then
T(n) > 2*T(n-2)
...
T(n) > 2^(n/2) * T(1) = O(2^(n/2))
This is just a lower bound that for the purpose of your analysis should be enough but the real time function is a factor of a constant by the same Fibonacci formula and the closed form is known to be exponential of the golden ratio.
In addition, you can find optimized versions of Fibonacci using dynamic programming like this:
static int fib(int n)
{
/* memory */
int f[] = new int[n+1];
int i;
/* Init */
f[0] = 0;
f[1] = 1;
/* Fill */
for (i = 2; i <= n; i++)
{
f[i] = f[i-1] + f[i-2];
}
return f[n];
}
That is optimized and do only n steps but is also exponential.
Cost functions are defined from Input size to the number of steps to solve the problem. When you see the dynamic version of Fibonacci (n steps to compute the table) or the easiest algorithm to know if a number is prime (sqrt(n) to analyze the valid divisors of the number). you may think that these algorithms are O(n) or O(sqrt(n)) but this is simply not true for the following reason:
The input to your algorithm is a number: n, using the binary notation the input size for an integer n is log2(n) then doing a variable change of
m = log2(n) // your real input size
let find out the number of steps as a function of the input size
m = log2(n)
2^m = 2^log2(n) = n
then the cost of your algorithm as a function of the input size is:
T(m) = n steps = 2^m steps
and this is why the cost is an exponential.
Well, according to me to it is O(2^n) as in this function only recursion is taking the considerable time (divide and conquer). We see that, the above function will continue in a tree until the leaves are approaches when we reach to the level F(n-(n-1)) i.e. F(1). So, here when we jot down the time complexity encountered at each depth of tree, the summation series is:
1+2+4+.......(n-1)
= 1((2^n)-1)/(2-1)
=2^n -1
that is order of 2^n [ O(2^n) ].
No answer emphasizes probably the fastest and most memory efficient way to calculate the sequence. There is a closed form exact expression for the Fibonacci sequence. It can be found by using generating functions or by using linear algebra as I will now do.
Let f_1,f_2, ... be the Fibonacci sequence with f_1 = f_2 = 1. Now consider a sequence of two dimensional vectors
f_1 , f_2 , f_3 , ...
f_2 , f_3 , f_4 , ...
Observe that the next element v_{n+1} in the vector sequence is M.v_{n} where M is a 2x2 matrix given by
M = [0 1]
[1 1]
due to f_{n+1} = f_{n+1} and f_{n+2} = f_{n} + f_{n+1}
M is diagonalizable over complex numbers (in fact diagonalizable over the reals as well, but this is not usually the case). There are two distinct eigenvectors of M given by
1 1
x_1 x_2
where x_1 = (1+sqrt(5))/2 and x_2 = (1-sqrt(5))/2 are the distinct solutions to the polynomial equation x*x-x-1 = 0. The corresponding eigenvalues are x_1 and x_2. Think of M as a linear transformation and change your basis to see that it is equivalent to
D = [x_1 0]
[0 x_2]
In order to find f_n find v_n and look at the first coordinate. To find v_n apply M n-1 times to v_1. But applying M n-1 times is easy, just think of it as D. Then using linearity one can find
f_n = 1/sqrt(5)*(x_1^n-x_2^n)
Since the norm of x_2 is smaller than 1, the corresponding term vanishes as n tends to infinity; therefore, obtaining the greatest integer smaller than (x_1^n)/sqrt(5) is enough to find the answer exactly. By making use of the trick of repeatedly squaring, this can be done using only O(log_2(n)) multiplication (and addition) operations. Memory complexity is even more impressive because it can be implemented in a way that you always need to hold at most 1 number in memory whose value is smaller than the answer. However, since this number is not a natural number, memory complexity here changes depending on whether if you use fixed bits to represent each number (hence do calculations with error)(O(1) memory complexity this case) or use a better model like Turing machines, in which case some more analysis is needed.