Why is Time complexity of the code O(log n)? - time-complexity

Here is the code given in the book "Cracking the Coding Interview" by Gayle Laakmann. Here time complexity of the code to find:-
int sumDigits(int n)
{ int sum=0;
while(n >0)
{
sum+=n%10;
n/=10
}
return sum ;
}
I know the time complexity should be the number of digits in n.
According to the book, its run time complexity is O(log n). Book provided brief description but I don't understand.

while(n > 0)
{
sum += n % 10;
n /= 10;
}
so, how much steps will this while loop take so that n comes to 0? What you do in each step, you divide n with 10. So, you need to do it k times in order to come to 0. Note, k is the number of digits in n.
Lets go step by step:
First step is when n > 0, you divide n by 10. If n is still positive, you will divide it by 10. What you get there is n/10/10 or n / (10^2). After third time, its n / (10^3). And after k times, its n/(10^k) = 0. And loop will end. But this is not 0 in mathematical sense, its 0 because we deal with integers. What you really have is |n|/(10^k) < 1, where k∈N.
So, we have this now:
n/(10^k) < 1
n < 10^k
logn < k
Btw. its also n/(10^(k-1)) > 1, so its:
k-1 < logn < k. (btw. don't forget, this is base 10).
So, you need to do logn + 1 steps in order to finish the loop, and thats why its O(log(n)).

The number of times the logic would run is log(n) to the base of 10 which is same as (log(n) to the base 2)/ (log (10) to the base of 2). In terms of time complexity, this would simply be O(log (n)). Note that log(n) to the base of 10 is how you would represent the number of digits in n.

Related

How to Understand Time Complexity of Happy Number Problem Solution from Leetcode

I have some difficulties in understanding the time complexity analysis for one solution for the Happy Number Question from Leet code, for my doubts on complexity analysis, I marked them in bold and really appreciate your advice
Here is the question:
Link: https://leetcode.com/problems/happy-number/
Question:
Write an algorithm to determine if a number is "happy".
A happy number is a number defined by the following process: Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process ends in 1 are happy numbers.
Example:
Input: 19
Output: true
Explanation:
1^2(square of 1) + 9^2 = 82
8^2 + 2^2 = 68
6^2 + 8^2 = 100
1^2 + 0^2 + 0^2 = 1
Here is the code:
class Solution(object):
def isHappy(self, n):
#getnext function will compute the sum of square of each digit of n
def getnext(n):
totalsum = 0
while n>0:
n,v = divmod(n,10)
totalsum+=v**2
return totalsum
#we declare seen as a set to track the number we already visited
seen = set()
#we stop checking if: either the number reaches one or the number was visited #already(ex.a cycle)
while n!=1 and (n not in seen):
seen.add(n)
n = getnext(n)
return n==1
Note: feel free to let me know if I need to explain how the code works
Time Complexity Analysis:
Time complexity : O(243 * 3 + logN + loglogN + log loglog N)...=O(logN).
Finding the next value for a given number has a cost of O(log n)because we are processing each digit in the number, and the number of digits in a number is given by logN.
My doubt: why the number of digits in a number is given by logN? what is N here? the value of a specific number or something else?
To work out the total time complexity, we'll need to think carefully about how many numbers are in the chain, and how big they are.
We determined above that once a number is below 243, it is impossible for it to go back up above 243.Therefore, based on our very shallow analysis we know for sure that once a number is below 243, it is impossible for it to take more than another 243 steps to terminate.
Each of these numbers has at most 3 digits. With a little more analysis, we could replace the 243 with the length of the longest number chain below 243, however because the constant doesn't matter anyway, we won't worry about it.
My doubt: I think the above paragraph is related to the time complexity component of 243*3, but I cannot understand why we multiply 243 by 3
For an n above 243, we need to consider the cost of each number in the chain that is above 243. With a little math, we can show that in the worst case, these costs will be O(log n) + O(log log n) + O(log log log N)... Luckily for us, the O(logN) is the dominating part, and the others are all tiny in comparison (collectively, they add up to less than logN), so we can ignore them. My doubt: what is the reasoning behind O(log log n) O(log log log N) for an n above 243?
Well, my guess for the first doubt is that the number of digits of a base 10 number is given by it's value (N) taken to the logarithm at base 10, rounded down. So for example, 1023 would have floor(log10(1023)) digits, which is 3. So yes, the N is the value of the number. the log in time complexity indicates a logarithm, not specifically that of base 2 or base e.
As for the second doubt, it probably has to do with the work required to reduce a number to below 243, but I am not sure. I'll edit this answer once I work that bit out.
Let's say N has M digits. Than getnext(N) <= 81*M. The equality happens when N only has 9's.
When N < 1000, i.e. at most 3 digits, getnext(N) <= 3*81 = 243. Now, you will have to call getnext(.) at most O(243) times to figure out if N is indeed happy.
If M > 3, number of digits of getnext(N) must be less than M. Try getnext(9999), getnext(99999), and so on [1].
Notes:
[1] Adding a digit to N can make it at most 10*N + 9, i.e. adding a 9 at the end. But the number of digits increases to M+1 only. It's a logarithmic relationship between N and M. Hence, the same relationship holds between N and 81*M.
Using the Leetcode solution
class Solution {
private int getNext(int n) {
int totalSum = 0;
while (n > 0) {
int d = n % 10;
n = n / 10;
totalSum += d * d;
}
return totalSum;
}
public boolean isHappy(int n) {
Set<Integer> seen = new HashSet<>();
while (n != 1 && !seen.contains(n)) {
seen.add(n);
n = getNext(n);
}
return n == 1;
}
}
}
O(243*3) for n < 243
3 is the max number of digits in n
e.g. For n = 243
getNext() will take a maximum of 3 iterations because there are 3 digits for us to loop over.
isHappy() can take a maximum of 243 iterations to find a cycle or terminate, because we can store a max of 243 numbers in our hash set.
O(log n) + O(log log n) + O(log log log N)... for n > 243
1st iteration + 2nd iteration + 3rd iteration ...
getNext() will be called a maximum of O(log n) times. Because log10 n is the number of digits.
isHappy() will be called a maximum of 9^2 per digit. This is the max we can store in the hash set before we find a cycle or terminate.
First Iteration
9^2 * number of digits
O(81*(log n)) drop the constant
O(log n)
+
Second Iteration
O(log (81*(log n))) drop the constant
O(log log n)
+
Third Iteration
O(log log log N)
+
ect ...

How is the complexity of below O(log n) and not O(n)

Could someone explain how this algorithm is O(log(n)) and not O(n)?
the loop runs for all the digits in a given number. So isn't the complexity O(n)?
while (x != 0) {
int pop = x % 10;
x /= 10;
if (rev > Integer.MAX_VALUE/10 || (rev == Integer.MAX_VALUE / 10 && pop > 7))
return 0;
if (rev < Integer.MIN_VALUE/10 || (rev == Integer.MIN_VALUE / 10 && pop < -8))
return 0;
rev = rev * 10 + pop;
}
It depends on what n is. If n is x itself, a numeric value, then the complexity is O(log(n)). If you multiply x by 10, the while loop will only be one iteration longer, not ten times as long. Likewise, multiplying x by 100 will only add two iterations.
On the other hand, if there was a variable s which was the string representation of x, and n was the length of string s, then the complexity would be O(n). Note that in this case, the length of s is proportional to log(x), so the logarithm is implicit from the viewpoint of the numeric value.
Here's a thought experiment:
The algorithm depends on the number of digits in n, not the value of n. n = 10 takes as many iterations as n = 99 because they both have 2 digits.
The number digits in n grows at log(n) rate, since adding a single digit requires n to be at least 10 times bigger.
Hence the algorithm has a complexity of O(log(n))

Time complexity of for loops, I cannot really understand a thing

So these are the for loops that I have to find the time complexity, but I am not really clearly understood how to calculate.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
For the first line, (int i = n; i > 1; i /= 3), keeps diving i by 3 and if i is less than 1 then the loop stops there, right?
But what is the time complexity of that? I think it is n, but I am not really sure.
The reason why I am thinking it is n is, if I assume that n is 30 then i will be like 30, 10, 3, 1 then the loop stops. It runs n times, doesn't it?
And for the last for loop, I think its time complexity is also n because what it does is
k starts as 2 and keeps multiplying itself to itself until k is greater than n.
So if n is 20, k will be like 2, 4, 16 then stop. It runs n times too.
I don't really think I am understanding this kind of questions because time complexity can be log(n) or n^2 or etc but all I see is n.
I don't really know when it comes to log or square. Or anything else.
Every for loop runs n times, I think. How can log or square be involved?
Can anyone help me understanding this? Please.
Since all three loops are independent of each other, we can analyse them separately and multiply the results at the end.
1. i loop
A classic logarithmic loop. There are countless examples on SO, this being a similar one. Using the result given on that page and replacing the division constant:
The exact number of times that this loop will execute is ceil(log3(n)).
2. j loop
As you correctly figured, this runs O(n / 2) times;
The exact number is floor(n / 2).
3. k loop
Another classic known result - the log-log loop. The code just happens to be an exact replicate of this SO post;
The exact number is ceil(log2(log2(n)))
Combining the above steps, the total time complexity is given by
Note that the j-loop overshadows the k-loop.
Numerical tests for confirmation
JavaScript code:
T = function(n) {
var m = 0;
for (var i = n; i > 1; i /= 3) {
for (var j = 0; j < n; j += 2)
m++;
for (var k = 2; k < n; k = k * k)
m++;
}
return m;
}
M = function(n) {
return ceil(log(n)/log(3)) * (floor(n/2) + ceil(log2(log2(n))));
}
M(n) is what the math predicts that T(n) will exactly be (the number of inner loop executions):
n T(n) M(n)
-----------------------
100000 550055 550055
105000 577555 577555
110000 605055 605055
115000 632555 632555
120000 660055 660055
125000 687555 687555
130000 715055 715055
135000 742555 742555
140000 770055 770055
145000 797555 797555
150000 825055 825055
M(n) matches T(n) perfectly as expected. A plot of T(n) against n log n (the predicted time complexity):
I'd say that is a convincing straight line.
tl;dr; I describe a couple of examples first, I analyze the complexity of the stated problem of OP at the bottom of this post
In short, the big O notation tells you something about how a program is going to perform if you scale the input.
Imagine a program (P0) that counts to 100. No matter how often you run the program, it's going to count to 100 as fast each time (give or take). Obviously right?
Now imagine a program (P1) that counts to a number that is variable, i.e. it takes a number as an input to which it counts. We call this variable n. Now each time P1 runs, the performance of P1 is dependent on the size of n. If we make n a 100, P1 will run very quickly. If we make n equal to a googleplex, it's going to take a little longer.
Basically, the performance of P1 is dependent on how big n is, and this is what we mean when we say that P1 has time-complexity O(n).
Now imagine a program (P2) where we count to the square root of n, rather than to itself. Clearly the performance of P2 is going to be worse than P1, because the number to which they count differs immensely (especially for larger n's (= scaling)). You'll know by intuition that P2's time-complexity is equal to O(n^2) if P1's complexity is equal to O(n).
Now consider a program (P3) that looks like this:
var length= input.length;
for(var i = 0; i < length; i++) {
for (var j = 0; j < length; j++) {
Console.WriteLine($"Product is {input[i] * input[j]}");
}
}
There's no n to be found here, but as you might realise, this program still depends on an input called input here. Simply because the program depends on some kind of input, we declare this input as n if we talk about time-complexity. If a program takes multiple inputs, we simply call those different names so that a time-complexity could be expressed as O(n * n2 + m * n3) where this hypothetical program would take 4 inputs.
For P3, we can discover it's time-complexity by first analyzing the number of different inputs, and then by analyzing in what way it's performance depends on the input.
P3 has 3 variables that it's using, called length, i and j. The first line of code does a simple assignment, which' performance is not dependent on any input, meaning the time-complexity of that line of code is equal to O(1) meaning constant time.
The second line of code is a for loop, implying we're going to do something that might depend on the length of something. And indeed we can tell that this first for loop (and everything in it) will be executed length times. If we increase the size of length, this line of code will do linearly more, thus this line of code's time complexity is O(length) (called linear time).
The next line of code will take O(length) time again, following the same logic as before, however since we are executing this every time execute the for loop around it, the time complexity will be multiplied by it: which results in O(length) * O(length) = O(length^2).
The insides of the second for loop do not depend on the size of the input (even though the input is necessary) because indexing on the input (for arrays!!) will not become slower if we increase the size of the input. This means that the insides will be constant time = O(1). Since this runs in side of the other for loop, we again have to multiply it to obtain the total time complexity of the nested lines of code: `outside for-loops * current block of code = O(length^2) * O(1) = O(length^2).
The total time-complexity of the program is just the sum of everything we've calculated: O(1) + O(length^2) = O(length^2) = O(n^2). The first line of code was O(1) and the for loops were analyzed to be O(length^2). You will notice 2 things:
We rename length to n: We do this because we express
time-complexity based on generic parameters and not on the ones that
happen to live within the program.
We removed O(1) from the equation. We do this because we're only
interested in the biggest terms (= fastest growing). Since O(n^2)
is way 'bigger' than O(1), the time-complexity is defined equal to
it (this only works like that for terms (e.g. split by +), not for
factors (e.g. split by *).
OP's problem
Now we can consider your program (P4) which is a little trickier because the variables within the program are defined a little cloudier than the ones in my examples.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
}
If we analyze we can say this:
The first line of code is executed O(cbrt(3)) times where cbrt is the cubic root of it's input. Since i is divided by 3 every loop, the cubic root of n is the number of times the loop needs to be executed before i is smaller or equal to 1.
The second for loop is linear in time because j is executed
O(n / 2) times because it is increased by 2 rather than 1 which
would be 'normal'. Since we know that O(n/2) = O(n), we can say
that this for loop is executed O(cbrt(3)) * O(n) = O(n * cbrt(n)) times (first for * the nested for).
The third for is also nested in the first for, but since it is not nested in the second for, we're not going to multiply it by the second one (obviously because it is only executed each time the first for is executed). Here, k is bound by n, however since it is increased by a factor of itself each time, we cannot say it is linear, i.e. it's increase is defined by a variable rather than by a constant. Since we increase k by a factor of itself (we square it), it will reach n in 2log(n) steps. Deducing this is easy if you understand how log works, if you don't get this you need to understand that first. In any case, since we analyze that this for loop will be run O(2log(n)) time, the total complexity of the third for is O(cbrt(3)) * O(2log(n)) = O(cbrt(n) *2log(n))
The total time-complexity of the program is now calculated by the sum of the different sub-timecomplexities: O(n * cbrt(n)) + O(cbrt(n) *2log(n))
As we saw before, we only care about the fastest growing term if we talk about big O notation, so we say that the time-complexity of your program is equal to O(n * cbrt(n)).

What is the time complexity of this do-while loop?

int count=0;
do
{
count++;
n=n/2;
}while (n>1);
I'm having trouble seeing the pattern here.
Even when plugging in numbers for n and then plotting out each basic operation.
Thanks in advance!
Edit: I want the worst case here.
First step is to divide n with 2. So you get n/2. Now, you divide it again with 2, if n/2 > 1, and you get n/4. If n/4 > 1 you do it again, and you get n/8, or better to write it as n/(2^3)... Now if n/(2^3) > 1 you do it again, and you get n/(2^4)... So, if you do it k times, you get n/(2^k). How to calculate k to so you get n/(2^k) ≤ 1? Easy:
n/(2^k) ≤ 1
n ≤ 2^k
ln(n) ≤ k
So, your algorithm needs O(ln(n)) iterations to exit the loop.
In your code, k is count.

How to calculate the complexity of the Algorithm?

I have to calculate the complexity of this algorithm,I tried to solve it and found the answer to be O(nlogn). Is it correct ? If not please explain.
for (i=5; i<n/2; i+=5)
{
for (j=1; j<n; j*=4)
op;
x = 3*n;
while(x > 6)
{op; x--;}
}
Katrina, In this example we've got an O (n*log(n))
`
for (int i= 0; i < N; i++) {
c= i;
while (c > 0) {
c= c/2
}
}
How ever you have another for that includes this both bucles.
Im not quite sure to understand the way it works the algorithm but an standard way considering an another for bucle, should be O(nnlog n
for (int j= 0; i < N); j++) { --> O(n)
for (int i= 0; i < N; i++) { O(n)
c= i;
while (c > 0) { O(logn)
c= c/2 O(1)
}
}
} ?
Aftermath in this standard algorithm would be O(n) * O(n) * O(logn) * O(1)
So, I think you forgot to include another O(n)
)
Hope it helps
Let's count the number of iterations in each loop.
The outermost loop for (i=5; i<n/2; i+=5) steps through all values between 5 and n / 2 in steps of 5. It will, thus, require approximately n / 10 - 1 iterations.
There are two inner loops. Let's consider the first one: for (j=1; j<n; j*=4). This steps through all values of the form 4^x between 1 and n for integers of x. The lowest value of x for this to be true is 0, and the highest value is the x that fulfills 4^x < n -- i.e., the integer closest to log_4(n). Thus, labelling the iterations by x, we have iterations 0, 1, ..., log_4(n). In other words, we've approximately log_4(n) + 1 iterations for this loop.
Now consider the second inner loop. It steps through all values from 3 * n down to 7. Thus the number of iterations are approximately 3 n - 6.
All other operations have constant run time and can therefore be ignored.
How do we put this together? The two inner loops are run sequentially (i.e., they are not nested) so the run time for both of them together is simply the sum:
(log_4(n) + 1) + (3 n - 6) = 3 n + log_4(n) - 5.
The outer loop and the two inner loops are, however, nested. For every iteration of the outer loop, both the inner ones are run. Therefore we multiply the number of iterations of the outer with the total of the inner:
(n / 10 - 1) * (3 n + log_4(n) - 5) =
= 3 n^2 / 10 + n log_4(n) / 10 - 7 n / 2 - log_4(n) + 5.
Finally, complexity is often expressed in Big-O notation -- that is, we're only interested in the order of the run time. This means two things. First, we can ignore all constant factors in all terms. For example, O(3 n^2 / 10) becomes just O(n^2). Thereby we have:
O(3 n^2 / 10 + n log_4(n) / 10 - 7 n / 2 - log_4(n) + 5) =
= O(n^2 + n log_4(n) - n - log_4(n) + 1).
Second, we can ignore all terms that have a lower order than the term with the highest order. For example, n is of an higher order than 1 so we have O(n + 1) = O(n). Thereby we have:
O(n^2 + n log_4(n) - n - log_4(n) + 1) = O(n^2).
Finally, we have the answer. The complexity of the algorithm your code describes is O(n^2).
(In practice, one would never calculate the (approximate) number of iterations as we did here. The simplification we did in the last step can be done earlier which makes the calculations much easier.)
As Fredrik mentioned, time complexity of first and third loops are O(n). an time for second loop is O(log(n)).
So complexity of following algorithm is O(n^2).
for (i=5; i<n/2; i+=5)
{
for (j=1; j<n; j*=4)
op;
x = 3*n;
while(x > 6)
{op; x--;}
}
Note that complexity of following algorithm is O(n^2*log(n)) which is not same with above algorithm.
for (i=5; i<n/2; i+=5)
{
for (j=1; j<n; j*=4)
{
op;
x = 3*n;
while(x > 6)
{op; x--;}
}
}