Why does the following algorithm have runtime log(log(n))? - time-complexity

I don't understand how the runtime of the algorithm can be log(log(n)). Can someone help me?
s=1
while s <= (log n)^2 do
s=3s

Notation note: log(n) indicates log2(n) throughout the solution.
Well, I suppose (log n)^2 indicates the square of log(n), which means log(n)*log(n). Let us try to analyze the algorithm.
It starts from s=1 and goes like 1,3,9,27...
Since it goes by the exponents of 3, after each iteration s can be shown as 3^m, m being the number of iterations starting from 1.
We will do these iterations until s becomes bigger than log(n)*log(n). So at some point 3^m will be equal to log(n)*log(n).
Solve the equation:
3^m = log(n) * log(n)
m = log3(log(n) * log(n))
Time complexity of the algorithm can be shown as O(m). We have to express m in terms of n.
log3(log(n) * log(n)) = log3(log(n)) + log3(log(n))
= 2 * log3(log(n)) For Big-Oh notation, constants do not matter. So let us get rid of 2.
Time complexity = O(log3(log(n)))
Well ok, here is the deal: By the definition of Big-Oh notation, it represents an upper bound runtime for our function. Therefore O(n) ⊆ O(n^2).
Notice that log3(a) < log2(a) after a point.
By the same logic we can conclude that O(log3(log(n)) ⊆ O(log(log(n)).
So the time complexity of the algorithm : O(log(logn))
Not the most scientific explanation, but I hope you got the point.

This follows as a special case of a more general principle. Consider the following loop:
s = 1
while s < k:
s = 3s
How many times will this loop run? Well, the values of s taken on will be equal to 1, 3, 9, 27, 81, ... = 30, 31, 32, 33, ... . And more generally, on the ith iteration of the loop, the value of s will be 3i.
This loop stops running at soon as 3i overshoots k. To figure out where that is, we can equate and solve:
3i = k
i = log3 k
So this loop will run a total of log3 k times.
Now, what do you think would happen if we used this loop instead?
s = 1
while s < k:
s = 4s
Using similar logic, the number of loop iterations would be log4 k. And more generally, if we have the following loop:
s = 1
while s < k:
s = c * s
Then assuming c > 1, the number of iterations will be logc k.
Given this, let's look at your loop:
s = 1
while s <= (log n)^2 do
s = 3s
Using the reasoning from above, the number of iterations of this loop works out to log3 (log n)2. Using properties of logarithms, we can simplify this to
log3 (log n)2
= 2 log3 log n
= O(log log n).

Related

How the time complexity of this code is coming O(logn)?

int a = 0, i = N;
while (i > 0)
{
a += i;
i /= 2;
}
How will I calculate the time complexity of the code? Can anyone Explain?
Time complexity is basically the number of times a loop will run. Big O is the worst case complexity that a particular loop can have. For example, if linear search were being used to find K, which is, say the (n-1)th element of an array(0 indexed, starts with 0), the program would have to loop through the entire array to find the element. This would mean that the loop has to run n times in the worst case, giving linear search a time complexity of O(n).
In the case of your problem, i is initally equal to N and decrements by half per iteration. This would mean that when (N/pow(2, m) > 0 the loop terminates. So the loop runs at most m times which log(n).
log(N) = log(pow(2,m)) ==> log(N) = m

Is this O(N) algorithm actually O(logN)?

I have an integer, N.
I denote f[i] = number of appearances of the digit i in N.
Now, I have the following algorithm.
FOR i = 0 TO 9
FOR j = 1 TO f[i]
k = k*10 + i;
My teacher said this is O(N). It seems to me more like a O(logN) algorithm.
Am I missing something?
I think that you and your teacher are saying the same thing but it gets confused because the integer you are using is named N but it is also common to refer to an algorithm that is linear in the size of its input as O(N). N is getting overloaded as the specific name and the generic figure of speech.
Suppose we say instead that your number is Z and its digits are counted in the array d and then their frequencies are in f. For example, we could have:
Z = 12321
d = [1,2,3,2,1]
f = [0,2,2,1,0,0,0,0,0,0]
Then the cost of going through all the digits in d and computing the count for each will be O( size(d) ) = O( log (Z) ). This is basically what your second loop is doing in reverse, it's executing one time for each occurence of each digits. So you are right that there is something logarithmic going on here -- the number of digits of Z is logarithmic in the size of Z. But your teacher is also right that there is something linear going on here -- counting those digits is linear in the number of digits.
The time complexity of an algorithm is generally measured as a function of the input size. Your algorithm doesn't take N as an input; the input seems to be the array f. There is another variable named k which your code doesn't declare, but I assume that's an oversight and you meant to initialise e.g. k = 0 before the first loop, so that k is not an input to the algorithm.
The outer loop runs 10 times, and the inner loop runs f[i] times for each i. Therefore the total number of iterations of the inner loop equals the sum of the numbers in the array f. So the complexity could be written as O(sum(f)) or O(Σf) where Σ is the mathematical symbol for summation.
Since you defined that N is an integer which f counts the digits of, it is in fact possible to prove that O(Σf) is the same thing as O(log N), so long as N must be a positive integer. This is because Σf equals how many digits the number N has, which is approximately (log N) / (log 10). So by your definition of N, you are correct.
My guess is that your teacher disagrees with you because they think N means something else. If your teacher defines N = Σf then the complexity would be O(N). Or perhaps your teacher made a genuine mistake; that is not impossible. But the first thing to do is make sure you agree on the meaning of N.
I find your explanation a bit confusing, but lets assume N = 9075936782959 is an integer. Then O(N) doesn't really make sense. O(length of N) makes more sense. I'll use n for the length of N.
Then f(i) = iterate over each number in N and sum to find how many times i is in N, that makes O(f(i)) = n (it's linear). I'm assuming f(i) is a function, not an array.
Your algorithm loops at most:
10 times (first loop)
0 to n times, but the total is n (the sum of f(i) for all digits must be n)
It's tempting to say that algorithm is then O(algo) = 10 + n*f(i) = n^2 (removing the constant), but f(i) is only calculated 10 times, each time the second loops is entered, so O(algo) = 10 + n + 10*f(i) = 10 + 11n = n. If f(i) is an array, it's constant time.
I'm sure I didn't see the problem the same way as you. I'm still a little confused about the definition in your question. How did you come up with log(n)?

Understanding time complexity: iterative algorithm

I'm new with getting time complexities and I can't seem to understand the logic behind getting this at the end:
100 (n(n+1) / 2)
For this function:
function a() {
int i,j,k,n;
for(i=1; i<=n; i++) {
for(j=1; j<=i; j++) {
for(k=1; k<=100; k++) {
print("hello");
}
}
}
}
Here's how I understand its algorithm:
i = 1, 2, 3, 4...n
j = 1, 2, 3, 4...(dependent to i, which can be 'n')
k = 1(100), 2(100), 3(100), 4(100)...n(100)
= 100 [1, 2, 3, 4.....]
If I'll use this algorithm above to simulate the end equation, I'll get this result:
End Equation:
100 (n(n+1) / 2)
Simulation
i = 1, 2, 3, 4... n
j = 1, 2, 3, 4... n
k = 100, 300, 600, 10000
I usually study these in youtube and get the idea of Big O, Omega & Theta but when it comes to this one, I can't figure out how they end with the equation such as what I have given. Please help and if you have some best practices, please share.
EDIT:
As for my own assumption of answer, it think it should be this one:
100 ((n+n)/2) or 100 (2n / 2)
Source:
https://www.youtube.com/watch?v=FEnwM-iDb2g
At around: 15:21
I think you've got i and j correct, except that it's not clear why you say k = 100, 200, 300... In every loop, k runs from 1 to 100.
So let's think through the inner loop first:
k from 1 to 100:
// Do something
The inner loop is O(100) = O(1) because its runtime does not depend on n. Now we analyze the outer loops:
i from 1 to n:
j from 1 to i:
// Do inner stuff
Now lets count how many times Do inner stuff executes:
i = 1 1 time
i = 2 2 times
i = 3 3 times
... ...
i = n n times
This is our classic triangular sum 1 + 2 + 3 + ... n = n(n+1) / 2. Therefore, the time complexity of the outer two loops is O(n(n+1)/2) which reduces to O(n^2).
The time complexity of the entire thing is O(1 * n^2) = O(n^2) because nesting loops multiplies the complexities (assuming the runtime of the inner loop is independent of the variables in the outer loops). Note here that if we had not reduced at various phases, we would be left with O(100(n)(n+1)/2), which is equivalent to O(n^2) because of the properties of big-O notation.
SOME TIPS:
You asked for some best practices. Here are some "rules" that I made use of in analyzing the example you posted.
In time complexity analysis, we can ignore multiplication by a constant. This is why the inner loop is still O(1) even though it executes 100 times. Understanding this is the basis of time complexity. We are analyzing runtime on a large scale, not counting the number of clock cycles.
With nested loops where the runtime is independent of each other, just multiply the complexity. Nesting the O(1) loop inside the outer O(N^2) loops resulted in O(N^2) code.
Some more reduction rules: http://courses.washington.edu/css162/rnash/quarters/current/labs/bigOLab/lab9.htm
If you can break code up into smaller pieces (in the same way we analyzed the k loop separately from the outer loops) then you can take advantage of the nesting rule to find the combined complexity.
Note on Omega/Theta:
Theta is the "exact bound" for time complexity whereas Big-O and Omega are upper and lower bounds respectively. Because there is no random data (like there is in a sorting algorithm), we can get an exact bound on the time complexity and the upper bound is equal to the lower bound. Therefore, it does not make any difference if we use O, Omega or Theta in this case.

Time complexity of for loops, I cannot really understand a thing

So these are the for loops that I have to find the time complexity, but I am not really clearly understood how to calculate.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
For the first line, (int i = n; i > 1; i /= 3), keeps diving i by 3 and if i is less than 1 then the loop stops there, right?
But what is the time complexity of that? I think it is n, but I am not really sure.
The reason why I am thinking it is n is, if I assume that n is 30 then i will be like 30, 10, 3, 1 then the loop stops. It runs n times, doesn't it?
And for the last for loop, I think its time complexity is also n because what it does is
k starts as 2 and keeps multiplying itself to itself until k is greater than n.
So if n is 20, k will be like 2, 4, 16 then stop. It runs n times too.
I don't really think I am understanding this kind of questions because time complexity can be log(n) or n^2 or etc but all I see is n.
I don't really know when it comes to log or square. Or anything else.
Every for loop runs n times, I think. How can log or square be involved?
Can anyone help me understanding this? Please.
Since all three loops are independent of each other, we can analyse them separately and multiply the results at the end.
1. i loop
A classic logarithmic loop. There are countless examples on SO, this being a similar one. Using the result given on that page and replacing the division constant:
The exact number of times that this loop will execute is ceil(log3(n)).
2. j loop
As you correctly figured, this runs O(n / 2) times;
The exact number is floor(n / 2).
3. k loop
Another classic known result - the log-log loop. The code just happens to be an exact replicate of this SO post;
The exact number is ceil(log2(log2(n)))
Combining the above steps, the total time complexity is given by
Note that the j-loop overshadows the k-loop.
Numerical tests for confirmation
JavaScript code:
T = function(n) {
var m = 0;
for (var i = n; i > 1; i /= 3) {
for (var j = 0; j < n; j += 2)
m++;
for (var k = 2; k < n; k = k * k)
m++;
}
return m;
}
M = function(n) {
return ceil(log(n)/log(3)) * (floor(n/2) + ceil(log2(log2(n))));
}
M(n) is what the math predicts that T(n) will exactly be (the number of inner loop executions):
n T(n) M(n)
-----------------------
100000 550055 550055
105000 577555 577555
110000 605055 605055
115000 632555 632555
120000 660055 660055
125000 687555 687555
130000 715055 715055
135000 742555 742555
140000 770055 770055
145000 797555 797555
150000 825055 825055
M(n) matches T(n) perfectly as expected. A plot of T(n) against n log n (the predicted time complexity):
I'd say that is a convincing straight line.
tl;dr; I describe a couple of examples first, I analyze the complexity of the stated problem of OP at the bottom of this post
In short, the big O notation tells you something about how a program is going to perform if you scale the input.
Imagine a program (P0) that counts to 100. No matter how often you run the program, it's going to count to 100 as fast each time (give or take). Obviously right?
Now imagine a program (P1) that counts to a number that is variable, i.e. it takes a number as an input to which it counts. We call this variable n. Now each time P1 runs, the performance of P1 is dependent on the size of n. If we make n a 100, P1 will run very quickly. If we make n equal to a googleplex, it's going to take a little longer.
Basically, the performance of P1 is dependent on how big n is, and this is what we mean when we say that P1 has time-complexity O(n).
Now imagine a program (P2) where we count to the square root of n, rather than to itself. Clearly the performance of P2 is going to be worse than P1, because the number to which they count differs immensely (especially for larger n's (= scaling)). You'll know by intuition that P2's time-complexity is equal to O(n^2) if P1's complexity is equal to O(n).
Now consider a program (P3) that looks like this:
var length= input.length;
for(var i = 0; i < length; i++) {
for (var j = 0; j < length; j++) {
Console.WriteLine($"Product is {input[i] * input[j]}");
}
}
There's no n to be found here, but as you might realise, this program still depends on an input called input here. Simply because the program depends on some kind of input, we declare this input as n if we talk about time-complexity. If a program takes multiple inputs, we simply call those different names so that a time-complexity could be expressed as O(n * n2 + m * n3) where this hypothetical program would take 4 inputs.
For P3, we can discover it's time-complexity by first analyzing the number of different inputs, and then by analyzing in what way it's performance depends on the input.
P3 has 3 variables that it's using, called length, i and j. The first line of code does a simple assignment, which' performance is not dependent on any input, meaning the time-complexity of that line of code is equal to O(1) meaning constant time.
The second line of code is a for loop, implying we're going to do something that might depend on the length of something. And indeed we can tell that this first for loop (and everything in it) will be executed length times. If we increase the size of length, this line of code will do linearly more, thus this line of code's time complexity is O(length) (called linear time).
The next line of code will take O(length) time again, following the same logic as before, however since we are executing this every time execute the for loop around it, the time complexity will be multiplied by it: which results in O(length) * O(length) = O(length^2).
The insides of the second for loop do not depend on the size of the input (even though the input is necessary) because indexing on the input (for arrays!!) will not become slower if we increase the size of the input. This means that the insides will be constant time = O(1). Since this runs in side of the other for loop, we again have to multiply it to obtain the total time complexity of the nested lines of code: `outside for-loops * current block of code = O(length^2) * O(1) = O(length^2).
The total time-complexity of the program is just the sum of everything we've calculated: O(1) + O(length^2) = O(length^2) = O(n^2). The first line of code was O(1) and the for loops were analyzed to be O(length^2). You will notice 2 things:
We rename length to n: We do this because we express
time-complexity based on generic parameters and not on the ones that
happen to live within the program.
We removed O(1) from the equation. We do this because we're only
interested in the biggest terms (= fastest growing). Since O(n^2)
is way 'bigger' than O(1), the time-complexity is defined equal to
it (this only works like that for terms (e.g. split by +), not for
factors (e.g. split by *).
OP's problem
Now we can consider your program (P4) which is a little trickier because the variables within the program are defined a little cloudier than the ones in my examples.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
}
If we analyze we can say this:
The first line of code is executed O(cbrt(3)) times where cbrt is the cubic root of it's input. Since i is divided by 3 every loop, the cubic root of n is the number of times the loop needs to be executed before i is smaller or equal to 1.
The second for loop is linear in time because j is executed
O(n / 2) times because it is increased by 2 rather than 1 which
would be 'normal'. Since we know that O(n/2) = O(n), we can say
that this for loop is executed O(cbrt(3)) * O(n) = O(n * cbrt(n)) times (first for * the nested for).
The third for is also nested in the first for, but since it is not nested in the second for, we're not going to multiply it by the second one (obviously because it is only executed each time the first for is executed). Here, k is bound by n, however since it is increased by a factor of itself each time, we cannot say it is linear, i.e. it's increase is defined by a variable rather than by a constant. Since we increase k by a factor of itself (we square it), it will reach n in 2log(n) steps. Deducing this is easy if you understand how log works, if you don't get this you need to understand that first. In any case, since we analyze that this for loop will be run O(2log(n)) time, the total complexity of the third for is O(cbrt(3)) * O(2log(n)) = O(cbrt(n) *2log(n))
The total time-complexity of the program is now calculated by the sum of the different sub-timecomplexities: O(n * cbrt(n)) + O(cbrt(n) *2log(n))
As we saw before, we only care about the fastest growing term if we talk about big O notation, so we say that the time-complexity of your program is equal to O(n * cbrt(n)).

Big O notation and measuring time according to it

Suppose we have an algorithm that is of order O(2^n). Furthermore, suppose we multiplied the input size n by 2 so now we have an input of size 2n. How is the time affected? Do we look at the problem as if the original time was 2^n and now it became 2^(2n) so the answer would be that the new time is the power of 2 of the previous time?
Big 0 is not for telling you the actual running time, just how the running time is affected by the size of input. If you double the size of input the complexity is still O(2^n), n is just bigger.
number of elements(n) units of work
1 1
2 4
3 8
4 16
5 32
... ...
10 1024
20 1048576
There's a misunderstanding here about how Big-O relates to execution time.
Consider the following formulas which define execution time:
f1(n) = 2^n + 5000n^2 + 12300
f2(n) = (500 * 2^n) + 6
f3(n) = 500n^2 + 25000n + 456000
f4(n) = 400000000
Each of these functions are O(2^n); that is, they can each be shown to be less than M * 2^n for an arbitrary M and starting n0 value. But obviously, the change in execution time you notice for doubling the size from n1 to 2 * n1 will vary wildly between them (not at all in the case of f4(n)). You cannot use Big-O analysis to determine effects on execution time. It only defines an upper boundary on the execution time (which is not even guaranteed to be the minimum form of the upper bound).
Some related academia below:
There are three notable bounding functions in this category:
O(f(n)): Big-O - This defines a upper-bound.
Ω(f(n)): Big-Omega - This defines a lower-bound.
Θ(f(n)): Big-Theta - This defines a tight-bound.
A given time function f(n) is Θ(g(n)) only if it is also Ω(g(n)) and O(g(n)) (that is, both upper and lower bounded).
You are dealing with Big-O, which is the usual "entry point" to the discussion; we will neglect the other two entirely.
Consider the definition from Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes:
f(x)=O(g(x)) as x tends to infinity
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M|g(x)| for all x > x0
Going from here, assume we have f1(n) = 2^n. If we were to compare that to f2(n) = 2^(2n) = 4^n, how would f1(n) and f2(n) relate to each other in Big-O terms?
Is 2^n <= M * 4^n for some arbitrary M and n0 value? Of course! Using M = 1 and n0 = 1, it is true. Thus, 2^n is upper-bounded by O(4^n).
Is 4^n <= M * 2^n for some arbitrary M and n0 value? This is where you run into problems... for no constant value of M can you make 2^n grow faster than 4^n as n gets arbitrarily large. Thus, 4^n is not upper-bounded by O(2^n).
See comments for further explanations, but indeed, this is just an example I came up with to help you grasp Big-O concept. That is not the actual algorithmic meaning.
Suppose you have an array, arr = [1, 2, 3, 4, 5].
An example of a O(1) operation would be directly access an index, such as arr[0] or arr[2].
An example of a O(n) operation would be a loop that could iterate through all your array, such as for elem in arr:.
n would be the size of your array. If your array is twice as big as the original array, n would also be twice as big. That's how variables work.
See Big-O Cheat Sheet for complementary informations.