Order of growth - time-complexity

for
f = n(log(n))^5
g = n^1.01
is
f = O(g)
f = 0(g)
f = Omega(g)?
I tried dividing both by n and i got
f = log(n)^5
g = n^0.01
But I am still clueless to which one grows faster. Can someone help me with this and explain the reasoning to the answer? I really want to know how (without calculator) one can determine which one grows faster.

Probably easiest to compare their logarithmic profiles:
If (for some C1, C2, a>0)
f < C1 n log(n)^a
g < C2 n^(1+k)
Then (for large enough n)
log(f) < log(n) + a log(log(n)) + log(C1)
log(g) < log(n) + k log(n) + log(C2)
Both are dominated by log(n) growth, so the question is which residual is bigger. The log(n) residual grows faster than log(log(n)), regardless of how small k or how large a is, so g would grow faster than f.
So in terms of big-O notation: g grows faster than f, so f can (asymptotically) be bounded from above by a function like g:
f(n) < C3 g(n)
So f = O(g). Similarly, g can be bounded from below by f, so g = Omega(f). But f cannot be bounded from below by a function like g, since g will eventually outgrow it. So f != Omega(g) and f != Theta(g).
But aaa makes a very good point: g does not begin to dominate over f until n becomes obscenely large.
I don't have a lot of experience with algorithm scaling, so corrections are welcome.

I would break this up into several easy, reusable lemmas:
Lemma 1: For a positive constant k, f = O(g) if and only if f = O(k g).
Proof: Suppose f = O(g). Then there exist constants c and N such that |f(n)| < c |g(n)| for n > N.
Thus |f(n)| < (c/k) (k |g(n)| ) for n > N and constant (c/k), so f = O (k g). The converse is trivially similar.
Lemma 2: If h is a positive monotonically increasing function and f and g are positive for sufficiently large n, then f = O(g) if and only if h(f) = O( h(g) ).
Proof: Suppose f = O(g). Then there exist constants c and N such that |f(n)| < c |g(n)| for n > N. Since f and g are positive for n > M, f(n) < c g(n) for n > max(N, M). Since h is monotonically increasing, h(f(n)) < c h(g(n)) for n > max(N, M), and lastly |h(f(n))| < c |h(g(n))| for n > max(N, M) since h is positive. Thus h(f) = O(h(g)).
The converse follows similarly; the key fact being that if h is monotonically increasing, then h(a) < h(b) => a < b.
Lemma 3: If h is an invertible monotonically increasing function, then f = O(g) if and only if f(h) + O(g(h)).
Proof: Suppose f = O(g). Then there exist constants c, N such that |f(n)| < c |g(n)| for n > N. Thus |f(h(n))| < c |g(h(n))| for h(n) > N. Since h(n) is invertible and monotonically increasing, h(n) > N whenever n > h^-1(N). Thus h^-1(N) is the new constant we need, and f(h(n)) = O(g(h(n)).
The converse follows similarly, using g's inverse.
Lemma 4: If h(n) is nonzero for n > M, f = O(g) if and only if f(n)h(n) = O(g(n)h(n)).
Proof: If f = O(g), then for constants c, N, |f(n)| < c |g(n)| for n > N. Since |h(n)| is positive for n > M, |f(n)h(n)| < c |g(n)h(n)| for n > max(N, M) and so f(n)h(n) = O(g(n)h(n)).
The converse follows similarly by using 1/h(n).
Lemma 5a: log n = O(n).
Proof: Let f = log n, g = n. Then f' = 1/n and g' = 1, so for n > 1, g increases more quickly than f. Moreover g(1) = 1 > 0 = f(1), so |f(n)| < |g(n)| for n > 1 and f = O(g).
Lemma 5b: n != O(log n).
Proof: Suppose otherwise for contradiction, and let f = n and g = log n. Then for some constants c, N, |n| < c |log n| for n > N.
Let d = max(2, 2c, sqrt(N+1) ). By the calculation in lemma 5a, since d > 2 > 1, log d < d. Thus
|f(2d^2)| = 2d^2 > 2d(log d) >= d log d + d log 2 = d (log 2d) > 2c log 2d > c log (2d^2) = c g(2d^2) = c |g(2d^2)| for 2d^2 > N, a contradiction. Thus f != O(g).
So now we can assemble the answer to the question you originally asked.
Step 1:
log n = O(n^a)
n^a != O(log n)
For any positive constant a.
Proof: log n = O(n) by Lemma 5a. Thus log n = 1/a log n^a = O(1/a n^a) = O(n^a) by Lemmas 3 (for h(n) = n^a), 4, and 1. The second fact follows similarly by using Lemma 5b.
Step 2:
log^5 n = O(n^0.01)
n^0.01 != O(log^5 n)
Proof: log n = O(n^0.002) by step 1. Then by Lemma 2 (with h(n) = n^5), log^5 n = O( (n^0.002)^5 ) = O(n^0.01). The second fact follows similarly.
Final answer:
n log^5 n = O(n^1.01)
n^1.01 != O(n log^5 n)
In other words,
f = O(g)
f != 0(g)
f != Omega(g)
Proof: Apply Lemma 4 (using h(n) = n) to step 2.
With practice these rules become "obvious" and second nature. and unless your test requires that you prove your answer you'll find yourself whipping through these kinds of big-O problems.

how about checking their intersections?
Solve[Log[n] == n^(0.01/5), n]
1809
{{n -> 2.72374}, {n -> 8.70811861815 10 }}
I cheated with Mathematica
you can also reason with derivatives,
In[71]:= D[Log[n], n]
1
-
n
In[72]:= D[n^(0.01/5), n]
0.002
------
0.998
n
consider what happens as n gets really large, change in first tends to zero, later function doesnt lose its derivative (exponent is greater than 0).
this tells you which is more complex theoretically.
however in the practical region, first function is going to grow faster.

This is not 100% mathematically kosher without proving something about logs, but here he goes:
f = log(n)^5
g = n^0.01
We take logs of both:
log(f) = log(log(n)^5)) = 5*log(log(n)) = O(log(log(n)))
log(g) = log(n^0.01) = 0.01*log(n) = O(log(n))
From this we see that the first one grows asymptotically slower, because it has a double log in it and logs grow slowly. An non-formal argument why this reasoning by taking logs is valid is this: log(n) tells you roughly how many digits there are in the number n. So if the number of digits of g is growing asymptotically faster than the number of digits of f, then surely the actual number g is growing faster than the number f!

Related

What would be the time complexity of a loop that runs n-2 times?

I have a loop that runs n-2 times, what would be the time complexity in this case.
for(int m=1; m<arr.length-1; m++) {
}
I am not convinced for it to be O(n) because it will never run n times, not even in worst case scenarios.
O(n) just means "on the order of n". Specifically, the definition is that some function f(n) is O(g(n)) if there exist some k and c such that for all n > k, f(n) < c * g(n). In this case, set f(n) = n - 2 and g(n) = n, you can see that for k = 10 and c = 2, n < 2 * (n - 2) for all n > 10, so n - 2 is indeed O(n).

Proving or Refuting Time Complexity

I have an exam soon and I wasn't at university for a long time, cause I was at the hospital
Prove or refute the following statements:
log(n)= O(
√
n)
3^(n-1)= O(2^n)
f(n) + g(n) = O(f(g(n)))
2^(n+1) = O(2^n)
Could someone please help me and explain to me ?
(1) is true because log(n) grows asymptotically slower than any polynomial, including sqrt(n) = n^(1/2). To prove this we can observe that both log(n) and sqrt(n) are strictly increasing functions for n > 0 and then focus on a sequence where both evaluate easily, e.g., 2^(2k). Now we see log(2^(2k)) = 2k, but sqrt(2^(2k)) = 2^k. For k = 2, 2k = 2^k, and for k > 2, 2k < 2^k. This glosses over some details but the idea is sound. You can finish this by arguing that between 2^(2k) and 2^(2(k+1)) both functions have values greater than one for k >= 2 and thus any crossings can be eliminated by multiplying sqrt(n) by some constant.
(2) it is not true that 3^(n-1) is O(2^n). Suppose this were true. Then there exists an n0 and c such that for n > n0, 3^(n-1) <= c*2^n. First, eliminate the -1 by adding a (1/3) to the front; so (1/3)*3^n <= c*2^n. Next, divide through by 2^n: (1/3)*(3/2)^n <= c. Multiply by 3: (3/2)^n <= 3c. Finally, take the log of both sides with base 3/2: n <= log_3/2 (3c). The RHS is a constant expression and n is a variable; so this cannot be true of arbitrarily large n as required. This is a contradiction so our supposition was wrong; that is, 3^(n-1) is not O(2^n).
(3) this is not true. f(n) = 1 and g(n) = n is an easy counterexample. In this case, f(n) + g(n) = 1 + n but O(f(g(n)) = O(f(n)) = O(1).
(4) this is true. Rewrite 2^(n+1) as 2*2^n and it becomes obvious that this is true for n >= 1 by choosing c > 2.

Time Complexity of nested loops including if statement

I'm unsure of the general time complexity of the following code.
Sum = 0
for i = 1 to N
if i > 10
for j = 1 to i do
Sum = Sum + 1
Assuming i and j are incremented by 1.
I know that the first loop is O(n) but the second loop is only going to run when N > 10. Would the general time complexity then be O(n^2)? Any help is greatly appreciated.
Consider the definition of Big O Notation.
________________________________________________________________
Let f: ℜ → ℜ and g: ℜ → ℜ.
Then, f(x) = O(g(x))
&iff;
∃ k ∈ ℜ ∋ ∃ M > 0 ∈ ℜ ∋ ∀ x ≥ k, |f(x)| ≤ M ⋅ |g(x)|
________________________________________________________________
Which can be read less formally as:
________________________________________________________________
Let f and g be functions defined on a subset of the real numbers.
Then, f is O of g if, for big enough x's (this is what the k is for in the formal definition) there is a constant M (from the real numbers, of course) such that M times g(x) will always be greater than or equal to (really, you can just increase M and it will always be greater, but I regress) f(x).
________________________________________________________________
(You may note that if a function is O(n), then it is also O(n²) and O(e^n), but of course we are usually interested in the "smallest" function g such that it is O(g). In fact, when someone says f is O of g then they almost always mean that g is the smallest such function.)
Let's translate this to your problem. Let f(N) be the amount of time your process takes to complete as a function of N. Now, pretend that addition takes one unit of time to complete (and checking the if statement and incrementing the for-loop take no time), then
f(1) = 0
f(2) = 0
...
f(10) = 0
f(11) = 11
f(12) = 23
f(13) = 36
f(14) = 50
We want to find a function g(N) such that for big enough values of N, f(N) ≤ M ⋅g(N). We can satisfy this by g(N) = N² and M can just be 1 (maybe it could be smaller, but we don't really care). In this case, big enough means greater than 10 (of course, f is still less than M⋅g for N <11).
tl;dr: Yes, the general time complexity is O(n²) because Big O assumes that your N is going to infinity.
Let's assume your code is
Sum = 0
for i = 1 to N
for j = 1 to i do
Sum = Sum + 1
There are N^2 sum operations in total. Your code with if i > 10 does 10^2 sum operations less. As a result, for enough big N we have
N^2 - 10^2
operations. That is
O(N^2) - O(1) = O(N^2)

How the time complexity of gcd is Θ(logn)?

I was solving a time-complexity question on Interview Bit as given in the below image.
The answer given is Θ(theta)(logn) and I am not able to grasp how the logn term arrive here in the time complexity of this program.
Can someone please explain how the answer is theta of logn?
Theorem given any x, gcd(n, m) where n < fib(x) is recursive called equal or less than x times.
Note: fib(x) is fibonacci(x), where fib(x) = fib(x-1) + fib(x-2)
Prove
Basis
every n <= fib(1), gcd(n, m) is gcd(1,m) only recursive once
Inductive step
assume the theorem is hold for every number less than x, which means:
calls(gcd(n, m)) <= x for every n <= fib(x)
consider n where n <= fib(x+1)
if m > fib(x)
calls(gcd(n, m))
= calls(gcd(m, (n-m))) + 1
= calls(gcd(n-m, m%(n-m))) + 2 because n - m <= fib(x-1)
<= x - 1 + 2
= x + 1
if m <= fib(x)
calls(gcd(n, m))
= calls(gcd(m, (n%m))) + 1 because m <= fib(x)
<= x + 1
So the theorem also holds for x + 1, as mathematical induction, the theorem holds for every x.
Conclusion
gcd(n,m) is Θ(reverse fib) which is Θ(logn)
This algorithm generates a decreasing sequence of integer (m, n) pairs. We can try to prove that such sequence decays fast enough.
Let's say we start with m_1 and n_1, with m_1 < n_1.
At each step we take n_1 % m_1, which is < m_1, and repeat recursively on the pair m_2 = n_1 % m_1 and n_2 = m_1.
Now, let's say n_1 % m_1 = m_1 - p for some p where 0 < p < m_1.
We have max(m_2, n_2) = m_1 - p.
Let's take another step (m_2, n_2) -> (m_3, n_3), we can easily see that max(m_3, n_3) < p, but clearly it is also true that max(m_3, n_3) < m_1 - p as the sequence is strictly decreasing.
So we can write max(m_3, n_3) < min(m_1 - p, p), where min(m_1 - p, p) = m_1 / 2. This result expresses the fact that the sequence decreases geometrically, therefore the algorithm has to terminate in at most log_2(m_1) steps.

Haskell Heap Issues with Parameter Passing Style

Here's a simple program that blows my heap to Kingdom Come:
intersect n k z s rs c
| c == 23 = rs
| x == y = intersect (n+1) (k+1) (z+1) (z+s) (f : rs) (c+1)
| x < y = intersect (n+1) k (z+1) s rs c
| otherwise = intersect n (k+1) z s rs c
where x = (2*n*n) + 4 * n
y = (k * k + k )
f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0 [] 0
main = do
putStr (show p)
What the program does is calculate the intersection of two infinite series, stopping when it reaches 23 elements. But that's not important to me.
What's interesting is that as far as I can tell, there shouldn't be much here that is sitting on the heap. The function intersect is recursives with all recursions written as tail calls. State is accumulated in the arguments, and there is not much of it. 5 integers and a small list of tuples.
If I were a betting person, I would bet that somehow thunks are being built up in the arguments as I do the recursion, particularly on arguments that aren't evaluated on a given recursion. But that's just a wild hunch.
What's the true problem here? And how does one fix it?
If you have a problem with the heap, run the heap profiler, like so:
$ ghc -O2 --make A.hs -prof -auto-all -rtsopts -fforce-recomp
[1 of 1] Compiling Main ( A.hs, A.o )
Linking A.exe ...
Which when run:
$ ./A.exe +RTS -M1G -hy
Produces an A.hp output file:
$ hp2ps -c A.hp
Like so:
So your heap is full of Integer, which indicates some problem in the accumulating parameters of your functions -- where all the Integers are.
Modifying the function so that it is strict in the lazy Integer arguments (based on the fact you never inspect their value), like so:
{-# LANGUAGE BangPatterns #-}
intersect n k !z !s rs c
| c == 23 = rs
| x == y = intersect (n+1) (k+1) (z+1) (z+s) (f : rs) (c+1)
| x < y = intersect (n+1) k (z+1) s rs c
| otherwise = intersect n (k+1) z s rs c
where x = (2*n*n) + 4 * n
y = (k * k + k )
f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0 [] 0
main = do
putStr (show p)
And your program now runs in constant space with the list of arguments you're producing (though doesn't terminate for c == 23 in any reasonable time).
If it is OK to get the resulting list reversed, you can take advantage of Haskell's laziness and return the list as it is computed, instead of passing it recursively as an accumulating argument. Not only does this let you consume and print the list as it is being computed (thereby eliminating one space leak right there), you can also factor out the decision about how many elements you want from intersect:
{-# LANGUAGE BangPatterns #-}
intersect n k !z s
| x == y = f : intersect (n+1) (k+1) (z+1) (z+s)
| x < y = intersect (n+1) k (z+1) s
| otherwise = intersect n (k+1) z s
where x = (2*n*n) + 4 * n
y = (k * k + k )
f = (z, (x `div` 2), (z+s))
p = intersect 1 1 1 0
main = do
putStrLn (unlines (map show (take 23 p)))
As Don noted, we need to be careful so that accumulating arguments evaluate timely instead of building up big thunks. By making the argument z strict we ensure that all arguments will be demanded.
By outputting one element per line, we can watch the result being produced:
$ ghc -O2 intersect.hs && ./intersect
[1 of 1] Compiling Main ( intersect.hs, intersect.o )
Linking intersect ...
(1,3,1)
(3,15,4)
(10,120,14)
(22,528,36)
(63,4095,99)
(133,17955,232)
(372,139128,604)
(780,609960,1384)
...