how to evaluate equality of big O notation? - time-complexity

I've been asked a question that seems strange to me. Given the following two equalities, which is true, and which is false (or are they both either true or false)?
O(n^2) = O(n^3)
O(n^3) = O(n^2)
To me, this question seems absurd, since O(f(n)) just means that for some time function T(n), lim as n -> infty of T(n) <= c * f(n).

O(f(n)) can be thought of as the class of all functions whose growth is bounded above by f(n). Taking into account this, the two options are false.

Related

Which is faster? Switch statement or dictionary?

I see a lot of the following enum-to-string conversion in Objective-c/C at work. Stuff like:
static NSString *_TranslateMyAnimalToNSString(MyAnimal animal)
{
switch (animal) {
case MyAnimalDog:
return "my_animal_dog";
case MyAnimalCat:
return #"my_animal_cat";
case MyAnimalFish:
return #"my_animal_fish";
}
}
NS_ENUM(NSInteger, MyAnimal) {
MyAnimalDog,
MyAnimalCat,
MyAnimalFish,
};
Wouldn't it be faster and/or smaller to have a static dictionary? Something like:
static NSDictionary *animalsAndNames = #{#{MyAnimalCat} : #"my_animal_cat",
#{MyAnimalDog} : #"my_animal_dog",
#{MyAnimalFish} : #"my_animal_fish"};
The difference is small, but I'm trying to optimize the binary size and speed, which makes me inclined toward the latter.
Thanks for helping clarify.
Answer
The dictionary should be faster for a large amount of cases. A dictionary is a hashmap, which grants O(1) lookup. A switch statement, on the other hand, will have to go through all entries thus requiring ϴ(n) time.
A quick explanation of big O/ϴ/Ω notation
Big-O notation is used to give an asymptotic upper bound to a particular function. That is, f(n) = O(g(n)) means that f(n) does not grow faster than g(n) as n goes to infinity (up to a constant). Similarly, big Ω denotes a lower bound. Therefore, the function f(n)=n+3 is both in Ω(1) and O(n^4), which is not very useful.
Big ϴ, then, denotes a strict bound. If f(n) = O(g(n)) and f(n) = Ω(g(n)) then also f(n) = ϴ(g(n)).
Often using big O suffices, as it makes no sense to advertise a linear algorithm as being in O(n^3), even though it is technically correct. In the case above however the emphasis is on the relative slowness of the switch case, which big O cannot correctly express, hence the usage of ϴ/Ω.
(However, I'm not sure sacrificing readability for correctness was the right choice.)

What is a*<a[j] in pseudocode?

I was trying the time complexity mcq questions given in codechef under practice for Data Structures and Algorithms. One of the questions had a line a*< a[i]. What does that line mean?
I know that if there wasn't an and statement the complexity would have been O(n^2). But the a*< is completely alien to me. I searched for it in the internet but all I got was about the a star algorithm and asterisks! I tried running the program in python with a print statement but it says that * is invalid. Does that mean something like a pointer to the 1st element in the array or something?
Find the time complexity of the following function
n = len(a)
j = 0
for i =0 to n-1:
while (j < n and a* < a[j]):
j += 1
The answer is given as O(n). But there are nested loops so it is supposed to be O(n^2).Help required! Thanks
It doesn't actually matter what a* means. The question is to determine the time complexity of the algorithm. Notice that although there are two nested loops, the inner while loop isn't a full independent loop. Its index is j, which starts at 0 and is only ever incremented, with an upper bound of n. So the inner loop can only run a maximum of n times in total. This means that the overall complexity is only O(n).

Is a nested for loop automatically O(n^2)?

I was recently asked an interview question about testing the validity of a Sudoku board. A basic answer involves for loops. Essentially:
for(int x = 0; x != 9; ++x)
for(int y = 0; y != 9; ++y)
// ...
Do this nested for loops to check the rows. Do it again to check the columns. Do one more for the sub-squares but that one is more funky because we're dividing the suoku board into sub-boards so we end end up more than two nested loops, maybe three or four.
I was later asked the complexity of this code. Frankly, as far as I'm concerned, all the cells of the board are visited exactly three times so O(3n). To me, the fact that we have nested loops doesn't mean this code is automatically O(n^2) or even O(n^highest-nesting-level-of-loops). But I have suspicion that that's the answer the interviewer expected...
Posed another way, what is the complexity of these two pieces of code:
for(int i = 0; i != n; ++i)
// ...
and:
for(int i = 0; i != sqrt(n); ++i)
for(int j = 0; j != sqrt(n); ++j)
// ...
Your general intuition is correct. Let's clarify a bit about Big-O notation:
Big-O gives you an upper bound for the worst-case (time) complexity for your algorithm, in relation to n - the size of your input. In essence, it is a measurement of how the amount of work changes in relation to the size of the input.
When you say something like
all the cells of the board are visited exactly three times so O(3n).
you are implying that n (the size of your input) is the the number of cells in the board and therefore visiting all cells three times would indeed be an O(3n) (which is O(n)) operation. If this is the case you would be correct.
However usually when referring to Sudoku problems (or problems involving a grid in general), n is taken to be the number of cells in each row/column (an n x n board). In this case, the runtime complexity would be O(3n²) (which is indeed equal to O(n²)).
In the future, it is perfectly valid to ask your interviewer what n is.
As for the question in the title (Is a nested for loop automatically O(n^2)?) the short answer is no.
Consider this example:
for(int i = 0 ; i < n ; i++) {
for(int j = 0 ; j < n ; j * 2) {
... // some constant time operation
}
}
The outer loops makes n iterations while the inner loop makes log2(n) iterations - therefore the time complexity will be O(nlogn).
In your examples, in the first one you have a single for-loop making n iterations, therefore a complexity of (at least) O(n) (the operation is performed an order of n times).
In the second one you two nested for loops, each making sqrt(n) iterations, therefore a total runtime complexity of (at least) O(n) as well. The second function isn't automatically O(n^2) simply because it contains a nested loop. The amount of operations being made is still of the same order (n) therefore these two examples have the same complexity - since we assume n is the same for both examples.
This is the most crucial point to sail home. To compare between the performance of two algorithms, you must be using the same input to make the comparison. In your sudoku problem you could have defined n in a few different ways, and the way you did would directly affect the complexity calculation of the problem - even if the amount of work is all the same.
*NOTE - this is unrelated to your question, but in the future avoid using != in loop conditions. In your second example, if log(n) is not a whole number, the loop could run forever, depending on the language and how it is defined. It is therefore recommended to use < instead.
It depends on how you define the so-called N.
If the size of the board is N-by-N, then yes, the complexity is O(N^2).
But if you say, the total number of grids is N (i.e., the board id sqrt(N)-by-sqrt(N)), then the complexity is O(N), or 3O(N) if you mind the constant.

Analyzing time complexity (Poly log vs polynomial)

Say an algorithm runs at
[5n^3 + 8n^2(lg (n))^4]
Which is the first order term? Would it be the one with the poly log or the polynomial?
For each two constants a>0,b>0, log(n)^a is in o(n^b) (Note small o notation here).
One way to prove this claim is examine what happens when we apply a monotomically increasing function on both sides: the log function.
log(log(n)^a)) = a* log(log(n))
log(n^b) = b * log(n)
Since we know we can ignore constants when it comes to asymptotic notations, we can see that the answer to "which is bigger" log(n)^a or n^b, is the same as "which is bigger": log(log(n)) and log(n). This answer is much more intuitive to answer.

Renaming variables to solve recursion method

I know the idea of renaming the variables that is transforming the recurrence to one that you have seen before.
I'm OK with slide until line 4 .. they renamed T(2^m) with S(m) >> this mean they made 2^m = m
So S(m) should be :
S(m)= 2T(m^(0.5)) + m
also m i think we shouldn't leave m as it is, because it here mean 2^m but they in real are not
Could any one explain this to me?
And also how can i know which variables I should use to make it easy to me ?
Everything you're saying is correct up to the point where you claim that since S(m) = T(2m), then m = 2m.
The step of defining S(m) = T(2m) is similar to defining some new function g in terms of an old function f. For example, if you define a new function g(x) = 2f(5x), you're not saying that x = 5x. You're just defining a new function that's evaluated in terms of f.
So let's see what happens from here. We've defined S(m) = T(2m). That means that
S(m) = T(2m)
= 2T(√(2m)) + lg (2m)
We can do some algebraic simplification to see that
S(m) = 2T(2m/2) + m
And, using the connection between T and S, we see that
S(m) = 2S(m/2) + m
Notice that we ended up with the recurrence S(m) = 2S(m/2) + m not by just replacing T with S in the original recurrence, but by doing algebraic substitutions and simplifications.
Once we're here, we can use the master theorem to solve S(m) and get that S(m) = O(m log m), so
T(n) = S(lg n) = O(lg n lg lg n).
As for how you'd come up with this in the first place - that just takes practice. The key insight is that to use the master theorem you need to be shrink the size of the problem down by a constant factor each time, so you need to find a transformation that converts square roots into division by a constant. Square roots are a kind of exponentiation, and logarithms are specifically designed to convert exponentiation into multiplication and division, so it's reasonable to try a log or exponential substitution. Now that you know the trick, I suspect that you'll see it in a lot more places.
You could, as alternative, also just divide the first equation by log(n) to get
T(n)/log(n)=T(sqrt(n))/log(sqrt(n)) + 1
and then just use
S(n) = T(n)/log(n) with S(n) = S(sqrt(n)) + 1
or in a different way
S(k) = T(n^(2^(-k)))/log(n^(2^(-k)))
where then
S(k+1)=S(k)+1
is again a well-known recursive equation.