algorithm what (n)
begin
if n = 1 then call A
else
begin
what (n-1);
call B(n)
end
end.
In the above program, I was asked to find the time complexity where procedure A takes O(1) time and procedure B takes O(1/n).
I formed the recurrence relation T(n) = T(n-1) + O(1/n)
And solving it, I got T(n) = O(log n) since we will get harmonic series if we solve it by using back substitution method and time complexity to compute the sum of harmonic series is O(lgn). But the answer is given as O(n). I am not able to figure out how they got that answer. In the explanation they have added a constant times n to the recurrence relation. I didn't get why we should add that constant times n. Please help me in understanding this.
This is likely a trick question set by the author / examiner to catch you out. You must note that the O(1) operations involved in each call to what (pushing arguments to stack etc.) overshadow the O(1/n) complexity of B – at least asymptotically speaking. So the actual time complexity is T(n) = T(n - 1) + O(1), which gives the correct answer.
Related
I came across a problem where O(nm) is being ran against O(n log n) and they say that one is better but it's not so obvious to me.
O(n log n) is better than O (n^2), but what about compared to O(n*m) with m being a non-constant?
When you examine complexity, you must define your input.
When your input is N and you want to compute a program complexity, given N, the result will be a function with N as an argument.
It is easy to compare between two different functions with the same input.
However, in your case, you are comparing a function that takes one argument (N) with a function that takes two arguments (N and M), while M is unknown relatively to N. Therefore, you can't really compare between them and get the answer you want.
For example, if M is defined as M=N*C (when C is a constant), you can say that O(N*M)=O(N^2)>O(N*logN). But if M is defined as M=log(log(N))*C, then O(N*M)=O(N*log(log(N)))<O(NlogN).
The key point here is that time complexity calculation always relays on every input, which they must be well defined relatively to the other arguments being compared.
So, the answer to (N*M) ? O(N*logN) depends on the relation between M and N (to be more specific: depends on if M*C<log(N)*K, when C and K are constants)
You can't answer for at least three reasons:
if N and M are independent, there is no relation between log N and M;
asymptotic complexities depend on the constants in the big-O notation, and even for two O(N) functions you don't know which is "better";
big-O's are just upper bounds and if they are not tight, the true function can be quite different.
This is not a homework problem. I'm prepping for an interview, and have done quite a bit of research on links on this post. I coded up a solution based on a suggestion, but I disagree with the time complexity proposed. I'd like to know whether I'm incorrect/correct in my assertion.
Below is a function to spit out group of anagrams. It sorted each input words and put the sorted input word in a dictionary. I wrote the code myself based on a hint from geeksforgeeks posting that suggest:
Using sorting: We can sort array of strings so that all anagrams
come together. Then print all anagrams by linearly traversing the
sorted array. The time complexity of this solution is O(mnLogn) (We
would be doing O(nLogn) comparisons in sorting and a comparison would
take O(m) time). Where n is number of strings and m is maximum length of a string.
I disagree with the time complexity mentioned
I think the time complexity for the following code is n(m log m).
Space complexity is: O(2n)= O(n) for the results and sorted_dict variable
n= number of words, m = number of character in a word
def groupAnagrams(strs):
sorted_dict ={}
results=[]
for each in strs:#loop: O(n)
#time complexity for sort: O(m log m).
sorted_str = "".join(sorted(each.lower())) #O(m)
if not sorted_dict.get(sorted_str): #0(1)
sorted_dict[sorted_str] = []
sorted_dict[sorted_str].append(each) #0(1)
for k,v in sorted_dict.items(): #0(n)
results.append(v)
return results
Your algorithm has time complexity O(mn log m), dominated by the time it takes to sort each of the strings in the array; so your analysis is correct. However, your result differs from the one you quoted not because the quote is wrong, but because your algorithm is different to the one analysed in the quote. Note that the quote says:
We can sort array of strings so that all anagrams come together.
Your algorithm does not do this; it does not sort the array of strings at all, rather it sorts the characters in each string individually. Here's an implementation in Python of the algorithm that this quote is talking about:
from itertools import groupby
NO_OF_CHARS = 256
def char_freqs(word):
count = [0] * NO_OF_CHARS
for c in word:
count[ord(c)] += 1
return count
def print_anagrams_together(words):
words = sorted(words, key=char_freqs)
for _, group in groupby(words, key=char_freqs):
print(*group, sep=', ')
The time complexity can be determined as follows:
char_freqs takes O(m) time because of iterating over a string of length m.
Sorting takes O(mn + n log n) time, because the key function takes O(m) time and is called for n strings, and then the strings are sorted in O(n log n) time. The comparisons in the sort are done on lists of length NO_OF_CHARS (a constant), so the comparisons take constant time.
Grouping words together takes O(mn) time because it's dominated by calling char_freqs again n times; this could be improved to O(n) by reusing the already-computed keys from the sort, but this part is dominated by sorting anyway.
That gives an overall time complexity of O(mn + n log n), which is not the same as quoted, but you would get O(mn log n) if the key function char_freqs were called for every comparison, instead of once per element and cached. For example, if you did the sorting in Java using something like:
// assuming that charFreqs returns something comparable
Collections.sort(words, Comparator.comparing(Solution::charFreqs));
Then the comparisons would take O(m) time instead of O(1) time, and the overall time complexity would be O(mn log n). So the quote isn't wrong, it's just talking about a different algorithm to the one you were thinking of, and it assumes a suboptimal implementation of it.
during my classroom i asked this question to my teacher and he couldn't answer that's why i am asking here.
i asked that during a code , what if we have a loop to run from 1 to 10 , does the complexity would be O(1) {big O of 1} . heanswered yes. so here's the question what if i have written a loop to run from 1 to 1 million .is it sill O(1)? or is it O(n) or something else?
pseudo code -
for i in range(1,1 million):
print("hey")
what is the time complexity for that loop
now , if you think the answer is O(n) , how can you say it to be O(n) , because O(n) is when complexity is linear.
and what is the silver lining? when a code gets O(1) and O(n) .
like if i would have written a loop for 10 or 100 or 1000 or 10000 or 100000. when did it transformed from O(1) to O(n).
By definition, O(10000000) and O(1) are equal, Let me quickly explain what complexity means.
What we try to represent with the abstraction of time (and space) complexity isn't how fast a program will run, it what is the growth in runtime (or space) given the growth in input length.
For instance, given a loop with a fixed number of iterations (lets say 10), it doesnt matter if your input will be 1 long or 10000000000000, because your loop will ALWAYS run the same number of iteration therefore, no growth in runtime (even if that 10 iterations may take 1 week to run, it will always be 1 week).
but, if your algorithm's steps are dependent in your input length, that means the longer your input, the longer your algorithm's steps, the question is, how much more steps?
in summary, time (and space) complexity is an abstraction, its not here to tell us how long things will take, its simply here to tell us how the growth in time will be given growth in input, O(1) == O(10000000), because its not about how long it will take, its about the change in the runtime, O(1) algorithm can take 10 years, but it will always take 10 years, even for very large input length.
I think you are confusing the term. Time complexity for a given algorithm is given by the relationship between change in execution time with respect to change in input size.
If you are running a fixed loop from 1 to 10, but doing something in each iteration, then that counts as O(10), or O(1), meaning that it will take the same time each run.
But, as soon as the number of iterations starts depending on the number of elements or tasks, then a loop becomes O(n), meaning that the complexity becomes linear. The more the tasks, proportionally more the time.
I hope that clears some things up. :-)
I am working with a very specific divide and conquer algorithm that always divides a problem with n elements into two subproblems with n/2 - 1 and n/2 + 1 elements.
I am pretty sure the time complexity remains O(n log n), but I wonder how could I formally prove it.
Take the "useful work done" at each recursion level to be some function f(n):
Let's observe what happens when we repeatedly substitute this back into itself.
T(n) terms:
Spot the pattern?
At recursion depth m:
There are recursive calls to T
The first term in each parameter for T is
The second term ranges from to , in steps of
Thus the sum of all T-terms at each level is given by:
f(n) terms:
Look familiar?
The f(n) terms are exactly one recursion level behind the T(n) terms. Therefore adapting the previous expression, we arrive at the following sum:
However note that we only start with one f-term, so this sum has an invalid edge case. However this is simple to rectify - the special-case result for m = 1 is simply f(n).
Combining the above, and summing the f terms for each recursion level, we arrive at the (almost) final expression for T(n):
We next need to find when the first summation for T-terms terminates. Let's assume that is when n ≤ c.
The last call to terminate intuitively has the largest argument, i.e the call to:
Therefore the final expression is given by:
Back to the original problem, what is f(n)?
You haven't stated what this is, so I can only assume that the amount of work done per call is ϴ(n) (proportional to the array length). Thus:
Your hypothesis was correct.
Note that even if we had something more general like
Where a is some constant not equal to 1, we would still have ϴ(n log n) as the result, since the terms in the above equation cancel out:
The Three For Loops:
I'm fairly new to this Big-Oh stuff and I'm having difficulty seeing the difference in complexity between these three loops.
They all seem to run less than O(n^2) but more than O(n).
Could someone explain to me how to evaluate the complexity of these loops?
Thanks!
Could someone explain to me how the evaluate to complexity of these loops?
Start by clearly defining the problem. The linked image has little to go on, so let's start making up stuff:
The parameter being varied is integer n.
C is a constant positive integer value greater than one.
the loop variables are integers
integers do not overflow
The costs of addition, comparison, assignment, multiplication and indexing is all constant.
The cost we are looking to find the complexity of is the cost of the constant operations of the innermost loop; we ignore all the additions and whatnot in the actual computations of the loop variables.
In each case the innermost statement is the same, and is of constant cost, so let's just call that cost "one unit" of cost.
Great.
What is the cost of the first loop?
The cost of the inner statement is one unit.
The cost of the "j" loop containing it is ten units every time.
How many times does the "i" loop run? Roughly n divided by C times.
So the total cost of the "i" loop is 10 * n / C, which is O(n).
Now can you do the second and third loops? Say more clearly where you are running into trouble. Start with:
The cost of the first run of the "j" loop is 1 unit.
The cost of the second run of the "j" loop is C units
The cost of the third run of the "j" loop is C * C units
...
and go from there.
Remember that you don't need to work out the exact cost function. You just need to figure out the dominating cost. Hint: what do we know about C * C * C ... at the last run of the outer loop?
You can analyse the loops using Sigma notation. Note that for the purpose of studying the asymptotic behaviour of loop (a), the constant C just describes the linear increment in the loop, and we can freely choose any value of C in our analysis (since the inner loop is just a fixed number of iterations) however assuming C>0 (integer). Hence, for loop (a), choose C=1. For loop (b), we'll include C, and assume, however, that C > 1 (integer). If C = 1 in loop (b), it will never terminate as i is never incremented. Finally define the innermost operations in all loops as our basic operations, with cost O(1).
The Sigma notation analysis follows:
Hence
(a) is O(n)
(b) is O(n)
(c) is O(n*log(n))