As defined (wiki) time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the length of the string representing the input.
Then how is that we count the number of elementary operations and call if time complexity ?
Doing so we are not even thinking of length of string representing the input ? Isn't it.
Related
I'm learning a course about big O notation on Coursera. I watched a video about the big O of a Fibonacci algorithm (non-recursion method), which is like this:
Operation Runtime
create an array F[0..n] O(n)
F[0] <-- 0 O(1)
F[1] <-- 1 O(1)
for i from 2 to n: Loop O(n) times
F[i] <-- F[i-1] + F[i-2] O(n) => I don't understand this line, isn't it O(1)?
return F[n] O(1)
Total: O(n)+O(1)+O(1)+O(n)*O(n)+O(1) = O(n^2)
I understand every part except F[i] <-- F[i-1] + F[i-2] O(n) => I don't understand this line, isn't it O(1) since it's just a simple addition? Is it the same with F[i] <-- 1+1?
The explanation they give me is:"But the addition is a bit worse. And normally additions are constant time. But these are large numbers. Remember, the nth Fibonacci number has about n over 5 digits to it, they're very big, and they often won't fit in the machine word."
"Now if you think about what happens if you add two very big numbers together, how long does that take? Well, you sort of add the tens digit and you carry, and you add the hundreds digit and you carry, and add the thousands digit, you carry and so on and so forth. And you sort of have to do work for each digits place.
And so the amount of work that you do should be proportional to the number of digits. And in this case, the number of digits is proportional to n, so this should take O(n) time to run that line of code".
I'm still a bit confusing. Does it mean a large number affects time complexity too? For example a = n+1 is O(1) while a = n^50+n^50 isn't O(1) anymore?
Video link for anyone who needed more information (4:56 to 6:26)
Big-O is just a notation for keeping track of orders of magnitude. But when we apply that in algorithms, we have to remember "orders of magnitude of WHAT"? In this case it is "time spent".
CPUs are set up to execute basic arithmetic on basic arithmetic types in constant time. For most purposes, we can assume we are dealing with those basic types.
However if n is a very large positive integer, we can't assume that. A very large integer will need O(log(n)) bits to represent. Which, whether we store it as bits, bytes, etc, will need an array of O(log(n)) things to store. (We would need fewer bytes than bits, but that is just a constant factor.) And when we do a calculation, we have to think about what we will actually do with that array.
Now suppose that we're trying to calculate n+m. We're going to need to generate a result of size O(log(n+m)), which must take at least that time to allocate. Luckily the grade school method of long addition where you add digits and keep track of carrying, can be adapted for big integer libraries and is O(log(n+m)) to track.
So when you're looking at addition, the log of the size of the answer is what matters. Since log(50^n) = n * log(50) that means that operations with 50^n are at least O(n). (Getting 50^n might take longer...) And it means that calculating n+1 takes time O(log(n)).
Now in the case of the Fibonacci sequence, F(n) is roughly φ^n where φ = (1 + sqrt(5))/2 so log(F(n)) = O(n).
Let's say I have to iterate over every character in an array of strings, in which every string has a different length, so arr[0].length != arr[1].length and so on, as this for example:
#prints every char in all the array
for str in arr:
for c in str:
print(c)
How should the time complexity of an algorithm of this nature be represented? A summation of every length of the element in the array? or just like O(N*M), taking N as number of elements and M as max length of array, which it overbounds accordingly?
There is a precise mathematical theory called complexity theory which answers your question and many more. In complexity theory, we have what is called a Turing machine which is a type of computer. The time complexity of a Turing machine doing a computation is then defined as the function f defined on natural numbers such that f(n) is the worst case running time of the machine on inputs of length n. In your case it just needs to copy its input into somewhere else, which is clearly has O(n) time complexity (n here is the combined length of your array). Since NM is greater than n, it means that your Turing machine doing the algorithm you described will not run longer than some constant times NM but it may halt sooner due to irregularities of the lengths of elements of the array.
If you are interested in learning about complexity theory, I recommend the book Introduction to the Theory of Computation by Michael Sipser, which explains these concepts from scratch.
There are many ways you could do this. Your bound of O(NM) is a conservative upper bound. You could also define a parameter L indicating the total length of all the strings and say that the runtime is Θ(N + L), which is essentially your sum idea made a bit cleaner by assigning a name to the summation. That’s a more precise bound that more clearly indicates where the work is being done.
I just did an interview where my solution involved this algorithm, but I could not say with confidence what the time complexity was.
Psuedocode example:
arr = ["hello", "this", "is", "some", "different", "length", "strings"]
function (arr)
for string in arr
for char in string
// do stuff in constant time
I initially thought that the complexity as O(N * M) where N is the length of the array and M is the length of each string, but if the strings vary in length I cannot characterize all their lengths with a constant M.
EDIT: strings in the array don't have to be real words and can be any string of arbitrary length
In this case, where the amount of work you do doesn't depend solely on the number of elements in the input array but also on a property of those elements (namely, their length), you're correct that simply describing the algorithm as linear or O(n) would be inadequate.
If all the strings are indeed of arbitrary length, you could technically describe the time-complexity as being 'pseudo-polynomial' or 'pseudo-linear'. Although the lengths of the strings are arbitrary (as you put it, you can't give a fixed value to 'm'), you could still describe the complexity as O(nm) where m is the length of the largest string in the input (It's not important that m is unknown or arbitrary: you could say the same thing about n). It would also be correct to say the algorithm is in O(nm) or even Θ(nm) for m as the average length of the input strings. These are just specific ways of saying that it is pseudo-linear.
But if you make more qualifying assumptions or re-frame what you interpret as the 'input' of the algorithm, then you could describe it as linear. For example, if you can bound the maximum length of any string in the input by any constant (eg if you know there will be no string longer than 10,000 characters), then saying the algorithm is O(n) would be completely correct. You could also say that the algorithm is linear in the total number of characters in the input (rather than in the number of words), or linear in the average input string length.
It is known that binary search takes t units of time in the worst case for a sorted array of size n. How long will the algorithm take in the worst case if input size is n/2?
Our new input differs from the original input by a factor of 2. Because the worst-case time of binary search is known to have a logarithmic complexity, the asymptotic upper bound on the time should be the same. This should hold true for any input that's a multiple of 2 of the original.
I'v got some problem to understand the difference between Logarithmic(Lcc) and Uniform(Ucc) cost criteria and also how to use it in calculations.
Could someone please explain the difference between the two and perhaps show how to calculate the complexity for a problem like A+B*C
(Yes this is part of an assignment =) )
Thx for any help!
/Marthin
Uniform Cost Criteria assigns a constant cost to every machine operation regardless of the number of bits involved WHILE Logarithm Cost Criteria assigns a cost to every machine operation proportional to the number of bits involved
Problem size influence complexity
Since complexity depends on the size of the
problem we define complexity to be a function
of problem size
Definition: Let T(n) denote the complexity for
an algorithm that is applied to a problem of
size n.
The size (n) of a problem instance (I) is the
number of (binary) bits used to represent the
instance. So problem size is the length of the
binary description of the instance.
This is called Logarithmic cost criteria
Unit Cost Criteria
If you assume that:
- every computer instruction takes one time
unit,
- every register is one storage unit
- and that a number always fits in a register
then you can use the number of inputs as
problem size since the length of input (in bits)
will be a constant times the number of inputs.
Uniform cost criteria assume that every instruction takes a single unit of time and that every register requires a single unit of space.
Logarithmic cost criteria assume that every instruction takes a logarithmic number of time units (with respect to the length of the operands) and that every register requires a logarithmic number of units of space.
In simpler terms, what this means is that uniform cost criteria count the number of operations, and logarithmic cost criteria count the number of bit operations.
For example, suppose we have an 8-bit adder.
If we're using uniform cost criteria to analyze the run-time of the adder, we would say that addition takes a single time unit; i.e., T(N)=1.
If we're using logarithmic cost criteria to analyze the run-time of the adder, we would say that addition takes lgn time units; i.e., T(N)=lgn, where n is the worst case number you would have to add in terms of time complexity (in this example, n would be 256). Thus, T(N)=8.
More specifically, say we're adding 256 to 32. To perform the addition, we have to add the binary bits together in the 1s column, the 2s column, the 4s column, etc (columns meaning the bit locations). The number 256 requires 8 bits. This is where logarithms come into our analysis. lg256=8. So to add the two numbers, we have to perform addition on 8 columns. Logarithmic cost criteria say that each of these 8 addition calculations takes a single unit of time. Uniform cost criteria say that the entire set of 8 addition calculations takes a single unit of time.
Similar analysis can be made in terms of space as well. Registers either take up a constant amount of space (under uniform cost criteria) or a logarithmic amount of space (under uniform cost criteria).
I think you should do some research on Big O notation... http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions
If there is a part of the description you find difficult edit your question.