Time complexity and integer inputs - time-complexity

I came across a question asking to describe the computational complexity in Big O of the following code:
i = 1;
while(i < N) {
i = i * 2;
}
I found this Stack Overflow question asking for the answer, with the most voted answer saying it is Log2(N).
On first thought that answer looks correct, however I remember learning about psuedo polynomial runtimes, and how computational complexity measures difficulty with respect to the length of the input, rather than the value.
So for integer inputs, the complexity should be in terms of the number of bits in the input.
Therefore, shouldn't this function be O(N)? Because every iteration of the loop increases the number of bits in i by 1, until it reaches around the same bits as N.

This code might be found in a function like the one below:
function FindNextPowerOfTwo(N) {
i = 1;
while(i < N) {
i = i * 2;
}
return i;
}
Here, the input can be thought of as a k-bit unsigned integer which we might as well imagine as having as a string of k bits. The input size is therefore k = floor(log(N)) + 1 bits of input.
The assignment i = 1 should be interpreted as creating a new bit string and assigning it the length-one bit string 1. This is a constant time operation.
The loop condition i < N compares the two bit strings to see which represents the larger number. If implemented intelligently, this will take time proportional to the length of the shorter of the two bit strings which will always be i. As we will see, the length of i's bit string begins at 1 and increases by 1 until it is greater than or equal to the length of N's bit string, k. When N is not a power of two, the length of i's bit string will reach k + 1. Thus, the time taken by evaluating the condition is proportional to 1 + 2 + ... + (k + 1) = (k + 1)(k + 2)/2 = O(k^2) in the worst case.
Inside the loop, we multiply i by two over and over. The complexity of this operation depends on how multiplication is to be interpreted. Certainly, it is possible to represent our bit strings in such a way that we could intelligently multiply by two by performing a bit shift and inserting a zero on the end. This could be made to be a constant-time operation. If we are oblivious to this optimization and perform standard long multiplication, we scan i's bit string once to write out a row of 0s and again to write out i with an extra 0, and then we perform regular addition with carry by scanning both of these strings. The time taken by each step here is proportional to the length of i's bit string (really, say that plus one) so the whole thing is proportional to i's bit-string length. Since the bit-string length of i assumes values 1, 2, ..., (k + 1), the total time is 2 + 3 + ... + (k + 2) = (k + 2)(k + 3)/2 = O(k^2).
Returning i is a constant time operation.
Taking everything together, the runtime is bounded from above and from below by functions of the form c * k^2, and so a bound on the worst-case complexity is Theta(k^2) = Theta(log(n)^2).

In the given example, you are not increasing the value of i by 1, but doubling it at every time, thus it is moving 2 times faster towards N. By multiplying it by two you are reducing the size of search space (between i to N) by half; i.e, reducing the input space by the factor of 2. Thus the complexity of your program is - log_2 (N).
If by chance you'd be doing -
i = i * 3;
The complexity of your program would be log_3 (N).

It depends on important question: "Is multiplication constant operation"?
In real world it is usually considered as constant, because you have fixed 32 or 64 bit numbers and multiplicating them takes always same (=constant) time.
On the other hand - you have limitation that N < 32/64 bit (or any other if you use it).
In theory where you do not consider multiplying as constant operation or for some special algorithms where N can grow too much to ignore the multiplying complexity, you are right, you have to start thinking about complexity of multiplying.
The complexity of multiplying by constant number (in this case 2) - you have to go through each bit each time and you have log_2(N) bits.
And you have to do hits log_2(N) times before you reach N
Which ends with complexity of log_2(N) * log_2(N) = O(log_2^2(N))
PS: Akash has good point that multiply by 2 can be written as constant operation, because the only thing you need in binary is to "add zero" (similar to multiply by 10 in "human readable" format, you just add zero 4333*10 = 43330)
However if multiplying is not that simple (you have to go through all bits), the previous answer is correct

Related

Time Complexity of Algorithms With Addition [duplicate]

I'm learning a course about big O notation on Coursera. I watched a video about the big O of a Fibonacci algorithm (non-recursion method), which is like this:
Operation Runtime
create an array F[0..n] O(n)
F[0] <-- 0 O(1)
F[1] <-- 1 O(1)
for i from 2 to n: Loop O(n) times
F[i] <-- F[i-1] + F[i-2] O(n) => I don't understand this line, isn't it O(1)?
return F[n] O(1)
Total: O(n)+O(1)+O(1)+O(n)*O(n)+O(1) = O(n^2)
I understand every part except F[i] <-- F[i-1] + F[i-2] O(n) => I don't understand this line, isn't it O(1) since it's just a simple addition? Is it the same with F[i] <-- 1+1?
The explanation they give me is:"But the addition is a bit worse. And normally additions are constant time. But these are large numbers. Remember, the nth Fibonacci number has about n over 5 digits to it, they're very big, and they often won't fit in the machine word."
"Now if you think about what happens if you add two very big numbers together, how long does that take? Well, you sort of add the tens digit and you carry, and you add the hundreds digit and you carry, and add the thousands digit, you carry and so on and so forth. And you sort of have to do work for each digits place.
And so the amount of work that you do should be proportional to the number of digits. And in this case, the number of digits is proportional to n, so this should take O(n) time to run that line of code".
I'm still a bit confusing. Does it mean a large number affects time complexity too? For example a = n+1 is O(1) while a = n^50+n^50 isn't O(1) anymore?
Video link for anyone who needed more information (4:56 to 6:26)
Big-O is just a notation for keeping track of orders of magnitude. But when we apply that in algorithms, we have to remember "orders of magnitude of WHAT"? In this case it is "time spent".
CPUs are set up to execute basic arithmetic on basic arithmetic types in constant time. For most purposes, we can assume we are dealing with those basic types.
However if n is a very large positive integer, we can't assume that. A very large integer will need O(log(n)) bits to represent. Which, whether we store it as bits, bytes, etc, will need an array of O(log(n)) things to store. (We would need fewer bytes than bits, but that is just a constant factor.) And when we do a calculation, we have to think about what we will actually do with that array.
Now suppose that we're trying to calculate n+m. We're going to need to generate a result of size O(log(n+m)), which must take at least that time to allocate. Luckily the grade school method of long addition where you add digits and keep track of carrying, can be adapted for big integer libraries and is O(log(n+m)) to track.
So when you're looking at addition, the log of the size of the answer is what matters. Since log(50^n) = n * log(50) that means that operations with 50^n are at least O(n). (Getting 50^n might take longer...) And it means that calculating n+1 takes time O(log(n)).
Now in the case of the Fibonacci sequence, F(n) is roughly φ^n where φ = (1 + sqrt(5))/2 so log(F(n)) = O(n).

How can I compare the time-complexity O(n^2) with O(N+log(M))?

My Lua function:
for y=userPosY+radius,userPosY-radius,-1 do
for x=userPosX-radius,userPosX+radius,1 do
local oneNeighborFound = redis.call('lrange', userPosZone .. x .. y, '0', '0')
if next(oneNeighborFound) ~= nil then
table.insert(neighborsFoundInPosition, userPosZone .. x .. y)
neighborsFoundInPositionCount = neighborsFoundInPositionCount + 1
end
end
end
Which leads to this formula: (2n+1)^2
As I understand it correctly, that would be a time complexity of O(n^2).
How can I compare this to the time complexity of the GEORADIUS (Redis) with O(N+log(M))? https://redis.io/commands/GEORADIUS
Time complexity: O(N+log(M)) where N is the number of elements inside the bounding box of the circular area delimited by center and radius and M is the number of items inside the index.
My time complexity does not have a M. I do not know how many items are in the index (M) because I do not need to know that. My index changes often, almost with every request and can be large.
Which time complexity is when better?
Assuming N and M were independent variables, I would treat O(N + log M) the same way you treat O(N3 - 7N2 - 12N + 42): the latter becomes O(N3) simply because that's the term that has most effect on the outcome.
This is especially true as time complexity analysis is not really a case of considering runtime. Runtime has to take into account the lesser terms for specific limitations of N. For example, if your algorithm runtime can be expressed as runtime = N2 + 9999999999N, and N is always in the range [1, 4], it's the second term that's more important, not the first.
It's better to think of complexity analysis as what happens as N approaches infinity. With the O(N + log M) one, think about what happens when you:
double N?
double M?
The first has a much greater impact so I would simply convert the complexity to O(N).
However, you'll hopefully have noticed the use of the word "independent" in my first paragraph. The only sticking point to my suggestion would be if M was actually some function of N, in which case it may become the more important term.
Any function that reversed the impact of the log M would do this, such as the equality M = 101010N.

What is the complexity of this program. Is it O(n)?

This is a simple program I want to know the complexity of this program. I assume this is O(n) as it has only a single operation in one for loop.
a = int(input("Enter a:"))
b = int(input("Enter b:"))
sol = a
for i in range(a,b):
sol = sol & i+1
print("\nSol",sol)
Yes, it is O(n), sort of. You have to remember O(n) means the number of operations grows with the size of the input. Perhaps you're worried about the & and (i+1) operations in the for loop. What you need to keep in mind here is these operations are constant since they're all performing on a 32-bit integer. Therefore, the only parameters changing how long the program will run is the actual number of iterations of the for loop.
If you're assuming n = b - a, then this program is O(n). In fact, if you break down the actual runtime:
per loop: 1 AND operation, 1 addition operation
now do (b-a) iterations, so 2 operations per loop, (b-a) times = 2*(b-a)
If we assume n = b-a, then this runtime becomes 2*n, which is O(n).
I assume you define n := b - a. The complexity is actually n log(n). There is only 1 operation in the loop so the complexity is n * Time(operation in loop), but as i consists of log(n) bits, the complexity is O(n log(n))
EDIT:
I now regard n := b. It does not affect my original answer, and it makes more sense as it's the size of the input. (It doesn't make sense to say that n=1 for some big a,a+1)
To make it more efficient, notice you calculate (a)&(a+1)&(a+2)&..&(b).
So we just need to set 0's instead of 1's in the binary representation of b, in every place in which there is a 0 in this position for some a <= k < b. How can we know whether to set a digit to 0 or not then? I'll leave it
to you :)
It is possible to do in log(n) time, the size of the binary representation of b.
So in this case we get that the time is O(log(n)^2) = o(n)

Is the approach I have used to find the time complexity correct?

For the following problem I came up with the following algorithm. I just wondering whether I have calculated the complexity of the algorithm correctly or not.
Problem:
Given a list of integers as input, determine whether or not two integers (not necessarily distinct) in the list have a product k. For example, for k = 12 and list [2,10,5,3,7,4,8], there is a pair, 3 and 4, such that 3×4 = 12.
My solution:
// Imagine A is the list containing integer numbers
for(int i=0; i<A.size(); i++) O(n)
{
for(int j=i+1; j<A.size()-1; j++) O(n-1)*O(n-(i+1))
{
if(A.get(i) * A.get(j) == K) O(n-2)*O(n-(i+1))
return "Success"; O(1)
}
}
return "FAILURE"; O(1)
O(n) + O(n-1)*O(n-i-1) + O(n-2)*O(n-i-1)) + 2*O(1) =
O(n) + O(n^2-ni-n) + O(-n+i+1) + O(n^2-ni-n) + O(-2n+2i+2) + 2O(1) =
O(n) + O(n^2) + O(n) + O(n^2) + O(2n) + 2O(2) =
O(n^2)
Apart from my semi-algorithm, is there any more efficient algorithm?
Let's break down what your proposed algorithm is essentially doing.
For every index i (s.t 0 ≤ i ≤ n) you compare i to all unique indices j (i ≠ j) to determine whether: i * j == k.
An invariant for this algorithm would be that at every iteration, the pair {i,j} being compared hasn't been compared before.
This implementation (assuming it compiles and runs without the runtime exceptions mentioned in the comments) makes a total of nC2 comparisons (where nC2 is the binomial coefficient of n and 2, for choosing all possible unique pairs) and each such comparison would compute at a constant time (O(1)). Note it can be proven that nCk is not greater than n^k.
So O(nC2) makes for a more accurate upper bound for this algorithm - though by common big O notation this would still be O(n^2) since nC2 = n*(n-1)/2 = (n^2-n)/2 which is still order of n^2.
Per your question from the comments:
Is it correct to use "i" in the complexity, as I have used O(n-(i+1))?
i is a running index, whereas the complexity of your algorithm is only affected by the size of your sample, n.
IOW, the total complexity is calculated for all iterations in the algorithm, while i refers to a specific iteration. Therefore it is incorrect to use 'i' in your complexity calculations.
Apart from my semi-algorithm, is there any more efficient algorithm?
Your "semi-algorithm" seems to me the most efficient way to go about this. Any comparison-based algorithm would require querying all pairs in the array, which translates to the runtime complexity detailed above.
Though I have not calculated a lower bound and would be curious to hear if someone knows of a more efficient implementation.
edit: The other answer here shows a good solution to this problem which is (generally speaking) more efficient than this one.
Your algorithm looks like O(n^2) worst case and O(n*log(n)) average case, because the longer the list is, the more likely the loops will exit before evaluating all n^2 pairs.
An algorithm with O(n) worst case and O(log(n)) average case is possible. In real life it would be less efficient than your algorithm for lists where the factors of K are right at the start or the list is short, and more efficient otherwise. (pseudocode not written in any particular language)
var h = new HashSet();
for(int i=0; i<A.size(); i++)
{
var x = A.get(i);
if(x%K == 0) // If x is a factor of K
{
h.add(x); // Store x in h
if(h.contains(K/x))
{
return "Success";
}
}
}
return "FAILURE";
HashSet.add and HashSet.contains are O(1) on average (but slower than List.get even though it is also O(1)). For the purpose of this exercise I am assuming they always run in O(1) (which is not strictly true but close enough for government work). I have not accounted for edge cases, such as the list containing a 0.

do numbers in an array contain sides of a valid triange

Check if an array of n integers contains 3 numbers which can form a triangle (i.e. the sum of any of the two numbers is bigger than the third).
Apparently, this can be done in O(n) time.
(the obvious O(n log n) solution is to sort the array so please don't)
It's difficult to imagine N numbers (where N is moderately large) so that there is no triangle triplet. But we'll try:
Consider a growing sequence, where each next value is at the limit N[i] = N[i-1] + N[i-2]. It's nothing else than Fibonacci sequence. Approximately, it can be seen as a geometric progression with the factor of golden ratio (GRf ~= 1.618).
It can be seen that if the N_largest < N_smallest * (GRf**(N-1)) then there sure will be a triangle triplet. This definition is quite fuzzy because of floating point versus integer and because of GRf, that is a limit and not an actual geometric factor. Anyway, carefully implemented it will give an O(n) test that can check if the there is sure a triplet. If not, then we have to perform some other tests (still thinking).
EDIT: A direct conclusion from fibonacci idea is that for integer input (as specified in Q) there will exist a garanteed solution for any possible input if the size of array will be larger than log_GRf(MAX_INT), and this is 47 for 32 bits or 93 for 64 bits. Actually, we can use the largest value from the input array to define it better.
This gives us a following algorithm:
Step 1) Find MAX_VAL from input data :O(n)
Step 2) Compute the minimum array size that would guarantee the existence of the solution:
N_LIMIT = log_base_GRf(MAX_VAL) : O(1)
Step 3.1) if N > N_LIMIT : return true : O(1)
Step 3.2) else sort and use direct method O(n*log(n))
Because for large values of N (and it's the only case when the complexity matters) it is O(n) (or even O(1) in cases when N > log_base_GRf(MAX_INT)), we can say it's O(n).