This earlier question addresses some of the factors that might cause an algorithm to have O(log n) complexity.
What would cause an algorithm to have time complexity O(log log n)?
O(log log n) terms can show up in a variety of different places, but there are typically two main routes that will arrive at this runtime.
Shrinking by a Square Root
As mentioned in the answer to the linked question, a common way for an algorithm to have time complexity O(log n) is for that algorithm to work by repeatedly cut the size of the input down by some constant factor on each iteration. If this is the case, the algorithm must terminate after O(log n) iterations, because after doing O(log n) divisions by a constant, the algorithm must shrink the problem size down to 0 or 1. This is why, for example, binary search has complexity O(log n).
Interestingly, there is a similar way of shrinking down the size of a problem that yields runtimes of the form O(log log n). Instead of dividing the input in half at each layer, what happens if we take the square root of the size at each layer?
For example, let's take the number 65,536. How many times do we have to divide this by 2 until we get down to 1? If we do this, we get
65,536 / 2 = 32,768
32,768 / 2 = 16,384
16,384 / 2 = 8,192
8,192 / 2 = 4,096
4,096 / 2 = 2,048
2,048 / 2 = 1,024
1,024 / 2 = 512
512 / 2 = 256
256 / 2 = 128
128 / 2 = 64
64 / 2 = 32
32 / 2 = 16
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
This process takes 16 steps, and it's also the case that 65,536 = 216.
But, if we take the square root at each level, we get
√65,536 = 256
√256 = 16
√16 = 4
√4 = 2
Notice that it only takes four steps to get all the way down to 2. Why is this?
First, an intuitive explanation. How many digits are there in the numbers n and √n? There are approximately log n digits in the number n, and approximately log (√n) = log (n1/2) = (1/2) log n digits in √n. This means that, each time you take a square root, you're roughly halving the number of digits in the number. Because you can only halve a quantity k O(log k) times before it drops down to a constant (say, 2), this means you can only take square roots O(log log n) times before you've reduced the number down to some constant (say, 2).
Now, let's do some math to make this rigorous. Le'ts rewrite the above sequence in terms of powers of two:
√65,536 = √216 = (216)1/2 = 28 = 256
√256 = √28 = (28)1/2 = 24 = 16
√16 = √24 = (24)1/2 = 22 = 4
√4 = √22 = (22)1/2 = 21 = 2
Notice that we followed the sequence 216 → 28 → 24 → 22 → 21. On each iteration, we cut the exponent of the power of two in half. That's interesting, because this connects back to what we already know - you can only divide the number k in half O(log k) times before it drops to zero.
So take any number n and write it as n = 2k. Each time you take the square root of n, you halve the exponent in this equation. Therefore, there can be only O(log k) square roots applied before k drops to 1 or lower (in which case n drops to 2 or lower). Since n = 2k, this means that k = log2 n, and therefore the number of square roots taken is O(log k) = O(log log n). Therefore, if there is algorithm that works by repeatedly reducing the problem to a subproblem of size that is the square root of the original problem size, that algorithm will terminate after O(log log n) steps.
One real-world example of this is the van Emde Boas tree (vEB-tree) data structure. A vEB-tree is a specialized data structure for storing integers in the range 0 ... N - 1. It works as follows: the root node of the tree has √N pointers in it, splitting the range 0 ... N - 1 into √N buckets each holding a range of roughly √N integers. These buckets are then each internally subdivided into √(√ N) buckets, each of which holds roughly √(√ N) elements. To traverse the tree, you start at the root, determine which bucket you belong to, then recursively continue in the appropriate subtree. Due to the way the vEB-tree is structured, you can determine in O(1) time which subtree to descend into, and so after O(log log N) steps you will reach the bottom of the tree. Accordingly, lookups in a vEB-tree take time only O(log log N).
Another example is the Hopcroft-Fortune closest pair of points algorithm. This algorithm attempts to find the two closest points in a collection of 2D points. It works by creating a grid of buckets and distributing the points into those buckets. If at any point in the algorithm a bucket is found that has more than √N points in it, the algorithm recursively processes that bucket. The maximum depth of the recursion is therefore O(log log n), and using an analysis of the recursion tree it can be shown that each layer in the tree does O(n) work. Therefore, the total runtime of the algorithm is O(n log log n).
O(log n) Algorithms on Small Inputs
There are some other algorithms that achieve O(log log n) runtimes by using algorithms like binary search on objects of size O(log n). For example, the x-fast trie data structure performs a binary search over the layers of at tree of height O(log U), so the runtime for some of its operations are O(log log U). The related y-fast trie gets some of its O(log log U) runtimes by maintaining balanced BSTs of O(log U) nodes each, allowing searches in those trees to run in time O(log log U). The tango tree and related multisplay tree data structures end up with an O(log log n) term in their analyses because they maintain trees that contain O(log n) items each.
Other Examples
Other algorithms achieve runtime O(log log n) in other ways. Interpolation search has expected runtime O(log log n) to find a number in a sorted array, but the analysis is fairly involved. Ultimately, the analysis works by showing that the number of iterations is equal to the number k such that n2-k ≤ 2, for which log log n is the correct solution. Some algorithms, like the Cheriton-Tarjan MST algorithm, arrive at a runtime involving O(log log n) by solving a complex constrained optimization problem.
One way to see factor of O(log log n) in time complexity is by division like stuff explained in the other answer, but there is another way to see this factor, when we want to make a trade of between time and space/time and approximation/time and hardness/... of algorithms and we have some artificial iteration on our algorithm.
For example SSSP(Single source shortest path) has an O(n) algorithm on planar graphs, but before that complicated algorithm there was a much more easier algorithm (but still rather hard) with running time O(n log log n), the base of algorithm is as follow (just very rough description, and I'd offer to skip understanding this part and read the other part of the answer):
divide graph into the parts of size O(log n/(log log n)) with some restriction.
Suppose each of mentioned part is node in the new graph G' then compute SSSP for G' in time O(|G'|*log |G'|) ==> here because |G'| = O(|G|*log log n/log n) we can see the (log log n) factor.
Compute SSSP for each part: again because we have O(|G'|) part and we can compute SSSP for all parts in time |n/logn| * |log n/log logn * log (logn /log log n).
update weights, this part can be done in O(n).
for more details this lecture notes are good.
But my point is, here we choose the division to be of size O(log n/(log log n)). If we choose other divisions like O(log n/ (log log n)^2) which may runs faster and brings another result. I mean, in many cases (like in approximation algorithms or randomized algorithms, or algorithms like SSSP as above), when we iterate over something (subproblems, possible solutions, ...), we choose number of iteration corresponding to the trade of that we have (time/space/complexity of algorithm/ constant factor of the algorithm,...). So may be we see more complicated stuffs than "log log n" in real working algorithms.
Related
I am trying to answer the following question:
Given n=2k find the complexity
func(n)
if(n==2) return1;
else n=1+func(sqrt(n))
end
I think because there is if-else statement, it will loop n times for sure, but I'm confused with the recursive loop for func(sqrt(n)). Since it is square-rooted, I think the time complexity would be
O(sqrt(n) * n) = O(n^1/2 * n) = O(n^3/2) = O(2k^3/2). However, the possible answer choices are only
O(k)
O(2^n)
O(n*k)
Can O(2k^3/2) be considered as O(k)? I'm confused because although time complexity is often simplified, O(n) and O(n^2) are different, so I thought O(2k^3/2) can only be simplified to O(k^3/2).
I’d argue that none of the answers here are the best answer to this question.
If you have a recursive function that does O(1) work per iteration and then reduces the size of its argument from n to √n, as is the case here, the runtime works out to O(log log n). The intuition behind this is that taking the square root of a number throws away half the digits of the number, roughly, so the runtime will be O(log d), where d is the number of digits of the input number. The number of digits of the number n is O(log n), hence the overall runtime of O(log log n).
In your case, you have n = 2k, so O(log log n) = O(log log k). (Unless you meant n = 2k, In which case O(log log n) = O(log log 2k) = O(log k).)
Notice that we don’t get O(n × √n) as our result. That would be what happens if we do O(√n) work O(n) times. But here, that’s not what we’re doing. The size of the input shrinks by a square root on each iteration, but that’s not the same thing as saying that we’re doing a square root amount of work. And the number of times that this happens isn’t O(n), since the value of n shrinks too rapidly for that.
Reasoning by analogy, the runtime of this code would be O(log n), not O(n × n / 2):
func(n):
if n <= 2 return 1
return func(n/2)
In big O notation of time complexity in algorithmic analysis, is O(n + k log n) the same as O(n log n) if k is larger than n? I am not entirely sure about this.
I am not 100% sure what you mean by N+KlogN. I'm used to seeing K used as a subset of N, for example "the top Kth set of items in N" which for large N it is common to simply return the top K items in N because then the Big-O time is NlogK which is much faster than NlogN (because K is a smaller number).
If you literally mean N+KlogN, then that would be more complex than simply NlogN as K adds to the number. For example, as K goes to zero you simply end up with NlogN, otherwise you get a greater than NlogN, which I hope would be obvious is more complex.
I hope that does something to answer the question. I confess I feel like I might be missing the point here and if so I apologize.
No, in the specific case you’re mentioning these are not the same. For example, consider this algorithm: given an array of length N and a number K ≥ N, do a linear scan over the array, then do K binary searches on the array. How much work is done here? Well, the linear search takes time O(N), and the K binary searches collectively take time O(K log N), so the total work done is O(N + K log N).
However, the work here is not O(N log N). Since K can be arbitrarily large, the value of K log N can exceed the value of N log N by an arbitrary amount. A different way of seeing this: a bound of O(N log N) means that the runtime depends purely on N and not on K. But that can’t be the case here, since cranking K way, way up definitely increases the runtime, independently of what N is.
Hope this helps!
I assume it as N + (K log N) where N is total count and K is the subset count. Now assuming K is very small compared to N (possibly a constant to get top K numbers from varying N) it reduces to linear time.
For example, to get top 100 items from array of 10000 elements
10000 + (100 * log (10000) base 2) = 10000 + 1300
Now when N is 20000, k log n changes to 1400
So as N increases linearly, the k log n increases in logarithmic manner reducing the overall complexity to linear.
O(n + (k log n)) is approximately O(n)
for (i=0;i<n;i++)
{
enumerate all subsets of size i = 2^n
each subset of size i takes o(nlogn) to search a solution
from all these solution I want to search the minimum subset of size S.
}
I want to know the complexity of this algorithm it'is 2^n O(nlogn*n)=o(2^n n²) ??
If I understand you right:
You iterate all subsets of a sorted set of n numbers.
For each subset you test in O(n log n) if its is a solution. (how ever you do this)
After you have all this solutions you looking for the one with exact S elements with the smalest sum.
The way you write it, the complexity would be O(2^n * n log n) * O(log (2^n)) = O(2^n * n^2 log n). O(log (2^n)) = O(n) is for searching the minimum solution, and you do this every round of the for loop with worst case i=n/2 and every subset is a solution.
Now Im not sure if you mixing O() and o() up.
2^n O(nlogn*n)=o(2^n n²) is only right if you mean 2^n O(nlog(n*n)).
f=O(g) means, the complexity of f is not bigger than the complexity of g.
f=o(g) means the complexity of f is smaller than the complexity of g.
So 2^n O(nlogn*n) = O(2^n n logn^2) = O(2^n n * 2 logn) = O(2^n n logn) < O(2^n n^2)
Notice: O(g) = o(h) is never a good notation. You will (most likly every time) find a function f with f=o(h) but f != O(g), if g=o(h).
Improvements:
If I understand your algorithm right, you can speed it a little up. You know the size of the subset you looking for, so only look at all the subsets that have the size S. The worst case is S=n/2, so C(n,n/2) ~ 2^(n-1) will not reduce the complexity but saves you a factor 2.
You can also just save a solution and check if the next solution is smaller. this way you get the smallest solution without serching for it again. So the complexity would be O(2^n * n log n).
I'm sitting here with this assignment in a course on algorithms with massive data sets and the use of Little-Oh notation has got me confused, although I'm perfectly confident with Big-Oh.
I do not want a solution to the assignment, and as such I will not present it. Instead my question is how I interpret the time complexity o(log n)?
I know from the definition, that an algorithm A must grow asymptotically slower than o(log n), but I'm uncertain as to whether this means that the algorithm must be running in constant time or if it is still allowed to be log n under certain conditions, such that c = 1 or if it is really log (n-1).
Say an algorithm has a running time of O(log n) but in fact does two iterations and as such c = 2, but 2*log n is still O(log n), am I right when I say that this does not hold for Little-Oh?
Any help is greatly appreciated and if strictly needed for clarification, I will provide the assignment
To say the f is 'little-oh-of g' f = o(g), means that the quotient
|f(x)/g(x)|
approaches 0 as x approaches infinity. Referring to your example of o(log n), that class contains functions like log x / log (log x), sqrt(log x) and many more, so o(log x) definitely doesn't imply O(1). On the other hand, log (x/2) = log x - log 2, so
log (x/2) / log x = 1 - log 2 / log x -> 1
and log (x/2) is not in the class o(log x).
For Little-Oh, f(x) does not have to be smaller than g(x) for all x. It has to be smaller only after a certain value of x. (For your question, it is still allowed to be log n under certain conditions.)
For example:
let f(x) = 5000x and g(x) = x^2
f(x) / g(x) as x approaches infinity is 0, so f(x) is litte-o of g(x). However, at x = 1, f(x) is greater than g(x). Only after x becomes greater than 5000 will g(x) be bigger than f(x).
What little-o really tells us is that g(x) always grows at a faster rate than f(x). For example, look how much f(x) grew between x = 1 and x = 2:
f(1) = 5000
f(2) = 10000 - f(x) became twice as big
Now look at g(x) on the same interval:
g(1) = 1
g(2) = 4 - g(x) became four times bigger
This rate will increase even more at bigger values of x. Now, since g(x) increases at a faster rate and because we take x to the infinity, at some point it will become larger than f(x). But this is not what little-o is concerned with, it's all about the rate of change.
Check if an array of n integers contains 3 numbers which can form a triangle (i.e. the sum of any of the two numbers is bigger than the third).
Apparently, this can be done in O(n) time.
(the obvious O(n log n) solution is to sort the array so please don't)
It's difficult to imagine N numbers (where N is moderately large) so that there is no triangle triplet. But we'll try:
Consider a growing sequence, where each next value is at the limit N[i] = N[i-1] + N[i-2]. It's nothing else than Fibonacci sequence. Approximately, it can be seen as a geometric progression with the factor of golden ratio (GRf ~= 1.618).
It can be seen that if the N_largest < N_smallest * (GRf**(N-1)) then there sure will be a triangle triplet. This definition is quite fuzzy because of floating point versus integer and because of GRf, that is a limit and not an actual geometric factor. Anyway, carefully implemented it will give an O(n) test that can check if the there is sure a triplet. If not, then we have to perform some other tests (still thinking).
EDIT: A direct conclusion from fibonacci idea is that for integer input (as specified in Q) there will exist a garanteed solution for any possible input if the size of array will be larger than log_GRf(MAX_INT), and this is 47 for 32 bits or 93 for 64 bits. Actually, we can use the largest value from the input array to define it better.
This gives us a following algorithm:
Step 1) Find MAX_VAL from input data :O(n)
Step 2) Compute the minimum array size that would guarantee the existence of the solution:
N_LIMIT = log_base_GRf(MAX_VAL) : O(1)
Step 3.1) if N > N_LIMIT : return true : O(1)
Step 3.2) else sort and use direct method O(n*log(n))
Because for large values of N (and it's the only case when the complexity matters) it is O(n) (or even O(1) in cases when N > log_base_GRf(MAX_INT)), we can say it's O(n).