Complexity of traversing 2 lists of difference sizes - time-complexity

What would be the complexity of comparing 2 different lists
For(object a: Array List) {
For (object b: Array List) {
If a==b then do something
}
}
I know when it is same list then it is O(n^2) but when it is different then what's the complexity
Thank you

Let's start with the basics:
The complexity of traversing a single list of size N is O(N) - everybody knows that
The complexity of traversing a single list of size M is O(M) - the point I am trying to make here is that the letter inside O(...) does not matter
Now the answer becomes obvious:
The complexity of nesting O(M) operations inside O(N) loop is O(M×N) - that's what leads to N2 when M=N

Related

Space complexity of a recursive algorithm from the CTCI book [duplicate]

This question already has answers here:
What is the space complexity of a recursive fibonacci algorithm?
(5 answers)
Closed 2 years ago.
I am going through the CTCI book and can't understand one of their examples. They start with:
int sum(int n) {
if (n <= 0) {
return 0;
}
return n + sum(n-1);
}
and explain that it's O(n) time and O(n) space because each of the calls is added to the call stack and takes up actual memory.
The next example is:
int f(int n) {
if (n <= 0) {
return 1;
}
return f(n - 1) + f(n-1);
}
and states that time complexity is O(2^n) and space is O(n). Although I understand why the time is O(2^n), I am not sure why the space is O(n)? Their explanation is that "only O(n) nodes exists at any given time". Why we don't count the space taken by each call stack, as it is in the first example?
P.S. After reading similar questions, should I assume that stack frame's space is reclaimed once we start moving back (or up) the recursion?
Unlike the time complexity, which is simply a total time that is needed to run a program, the space complexity describes the space required to execute the program. So it doesn't really matter that there are 2n nodes in the execution tree of the program. The call stack automatically folds and releases the additional memory used. What matters is the maximal depth of the call tree, which is O(n) for this program. Should be noted, though, that recursion is a special case that naturally releases any used memory upon stack fold. If memory is allocated explicitly during runtime, it should be released explicitly as well.
Regarding the first example, the call tree is simply a list of depth n, resulting in similar complexity of O(n).

Is a nested for loop automatically O(n^2)?

I was recently asked an interview question about testing the validity of a Sudoku board. A basic answer involves for loops. Essentially:
for(int x = 0; x != 9; ++x)
for(int y = 0; y != 9; ++y)
// ...
Do this nested for loops to check the rows. Do it again to check the columns. Do one more for the sub-squares but that one is more funky because we're dividing the suoku board into sub-boards so we end end up more than two nested loops, maybe three or four.
I was later asked the complexity of this code. Frankly, as far as I'm concerned, all the cells of the board are visited exactly three times so O(3n). To me, the fact that we have nested loops doesn't mean this code is automatically O(n^2) or even O(n^highest-nesting-level-of-loops). But I have suspicion that that's the answer the interviewer expected...
Posed another way, what is the complexity of these two pieces of code:
for(int i = 0; i != n; ++i)
// ...
and:
for(int i = 0; i != sqrt(n); ++i)
for(int j = 0; j != sqrt(n); ++j)
// ...
Your general intuition is correct. Let's clarify a bit about Big-O notation:
Big-O gives you an upper bound for the worst-case (time) complexity for your algorithm, in relation to n - the size of your input. In essence, it is a measurement of how the amount of work changes in relation to the size of the input.
When you say something like
all the cells of the board are visited exactly three times so O(3n).
you are implying that n (the size of your input) is the the number of cells in the board and therefore visiting all cells three times would indeed be an O(3n) (which is O(n)) operation. If this is the case you would be correct.
However usually when referring to Sudoku problems (or problems involving a grid in general), n is taken to be the number of cells in each row/column (an n x n board). In this case, the runtime complexity would be O(3n²) (which is indeed equal to O(n²)).
In the future, it is perfectly valid to ask your interviewer what n is.
As for the question in the title (Is a nested for loop automatically O(n^2)?) the short answer is no.
Consider this example:
for(int i = 0 ; i < n ; i++) {
for(int j = 0 ; j < n ; j * 2) {
... // some constant time operation
}
}
The outer loops makes n iterations while the inner loop makes log2(n) iterations - therefore the time complexity will be O(nlogn).
In your examples, in the first one you have a single for-loop making n iterations, therefore a complexity of (at least) O(n) (the operation is performed an order of n times).
In the second one you two nested for loops, each making sqrt(n) iterations, therefore a total runtime complexity of (at least) O(n) as well. The second function isn't automatically O(n^2) simply because it contains a nested loop. The amount of operations being made is still of the same order (n) therefore these two examples have the same complexity - since we assume n is the same for both examples.
This is the most crucial point to sail home. To compare between the performance of two algorithms, you must be using the same input to make the comparison. In your sudoku problem you could have defined n in a few different ways, and the way you did would directly affect the complexity calculation of the problem - even if the amount of work is all the same.
*NOTE - this is unrelated to your question, but in the future avoid using != in loop conditions. In your second example, if log(n) is not a whole number, the loop could run forever, depending on the language and how it is defined. It is therefore recommended to use < instead.
It depends on how you define the so-called N.
If the size of the board is N-by-N, then yes, the complexity is O(N^2).
But if you say, the total number of grids is N (i.e., the board id sqrt(N)-by-sqrt(N)), then the complexity is O(N), or 3O(N) if you mind the constant.

BFS bad complexity

I am using adjacency lists to represent graph in OCaml. Then I made the following implementation of a BFS in OCaml starting at the node s.
let bfs graph s=
let size = Array.length graph in
let seen = Array.make size false and next = [s] in
let rec aux = function
|[] -> ()
|t::q -> if not seen.(t) then begin seen.(t) <- true; aux (q#graph.(t)) end else aux q
in aux next
size represents the number of nodes of the graph. seen is an array where seen.(t) = true if we've seen the node t, and next is a list of the node we need to see.
The thing is that normally the time complexity for BFS is linear (O( V +E)) yet I feel like my implementation doesn't have this complexity. If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
Thus I am wondering how can I adapt this code to make an efficient BFS? The problem (I think) comes from the implementation of a Queue using lists. Does the complexity of the Queue module in OCaml takes O(1) to add an element? In this case how can I use this module to make my bfs work, since I can't do pattern matching with Queue just as easily as list?
the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
You are absolutely right – this is the bottleneck of your BFS. You should be happily able to use the Queue module, because according to https://ocaml.org/learn/tutorials/comparison_of_standard_containers.html operation of insertion and taking elements is O(1).
One of the differences between queues and lists in OCaml is that queues are mutable structures, so you will need to use non pure functions like add, take and top that respectively insert element in-place, pop element from the front and return first element.
If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |).
That is indeed the problem. What you should be using is graph.(t) # q. The complexity of that is O(| graph.(t) |).
You might ask: What difference does that make?
The difference is that |q| can be anything from 0 to V * E. graph.(t) on the other hand you can work with. You visit every vertex in the graph at most once so overall the complexity will be
O(\Sum_V |grahp.(v))
The sum of all edges of each vertex in the graph. Or in other words: E.
That brings you to the overall complexity of O(V + E).

How to effectively get the N lowest values from the collection (Top N) in Kotlin?

How to effectively get the N lowest values from the collection (Top N) in Kotlin?
Is there any other way besides collectionOrSequence.sortedby{it.value}.take(n)?
Assume I have a collection with +100500 elements and I need to found 10 lowest. I'm afraid that the sortedby will create new temporary collection which later will take only 10 items.
You could keep a list of the n smallest elements and just update it on demand, e.g.
fun <T : Comparable<T>> top(n: Int, collection: Iterable<T>): List<T> {
return collection.fold(ArrayList<T>()) { topList, candidate ->
if (topList.size < n || candidate < topList.last()) {
// ideally insert at the right place
topList.add(candidate)
topList.sort()
// trim to size
if (topList.size > n)
topList.removeAt(n)
}
topList
}
}
That way you only compare the current element of your list once to the largest element of the top n elements which would usually be faster than sorting the entire list https://pl.kotl.in/SyQPtDTcQ
If you're running on the JVM, you could use Guava's Comparators.least(int, Comparator), which uses a more efficient algorithm than any of these suggestions, taking O(n + k log k) time and O(k) memory to find the lowest k elements in a collection of size n, as opposed to zapl's algorithm (O(nk log k)) or Lior's (O(nk)).
You have more to worry about.
collectionOrSequence.sortedby{it.value} runs java.util.Arrays.sort, that will run timSort (or mergeSort if requested).
timSort is great, but usually ends by n*log(n) operations, which is much more than the O(n) of copying the array.
Each of the O(n*log.n) operations will run a function (the lambda you provided, {it.value}) --> an additional meaningful overhead.
Lastly, java.util.Arrays.sort will convert the collection to Array and back to a List - 2 additional conversions (which you wanted to avoid, but this is secondary)
The efficient way to do it is probably:
map the values for comparison into a list: O(n) conversions (once per element) rather than O(n*log.n) or more.
Iterate over the list (or Array) created to collect the N smallest elements in one pass
Keep a list of N smallest elements found so far and their index on the original list. If it is small (e.g. 10 items) - mutableList is a good fit.
Keep a variable holding the max value for the small element list.
When iterating over the original collection, compare the current element on the original list against the max value of the small values list. If smaller than it - replace it in the "small list" and find the updated max value in it.
Use the indexes from the "small list" to extract the 10 smallest elements of the original list.
That would allow you to go from O(n*log.n) to O(n).
Of course, if time is critical - it is always best to benchmark the specific case.
If you managed, on the first step, to extract primitives for the basis of comparison (e.g. int or long) - that would be even more efficient.
I suggest implementing your own sort method based on a typical quickSort algorithm(in descending order, and take the first N elements), if the collection has 1k+ values spread randomly.

time complexity of constructing a BST using pre order

I tried to write a program to construct a binary search tree using the pre-order sequence. I know there are many solutions: the min/max algorithm, the classical (or "obvious" recursion) or even iteration rather than recursion.
I tried to implement the classical recursion: the first element of pre-order traversal is the root. Then I search for all elements which are less than the root. All these elements will be part of left subtree, and the other values will be part of the right subtree. I repeat that until I construct all substrees. It's a very classical approach.
Here is my code:
public static TreeNode constructInOrderTree(int[] inorder) {
return constructInOrderTree(inorder, 0, inorder.length-1);
}
private static TreeNode constructInOrderTree(int[] inorder, int start, int end){
if(start>end){
return null;
}
int rootValue = inorder[start];
TreeNode root = new TreeNode(rootValue);
int k = 0;
for (int i =0; i< inorder.length; i++){
if (inorder[i]<= rootValue){
k=i;
}
}
root.left = constructInOrderTree(inorder, start+1, k);
root.right= constructInOrderTree(inorder, k+1, end);
return root;
}
My question is: What is the time complexity of this algorithm? Is it O(n^2) or O(n * log(n) ) ?
I searched here in stackoverflow but I found many contradictory answers. Sometimes, someone said that it is O(n^2), sometime O(n*log(n)) and I got really confused.
Can we apply the master theorem here? If yes, "perhaps" we can consider that each time we divide the tree in two subtrees (of equal parts), so we will have the relation: (O(n) is the complexity of searching in the array)
T(n) = 1/2 * T(n/2) + O(n)
Which will give us a complexity of O(n*log(n)). But, it's not really true I think, we don't divide the tree at equal parts because we search in the array until we found the adequate elements no?
Is it possible to apply the master theorem here?
Forethoughts:
No, it is neither O(n^2), nor O(nlogn) in WC. Because of the nature of trees and the fact that you don't perform any complex actions on each element. All you do is output it, in contrast to sorting it with some comparison algorithm.
Then the WC would be O(n).
That is when the tree is skewed, i.e. one of the root's subtrees is empty. Then you have quasi a simple linked list. Then to output it you must visit each element at least once giving O(n).
Proof:
Lets assume the right subtree is empty and the per call effort is constant(only print out). Then
T(n) = T(n-1) + T(0) + c
T(n) = T(n-2) + 2T(0) + 2c
.
.
T(n) = nT(0) + nc = n(T(0) + c)
Since T(0) and c are constants, you end up in O(n).