time complexity of constructing a BST using pre order - time-complexity

I tried to write a program to construct a binary search tree using the pre-order sequence. I know there are many solutions: the min/max algorithm, the classical (or "obvious" recursion) or even iteration rather than recursion.
I tried to implement the classical recursion: the first element of pre-order traversal is the root. Then I search for all elements which are less than the root. All these elements will be part of left subtree, and the other values will be part of the right subtree. I repeat that until I construct all substrees. It's a very classical approach.
Here is my code:
public static TreeNode constructInOrderTree(int[] inorder) {
return constructInOrderTree(inorder, 0, inorder.length-1);
}
private static TreeNode constructInOrderTree(int[] inorder, int start, int end){
if(start>end){
return null;
}
int rootValue = inorder[start];
TreeNode root = new TreeNode(rootValue);
int k = 0;
for (int i =0; i< inorder.length; i++){
if (inorder[i]<= rootValue){
k=i;
}
}
root.left = constructInOrderTree(inorder, start+1, k);
root.right= constructInOrderTree(inorder, k+1, end);
return root;
}
My question is: What is the time complexity of this algorithm? Is it O(n^2) or O(n * log(n) ) ?
I searched here in stackoverflow but I found many contradictory answers. Sometimes, someone said that it is O(n^2), sometime O(n*log(n)) and I got really confused.
Can we apply the master theorem here? If yes, "perhaps" we can consider that each time we divide the tree in two subtrees (of equal parts), so we will have the relation: (O(n) is the complexity of searching in the array)
T(n) = 1/2 * T(n/2) + O(n)
Which will give us a complexity of O(n*log(n)). But, it's not really true I think, we don't divide the tree at equal parts because we search in the array until we found the adequate elements no?
Is it possible to apply the master theorem here?

Forethoughts:
No, it is neither O(n^2), nor O(nlogn) in WC. Because of the nature of trees and the fact that you don't perform any complex actions on each element. All you do is output it, in contrast to sorting it with some comparison algorithm.
Then the WC would be O(n).
That is when the tree is skewed, i.e. one of the root's subtrees is empty. Then you have quasi a simple linked list. Then to output it you must visit each element at least once giving O(n).
Proof:
Lets assume the right subtree is empty and the per call effort is constant(only print out). Then
T(n) = T(n-1) + T(0) + c
T(n) = T(n-2) + 2T(0) + 2c
.
.
T(n) = nT(0) + nc = n(T(0) + c)
Since T(0) and c are constants, you end up in O(n).

Related

What is the time complexity of below function?

I was reading book about competitive programming and was encountered to problem where we have to count all possible paths in the n*n matrix.
Now the conditions are :
`
1. All cells must be visited for once (cells must not be unvisited or visited more than once)
2. Path should start from (1,1) and end at (n,n)
3. Possible moves are right, left, up, down from current cell
4. You cannot go out of the grid
Now this my code for the problem :
typedef long long ll;
ll path_count(ll n,vector<vector<bool>>& done,ll r,ll c){
ll count=0;
done[r][c] = true;
if(r==(n-1) && c==(n-1)){
for(ll i=0;i<n;i++){
for(ll j=0;j<n;j++) if(!done[i][j]) {
done[r][c]=false;
return 0;
}
}
count++;
}
else {
if((r+1)<n && !done[r+1][c]) count+=path_count(n,done,r+1,c);
if((r-1)>=0 && !done[r-1][c]) count+=path_count(n,done,r-1,c);
if((c+1)<n && !done[r][c+1]) count+=path_count(n,done,r,c+1);
if((c-1)>=0 && !done[r][c-1]) count+=path_count(n,done,r,c-1);
}
done[r][c] = false;
return count;
}
Here if we define recurrence relation then it can be like: T(n) = 4T(n-1)+n2
Is this recurrence relation true? I don't think so because if we use masters theorem then it would give us result as O(4n*n2) and I don't think it can be of this order.
The reason, why I am telling, is this because when I use it for 7*7 matrix it takes around 110.09 seconds and I don't think for n=7 O(4n*n2) should take that much time.
If we calculate it for n=7 the approx instructions can be 47*77 = 802816 ~ 106. For such amount of instruction it should not take that much time. So here I conclude that my recurrene relation is false.
This code generates output as 111712 for 7 and it is same as the book's output. So code is right.
So what is the correct time complexity??
No, the complexity is not O(4^n * n^2).
Consider the 4^n in your notation. This means, going to a depth of at most n - or 7 in your case, and having 4 choices at each level. But this is not the case. In the 8th, level you still have multiple choices where to go next. In fact, you are branching until you find the path, which is of depth n^2.
So, a non tight bound will give us O(4^(n^2) * n^2). This bound however is far from being tight, as it assumes you have 4 valid choices from each of your recursive calls. This is not the case.
I am not sure how much tighter it can be, but a first attempt will drop it to O(3^(n^2) * n^2), since you cannot go from the node you came from. This bound is still far from optimal.

Is a nested for loop automatically O(n^2)?

I was recently asked an interview question about testing the validity of a Sudoku board. A basic answer involves for loops. Essentially:
for(int x = 0; x != 9; ++x)
for(int y = 0; y != 9; ++y)
// ...
Do this nested for loops to check the rows. Do it again to check the columns. Do one more for the sub-squares but that one is more funky because we're dividing the suoku board into sub-boards so we end end up more than two nested loops, maybe three or four.
I was later asked the complexity of this code. Frankly, as far as I'm concerned, all the cells of the board are visited exactly three times so O(3n). To me, the fact that we have nested loops doesn't mean this code is automatically O(n^2) or even O(n^highest-nesting-level-of-loops). But I have suspicion that that's the answer the interviewer expected...
Posed another way, what is the complexity of these two pieces of code:
for(int i = 0; i != n; ++i)
// ...
and:
for(int i = 0; i != sqrt(n); ++i)
for(int j = 0; j != sqrt(n); ++j)
// ...
Your general intuition is correct. Let's clarify a bit about Big-O notation:
Big-O gives you an upper bound for the worst-case (time) complexity for your algorithm, in relation to n - the size of your input. In essence, it is a measurement of how the amount of work changes in relation to the size of the input.
When you say something like
all the cells of the board are visited exactly three times so O(3n).
you are implying that n (the size of your input) is the the number of cells in the board and therefore visiting all cells three times would indeed be an O(3n) (which is O(n)) operation. If this is the case you would be correct.
However usually when referring to Sudoku problems (or problems involving a grid in general), n is taken to be the number of cells in each row/column (an n x n board). In this case, the runtime complexity would be O(3n²) (which is indeed equal to O(n²)).
In the future, it is perfectly valid to ask your interviewer what n is.
As for the question in the title (Is a nested for loop automatically O(n^2)?) the short answer is no.
Consider this example:
for(int i = 0 ; i < n ; i++) {
for(int j = 0 ; j < n ; j * 2) {
... // some constant time operation
}
}
The outer loops makes n iterations while the inner loop makes log2(n) iterations - therefore the time complexity will be O(nlogn).
In your examples, in the first one you have a single for-loop making n iterations, therefore a complexity of (at least) O(n) (the operation is performed an order of n times).
In the second one you two nested for loops, each making sqrt(n) iterations, therefore a total runtime complexity of (at least) O(n) as well. The second function isn't automatically O(n^2) simply because it contains a nested loop. The amount of operations being made is still of the same order (n) therefore these two examples have the same complexity - since we assume n is the same for both examples.
This is the most crucial point to sail home. To compare between the performance of two algorithms, you must be using the same input to make the comparison. In your sudoku problem you could have defined n in a few different ways, and the way you did would directly affect the complexity calculation of the problem - even if the amount of work is all the same.
*NOTE - this is unrelated to your question, but in the future avoid using != in loop conditions. In your second example, if log(n) is not a whole number, the loop could run forever, depending on the language and how it is defined. It is therefore recommended to use < instead.
It depends on how you define the so-called N.
If the size of the board is N-by-N, then yes, the complexity is O(N^2).
But if you say, the total number of grids is N (i.e., the board id sqrt(N)-by-sqrt(N)), then the complexity is O(N), or 3O(N) if you mind the constant.

BFS bad complexity

I am using adjacency lists to represent graph in OCaml. Then I made the following implementation of a BFS in OCaml starting at the node s.
let bfs graph s=
let size = Array.length graph in
let seen = Array.make size false and next = [s] in
let rec aux = function
|[] -> ()
|t::q -> if not seen.(t) then begin seen.(t) <- true; aux (q#graph.(t)) end else aux q
in aux next
size represents the number of nodes of the graph. seen is an array where seen.(t) = true if we've seen the node t, and next is a list of the node we need to see.
The thing is that normally the time complexity for BFS is linear (O( V +E)) yet I feel like my implementation doesn't have this complexity. If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
Thus I am wondering how can I adapt this code to make an efficient BFS? The problem (I think) comes from the implementation of a Queue using lists. Does the complexity of the Queue module in OCaml takes O(1) to add an element? In this case how can I use this module to make my bfs work, since I can't do pattern matching with Queue just as easily as list?
the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
You are absolutely right – this is the bottleneck of your BFS. You should be happily able to use the Queue module, because according to https://ocaml.org/learn/tutorials/comparison_of_standard_containers.html operation of insertion and taking elements is O(1).
One of the differences between queues and lists in OCaml is that queues are mutable structures, so you will need to use non pure functions like add, take and top that respectively insert element in-place, pop element from the front and return first element.
If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |).
That is indeed the problem. What you should be using is graph.(t) # q. The complexity of that is O(| graph.(t) |).
You might ask: What difference does that make?
The difference is that |q| can be anything from 0 to V * E. graph.(t) on the other hand you can work with. You visit every vertex in the graph at most once so overall the complexity will be
O(\Sum_V |grahp.(v))
The sum of all edges of each vertex in the graph. Or in other words: E.
That brings you to the overall complexity of O(V + E).

Complexity of traversing 2 lists of difference sizes

What would be the complexity of comparing 2 different lists
For(object a: Array List) {
For (object b: Array List) {
If a==b then do something
}
}
I know when it is same list then it is O(n^2) but when it is different then what's the complexity
Thank you
Let's start with the basics:
The complexity of traversing a single list of size N is O(N) - everybody knows that
The complexity of traversing a single list of size M is O(M) - the point I am trying to make here is that the letter inside O(...) does not matter
Now the answer becomes obvious:
The complexity of nesting O(M) operations inside O(N) loop is O(M×N) - that's what leads to N2 when M=N

Determining growth function and Big O

Before anyone asks, yes this was a previous test question I got wrong and knew I got wrong because I honestly just don't understand growth functions and Big O. I've read the technical definition, I know what they are but not how to calculate them. My textbook gives examples off of real-life situations, but I still find it hard to interpret code. If someone can tell me their thought process on how they determine these, that would seriously help. (i.e. this section of code tells me to multiply n by x, etc, etc).
public static int sort(int lowI, int highI, int nums[]) {
int i = lowI;
int j = highI;
int pivot = nums[lowI +(highI-lowI)/2];
int counter = 0;
while (i <= j) {
while (nums[i] < pivot) {
i++;
counter++;
}
while (nums[j] > pivot) {
j--;
counter++;
}
count++;
if (i <= j) {
NumSwap(i, j, nums); //saves i to temp and makes i = j, j = temp
i++;
j--;
}
}
if(lowI< j)
{
return counter + sort(lowI, j, nums);
}
if(i < highI)
{
return counter + sort(i, highI, nums);
}
return counter;
}
It might help for you to read some explanations of Big-O. I think of Big-O as the number of "basic operations" computed as the "input size" increases. For sorting algorithms, "basic operations" usually means comparisons (or counter increments, in your case), and the "input size" is the size of the list to sort.
When I analyze for runtime, I'll start by mentally dividing the code into sections. I ignore one-off lines (like int i = lowI;) because they're only run once, and Big-O doesn't care about constants (though, note in your case that int i = lowI; runs once with each recursion, so it's not only run once overall).
For example, I'd mentally divide your code into three overall parts to analyze: there's the main while loop while (i <= j), the two while loops inside of it, and the two recursive calls at the end. How many iterations will those loops run for, depending on the values of i and j? How many times will the function recurse, depending on the size of the list?
If I'm having trouble thinking about all these different parts at once, I'll isolate them. For example, how long will one of the inner for loops run for, depending on the values of i and j? Then, how long does the outer while loop run for?
Once I've thought about the runtime of each part, I'll bring them back together. At this stage, it's important to think about the relationships between the different parts. "Nested" relationships (i.e. the nested block loops a bunch of times each time the outer thing loops once) usually mean that the run times are multiplied. For example, since the inner while loops are nested within the outer while loop, the total number of iterations is (inner run time + other inner) * outer. It also seems like the total run time would look something like this - ((inner + other inner) * outer) * recursions - too.