Why is the time complexity of Heap Sort, O(nlogn)? - time-complexity

I'm trying to understand time complexity of different data structures and began with heap sort. From what I've read, I think collectively people agree heap sort has a time complexity of O(nlogn); however, I have difficulty understanding how that came to be.
Most people seem to agree that the heapify method takes O(logn) and buildmaxheap method takes O(n) thus O(nlogn) but why does heapify take O(logn)?
From my perspective, it seems heapify is just a method that compares a node's left and right node and properly swaps them depending if it is min or max heap. Why does that take O(logn)?
I think I'm missing something here and would really appreciate if someone could explain this better to me.
Thank you.

It seems that you are confusing about the time complexity about heap sort. It is true that build a maxheap from an unsorted array takes your O(n) time and O(1) for pop one element out. However, after you pop out the top element from the heap, you need to move the last element(A) in your heap to the top and heapy for maintaining heap property. For element A, it at most pop down log(n) times which is the height of your heap. So, you need at most log(n) time to get the next maximum value after you pop maximum value out. Here is an example of the process of heapsort.
18
15 8
7 11 1 2
3 6 4 9
After you pop out the number of 18, you need to put the number 9 on the top and heapify 9.
9
15 8
7 11 1 2
3 6 4
we need to pop 9 down because of 9 < 15
15
9 8
7 11 1 2
3 6 4
we need to pop 9 down because of 9 < 11
15
11 8
7 9 1 2
3 6 4
9 > 4 which means heapify process is done. And now you get the next maximum value 15 safely without broking heap property.
You have n number for every number you need to do the heapify process. So the time complexity of heapsort is O(nlogn)

You are missing the recursive call at the end of the heapify method.
Heapify(A, i) {
le <- left(i)
ri <- right(i)
if (le<=heapsize) and (A[le]>A[i])
largest <- le
else
largest <- i
if (ri<=heapsize) and (A[ri]>A[largest])
largest <- ri
if (largest != i) {
exchange A[i] <-> A[largest]
Heapify(A, largest)
}
}
In the worst case, after each step, largest will be about two times i. For i to reach the end of the heap, it would take O(logn) steps.

Related

No of Passes in a Bubble Sort

For the list of items in an array i.e. {23 , 12, 8, 15, 21}; the number of passes I could see is only 3 contrary to (n-1) passes for n elements. I also see that the (n-1) passes for n elements is also the worst case where all the elements are in descending order. So I have got 2 conclusions and I request you to let me know if my understanding is right wrt the conclusions.
Conclusion 1
(n-1) passes for n elements can occur in the following scenarios:
all elements in the array are in descending order which is the worst case
when it is not the best case(no of passes is 1 for a sorted array)
Conclusion 2
(n-1) passes for n elements is more like a theoretical concept and may not hold true in all cases as in this example {23 , 12, 8, 15, 21}. Here the number of passes are (n-2).
In a classic, one-directional bubble sort, no element is moved more than one position to the “left” (to a smaller index) in a pass. Therefore you must make n-1 passes when (and only when) the smallest element is in the largest position. Example:
12 15 21 23 8 // initial array
12 15 21 8 23 // after pass 1
12 15 8 21 23 // after pass 2
12 8 15 21 23 // after pass 3
8 12 15 21 23 // after pass 4
Let's define L(i) as the number of elements to the left of element i that are larger than element i.
The array is sorted when L(i) = 0 for all i.
A bubble sort pass decreases every non-zero L(i) by one.
Therefore the number of passes required is max(L(0), L(1), ..., L(n-1)). In your example:
23 12 8 15 21 // elements
0 1 2 1 1 // L(i) for each element
The max L(i) is L(2): the element at index 2 is 8 and there are two elements left of 8 that are larger than 8.
The bubble sort process for your example is
23 12 8 15 21 // initial array
12 8 15 21 23 // after pass 1
8 12 15 21 23 // after pass 2
which takes max(L(i)) = L(2) = 2 passes.
For a detailed analysis of bubble sort, see The Art of Computer Programming Volume 3: Sorting and Searching section 5.2.2. In the book, what I called the L(i) function is called the “inversion table” of the array.
re: Conclusion 1
all elements in the array are in descending order which is the worst case
Yep, this is when you'll have to do all (n-1) "passes".
when it is not the best case (no of passes is 1 for a sorted array)
No. When you don't have the best-case, you'll have more than 1 passes. So long as it's not fully sorted, you'll need less than (n-1) passes. So it's somewhere in between
re: Conclusion 2
There's nothing theoretical about it at all. You provide an example of a middle-ground case (not fully reversed, but not fully sorted either), and you end up needing a middle-ground number of passes through it. What's theoretical about it?

Discrete Binary Search Main Theory

I have read this: https://www.topcoder.com/community/competitive-programming/tutorials/binary-search.
I can't understand some parts==>
What we can call the main theorem states that binary search can be
used if and only if for all x in S, p(x) implies p(y) for all y > x.
This property is what we use when we discard the second half of the
search space. It is equivalent to saying that ¬p(x) implies ¬p(y) for
all y < x (the symbol ¬ denotes the logical not operator), which is
what we use when we discard the first half of the search space.
But I think this condition does not hold when we want to find an element(checking for equality only) in an array and this condition only holds when we're trying to find Inequality for example when we're searching for an element greater or equal to our target value.
Example: We are finding 5 in this array.
indexes=0 1 2 3 4 5 6 7 8
1 3 4 4 5 6 7 8 9
we define p(x)=>
if(a[x]==5) return true else return false
step one=>middle index = 8+1/2 = 9/2 = 4 ==> a[4]=5
and p(x) is correct for this and from the main theory, the result is that
p(x+1) ........ p(n) is true but its not.
So what is the problem?
We CAN use that theorem when looking for an exact value, because we
only use it when discarding one half. If we are looking for say 5,
and we find say 6 in the middle, the we can discard the upper half,
because we now know (due to the theorem) that all items in there are > 5
Also notice, that if we have a sorted sequence, and want to find any element
that satisfies an inequality, looking at the end elements is enough.

predefined preemption points among processes

I have forked many child processes and assigned priority and core to each of them. Porcess A executes at period of 3 sec and process B at a period of 6 sec. I want them to execute in such a way that the higher priority processes should preempt lower priority ones only at predefined points and tried to acheive it with semaphores. I have used this same code snippets within the 2 processes with different array values in both.
'bubblesort_desc()' sorts the array in descending order and prints it. 'bubblesort_asc()' sorts in ascending order and prints.
while(a<3)
{
printf("time in sort1.c: %d %ld\n", (int)request.tv_sec, (long int)request.tv_nsec);
int array[SIZE] = {5, 1, 6 ,7 ,9};
semaphore_wait(global_sem);
bubblesort_desc(array, SIZE);
semaphore_post(global_sem);
semaphore_wait(global_sem);
bubblesort_asc(array, SIZE);
semaphore_post(global_sem);
semaphore_wait(global_sem);
a++;
request.tv_sec = request.tv_sec + 6;
request.tv_nsec = request.tv_nsec; //if i add 1ms here like an offset to the lower priority one, it works.
semaphore_post(global_sem);
semaphore_close(global_sem); //close the semaphore file
//sleep for the rest of the time after the process finishes execution until the period of 6
clk = clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &request, NULL);
if (clk != 0 && clk != EINTR)
printf("ERROR: clock_nanosleep\n");
}
I get the output like this whenever two processes get activated at the same time. For example at time units of 6, 12,..
time in sort1.c: 10207 316296689
time now in sort.c: 10207 316296689
9
99
7
100
131
200
256
6
256
200
5
131
100
99
1
1
5
6
7
9
One process is not supposed to preempt the other while one set of sorted list is printing. But it's working as if there are no semaphores. I defined semaphores as per this link: http://linux.die.net/man/3/pthread_mutexattr_init
Can anyone tell me what can be the reason for that? Is there a better alternative than semaphores?
Its printf that's causing the ambiguous output. If the results are printed without '\n' then we get a more accurate result. But its always better to avoid printf statements for real time applications. I used trace-cmd and kernelshark to visualize the behaviour of the processes.

Understanding The Modulus Operator %

I understand the Modulus operator in terms of the following expression:
7 % 5
This would return 2 due to the fact that 5 goes into 7 once and then gives the 2 that is left over, however my confusion comes when you reverse this statement to read:
5 % 7
This gives me the value of 5 which confuses me slightly. Although the whole of 7 doesn't go into 5, part of it does so why is there either no remainder or a remainder of positive or negative 2?
If it is calculating the value of 5 based on the fact that 7 doesn't go into 5 at all why is the remainder then not 7 instead of 5?
I feel like there is something I'm missing here in my understanding of the modulus operator.
(This explanation is only for positive numbers since it depends on the language otherwise)
Definition
The Modulus is the remainder of the euclidean division of one number by another. % is called the modulo operation.
For instance, 9 divided by 4 equals 2 but it remains 1. Here, 9 / 4 = 2 and 9 % 4 = 1.
In your example: 5 divided by 7 gives 0 but it remains 5 (5 % 7 == 5).
Calculation
The modulo operation can be calculated using this equation:
a % b = a - floor(a / b) * b
floor(a / b) represents the number of times you can divide a by b
floor(a / b) * b is the amount that was successfully shared entirely
The total (a) minus what was shared equals the remainder of the division
Applied to the last example, this gives:
5 % 7 = 5 - floor(5 / 7) * 7 = 5
Modular Arithmetic
That said, your intuition was that it could be -2 and not 5. Actually, in modular arithmetic, -2 = 5 (mod 7) because it exists k in Z such that 7k - 2 = 5.
You may not have learned modular arithmetic, but you have probably used angles and know that -90° is the same as 270° because it is modulo 360. It's similar, it wraps! So take a circle, and say that its perimeter is 7. Then you read where is 5. And if you try with 10, it should be at 3 because 10 % 7 is 3.
Two Steps Solution.
Some of the answers here are complicated for me to understand. I will try to add one more answer in an attempt to simplify the way how to look at this.
Short Answer:
Example 1:
7 % 5 = 2
Each person should get one pizza slice.
Divide 7 slices on 5 people and every one of the 5 people will get one pizza slice and we will end up with 2 slices (remaining). 7 % 5 equals 2 is because 7 is larger than 5.
Example 2:
5 % 7 = 5
Each person should get one pizza slice
It gives 5 because 5 is less than 7. So by definition, you cannot divide whole 5items on 7 people. So the division doesn't take place at all and you end up with the same amount you started with which is 5.
Programmatic Answer:
The process is basically to ask two questions:
Example A: (7 % 5)
(Q.1) What number to multiply 5 in order to get 7?
Two Conditions: Multiplier starts from `0`. Output result should not exceed `7`.
Let's try:
Multiplier is zero 0 so, 0 x 5 = 0
Still, we are short so we add one (+1) to multiplier.
1 so, 1 x 5 = 5
We did not get 7 yet, so we add one (+1).
2 so, 2 x 5 = 10
Now we exceeded 7. So 2 is not the correct multiplier.
Let's go back one step (where we used 1) and hold in mind the result which is5. Number 5 is the key here.
(Q.2) How much do we need to add to the 5 (the number we just got from step 1) to get 7?
We deduct the two numbers: 7-5 = 2.
So the answer for: 7 % 5 is 2;
Example B: (5 % 7)
1- What number we use to multiply 7 in order to get 5?
Two Conditions: Multiplier starts from `0`. Output result and should not exceed `5`.
Let's try:
0 so, 0 x 7 = 0
We did not get 5 yet, let's try a higher number.
1 so, 1 x 7 = 7
Oh no, we exceeded 5, let's get back to the previous step where we used 0 and got the result 0.
2- How much we need to add to 0 (the number we just got from step 1) in order to reach the value of the number on the left 5?
It's clear that the number is 5. 5-0 = 5
5 % 7 = 5
Hope that helps.
As others have pointed out modulus is based on remainder system.
I think an easier way to think about modulus is what remains after a dividend (number to be divided) has been fully divided by a divisor. So if we think about 5%7, when you divide 5 by 7, 7 can go into 5 only 0 times and when you subtract 0 (7*0) from 5 (just like we learnt back in elementary school), then the remainder would be 5 ( the mod). See the illustration below.
0
______
7) 5
__-0____
5
With the same logic, -5 mod 7 will be -5 ( only 0 7s can go in -5 and -5-0*7 = -5). With the same token -5 mod -7 will also be -5.
A few more interesting cases:
5 mod (-3) = 2 i.e. 5 - (-3*-1)
(-5) mod (-3) = -2 i.e. -5 - (-3*1) = -5+3
It's just about the remainders. Let me show you how
10 % 5=0
9 % 5=4 (because the remainder of 9 when divided by 5 is 4)
8 % 5=3
7 % 5=2
6 % 5=1
5 % 5=0 (because it is fully divisible by 5)
Now we should remember one thing, mod means remainder so
4 % 5=4
but why 4?
because 5 X 0 = 0
so 0 is the nearest multiple which is less than 4
hence 4-0=4
modulus is remainders system.
So 7 % 5 = 2.
5 % 7 = 5
3 % 7 = 3
2 % 7 = 2
1 % 7 = 1
When used inside a function to determine the array index. Is it safe programming ? That is a different question. I guess.
Step 1 : 5/7 = 0.71
Step 2 : Take the left side of the decimal , so we take 0 from 0.71 and multiply by 7
0*7 = 0;
Step # : 5-0 = 5 ; Therefore , 5%7 =5
Modulus operator gives you the result in 'reduced residue system'. For example for mod 5 there are 5 integers counted: 0,1,2,3,4. In fact 19=12=5=-2=-9 (mod 7). The main difference that the answer is given by programming languages by 'reduced residue system'.
lets put it in this way:
actually Modulus operator does the same division but it does not care about the answer , it DOES CARE ABOUT reminder for example if you divide 7 to 5 ,
so , lets me take you through a simple example:
think 5 is a block, then for example we going to have 3 blocks in 15 (WITH Nothing Left) , but when that loginc comes to this kinda numbers {1,3,5,7,9,11,...} , here is where the Modulus comes out , so take that logic that i said before and apply it for 7 , so the answer gonna be that we have 1 block of 5 in 7 => with 2 reminds in our hand! that is the modulus!!!
but you were asking about 5 % 7 , right ?
so take the logic that i said , how many 7 blocks do we have in 5 ???? 0
so the modulus returns 0...
that's it ...
A novel way to find out the remainder is given below
Statement : Remainder is always constant
ex : 26 divided by 7 gives R : 5
This can be found out easily by finding the number that completely divides 26 which is closer to the
divisor and taking the difference of the both
13 is the next number after 7 that completely divides 26 because after 7 comes 8, 9, 10, 11, 12 where none of them divides 26 completely and give remainder 0.
So 13 is the closest number to 7 which divides to give remainder 0.
Now take the difference (13 ~ 7) = 5 which is the temainder.
Note: for this to work divisor should be reduced to its simplest form ex: if 14 is the divisor, 7 has to be chosen to find the closest number dividing the dividend.
As you say, the % sign is used to take the modulus (division remainder).
In w3schools' JavaScript Arithmetic page we can read in the Remainder section what I think to be a great explanation
In arithmetic, the division of two integers produces a quotient and a
remainder.
In mathematics, the result of a modulo operation is the
remainder of an arithmetic division.
So, in your specific case, when you try to divide 7 bananas into a group of 5 bananas, you're able to create 1 group of 5 (quotient) and you'll be left with 2 bananas (remainder).
If 5 bananas into a group of 7, you won't be able to and so you're left with again the 5 bananas (remainder).

Circle Summation (30 Points) InterviewStree Puzzle

The following is the problem from Interviewstreet I am not getting any help from their site, so asking a question here. I am not interested in an algorithm/solution, but I did not understand the solution given by them as an example for their second input. Can anybody please help me to understand the second Input and Output as specified in the problem statement.
Circle Summation (30 Points)
There are N children sitting along a circle, numbered 1,2,...,N clockwise. The ith child has a piece of paper with number ai written on it. They play the following game:
In the first round, the child numbered x adds to his number the sum of the numbers of his neighbors.
In the second round, the child next in clockwise order adds to his number the sum of the numbers of his neighbors, and so on.
The game ends after M rounds have been played.
Input:
The first line contains T, the number of test cases. T cases follow. The first line for a test case contains two space seperated integers N and M. The next line contains N integers, the ith number being ai.
Output:
For each test case, output N lines each having N integers. The jth integer on the ith line contains the number that the jth child ends up with if the game starts with child i playing the first round. Output a blank line after each test case except the last one. Since the numbers can be really huge, output them modulo 1000000007.
Constraints:
1 <= T <= 15
3 <= N <= 50
1 <= M <= 10^9
1 <= ai <= 10^9
Sample Input:
2
5 1
10 20 30 40 50
3 4
1 2 1
Sample Output:
80 20 30 40 50
10 60 30 40 50
10 20 90 40 50
10 20 30 120 50
10 20 30 40 100
23 7 12
11 21 6
7 13 24
Here is an explanation of the second test case. I will use a notation (a, b, c) meaning that child one has number a, child two has number b and child three has number c. In the beginning, the position is always (1,2,1).
If the first child is the first to sum its neighbours, the table goes through the following situations (I will put an asterisk in front of the child that just added its two neighbouring numbers):
(1,2,1)->(*4,2,1)->(4,*7,1)->(4,7,*12)->(*23,7,12)
If the second child is the first to move:
(1,2,1)->(1,*4,1)->(1,4,*6)->(*11,4,6)->(11,*21,6)
And last if the third child is first to move:
(1,2,1)->(1,2,*4)->(*7,2,4)->(7,*13,4)->(7,13,*24)
And as you notice the output to the second case are exactly the three triples computed that way.
Hope that helps.