Binary search to solve 'Kth Smallest Element in a Sorted Matrix'. How can one ensure the correctness of the algorithm,

Binary search to solve 'Kth Smallest Element in a Sorted Matrix'. How can one ensure the correctness of the algorithm, - binary-search

I'm referring to the leetcode question: Kth Smallest Element in a Sorted Matrix
There are two well-known solutions to the problem. One using Heap/PriorityQueue and other is using Binary Search. The Binary Search solution goes like this (top post):
public class Solution {
public int kthSmallest(int[][] matrix, int k) {
int lo = matrix[0][0], hi = matrix[matrix.length - 1][matrix[0].length - 1] + 1;//[lo, hi)
while(lo < hi) {
int mid = lo + (hi - lo) / 2;
int count = 0, j = matrix[0].length - 1;
for(int i = 0; i < matrix.length; i++) {
while(j >= 0 && matrix[i][j] > mid) j--;
count += (j + 1);
}
if(count < k) lo = mid + 1;
else hi = mid;
}
return lo;
}
}
While I understand how this works, I have trouble figuring out one issue.
How can we be sure that the returned lo is always in the matrix?
Since the search space is min and max value of the array, the mid need NOT be a value that is in the array. However, the returned lo always is.
Why is this happening?

For the sake of argument, we can move the calculation of count to a separate function like the following:
bool valid(int mid, int[][] matrix, int k) {
int count = 0, m = matrix.length;
for (int i = 0; i < m; i++) {
int j = 0;
while (j < m && matrix[i][j] <= mid) j++;
count += j;
}
return (count < k);
}
This predicate will do exactly same as your specified operation. Here, the loop invariant is that, the range [lo, hi] always contains the kth smallest number of the 2D array.
In other words, lo <= solution <= hi
Now, when the loop terminates, it is evident that lo >= hi
Merging those two properties, we get, lo = solution = hi, since solution is a member of array, it can be said that, lo is always in the array after loop termination and will rightly point to the kth smallest element.

Because We are finding the lower_bound using binary search and there cannot be any number smaller than the number(lo) in the array which could be the kth smallest element.

Related

While loop exit condition

I want to know how this while loop exits? Because i and j are not incremented anywhere except at s.charAt(j++) and s.charAt(i++) which is to find char at j or ith position. Does that also increment j and i? In my opinion it should only give you the character code at j++ or i++ th position isn't it so?
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length();
Set<Character> set = new HashSet<>();
int ans = 0, i = 0, j = 0;
while (i < n && j < n) {
// try to extend the range [i, j]
if (!set.contains(s.charAt(j))){
set.add(s.charAt(j++));
ans = Math.max(ans, j - i);
}
else {
set.remove(s.charAt(i++));
}
}
return ans;
}
}

Yes using i++ increments the value of i.
If you want to find character at next position than i, then you can simply use another variable and store i+1 value in it and use that to access character. Like this:
z = i+1;
set.add(s.charAt(z));
using post-increment or pre-increment always changes the value of increment variable and only differ as to when they increment the increment variable's value.

Selection sort implementation, I am stuck at calculating time complexity for number of swaps

static int count = 0;
for (int i = 0; i < arr.length; i++) {
for (int j = i + 1; j < arr.length; j++) {
if (arr[i] > arr[j]) {
swap(arr, i, j);
count++;
}
}
}
Is this the correct implementation for selection sort? I am not getting O(n-1) complexity for swaps with this implementation.

Is this the correct implementation for selection sort?
It depends, logically what you are doing is correct. It sort using "find the max/min value in the array". But, in Selection Sort, usually you didn't need more than one swap in one iteration. You just save the max/min value in the array, then at the end you swap it with the i-th element
I am not getting O(n-1) complexity for swaps
did you mean n-1 times of swap? yes, it happen because you swap every times find a larger value not only on the largest value. You can try to rewrite your code like this:
static int count=0;
static int maximum=0;
for(int i=0;i<arr.length-1;i++){
maximum = i;
for(int j=i+1;j<arr.length;j++){
if(arr[j] > arr[maximum]){
maximum = j;
}
}
swap(arr[maximum],arr[i]);
count++;
}
Also, if you want to exact n-1 times swap, your iteration for i should changed too.

When can an algorithm have square root(n) time complexity?

Can someone give me example of an algorithm that has square root(n) time complexity. What does square root time complexity even mean?

Square root time complexity means that the algorithm requires O(N^(1/2)) evaluations where the size of input is N.
As an example for an algorithm which takes O(sqrt(n)) time, Grover's algorithm is one which takes that much time. Grover's algorithm is a quantum algorithm for searching an unsorted database of n entries in O(sqrt(n)) time.
Let us take an example to understand how can we arrive at O(sqrt(N)) runtime complexity, given a problem. This is going to be elaborate, but is interesting to understand. (The following example, in the context for answering this question, is taken from Coding Contest Byte: The Square Root Trick , very interesting problem and interesting trick to arrive at O(sqrt(n)) complexity)
Given A, containing an n elements array, implement a data structure for point updates and range sum queries.
update(i, x)-> A[i] := x (Point Updates Query)
query(lo, hi)-> returns A[lo] + A[lo+1] + .. + A[hi]. (Range Sum Query)
The naive solution uses an array. It takes O(1) time for an update (array-index access) and O(hi - lo) = O(n) for the range sum (iterating from start index to end index and adding up).
A more efficient solution splits the array into length k slices and stores the slice sums in an array S.
The update takes constant time, because we have to update the value for A and the value for the corresponding S. In update(6, 5) we have to change A[6] to 5 which results in changing the value of S1 to keep S up to date.
The range-sum query is interesting. The elements of the first and last slice (partially contained in the queried range) have to be traversed one by one, but for slices completely contained in our range we can use the values in S directly and get a performance boost.
In query(2, 14) we get,
query(2, 14) = A[2] + A[3]+ (A[4] + A[5] + A[6] + A[7]) + (A[8] + A[9] + A[10] + A[11]) + A[12] + A[13] + A[14] ;
query(2, 14) = A[2] + A[3] + S[1] + S[2] + A[12] + A[13] + A[14] ;
query(2, 14) = 0 + 7 + 11 + 9 + 5 + 2 + 0;
query(2, 14) = 34;
The code for update and query is:
def update(S, A, i, k, x):
S[i/k] = S[i/k] - A[i] + x
A[i] = x
def query(S, A, lo, hi, k):
s = 0
i = lo
//Section 1 (Getting sum from Array A itself, starting part)
while (i + 1) % k != 0 and i <= hi:
s += A[i]
i += 1
//Section 2 (Getting sum from Slices directly, intermediary part)
while i + k <= hi:
s += S[i/k]
i += k
//Section 3 (Getting sum from Array A itself, ending part)
while i <= hi:
s += A[i]
i += 1
return s
Let us now determine the complexity.
Each query takes on average
Section 1 takes k/2 time on average. (you might iterate atmost k/2)
Section 2 takes n/k time on average, basically number of slices
Section 3 takes k/2 time on average. (you might iterate atmost k/2)
So, totally, we get k/2 + n/k + k/2 = k + n/k time.
And, this is minimized for k = sqrt(n). sqrt(n) + n/sqrt(n) = 2*sqrt(n)
So we get a O(sqrt(n)) time complexity query.

Prime numbers
As mentioned in some other answers, some basic things related to prime numbers take O(sqrt(n)) time:
Find number of divisors
Find sum of divisors
Find Euler's totient
Below I mention two advanced algorithms which also bear sqrt(n) term in their complexity.
MO's Algorithm
try this problem: Powerful array
My solution:
#include <bits/stdc++.h>
using namespace std;
const int N = 1E6 + 10, k = 500;
struct node {
int l, r, id;
bool operator<(const node &a) {
if(l / k == a.l / k) return r < a.r;
else return l < a.l;
}
} q[N];
long long a[N], cnt[N], ans[N], cur_count;
void add(int pos) {
cur_count += a[pos] * cnt[a[pos]];
++cnt[a[pos]];
cur_count += a[pos] * cnt[a[pos]];
}
void rm(int pos) {
cur_count -= a[pos] * cnt[a[pos]];
--cnt[a[pos]];
cur_count -= a[pos] * cnt[a[pos]];
}
int main() {
int n, t;
cin >> n >> t;
for(int i = 1; i <= n; i++) {
cin >> a[i];
}
for(int i = 0; i < t; i++) {
cin >> q[i].l >> q[i].r;
q[i].id = i;
}
sort(q, q + t);
memset(cnt, 0, sizeof(cnt));
memset(ans, 0, sizeof(ans));
int curl(0), curr(0), l, r;
for(int i = 0; i < t; i++) {
l = q[i].l;
r = q[i].r;
/* This part takes O(n * sqrt(n)) time */
while(curl < l)
rm(curl++);
while(curl > l)
add(--curl);
while(curr > r)
rm(curr--);
while(curr < r)
add(++curr);
ans[q[i].id] = cur_count;
}
for(int i = 0; i < t; i++) {
cout << ans[i] << '\n';
}
return 0;
}
Query Buffering
try this problem: Queries on a Tree
My solution:
#include <bits/stdc++.h>
using namespace std;
const int N = 2e5 + 10, k = 333;
vector<int> t[N], ht;
int tm_, h[N], st[N], nd[N];
inline int hei(int v, int p) {
for(int ch: t[v]) {
if(ch != p) {
h[ch] = h[v] + 1;
hei(ch, v);
}
}
}
inline void tour(int v, int p) {
st[v] = tm_++;
ht.push_back(h[v]);
for(int ch: t[v]) {
if(ch != p) {
tour(ch, v);
}
}
ht.push_back(h[v]);
nd[v] = tm_++;
}
int n, tc[N];
vector<int> loc[N];
long long balance[N];
vector<pair<long long,long long>> buf;
inline long long cbal(int v, int p) {
long long ans = balance[h[v]];
for(int ch: t[v]) {
if(ch != p) {
ans += cbal(ch, v);
}
}
tc[v] += ans;
return ans;
}
inline void bal() {
memset(balance, 0, sizeof(balance));
for(auto arg: buf) {
balance[arg.first] += arg.second;
}
buf.clear();
cbal(1,1);
}
int main() {
int q;
cin >> n >> q;
for(int i = 1; i < n; i++) {
int x, y; cin >> x >> y;
t[x].push_back(y); t[y].push_back(x);
}
hei(1,1);
tour(1,1);
for(int i = 0; i < ht.size(); i++) {
loc[ht[i]].push_back(i);
}
vector<int>::iterator lo, hi;
int x, y, type;
for(int i = 0; i < q; i++) {
cin >> type;
if(type == 1) {
cin >> x >> y;
buf.push_back(make_pair(x,y));
}
else if(type == 2) {
cin >> x;
long long ans(0);
for(auto arg: buf) {
hi = upper_bound(loc[arg.first].begin(), loc[arg.first].end(), nd[x]);
lo = lower_bound(loc[arg.first].begin(), loc[arg.first].end(), st[x]);
ans += arg.second * (hi - lo);
}
cout << tc[x] + ans/2 << '\n';
}
else assert(0);
if(i % k == 0) bal();
}
}

There are many cases.
These are the few problems which can be solved in root(n) complexity [better may be possible also].
Find if a number is prime or not.
Grover's Algorithm: allows search (in quantum context) on unsorted input in time proportional to the square root of the size of the input.link
Factorization of the number.
There are many problems that you will face which will demand use of sqrt(n) complexity algorithm.
As an answer to second part:
sqrt(n) complexity means if the input size to your algorithm is n then there approximately sqrt(n) basic operations ( like **comparison** in case of sorting). Then we can say that the algorithm has sqrt(n) time complexity.
Let's analyze the 3rd problem and it will be clear.
let's n= positive integer. Now there exists 2 positive integer x and y such that
x*y=n;
Now we know that whatever be the value of x and y one of them will be less than sqrt(n). As if both are greater than sqrt(n)
x>sqrt(n) y>sqrt(n) then x*y>sqrt(n)*sqrt(n) => n>n--->contradiction.
So if we check 2 to sqrt(n) then we will have all the factors considered ( 1 and n are trivial factors).
Code snippet:
int n;
cin>>n;
print 1,n;
for(int i=2;i<=sqrt(n);i++) // or for(int i=2;i*i<=n;i++)
if((n%i)==0)
cout<<i<<" ";
Note: You might think that not considering the duplicate we can also achieve the above behaviour by looping from 1 to n. Yes that's possible but who wants to run a program which can run in O(sqrt(n)) in O(n).. We always look for the best one.
Go through the book of Cormen Introduction to Algorithms.
I will also request you to read following stackoverflow question and answers they will clear all the doubts for sure :)
Are there any O(1/n) algorithms?
Plain english explanation Big-O
Which one is better?
How do you calculte big-O complexity?

This link provides a very basic beginner understanding of O() i.e., O(sqrt n) time complexity. It is the last example in the video, but I would suggest that you watch the whole video.
https://www.youtube.com/watch?v=9TlHvipP5yA&list=PLDN4rrl48XKpZkf03iYFl-O29szjTrs_O&index=6
The simplest example of an O() i.e., O(sqrt n) time complexity algorithm in the video is:
p = 0;
for(i = 1; p <= n; i++) {
p = p + i;
}
Mr. Abdul Bari is reknowned for his simple explanations of data structures and algorithms.

Primality test
Solution in JavaScript
const isPrime = n => {
for(let i = 2; i <= Math.sqrt(n); i++) {
if(n % i === 0) return false;
}
return true;
};
Complexity
O(N^1/2) Because, for a given value of n, you only need to find if its divisible by numbers from 2 to its root.

JS Primality Test
O(sqrt(n))
A slightly more performant version, thanks to Samme Bae, for enlightening me with this. 😉
function isPrime(n) {
if (n <= 1)
return false;
if (n <= 3)
return true;
// Skip 4, 6, 8, 9, and 10
if (n % 2 === 0 || n % 3 === 0)
return false;
for (let i = 5; i * i <= n; i += 6) {
if (n % i === 0 || n % (i + 2) === 0)
return false;
}
return true;
}
isPrime(677);

Performance analysis of 3 sum

I have a method that finds 3 numbers in an array that add up to a desired number.
code:
public static void threeSum(int[] arr, int sum) {
quicksort(arr, 0, arr.length - 1);
for (int i = 0; i < arr.length - 2; i++) {
for (int j = 1; j < arr.length - 1; j++) {
for (int k = arr.length - 1; k > j; k--) {
if ((arr[i] + arr[j] + arr[k]) == sum) {
System.out.println(Integer.toString(i) + "+" + Integer.toString(j) + "+" + Integer.toString(k) + "=" + sum);
}
}
}
}
}
I'm not sure about the big O of this method. I have a hard time wrapping my head around this right now. My guess is O(n^2) or O(n^2logn). But these are complete guesses. I can't prove this. Could someone help me wrap my head around this?

You have three runs over the array (the i, j and k loops), in sizes that depend primarily on n, the size of the array. Hence, this is an O(n3) operation.

Even though your quicksort is O(nlogn), it is overshadowed by the fact that you have 3 nested for loops. So the time complexity w.r.t number of elements (n) is O(n^3)

It's an O(n^3) complexity because there are three nested forloops. The inner forloop only runs if k>j so you can think of n^2*(n/2) but in big O notation you can ignore this.

Methodically speaking, the order of growth complexity can be accurately inferred like the following:

Number of possible combinations

How many possible combinations of the variables a,b,c,d,e are possible if I know that:
a+b+c+d+e = 500
and that they are all integers and >= 0, so I know they are finite.

#Torlack, #Jason Cohen: Recursion is a bad idea here, because there are "overlapping subproblems." I.e., If you choose a as 1 and b as 2, then you have 3 variables left that should add up to 497; you arrive at the same subproblem by choosing a as 2 and b as 1. (The number of such coincidences explodes as the numbers grow.)
The traditional way to attack such a problem is dynamic programming: build a table bottom-up of the solutions to the sub-problems (starting with "how many combinations of 1 variable add up to 0?") then building up through iteration (the solution to "how many combinations of n variables add up to k?" is the sum of the solutions to "how many combinations of n-1 variables add up to j?" with 0 <= j <= k).
public static long getCombos( int n, int sum ) {
// tab[i][j] is how many combinations of (i+1) vars add up to j
long[][] tab = new long[n][sum+1];
// # of combos of 1 var for any sum is 1
for( int j=0; j < tab[0].length; ++j ) {
tab[0][j] = 1;
}
for( int i=1; i < tab.length; ++i ) {
for( int j=0; j < tab[i].length; ++j ) {
// # combos of (i+1) vars adding up to j is the sum of the #
// of combos of i vars adding up to k, for all 0 <= k <= j
// (choosing i vars forces the choice of the (i+1)st).
tab[i][j] = 0;
for( int k=0; k <= j; ++k ) {
tab[i][j] += tab[i-1][k];
}
}
}
return tab[n-1][sum];
}
$ time java Combos
2656615626
real 0m0.151s
user 0m0.120s
sys 0m0.012s

The answer to your question is 2656615626.
Here's the code that generates the answer:
public static long getNumCombinations( int summands, int sum )
{
if ( summands <= 1 )
return 1;
long combos = 0;
for ( int a = 0 ; a <= sum ; a++ )
combos += getNumCombinations( summands-1, sum-a );
return combos;
}
In your case, summands is 5 and sum is 500.
Note that this code is slow. If you need speed, cache the results from summand,sum pairs.
I'm assuming you want numbers >=0. If you want >0, replace the loop initialization with a = 1 and the loop condition with a < sum. I'm also assuming you want permutations (e.g. 1+2+3+4+5 plus 2+1+3+4+5 etc). You could change the for-loop if you wanted a >= b >= c >= d >= e.

I solved this problem for my dad a couple months ago...extend for your use. These tend to be one time problems so I didn't go for the most reusable...
a+b+c+d = sum
i = number of combinations
for (a=0;a<=sum;a++)
{
for (b = 0; b <= (sum - a); b++)
{
for (c = 0; c <= (sum - a - b); c++)
{
//d = sum - a - b - c;
i++
}
}
}

This would actually be a good question to ask on an interview as it is simple enough that you could write up on a white board, but complex enough that it might trip someone up if they don't think carefully enough about it. Also, you can also for two different answers which cause the implementation to be quite different.
Order Matters
If the order matters then any solution needs to allow for zero to appear for any of the variables; thus, the most straight forward solution would be as follows:
public class Combos {
public static void main() {
long counter = 0;
for (int a = 0; a <= 500; a++) {
for (int b = 0; b <= (500 - a); b++) {
for (int c = 0; c <= (500 - a - b); c++) {
for (int d = 0; d <= (500 - a - b - c); d++) {
counter++;
}
}
}
}
System.out.println(counter);
}
}
Which returns 2656615626.
Order Does Not Matter
If the order does not matter then the solution is not that much harder as you just need to make sure that zero isn't possible unless sum has already been found.
public class Combos {
public static void main() {
long counter = 0;
for (int a = 1; a <= 500; a++) {
for (int b = (a != 500) ? 1 : 0; b <= (500 - a); b++) {
for (int c = (a + b != 500) ? 1 : 0; c <= (500 - a - b); c++) {
for (int d = (a + b + c != 500) ? 1 : 0; d <= (500 - a - b - c); d++) {
counter++;
}
}
}
}
System.out.println(counter);
}
}
Which returns 2573155876.

One way of looking at the problem is as follows:
First, a can be any value from 0 to 500. Then if follows that b+c+d+e = 500-a. This reduces the problem by one variable. Recurse until done.
For example, if a is 500, then b+c+d+e=0 which means that for the case of a = 500, there is only one combination of values for b,c,d and e.
If a is 300, then b+c+d+e=200, which is in fact the same problem as the original problem, just reduced by one variable.
Note: As Chris points out, this is a horrible way of actually trying to solve the problem.
link text

If they are a real numbers then infinite ... otherwise it is a bit trickier.
(OK, for any computer representation of a real number there would be a finite count ... but it would be big!)

It has general formulae, if
a + b + c + d = N
Then number of non-negative integral solution will be C(N + number_of_variable - 1, N)

#Chris Conway answer is correct. I have tested with a simple code that is suitable for smaller sums.
long counter = 0;
int sum=25;
for (int a = 0; a <= sum; a++) {
for (int b = 0; b <= sum ; b++) {
for (int c = 0; c <= sum; c++) {
for (int d = 0; d <= sum; d++) {
for (int e = 0; e <= sum; e++) {
if ((a+b+c+d+e)==sum) counter=counter+1L;
}
}
}
}
}
System.out.println("counter e "+counter);

The answer in math is 504!/(500! * 4!).
Formally, for x1+x2+...xk=n, the number of combination of nonnegative number x1,...xk is the binomial coefficient: (k-1)-combination out of a set containing (n+k-1) elements.
The intuition is to choose (k-1) points from (n+k-1) points and use the number of points between two chosen points to represent a number in x1,..xk.
Sorry about the poor math edition for my fist time answering Stack Overflow.
Just a test for code block
Just a test for code block
Just a test for code block

Including negatives? Infinite.
Including only positives? In this case they wouldn't be called "integers", but "naturals", instead. In this case... I can't really solve this, I wish I could, but my math is too rusty. There is probably some crazy integral way to solve this. I can give some pointers for the math skilled around.
being x the end result,
the range of a would be from 0 to x,
the range of b would be from 0 to (x - a),
the range of c would be from 0 to (x - a - b),
and so forth until the e.
The answer is the sum of all those possibilities.
I am trying to find some more direct formula on Google, but I am really low on my Google-Fu today...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Binary search to solve 'Kth Smallest Element in a Sorted Matrix'. How can one ensure the correctness of the algorithm, - binary-search

Because We are finding the lower_bound using binary search and there cannot be any number smaller than the number(lo) in the array which could be the kth smallest element.

Related

While loop exit condition

Selection sort implementation, I am stuck at calculating time complexity for number of swaps

When can an algorithm have square root(n) time complexity?

Performance analysis of 3 sum

Number of possible combinations

Categories

Resources