Given an array=[a, b, c, ...].
You have to find the maximum value of [a * k1 + b * k2 + c * k3 +
...]%M. Where k1,k2,k3.. are any desired non-negative integers you
can choose. M is known.
Language- C++
Example
Input
arr=[7, 3, 2].
M=10
Output
9
Explanation
You can choose to make the array [7 * 1 + 3 * 0 + 2 * 1 ] % 10.
Here k1 = 1, k2 = 0, k3 = 1 .
So you get 9 as the answer(max value).
Edit-
I am trying for a C++ Solution.
My attempt:
I know that the ans will range from 0 to M-1. But I'm not getting the idea how to proceed.
Edit
My attempt
int ans=arr[0];
for(int i=1; i<arr.length(); i++){
ans=max((ans+arr[i])%M,ans);
}
return ans;
Here I traverse the array from left to right going on updating the ans. Is this correct?
Related
I have a T(60000x8) matrix in which I want to do sorting operation.
In Matlab I can create a sub-matrix where I am sorting the rows that have same value in column 8.
a1 = max(T(:,8)); x = [1.0:1:a1];
for i = 1.0:1:a1
T1 = T(T(:, 8)== x(i), :);
end
This works perfectly and does my job.
But I want to perform the similar operation using Fortran.
I have tried the followings:
read(7,*) *(height(i,j),j=1,8)
k=maxval(height(:,8))
a1 = int(height(:,8))
allocate(x(k), T1(k,8))
do i=1,k
x(i) = i
end do
do i = 1, k
T1 = height((a1(i)== x(i)),:)
end do
When compiling this gives me error
Error: Array index at (1) must be of INTEGER type, found LOGICAL
Fortran is not Matlab ;)... Matlab has a feature to extract a subarrays using booleans, Fortran has not. However Fortran has a sub-array indexing feature.
Your code has many flaws, and I have to assume that height and T1 are real arrays. You can obtain your desired result (at least what I understand you want) with:
integer :: i
integer, allocatable :: idx(:)
real, allocatable :: x(:), T1(:,:)
a1 = nint(height(:,8))
x = [(i,i=1,size(a1))]
idx = pack( x, mask=(a1==x) )
T1 = height(idx(:),:)
Explanation, for instance:
a1 : [4 2 3 1 5]
x : [1 2 3 4 5]
(a1 == x) : [F T T F T]
idx : [ 2 3 5] ! 3 elements
T1 will be made of the columns 2, 3, 5 of height
Can somebody help with the time complexity of the following code:
for(i = 0; i <= n; i++)
{
for(j = 0; j <= i; j++)
{
for(k = 2; k <= n; k = k^2)
print("")
}
a/c to me the first loop will run n times,2nd will run for(1+2+3...n) times and third for loglogn times..
but i m not sure about the answer.
We start from the inside and work out. Consider the innermost loop:
for(k = 2; k <= n; k = k^2)
print("")
How many iterations of print("") are executed? First note that n is constant. What sequence of values does k assume?
iter | k
--------
1 | 2
2 | 4
3 | 16
4 | 256
We might find a formula for this in several ways. I used guess and prove to get iter = log(log(k)) + 1. Since the loop won't execute the next iteration if the value is already bigger than n, the total number of iterations executed for n is floor(log(log(n)) + 1). We can check this with a couple of values to make sure we got this right. For n = 2, we get one iteration which is correct. For n = 5, we get two. And so on.
The next level does i + 1 iterations, where i varies from 0 to n. We must therefore compute the sum 1, 2, ..., n + 1 and that will give us the total number of iterations of the outermost and middle loop: this sum is (n + 1)(n + 2) / 2 We must multiply this by the cost of the inner loop to get the answer, (n + 1)(n + 2)(log(log(n)) + 1) / 2 to get the total cost of the snippet. The fastest-growing term in the expansion is n^2 log(log(n)) and so that is what would typically be given as asymptotic complexity.
I would like to know where I can read about algorithms for solving this problem efficiently:
Four directions allowed: up, down, left, right
Cells containing zero can't be visited.
Visiting the same cell twice is illegal.
Moves wraps around the edges:
(first row is connected with last row)
(first col is connected with last col)
Example, 5x5 and 5 steps:
9 1 3 1 9
6 3 2 4 1
0 7 * 7 7
5 4 9 4 9
7 9 1 5 5
Starting point: *
Solution: down,left,down,left,down. That is 9 + 4 + 9 + 7 + 9 = 38
[9] 1 3 1 9
6 3 2 4 1
0 7 * 7 7
5 [4][9] 4 9
[7][9] 1 5 5
This problem is probably not related to:
Finding the maximum sub matrix
Dynamic programming
You specified in comments that you wanted a sub-second way of finding the best value 20-step path out of a 5x5 matrix. I've implemented a basic recursive search tree that does this. Ultimately, the difficulty of this is still O(3^k), but highly saturated cases like yours (21 out of 24 allowed nodes visited) will solve much faster because the problem simplifies to "skip the n*n-z-k-1 lowest value cells" (in this case, n=5, z=1 and k+1 = 21; the winning path skips three 1's).
The problem instance in your question solves in 0.231seconds on a 3 year old i5 laptop and about half a second on ideone.com. I've provided code here http://ideone.com/5kOyxq (note that 'up' and 'down' are reversed because of the way I input the data).
For less saturated problems you may need to add a Bound/Cut method. You can generate a Bound as follows:
First, run over the NxN matrix and collect the K highest value elements (can be done in N² log K) and sort them by max-first. Then, cumulatively calculate the value UB[t] = SUM[i::0->t] SortedElements[i]. So, any t-length path has a UB of UB[t] (max t elements).
At step T, the current Branch's UB is UB[t]. If ValueSoFar[T] + UB[K-T] <= BestPathValue, you can stop that branch.
There may be better ways, but this should be sufficient for reasonably sized matrices and path lengths.
Game or puzzle. Given a matrix, number of steps and a sum, find the path.
Would be nice if there is a real world application for this, but i haven't found it.
Games tend to "burn in" knowledge in young brains, so why not burn in something useful, like addition?
#include<iostream>
#include<climits>
#define R 3
#define C 3
int MAX(int x, int y, int z);
int Max_Cost(int cost[R][C], int m, int n)
{
if (n < 0 || m < 0)
return INT_MIN;
else if (m == 0 && n == 0)
return cost[m][n];
else
return cost[m][n] + MIN( Max_Cost(cost, m-1, n-1),
Max_Cost(cost, m-1, n),
Max_Cost(cost, m, n-1)
);
}
int MAX(int x, int y, int z)
{
return max(max(x, y), z);
}
int main()
{
int cost[R][C] = { {3, 2, 5},
{5, 8, 2},
{9, 7, 1}
};
cout<<Max_Cost(cost, 2, 1);
return 0;
}
So, here's the question. I want to do a computation in CUDA where I have a large 1D array (which represents a lattice), I partition it into subarrays of length #part, and I want each thread to do a couple of computations on each subarray.
More specifically, let's say that we have a number of threads, #threads, and a number of blocks, #blocks. The array is of size N = 2 * #part * #threads * #blocks. If we number the subarrays from 1 to 2*#blocks*#threads, we want to first use the #threads*#blocks threads to do computation on the subarrays with an even number and then the same number of threads to do computation on the subarrays with an odd number.
I thought that I could have a local index in each thread which would denote from where it's subarray would start.
So, I used the following index :
localIndex = #part * (2 * threadIdx.x + var) + 2 * #part * #Nthreads * blockIdx.x;
var is either 1 or 0, depending on if we want to have the thread do computation on an subarray with an even or an odd number.
I've tried to run it and it seems that something goes wrong when I use more than one blocks. Have I done something wrong with the indexing?
Thanks.
Why is it important that the threads collectively do first even, then the odd subarrays, since block and thread execution is not guaranteed to be in order there is no benefit?
Assuming you index only using x-dimension for your kernel dimension setup:
subArrayIndexEven = 2 * (blockIdx.x * blockDim.x + threadIdx.x) * part
subArrayIndexOdd = subArrayIndexEven + part
Prove:
BLOCK_SIZE = 3
NUM_OF_BLOCKS = 2
PART = 4
N = 2 * 3 * 2 * 4 = 48
T(threadIdx.x, blockIdx.x)
T(0, 1) -> even = 2 * (1 * 3 + 0) * 4 = 24, odd = 28
T(1, 1) -> even = 2 * (1 * 3 + 1) * 4 = 32, odd = 36
T(2, 1) -> even = 2 * (1 * 3 + 2) * 4 = 40, odd = 44
idx = threads_per_block*blockIdx.x + threadIdx.x;
int my_even_offset, my_odd_offset, my_even_idx, my_odd_idx;
int my_offset = floor(float(idx)/float(num_part));
my_even_offset = 2*my_offset*num_part;
my_odd_offset = (2*my_offset+1)*num_part;
my_even_idx = idx + my_even_offset;
my_odd_idx = idx + my_odd_offset;
//Do stuff with the indices.
Question: Suppose you have a random number generator randn() that returns a uniformly distributed random number between 0 and n-1. Given any number m, write a random number generator that returns a uniformly distributed random number between 0 and m-1.
My answer:
-(int)randm() {
int k=1;
while (k*n < m) {
++k;
}
int x = 0;
for (int i=0; i<k; ++i) {
x += randn();
}
if (x < m) {
return x;
} else {
return randm();
}
}
Is this correct?
You're close, but the problem with your answer is that there is more than one way to write a number as a sum of two other numbers.
If m<n, then this works because the numbers 0,1,...,m-1 appear each with equal probability, and the algorithm terminates almost surely.
This answer does not work in general because there is more than one way to write a number as a sum of two other numbers. For instance, there is only one way to get 0 but there are many many ways to get m/2, so the probabilities will not be equal.
Example: n = 2 and m=3
0 = 0+0
1 = 1+0 or 0+1
2 = 1+1
so the probability distribution from your method is
P(0)=1/4
P(1)=1/2
P(2)=1/4
which is not uniform.
To fix this, you can use unique factorization. Write m in base n, keeping track of the largest needed exponent, say e. Then, find the biggest multiple of m that is smaller than n^e, call it k. Finally, generate e numbers with randn(), take them as the base n expansion of some number x, if x < k*m, return x, otherwise try again.
Assuming that m < n^2, then
int randm() {
// find largest power of n needed to write m in base n
int e=0;
while (m > n^e) {
++e;
}
// find largest multiple of m less than n^e
int k=1;
while (k*m < n^2) {
++k
}
--k; // we went one too far
while (1) {
// generate a random number in base n
int x = 0;
for (int i=0; i<e; ++i) {
x = x*n + randn();
}
// if x isn't too large, return it x modulo m
if (x < m*k)
return (x % m);
}
}
It is not correct.
You are adding uniform random numbers, which does not produce a uniformly random result. Say n=2 and m = 3, then the possible values for x are 0+0, 0+1, 1+0, 1+1. So you're twice as likely to get 1 than you are to get 0 or 2.
What you need to do is write m in base n, and then generate 'digits' of the base-n representation of the random number. When you have the complete number, you have to check if it is less than m. If it is, then you're done. If it is not, then you need to start over.
The sum of two uniform random number generators is not uniformly generated. For instance, the sum of two dice is more likely to be 7 than 12, because to get 12 you need to throw two sixes, whereas you can get 7 as 1 + 6 or 6 + 1 or 2 + 5 or 5 + 2 or ...
Assuming that randn() returns an integer between 0 and n - 1, n * randn() + randn() is uniformly distributed between 0 and n * n - 1, so you can increase its range. If randn() returns an integer between 0 and k * m + j - 1, then call it repeatedly until you get a number <= k * m - 1, and then divide the result by k to get a number uniformly distributed between 0 and m -1.
Assuming both n and m are positive integers, wouldn't the standard algorithm of scaling work?
return (int)((float)randn() * m / n);