Checking containment using integer programming - gurobi

For this question, a region is a subset of Zd defined by finitely many linear inequalities with integer coefficients, where Zd is the set of d-tuples of integers. For example, the set of pairs (x, y) of non-negative integers with 2x+3y >= 10 constitutes a region with d=2 (non-negativity just imposes the additional inequalities x>=0 and y>=0).
Question: is there a good way, using integer programming (or something else?), to check if one region is contained in a union of finitely many other regions?
I know one way to check containment, which I describe below, but I'm hoping someone may be able to offer some improvements, as it's not too efficient.
Here's the way I know to check containment. First, integer programming libraries can directly check if a region is empty: in integer programming terminology (as I understand it), emptiness of a region corresponds to infeasibility of a model. I have coded up something using the gurobi library to check emptiness, and it seems to work well in practice for the kind of regions I care about.
Suppose now that we want to check if a region X is contained in another region Y (a special case of the question). Let Z be the intersection of X with the complement of Y. Then X is contained in Y if and only if Z is empty. Now, Z itself is not a region in my sense of the word, but it is a union of regions Z_1, ..., Z_n, where n is the number of inequalities used to define Y. We can check if Z is empty by checking that each of Z_1, ..., Z_n is empty, and we can do this as described above.
The general case can be handled in exactly the same way: if Y is a finite union of regions Y_1, ..., Y_k then Z is still a finite union of regions Z_1, ..., Z_n, and so we just check that each Z_i is empty. If Y_i is defined by m_i inequalities then n = m_1 * m_2 * ... * m_k.
So to summarize, we can reduce the containment problem to the emptiness problem, which the library can solve directly. The issue is that we may have to solve a very large number of emptiness problems to solve containment (e.g., if each Y_i is defined by only two inequalities then n = 2^k grows exponentially with k), and so this may take a lot of time.

You can't really expect a simple answer. Suppose that A is defined by all constraints of the form 0 <= x_i <= 1. A can be thought of as the collection of all possible rows of a truth table. Given any logical expression of the form e.g. x or (not y) or z, you can express it as a linear inequality
such as x + (1-y) + z >= 1 (along with the 0-1 constraints). Using this approach, any Boolean formula in conjunctive normal form (CNF) can be expressed as a region in Z^n. If A is defined as above and B_1, B_2, ...., B_k is a list of regions corresponding to CNFs then A is contained in the union of the B_i if and only if the disjunction of those CNFs is a tautology. But tautology-checking is a canonical example of an NP-complete problem.
None of this is to say that it can't be usefully reduced to ILP (which itself is NP-complete). I don't see of any direct way to do so, though I suspect that some of the techniques used to identify redundant constraints would be relevant.

Related

Matrix Inverse in Visual Basic

I'm writing a program to do the Newton Raphson Method for n variable (System of equation) using Datagridview. My problem is to determine the inverse for Jacobian Matrix. I've search in internet to find a solution but a real couldn't get it until now so if someone can help me I will real appreciate. Thanks in advance.
If you are asking for a recommendation of a library, that is explicitly off topic in Stack Overflow. However below I mention some algorithms that are commonly used; this may help you to find, or write, what you need. I would, though, not recommend writing something, unless you really want to, as it can be tricky to get these algorithms right. If you do decide to write something I'd recommend the QR method, as the easiest to write, though the theory is a little subtle.
First off do you really need to compute the inverse? If, for example, what you need to do is to compute
x = inv(J)*y
then it's faster and more accurate to treat this problem as
solve J*x = y for x
The methods below all factor J into other matrices, for which this solution can be done. A good package that implements the factorisation will also have the code to perform the solution.
If you do really really need the inverse often the best way is to solve, one column at a time
J*K = I for K, where I is the identity matrix
LU decomposition
This may well be the fastest of the algorithms described here but is also the least accurate. An important point is that the algorithm must include (partial) pivoting, or it will not work on all invertible matrices, for example it will fail on a rotation through 90 degrees.
What you get is a factorisation of J into:
J = P*L*U
where P is a permutation matrix,
L lower triangular,
U upper triangular
So having factorised, to solve for x we do three steps, each straightforward, and each can be done in place (ie all the x's can be the same variable)
Solve P*x1 = y for x1
Solve L*x2 = x1 for x2
Solve U*x = x2 for x
QR decomposition
This may be somewhat slower than LU but is more accurate. Conceptually this factorises J into
J = Q*R
Where Q is orthogonal and R upper triangular. However as it is usually implemented you in fact pass y as well as J to the routine, and it returns R (in J) and Q'*y (in the passed y), so to solve for x you just need to solve
R*x = y
which, given that R is upper triangular, is easy.
SVD (Singular value decomposition)
This is the most accurate, but also the slowest. Moreover unlike the others you can make progress even if J is singular (you can compute the 'generalised inverse' applied to y).
I recommend reading up on this, but advise against implementing it yourself.
Briefly you factorise J as
J = U*S*V'
where U and V are orthogonal and S diagonal.
There are, of course, many other ways of solving this problem. For example if your matrices are very large (dimension in the thousands) an it may, particularly if they are sparse (lots of zeroes), be faster to use an iterative method.

Relaxation of linear constraints?

When we need to optimize a function on the positive real half-line, and we only have non-constraints optimization routines, we use y = exp(x), or y = x^2 to map to the real line and still optimize on the log or the (signed) square root of the variable.
Can we do something similar for linear constraints, of the form Ax = b where, for x a d-dimensional vector, A is a (N,n)-shaped matrix and b is a vector of length N, defining the constraints ?
While, as Ervin Kalvelaglan says this is not always a good idea, here is one way to do it.
Suppose we take the SVD of A, getting
A = U*S*V'
where if A is n x m
U is nxn orthogonal,
S is nxm, zero off the main diagonal,
V is mxm orthogonal
Computing the SVD is not a trivial computation.
We first zero out the elements of S which we think are non-zero just due to noise -- which can be a slightly delicate thing to do.
Then we can find one solution x~ to
A*x = b
as
x~ = V*pinv(S)*U'*b
(where pinv(S) is the pseudo inverse of S, ie replace the non zero elements of the diagonal by their multiplicative inverses)
Note that x~ is a least squares solution to the constraints, so we need to check that it is close enough to being a real solution, ie that Ax~ is close enough to b -- another somewhat delicate thing. If x~ doesn't satisfy the constraints closely enough you should give up: if the constraints have no solution neither does the optimisation.
Any other solution to the constraints can be written
x = x~ + sum c[i]*V[i]
where the V[i] are the columns of V corresponding to entries of S that are (now) zero. Here the c[i] are arbitrary constants. So we can change variables to using the c[] in the optimisation, and the constraints will be automatically satisfied. However this change of variables could be somewhat irksome!

Generating Matched Pairs for Statistical Analysis

In my study, a person is represented as a pair of real numbers (x, y). x is on [30, 80] and y is [60, 120]. There are two types of people, A and B. I have ~300 of each type. How can I generate the largest (or even a large) set of pairs of one person from A with one from B: ((xA, yA), (xB, yB)) such that each pair of points is close? Two points are close if abs(x1-x2) < dX and abs(y1 - y2) < dY. Similar constraints are acceptable. (That is, this constraint is roughly a Manhattan metric, but euclidean/etc is ok too.) Not all points need be used, but no point can be reused.
You're looking for the Hungarian Algorithm.
Suggested formulation: A are rows, B are columns, each cell contains a distance metric between Ai and Bi, e.g. abs(X(Ai)-X(Bi)) + abs(Y(Ai)-Y(Bi)). (You can normalize the X and Y values to [0,1] if you want distances to be proportional to the range of each variable)
Then use the Hungarian Algorithm to minimize matching weight.
You can filter out matches with distances over your threshold. If you're worried that this filtering might cause the approach to be sub-optimal, you could set distances over your threshold to a very high number.
There are many implementations of this algorithm. A short search found one in any conceivable language, including VBA for Excel and some online solvers (not sure about matching 300x300 matrix with them, though)
Hungarian algorithm did it, thanks Etov.
Source code available here: http://www.filedropper.com/stackoverflow1

fmincon : impose vector greater than zero constraint

How do you impose a constraint that all values in a vector you are trying to optimize for are greater than zero, using fmincon()?
According to the documentation, I need some parameters A and b, where A*x ≤ b, but I think if I make A a vector of -1's and b 0, then I will have optimized for the sum of x>0, instead of each value of x greater than 0.
Just in case you need it, here is my code. I am trying to optimize over a vector (x) such that the (componentwise) product of x and a matrix (called multiplierMatrix) makes a matrix for which the sum of the columns is x.
function [sse] = myfun(x) % this is a nested function
bigMatrix = repmat(x,1,120) .* multiplierMatrix;
answer = sum(bigMatrix,1)';
sse = sum((expectedAnswer - answer).^2);
end
xGuess = ones(1:120,1);
[sse xVals] = fmincon(#myfun,xGuess,???);
Let me know if I need to explain my problem better. Thanks for your help in advance!
You can use the lower bound:
xGuess = ones(120,1);
lb = zeros(120,1);
[sse xVals] = fmincon(#myfun,xGuess, [],[],[],[], lb);
note that xVals and sse should probably be swapped (if their name means anything).
The lower bound lb means that elements in your decision variable x will never fall below the corresponding element in lb, which is what you are after here.
The empties ([]) indicate you're not using linear constraints (e.g., A,b, Aeq,beq), only the lower bounds lb.
Some advice: fmincon is a pretty advanced function. You'd better memorize the documentation on it, and play with it for a few hours, using many different example problems.

approximating log10[x^k0 + k1]

Greetings. I'm trying to approximate the function
Log10[x^k0 + k1], where .21 < k0 < 21, 0 < k1 < ~2000, and x is integer < 2^14.
k0 & k1 are constant. For practical purposes, you can assume k0 = 2.12, k1 = 2660. The desired accuracy is 5*10^-4 relative error.
This function is virtually identical to Log[x], except near 0, where it differs a lot.
I already have came up with a SIMD implementation that is ~1.15x faster than a simple lookup table, but would like to improve it if possible, which I think is very hard due to lack of efficient instructions.
My SIMD implementation uses 16bit fixed point arithmetic to evaluate a 3rd degree polynomial (I use least squares fit). The polynomial uses different coefficients for different input ranges. There are 8 ranges, and range i spans (64)2^i to (64)2^(i + 1).
The rational behind this is the derivatives of Log[x] drop rapidly with x, meaning a polynomial will fit it more accurately since polynomials are an exact fit for functions that have a derivative of 0 beyond a certain order.
SIMD table lookups are done very efficiently with a single _mm_shuffle_epi8(). I use SSE's float to int conversion to get the exponent and significand used for the fixed point approximation. I also software pipelined the loop to get ~1.25x speedup, so further code optimizations are probably unlikely.
What I'm asking is if there's a more efficient approximation at a higher level?
For example:
Can this function be decomposed into functions with a limited domain like
log2((2^x) * significand) = x + log2(significand)
hence eliminating the need to deal with different ranges (table lookups). The main problem I think is adding the k1 term kills all those nice log properties that we know and love, making it not possible. Or is it?
Iterative method? don't think so because the Newton method for log[x] is already a complicated expression
Exploiting locality of neighboring pixels? - if the range of the 8 inputs fall in the same approximation range, then I can look up a single coefficient, instead of looking up separate coefficients for each element. Thus, I can use this as a fast common case, and use a slower, general code path when it isn't. But for my data, the range needs to be ~2000 before this property hold 70% of the time, which doesn't seem to make this method competitive.
Please, give me some opinion, especially if you're an applied mathematician, even if you say it can't be done. Thanks.
You should be able to improve on least-squares fitting by using Chebyshev approximation. (The idea is, you're looking for the approximation whose worst-case deviation in a range is least; least-squares instead looks for the one whose summed squared difference is least.) I would guess this doesn't make a huge difference for your problem, but I'm not sure -- hopefully it could reduce the number of ranges you need to split into, somewhat.
If there's already a fast implementation of log(x), maybe compute P(x) * log(x) where P(x) is a polynomial chosen by Chebyshev approximation. (Instead of trying to do the whole function as a polynomial approx -- to need less range-reduction.)
I'm an amateur here -- just dipping my toe in as there aren't a lot of answers already.
One observation:
You can find an expression for how large x needs to be as a function of k0 and k1, such that the term x^k0 dominates k1 enough for the approximation:
x^k0 +k1 ~= x^k0, allowing you to approximately evaluate the function as
k0*Log(x).
This would take care of all x's above some value.
I recently read how the sRGB model compresses physical tri stimulus values into stored RGB values.
It basically is very similar to the function I try to approximate, except that it's defined piece wise:
k0 x, x < 0.0031308
k1 x^0.417 - k2 otherwise
I was told the constant addition in Log[x^k0 + k1] was to make the beginning of the function more linear. But that can easily be achieved with a piece wise approximation. That would make the approximation a lot more "uniform" - with only 2 approximation ranges. This should be cheaper to compute due to no longer needing to compute an approximation range index (integer log) and doing SIMD coefficient lookup.
For now, I conclude this will be the best approach, even though it doesn't approximate the function precisely. The hard part will be proposing this change and convincing people to use it.