In an ILP problem, is it possible to constrain/penalize the number of decision variables used? - optimization

I'm trying to setup minimization problems with restrictions on the number of decision variables used.
Is it possible to do this within a linear programming framework? Or am I forced to use a more sophisticated optimization framework?
Suppose all x's are non-negative integers:
x1, x2, x3, x4, x5 >= 0
1) Constraint: Is it possible to set up the problem so that no more than 3 of the x's can be non-zero? e.g. if
x1 = 1, x2 = 2, x3 = 3 then x4 = 0 and x5 = 0
2) Penalty: Suppose there are 3 possible solutions to the problem:
a) x1 = 1, x2 = 2, x3 = 3, x4 = 0, x5 = 0
b) x1 = 2, x2 = 3, x3 = 0, x4 = 0, x5 = 0
c) x1 = 3, x2 = 0, x3 = 0, x4 = 0, x5 = 0
Due to simplicity, solution (c) is preferred over solution (b) which is preferred over solution (a) i.e. 'using' less decision variables is preferable.
In both questions I've simplified the problem down to 5 x's, but in reality I have 100's of x's to optimise over.
I can see how I might do this in a general optimisation framework using indicator/delta variables, but can't figure out how to do it in linear programming. Any help would be appreciated!

You can build your own indicators (and without restrictions to some very specific problem you also need to).
Assuming there is an upper-bound ub_i for all of your integer-variables x0, x1, ..., xn, introduce binary-variables u0, u1, ... un and post new constraints like:
u1 * ub_1 >= x1
u2 * ub_2 >= x2
...
(the ub_x constants are often called big-M constants; but we keep them as small as possible for better relaxations)
Then your cardinality-constraint is simply:
sum(u) <= 3
Of course you can also use those u-variables in whatever penalty-design you might want to use.

Related

Numerically stable calculation of invariant mass in particle physics?

In particle physics, we have to compute the invariant mass a lot, which is for a two-body decay
When the momenta (p1, p2) are sometimes very large (up to a factor 1000 or more) compared to the masses (m1, m2). In that case, there is large cancellation happening between the last two terms when the calculation is carried out with floating point numbers on a computer.
What kind of numerical tricks can be used to compute this accurately for any inputs?
The question is about suitable numerical tricks to improve the accuracy of the calculation with floating point numbers, so the solution should be language-agnostic. For demonstration purposes, implementations in Python are preferred. Solutions which reformulate the problem and increase the amount of elementary operations are acceptable, but solutions which suggest to use other number types like decimal or multi-precision floating point numbers are not.
Note: The original question presented a simplified 1D dimensional problem in form of a Python expression, but the question is for the general case where the momenta are given in 3D dimensions. The question was reformulated in this way.
With a few tricks listed on Stackoverflow and the transformation described by Jakob Stark in his answer, it is possible to rewrite the equation into a form that does not suffer anymore from catastrophic cancellation.
The original question asked for a solution in 1D, which has a simple solution, but in practice, we need the formula in 3D and then the solution is more complicated. See this notebook for a full derivation.
Example implementation of numerically stable calculation in 3D in Python:
import numpy as np
# numerically stable implementation
#np.vectorize
def msq2(px1, py1, pz1, px2, py2, pz2, m1, m2):
p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
m1_sq = m1 ** 2
m2_sq = m2 ** 2
x1 = m1_sq / p1_sq
x2 = m2_sq / p2_sq
x = x1 + x2 + x1 * x2
a = angle(px1, py1, pz1, px2, py2, pz2)
cos_a = np.cos(a)
if cos_a >= 0:
y1 = (x + np.sin(a) ** 2) / (np.sqrt(x + 1) + cos_a)
else:
y1 = -cos_a + np.sqrt(x + 1)
y2 = 2 * np.sqrt(p1_sq * p2_sq)
return m1_sq + m2_sq + y1 * y2
# numerically stable calculation of angle
def angle(x1, y1, z1, x2, y2, z2):
# cross product
cx = y1 * z2 - y2 * z1
cy = x1 * z2 - x2 * z1
cz = x1 * y2 - x2 * y1
# norm of cross product
c = np.sqrt(cx * cx + cy * cy + cz * cz)
# dot product
d = x1 * x2 + y1 * y2 + z1 * z2
return np.arctan2(c, d)
The numerically stable implementation can never produce a negative result, which is a commonly occurring problem with naive implementations, even in double precision.
Let's compare the numerically stable function with a naive implementation.
# naive implementation
def msq1(px1, py1, pz1, px2, py2, pz2, m1, m2):
p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
m1_sq = m1 ** 2
m2_sq = m2 ** 2
# energies of particles 1 and 2
e1 = np.sqrt(p1_sq + m1_sq)
e2 = np.sqrt(p2_sq + m2_sq)
# dangerous cancelation in third term
return m1_sq + m2_sq + 2 * (e1 * e2 - (px1 * px2 + py1 * py2 + pz1 * pz2))
For the following image, the momenta p1 and p2 are randomly picked from 1 to 1e5, the values m1 and m2 are randomly picked from 1e-5 to 1e5. All implementations get the input values in single precision. The reference in both cases is calculated with mpmath using the naive formula with 100 decimal places.
The naive implementation loses all accuracy for some inputs, while the numerically stable implementation does not.
If you put e.g. m1 = 1e-4, m2 = 1e-4, p1 = 1 and p2 = 1 in the expression, you get about 4e-8 with double precision but 0.0 with single precision calculation. I assume, that your question is about how one can get the 4e-8 as well with single precision calculation.
What you can do is a taylor expansion (around m1 = 0 and m2 = 0) of the expression above.
e ~ e|(m1=0,m2=0) + de/dm1|(m1=0,m2=0) * m1 + de/dm2|(m1=0,m2=0) * m2 + ...
If I calculated correctly, the zeroth and first order terms are 0 and the second order expansion would be
e ~ (p1+p2)/p1 * m1**2 + (p1+p2)/p2 * m2**2
This yields exactly 4e-8 even with single precision calculation. You can of course do more terms in the expansion if you need, until you hit the precision limit of a single float.
Edit
If the mi are not always much smaller than the pi you could further massage the equation to get
The complicated part is now the one in the square brackets. It essentially is sqrt(x+1)-1 for a wide range of x values. If x is very small, we can use the taylor expansion of the square root (e.g. like here). If the x value is larger, the formula works just fine, because the addition and subtraction of 1 are no longer discarding the value of x due to floating point precision. So one threshold for x must be choosen below one switches to the taylor expansion.

Mixed Integer Linear Programming for a Ranking Constraint

I am trying to write a mixed integer linear programming for a constraint related to the rank of a specific variable, as follows:
I have X1, X2, X3, X4 as decision variables.
There is a constraint asking to define i as a rank of X1 (For example, if X1 is the largest number amongst X1, X2, X3, X4, then i=1; if X1 is the second largest number then i=2, if X1 is the 3rd largest number then i=3, else i=4)
How could I write this constraint into a mixed integer linear programming?
Not so easy. Here is an attempt:
First introduce binary variables y(i) for i=2,3,4
Then we can write:
x(1) >= x(i) - (1-y(i))*M i=2,3,4
x(1) <= x(i) + y(i)*M i=2,3,4
rank = 4 - sum(i,y(i))
y(i) ∈ {0,1} i=2,3,4
Here M is a large enough constant (a good choice is the maximum range of the data). If your solver supports indicator constraints, you can simplify things a bit.
A small example illustrates it works:
---- 36 VARIABLE x.L
i1 6.302, i2 8.478, i3 3.077, i4 6.992
---- 36 VARIABLE y.L
i3 1.000
---- 36 VARIABLE rank.L = 3.000

Variable name to include value of another variable

Lets say I have pre-defined 3 variables, x1, x2, and x3, each of which is a different co-ordinate on the screen. I have a whole chunk of code to decide whether another variable, a will equal 1, 2, or 3. Now, I want to include the value of a in a variable name, allowing me to 'dynamically' change between x1, x2, and x3.
E.g. a is set to 2. Now I want to move the mouse to xa, so if a=2, xa is x2, which is a predefined variable.
Its probably clear I am very new to Lua, I have tried googling the issue, but I'm not really sure what I am looking for, terminology wise and such.
Anyhow, is anyone able to help me out?
If you can change the code where x1, x2 and x3 are defined, a cleaner approach is to use arrays (i.e. array-like tables). This is the general approach when you need a sequence of variables indexed by a number.
Therefore, instead of x1, x2 and x3 you could define:
local x = {}
x[1] = 10 -- instead of x1
x[2] = 20 -- instead of x2
x[3] = 30 -- instead of x3
Now instead of using xa you simply use x[a].
If xa are global variables, you can use the table _G like this:
x1 = 42
x2 = 43
x3 = 44
local a = 2
print(_G['x' .. a])
Output:
43

Formatting a txt file of equations into the same format and then manipulating them for linear algebra calculations in Python

I'm looking for an universal way of transforming equations in Python 3.2. I've only recently begun playing around with it and stumbled upon some of my old MATLAB homework. I'm able to calculate this in MATLAB but pylab is still a bit of a mystery to me.
So, I have a text file of equations that I'm trying to convert into the the same form of A x = b and then solve some linear algebra problems associated with them in PYLAB.
The text file, "equations.txt",contains collections of linear equations in the following format:
-38 y1  +  35 y2  +  31 y3  = -3047
11 y1  + -13 y2  + -34 y3  = 784
34 y1  + -21 y2  +  19 y3  = 2949
etc.
The file contains the equations for four sets of equations, each set with a different number of variables. Each set of equations is of the exact form shown (3 examples above) with one empty line between each set.
I want to write a program to read all the sets of equations in the files, convert sets of equations into a matrix equation A x = b, and solve the set of equations for the vector x.
My approach has been very "MATLABy", which is a problem because I want to be able to write a program that will solve for all of the variables.
I've tried reading a single equation as a text line, stripped of the carriage return at the end, and splitting line at the = sign because as we know the 2nd element in the split is the right hand side of the equation, that goes into the vector b.
The first element in the split is the part you have to get the coefficients that go in the A matrix.  If you split this at white space ' ', you will get a list like
['-38', 'y1', '+', '35', 'y2', '+', '31', 'y3']
Note now that you can pull every 3rd element and get the coefficients that go into the matrix A.
Partial answers would be:
y1 = 90; c2 = 28; x4 = 41; z100 = 59
I'm trying to manipulate them to give me the sum of the entries of the solutions y1, ..., y3 for the first block of equations, the sum of the entries of the solutions c1, ..., c6 for the second block of equations, the sum of the entries of the solutions x1, ..., x13 for the third block of equations, and the sum of the entries of the solutions z1, ..., z100 for the fourth block of equations.
Like, I said - I'm able to do this in MATLAB but not in Python so I'm probably approaching this from the wrong way but this is what I have so far:
import pylab
f = open('equations.txt', 'r')
L=f.readlines()
list_final = []
for line in L:
line_l = line.rstrip()
list_l = line_l.split(";")
list_l = filter(None, list_l)
for expression in list_l:
and ending it with
f.close()
This was just my go at trying to format the equations to all look the same. I realise it's not a lot but I was really hoping someone could get my started because even though I know some python I normally don't use it for math because I have MATLAB for that.
I think this could be useful for many of us who have prior MATLAB experience but not pylab.
How would you go around this? Thank you!
For your example format, it's very easy to process it by numpy.loadtxt():
import numpy as np
data = np.loadtxt("equations.txt", dtype=str)[:, ::3].astype(np.float)
a = data[:, :-1]
b = data[:, -1]
x = np.linalg.solve(a, b)
The steps are:
An alternative approach that is possibly more robust to unstructured input is to use a combination of the Python symbolic math package (sympy), and a few parsing tricks. This scales to the variables in the equations being written in an arbitrary order.
Although sympy has some tools for parsing, (your input is very close in appearance to Mathematica), it appears that the sympy.parsing.mathematica module can't deal with some of the input (particularly leading minus signs).
import sympy
from sympy.parsing.sympy_parser import parse_expr
import re
def text_to_equations(text):
lines = text.split('\n')
lines = [line.split('=') for line in lines]
eqns = []
for lhs, rhs in lines:
# clobber all the spaces
lhs = lhs.replace(' ','')
# *assume* that a number followed by a letter is an
# implicit multiplication
lhs = re.sub(r'(\d)([a-z])', r'\g<1>*\g<2>', lhs)
eqns.append( (parse_expr(lhs), parse_expr(rhs)) )
return eqns
def get_all_symbols(eqns):
symbs = set()
for lhs, rhs in eqns:
for sym in lhs.atoms(sympy.Symbol):
symbs.add(sym)
return symbs
def text_to_eqn_matrix(text):
eqns = text_to_equations(text)
symbs = get_all_symbols(eqns)
n = len(eqns)
m = len(symbs)
A = numpy.zeros((m, n))
b = numpy.zeros((m, 1))
for i, (lhs, rhs) in enumerate(eqns):
d = lhs.as_coefficients_dict()
b[i] = int(rhs)
for j, s in enumerate(symbs):
A[i, j] = d[s]
x = sympy.Matrix([list(symbs)]).T
return sympy.Matrix(A), x, sympy.Matrix(b)
s = '''-38 y1 + 35 y2 + 31 y3 = -3047
11 y1 + -13 y2 + -34 y3 = 784
34 y1 + -21 y2 + 19 y3 = 2949'''
A, x, b = text_to_eqn_matrix(s)
print A
print x
print b

Deriving equations for finite domain constraint system

The following inequality system is solved for x1 and x2 over the integers.
x1 + x2 = l
x1 >= y1
x2 >= y2
x1 <= z1
x2 <= z2
l - z1 <= x2
l - z2 <= x1
l,y1,y2,z1,z2 are arbitrary but fixed and >= 0.
With the example values
l = 8
y1 = 1
y2 = 2
z1 = z2 = 6
I solve the system and get the following equations:
2 <= x1 <= 6
x2 = 8 - x1
When I tell WolframAlpha that it should solve it over the integers, it only outputs all possible values.
My question is whether I can derive equations/ranges for x1 and x2 for any given l,y1,y2,z1,z2 programmatically. This problem is related to constraint programming and I found an old paper about this problem: "Compiling Constraint Solving using Projection" by Harvey et al.
Is this approach used in any modern constraint solving libraries?
The reason I ask this is that I need to solve systems like the above several thousand times with different parameters and this takes a long time if the whole system is read/optimized/solved over and over again. Therefore, if I could compile my parameterized systems once and then just use the compiled versions I expect a massive speed gain.