SCIP what is the function for sign? - optimization

I am new to SCIP and have read through some of the example problems and documentation, but am still unsure how to formulate the following problem for the SCIP solver:
argmax(w) sum(sign(Aw) == sign(b))
where A is a nxm matrix, w is a mx1 vector, and b is a nx1 vector. The data type is floats/real numbers, and it is a constraint-free problem.
Values for A and b are also contained row-wise in a .txt file. How can I import that?
Overall - I am new to SCIP and have no idea how to start creating variables (especially the objective function value parameter), importing data, formulate the objective function... It's a bit of a stretch for me to ask this question, but your help is appreciated!

This should work:
where beta(i) = sign(b(i)). The implication can be implemented using indicator constraints. This way we don't need big-M's.
Most likely the >= 0 constraint should be >= 0.0001 (otherwise we can set all w(j)=0).

Related

induction for SCIPopt's setppc

Regarding SCIP's "constraint handler for the set partitioning / packing / covering":
Is it smart enough to deduce all forms that it supports without me having to call the setppc functions directly?
Can it handle/detect forms of sum(x) == y where x is a list of binary variables and y is also a binary variable? Same question for less than or equal?
The docs for it state that it requires a right-hand-side equal to 1. What about RHS=0?
If I understand you correctly you are asking if SCIP will see that a linear constraint is a setppc constraint and automatically upgrade it? Yes.
Yes, it should not matter how you write it.
A sum of binary variables with rhs = 0 will just propagate and fix all variables to 0. (if only lhs is 0 then that is redundant)
If some of the coefficients are -1 instead of +1 SCIP will still try to make it work by negating all negative variables (or all positive ones and multiply by -1 afterwards). SCIP will check for any linear constraint that has only binary variables and +1/-1 coefficients if it can be upgraded in such a way.

How to use PySCIPOpt for feasibility-only problem

I have used CVXPY and some of its LP solvers to determine whether a solution to an A*x <= b problem is feasible, and now I would like to try PySCIPOpt. I could not find an example of this in the docs, and I'm having trouble figuring out the right syntax. With CVXPY the code is simply:
def do_cvxpy(A, b, solver):
x = cvxpy.Variable(A.shape[1])
constraints = [A#x <= b] #The # denotes matrix multiplication in CVXPY
obj = cvxpy.Minimize(0)
prob = cvxpy.Problem(obj, constraints)
prob.solve(solver=solver)
return prob.status
I think with PySCIPOpt one cannot use matrix notation as above, but must treat vectors and matrices as collections of scalar variables, each of which has to be added individually, so I tried this:
def do_scip(A, b):
model = Model("XYZ")
x = {}
for i in range(A.shape[1]):
x[i] = model.addVar(vtype="C", name="x(%s)" % i)
model.setObjective(0) #Is this right for a feasibility-only problem?
model.addCons(A*x <= b) #This is certainly the wrong syntax
model.optimize()
return model.getStatus()
Could anyone please help me out with the correct form for the constraint in addCons() for this kind of problem, and confirm that an acceptable way to ask whether a solution is feasible is to simply pass 0 as the objective?
I'm still not positive about the setObjective(0), but at least I can get the code to run without errors by "unpacking" the A matrix and the b vector and adding each element as a constraint:
for i in range(ncols):
for j in range(nrows):
model.addCons(A[j,i]*x[i] <= b[i])
I also discovered that CVXPY actually has an interface to SCIP, but it gives me an error when I try to use it:
getSolObjVal cannot only be called in stage SOLVING without a valid solution
which seems to suggest that the interface cannot be used for feasibility-only problems.

Quadratic (programming) Optimization : Multiply by scalar

I have two - likely simple - questions that are bothering me, both related to quadratic programming:
1). There are two "standard" forms of the objective function I have found, differing by multiplication of negative 1.
In the R package quadprog, the objective function to be minimized is given as −dTb+12bTDb and in Matlab the objective is given as dTb+12bTDb. How can these be the same? It seems that one has been multiplied through by a negative 1 (which as I understand it would change from a min problem to a max problem.
2). Related to the first question, in the case of using quadprog for minimizing least squares, in order to get the objective function to match the standard form, it is necessary to multiply the objective by a positive 2. Does multiplication by a positive number not change the solution?
EDIT: I had the wrong sign for the Matlab objective function.
Function f(b)=dTb is a linear function thus it is both convex and concave. From optimization standpoint it means you can maximize or minimize it. Nevertheless minimizer of −dTb+12bTDb will be different from dTb+12bTDb, because there is additional quadratic term. Matlab implementation will find the one with plus sign. So if you are using different optimization software you will need to change d→−d to get the same result.
The function −dTb+12bTDb where D is symmetric and convex and thus has unique minimum. In general that is called standard quadratic programming form, but that doesn't really matter. The other function dTb−12bTDb is concave function which has unique maximum. It is easy to show that for, say, bounded function f(x) from above the following holds:
argmaxxf=argminx−f
Using the identity above value b∗1 where −dTb+12bTDb achieves minimum is the same as the value b∗2 which achieves maximum at dTb−12bTDb, that is b∗1=b∗2.
Programmatically it doesn't matter if you are minimizing −dTb+12bTDb or maximizing the other one. These are implementation-dependent details.
No it does not. ∀α>0 if x∗=argmaxxf(x), then x∗=argmaxxαf(x). This can be showed by contradiction.

Errors to fit parameters of scipy.optimize

I use the scipy.optimize.minimize ( https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html ) function with method='L-BFGS-B.
An example of what it returns is here above:
fun: 32.372210618549758
hess_inv: <6x6 LbfgsInvHessProduct with dtype=float64>
jac: array([ -2.14583906e-04, 4.09272616e-04, -2.55795385e-05,
3.76587650e-05, 1.49213975e-04, -8.38440428e-05])
message: 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
nfev: 420
nit: 51
status: 0
success: True
x: array([ 0.75739412, -0.0927572 , 0.11986434, 1.19911266, 0.27866406,
-0.03825225])
The x value correctly contains the fitted parameters. How do I compute the errors associated to those parameters?
TL;DR: You can actually place an upper bound on how precisely the minimization routine has found the optimal values of your parameters. See the snippet at the end of this answer that shows how to do it directly, without resorting to calling additional minimization routines.
The documentation for this method says
The iteration stops when (f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= ftol.
Roughly speaking, the minimization stops when the value of the function f that you're minimizing is minimized to within ftol of the optimum. (This is a relative error if f is greater than 1, and absolute otherwise; for simplicity I'll assume it's an absolute error.) In more standard language, you'll probably think of your function f as a chi-squared value. So this roughly suggests that you would expect
Of course, just the fact that you're applying a minimization routine like this assumes that your function is well behaved, in the sense that it's reasonably smooth and the optimum being found is well approximated near the optimum by a quadratic function of the parameters xi:
where Δxi is the difference between the found value of parameter xi and its optimal value, and Hij is the Hessian matrix. A little (surprisingly nontrivial) linear algebra gets you to a pretty standard result for an estimate of the uncertainty in any quantity X that's a function of your parameters xi:
which lets us write
That's the most useful formula in general, but for the specific question here, we just have X = xi, so this simplifies to
Finally, to be totally explicit, let's say you've stored the optimization result in a variable called res. The inverse Hessian is available as res.hess_inv, which is a function that takes a vector and returns the product of the inverse Hessian with that vector. So, for example, we can display the optimized parameters along with the uncertainty estimates with a snippet like this:
ftol = 2.220446049250313e-09
tmp_i = np.zeros(len(res.x))
for i in range(len(res.x)):
tmp_i[i] = 1.0
hess_inv_i = res.hess_inv(tmp_i)[i]
uncertainty_i = np.sqrt(max(1, abs(res.fun)) * ftol * hess_inv_i)
tmp_i[i] = 0.0
print('x^{0} = {1:12.4e} ± {2:.1e}'.format(i, res.x[i], uncertainty_i))
Note that I've incorporated the max behavior from the documentation, assuming that f^k and f^{k+1} are basically just the same as the final output value, res.fun, which really ought to be a good approximation. Also, for small problems, you can just use np.diag(res.hess_inv.todense()) to get the full inverse and extract the diagonal all at once. But for large numbers of variables, I've found that to be a much slower option. Finally, I've added the default value of ftol, but if you change it in an argument to minimize, you would obviously need to change it here.
One approach to this common problem is to use scipy.optimize.leastsq after using minimize with 'L-BFGS-B' starting from the solution found with 'L-BFGS-B'. That is, leastsq will (normally) include and estimate of the 1-sigma errors as well as the solution.
Of course, that approach makes several assumption, including that leastsq can be used and may be appropriate for solving the problem. From a practical view, this requires the objective function return an array of residual values with at least as many elements as variables, not a cost function.
You may find lmfit (https://lmfit.github.io/lmfit-py/) useful here: It supports both 'L-BFGS-B' and 'leastsq' and gives a uniform wrapper around these and other minimization methods, so that you can use the same objective function for both methods (and specify how to convert the residual array into the cost function). In addition, parameter bounds can be used for both methods. This makes it very easy to first do a fit with 'L-BFGS-B' and then with 'leastsq', using the values from 'L-BFGS-B' as starting values.
Lmfit also provides methods to more explicitly explore confidence limits on parameter values in more detail, in case you suspect the simple but fast approach used by leastsq might be insufficient.
It really depends what you mean by "errors". There is no general answer to your question, because it depends on what you're fitting and what assumptions you're making.
The easiest case is one of the most common: when the function you are minimizing is a negative log-likelihood. In that case the inverse of the hessian matrix returned by the fit (hess_inv) is the covariance matrix describing the Gaussian approximation to the maximum likelihood.The parameter errors are the square root of the diagonal elements of the covariance matrix.
Beware that if you are fitting a different kind of function or are making different assumptions, then that doesn't apply.

Usage of scipy.optimize.fmin_slsqp for Integer design variable

I'm trying to use the scipy.optimize.slsqp for an industrial-related constrained optimization. A highly non-linear FE model is used to generate the objective and the constraint functions, and their derivatives/sensitivities.
The objective function is in the form:
obj=a number calculated from the FE model
A series of constraint functions are set, and most of them are in the form:
cons = real number i - real number j (calculated from the FE model)
I would like to try to restrict the design variables to integers as that would be what input into the plant machine.
Another consideration is to have a log file recording what design variable have been tried. if a set of design variable (integer) is already tried for, skip the calculation, perturb the design variable and try again. By limiting the design variable to integers, we are able to limit the number of trials (while leaving the design variable to real, a change in the e.g. 8th decimal point could be regarded as untried values).
I'm using SLSQP as it is one of the SQP method (please correct me if I am wrong), and the it is said to be powerful to deal with nonlinear problems. I understand the SLSQP algorithm is a gradient-based optimizer and there is no way I can implement the restriction of the design variables being integer in the algorithm coded in FORTRAN. So instead, I modified the slsqp.py file to the following (where it calls the python extension built from the FORTRAN algorithm):
slsqp(m, meq, x, xl, xu, fx, c, g, a, acc, majiter, mode, w, jw)
for i in range(len(x)):
x[i]=int(x[i])
The code stops at the 2nd iteration and output the following:
Optimization terminated successfully. (Exit mode 0)
Current function value: -1.286621577077517
Iterations: 7
Function evaluations: 0
Gradient evaluations: 0
However, one of the constraint function is violated (value at about -5.2, while the default convergence criterion of the optimization code = 10^-6).
Questions:
1. Since the FE model is highly nonlinear, I think it's safe to assume the objective and constraint functions will be highly nonlinear too (regardless of their mathematical form). Is that correct?
2. Based on the convergence criterion of the slsqp algorithm(please see below), one of which requires the sum of all constraint violations(absolute values) to be less than a very small value (10^-6), how could the optimization exit with successful termination message?
IF ((ABS(f-f0).LT.acc .OR. dnrm2_(n,s,1).LT.acc).AND. h3.LT.acc)
Any help or advice is appreciated. Thank you.