Practical solver for convex QCQP? - optimization

I am working with a convex QCQP as the following:
Min e'Ie
z'Iz=n
[some linear equalities and inequalities that contain variables w,z, and e]
w>=0, z in [0,1]^n
So the problem has only one quadratic constraint, except the objective, and some variables are nonnegative. The matrices of both quadratic forms are identity matrices, thus are positive definite.
I can move the quadratic constraint to the objective but it must have the negative sign so the problem will be nonconvex:
min e'Ie-z'Iz
The size of the problem can be up to 10000 linear constraints, with 100 nonnegative variables and almost the same number of other variables.
The problem can be rewritten as an MIQP as well, as z_i can be binary, and z'Iz=n can be removed.
So far, I have been working with CPLEX via AIMMS for MIQP and it is very slow for this problem. Using the QCQP version of the problem with CPLEX, MINOS, SNOPT and CONOPT is hopeless as they either cannot find a solution, or the solution is not even close to an approximation that I know a priori.
Now I have three questions:
Do you know any method/technique to get rid of the quadratic constraint as this form without going to MIQP?
Is there any "good" solver for this QCQP? by good, I mean a solver that efficiently finds the global optimum in a resonable time.
Do you think using SDP relaxation can be a solution to this problem? I have never solved an SDP problem in reallity, so I do not know how efficient SDP version can be. Any advice?
Thanks.

Related

Solving an optimization problem bounded by conditional constrains

Basically, I have a dataset that contains 'weights' for some (207) variables, some are more important than the others for determining the class variable (binary) and therefore they are bigger etc. at the end all weigths are summed up across all columns so that the resulting cumulative weight is obtained for each observation.
If this weight is higher then some number then class variable is 1 otherwise is 0. I do have true labels for a class variable so the problem is to minimize false positives.
The thing is, for me it looks like a OR problem as it's about finding optimal weights. However, I am not sure if there is an OR method for such problem, at least I have not heard about one. Question is: does anyone recognize this type of problems and can send some keywords for me to research?
Another thing of course is to predict that with machine learning rather then deterministic methods but I need to do it this way.
Thank you!
Are the variables discrete (integer numbers etc) or continuous (floating point numbers)?
If they are discrete, it sounds like the knapsack problem, which constraint solvers like OptaPlanner (see this training that builds a knapsack solver) excel at.
If they are continuous, look for an LP solver, like CPLEX.
Either way, you'll get much better results than machine learning approaches, because neural nets et al are great at pattern recognition use cases (image/voice recognition, prediction, catagorization, ...), but consistently inferior for constraint optimization problems (like this, I presume).

Why does IPOPT converge faster when using path constraints instead of variable bounds?

I am using GPOPS-II (commercial optimisation software, unfortunately) to solve an aircraft trajectory optimisation problem. GPOPS-II transcribes the problem to a NLP problem that is subsequently solved by IPOPT, an NLP solver.
When trying to solve my problem, I impose a bound on the altitude of the aircraft. I am setting an upper limit of 5500 m on the altitude. Now, I can do this in two ways. First of all, I can set a direct upper bound on the state variable altitude of 5500 m. Doing this, IPOPT requires approximately 1000 iterations and 438 seconds until it finds an optimal solution.
Secondly, I can impose a path constraint on the state variable altitude of 5500 m. At the same time, I am relaxing the direct bound on the state variable altitude to 5750 m. Now, these problem formulations are logically equivalent, but not mathematically it seems: this time IPOPT takes only 150 iterations and 240 seconds to converge to the exact same optimal solution.
I already found a discussion where someone states that loosening the bounds on an NLP program promotes faster convergence, because of the nature of interior point methods. This seems logical to me: an interior point solver transforms the problem to a barrier problem in which the constraints are basically converted to an exponentially increasing cost at the constraint violation boundaries. As a result, the interior point solver will (initially) avoid the bounds of the problem (because of the increasing penalty function at the constraint violation boundaries) and converge at a slower rate.
My questions are the following:
How do the mathematical formulations of bound and of path constraints differ in an interior point method?
Why doesn't setting the bound of the path constraint to 5500 m slow down convergence in the same way the variable bound slows down convergence?
Thanks in advance!
P.s. The optimal solution lies near the constraint boundary of the altitude of 5500 m; in the optimal solution, the aircraft should reach h = 5500 m at the final time, and as a consequence, it flies near this altitude some time before t_f.
I found the answer to my first question in this post. I thought that IPOPT treated path constraints and bounds on variables equally. It turns out that "The only constraints that Ipopt is guaranteed to satisfy at all intermediate iterations are simple upper and lower bounds on variables. Any other linear or nonlinear equality or inequality constraint will not necessarily be satisfied until the solver has finished converging at the final iteration (if it can get to a point that satisfies termination conditions)."
So setting bounds on the variables gives a hard bound on the decision variables, whereas the path constraints give soft bounds.
This also partly answers my second question, in that the difference in convergence is explicable. However, with this knowledge, I'd expect setting bounds to result in faster convergence.

Quadratic (programming) Optimization : Multiply by scalar

I have two - likely simple - questions that are bothering me, both related to quadratic programming:
1). There are two "standard" forms of the objective function I have found, differing by multiplication of negative 1.
In the R package quadprog, the objective function to be minimized is given as −dTb+12bTDb and in Matlab the objective is given as dTb+12bTDb. How can these be the same? It seems that one has been multiplied through by a negative 1 (which as I understand it would change from a min problem to a max problem.
2). Related to the first question, in the case of using quadprog for minimizing least squares, in order to get the objective function to match the standard form, it is necessary to multiply the objective by a positive 2. Does multiplication by a positive number not change the solution?
EDIT: I had the wrong sign for the Matlab objective function.
Function f(b)=dTb is a linear function thus it is both convex and concave. From optimization standpoint it means you can maximize or minimize it. Nevertheless minimizer of −dTb+12bTDb will be different from dTb+12bTDb, because there is additional quadratic term. Matlab implementation will find the one with plus sign. So if you are using different optimization software you will need to change d→−d to get the same result.
The function −dTb+12bTDb where D is symmetric and convex and thus has unique minimum. In general that is called standard quadratic programming form, but that doesn't really matter. The other function dTb−12bTDb is concave function which has unique maximum. It is easy to show that for, say, bounded function f(x) from above the following holds:
argmaxxf=argminx−f
Using the identity above value b∗1 where −dTb+12bTDb achieves minimum is the same as the value b∗2 which achieves maximum at dTb−12bTDb, that is b∗1=b∗2.
Programmatically it doesn't matter if you are minimizing −dTb+12bTDb or maximizing the other one. These are implementation-dependent details.
No it does not. ∀α>0 if x∗=argmaxxf(x), then x∗=argmaxxαf(x). This can be showed by contradiction.

Difference of Convex Functions Optimization

I am looking for the method or idea to solve the following optimization problem:
min f(x)
s.t. g(xi, yi) <= f(x), i=1,...,n
where x, y are variables in R^n. f(x) is convex function with respect to x. g(xi, yi) is a bunch of convex functions with respect to (xi, yi).
It is the problem of difference of convex functions (DC) optimization due to the DC structure of the constraints. Since I am fairly new to 'DC programming', I hope to know the global optimality condition of DC programs and the efficient and popular approaches for global optimization.
In my specific problem, it is already verified that the necessary optimality condition is g(xi*, yi*)=f(x*) for i=1,...,n.
Any ideas or solution would be appreciated, thanks.
For global methods, I would suggest looking into Branch and Bound, Branch and Cut, and Cutting Plane methods. These methods may be notoriously slow though depending on the problem size. It's because it is non-convex. It would be difficult to get efficient algorithms for global optimization for this problem.
For local methods, look into the convex-concave procedure. Actually, any heuristic might work.

Constrained optimization with hessian in scipy

I want to minimize a function, subject to constraints (the variables are non-negative). I can compute the gradient and Hessian exactly. So I want something like:
result = scipy.optimize.minimize(objective, x0, jac=grad, hess=hess, bounds=bds)
I need to specify a method for the optimization (http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html). Unfortunately I can't seem to find a method that allows for both user-specified bounds and a Hessian!
This is particularly annoying because methods "TNC" and "Newton-CG" seem essentially the same, however TNC estimates Hessian internally (in C code), while Newton-CG doesn't allow for constraints.
So, how can I do a constrained optimization with user-specified Hessian? Seems like there ought to be an easy option for this in scipy -- am I missing something?
I realized a workaround for my problem, which is to transform the constrained optimization into an unconstrained optimization.
In my case, since I have the constraint x > 0, I decided to optimize over log(x) instead of x. This was easy to do for my problem since I am using automatic differentiation.
Still, this seems like a somewhat unsatisfying solution -- I still think scipy should allow some constrained second-order minimization method.
just bumped into exactly this point myself. I think the TNC applies an active set to the line search of the CG, not the direction of the line search. Conversely the Hessian chooses the direction of the line. So, er, could maybe cut the line search out of NCG and drop it into TNC. Problem is when you are at the boundary the Hessian might not take you out of it.
How about using TNC for an extremely sloppy first guess [give it a really large error bound to hit], then use NCG with a small number of iterations, check: if on boundary back to TNC, else continue with NCG. Ugh...
Yes, or use log(x). I'm going to follow your lead.