AMPL IPOPT gives wrong optimal solution while solve result is "solved" - optimization

I am trying to solve a very simple optimization problem in AMPL with IPOPT as follow:
var x1 >= 0 ;
minimize obj: -(x1^2)+x1;
obviously the problem is unbounded. but IPOPT gives me:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************
This is Ipopt version 3.12.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).
Number of nonzeros in equality constraint Jacobian...: 0
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 1
Total number of variables............................: 1
variables with only lower bounds: 1
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 0
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 9.8999902e-03 0.00e+00 2.00e-02 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 1.5346023e-04 0.00e+00 1.50e-09 -3.8 9.85e-03 - 1.00e+00 1.00e+00f 1
2 1.7888952e-06 0.00e+00 1.84e-11 -5.7 1.52e-04 - 1.00e+00 1.00e+00f 1
3 -7.5005506e-09 0.00e+00 2.51e-14 -8.6 1.80e-06 - 1.00e+00 1.00e+00f 1
Number of Iterations....: 3
(scaled) (unscaled)
Objective...............: -7.5005505996934397e-09 -7.5005505996934397e-09
Dual infeasibility......: 2.5091040356528538e-14 2.5091040356528538e-14
Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00
Complementarity.........: 2.4994494940593761e-09 2.4994494940593761e-09
Overall NLP error.......: 2.4994494940593761e-09 2.4994494940593761e-09
Number of objective function evaluations = 4
Number of objective gradient evaluations = 4
Number of equality constraint evaluations = 0
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 0
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 3
Total CPU secs in IPOPT (w/o function evaluations) = 0.001
Total CPU secs in NLP function evaluations = 0.000
EXIT: Optimal Solution Found.
Ipopt 3.12.4: Optimal Solution Found
suffix ipopt_zU_out OUT;
suffix ipopt_zL_out OUT;
ampl: display x1;
x1 = 0
when I change the solver to Gurobi, it gives this message:
Gurobi 6.5.0: unbounded; variable.unbdd returned.
which is what I expected.
I can not understand why it happens and now I don't know if I need to check it for all the problem that I am trying to solve to not converging to the the wrong optimal solution. As it is a super simple example it is a little bit strange.
I would appreciate if anybody can help me with this.
Thanks

You've already identified the basic problem, but elaborating a little on why these two solvers give different results:
IPOPT is designed to cover a wide range of optimisation problems, so it uses some fairly general numeric optimisation methods. I'm not familiar with the details of IPOPT but usually this sort of approach relies on picking a starting point, looking at the curvature of the objective function in the neighbourhood of that starting point, and following the curvature "downhill" until they find a local optimum. Different starting points can lead to different results. In this case IPOPT is probably defaulting to zero for the starting point, so it's right on top of that local minimum. As Erwin's suggested, if you specify a different starting point it might find the unboundedness.
Gurobi is designed specifically for quadratic/linear problems, so it uses very different methods which aren't susceptible to local-minimum issues, and it will probably be much more efficient for quadratics. But it doesn't support more general objective functions.

I think I understand why it happened. the objective function
-(x1^2)+x1;
is not convex. therefore the given solution is local optimum.

Related

Getting "DUAL_INFEASIBLE" when solving a very simple linear programming problem

I am solving a simple LP problem using Gurobi with dual simplex and presolve. I get the model is unbounded but I couldn't see why such a model is unbounded. Can anyone help to tell me where goes wrong?
I attached the log and also the content in the .mps file.
Thanks very much in advance.
Kind regards,
Hongyu.
The output log and .mps file:
Link to the .mps file: https://studntnu-my.sharepoint.com/:u:/g/personal/hongyuzh_ntnu_no/EV5CBhH2VshForCL-EtPvBUBiFT8uZZkv-DrPtjSFi8PGA?e=VHktwf
Gurobi Optimizer version 9.5.2 build v9.5.2rc0 (mac64[arm])
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 1 rows, 579 columns and 575 nonzeros
Coefficient statistics:
Matrix range [3e-02, 5e+01]
Objective range [7e-01, 5e+01]
Bounds range [0e+00, 0e+00]
RHS range [7e+03, 7e+03]
Iteration Objective Primal Inf. Dual Inf. Time
0 handle free variables 0s
Solved in 0 iterations and 0.00 seconds (0.00 work units)
Unbounded model
The easiest way to debug this is to put a bound on the objective, so the model is no longer unbounded. Then inspect the solution. This is a super easy trick that somehow few people know about.
When we do this with a bound of 100000, we see:
phi = 100000.0000
gamma[11] = -1887.4290
(the rest zero). Indeed we can make gamma[11] as negative as we want to obey R0. Note that gamma[11] is not in the objective.
More advice: It is also useful to write out the LP file of the model and study that carefully. You probably would have caught the error and that would have prevented this post.

SLSQP in ScipyOptimizeDriver only executes one iteration, takes a very long time, then exits

I'm trying to use SLSQP to optimise the angle of attack of an aerofoil to place the stagnation point in a desired location. This is purely as a test case to check that my method for calculating the partials for the stagnation position is valid.
When run with COBYLA, the optimisation converges to the correct alpha (6.04144912) after 47 iterations. When run with SLSQP, it completes one iteration, then hangs for a very long time (10, 20 minutes or more, I didn't time it exactly), and exits with an incorrect value. The output is:
Driver debug print for iter coord: rank0:ScipyOptimize_SLSQP|0
--------------------------------------------------------------
Design Vars
{'alpha': array([0.5])}
Nonlinear constraints
None
Linear constraints
None
Objectives
{'obj_cmp.obj': array([0.00023868])}
Driver debug print for iter coord: rank0:ScipyOptimize_SLSQP|1
--------------------------------------------------------------
Design Vars
{'alpha': array([0.5])}
Nonlinear constraints
None
Linear constraints
None
Objectives
{'obj_cmp.obj': array([0.00023868])}
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.0002386835700364719
Iterations: 1
Function evaluations: 1
Gradient evaluations: 1
Optimization Complete
-----------------------------------
Finished optimisation
Why might SLSQP be misbehaving like this? As far as I can tell, there are no incorrect analytical derivatives when I look at check_partials().
The code is quite long, so I put it on Pastebin here:
core: https://pastebin.com/fKJpnWHp
inviscid: https://pastebin.com/7Cmac5GF
aerofoil coordinates (NACA64-012): https://pastebin.com/UZHXEsr6
You asked two questions whos answers ended up being unrelated to eachother:
Why is the model so slow when you use SLSQP, but fast when you use COBYLA
Why does SLSQP stop after one iteration?
1) Why is SLSQP so slow?
COBYLA is a gradient free method. SLSQP uses gradients. So the solid bet was that slow down happened when SLSQP asked for the derivatives (which COBYLA never did).
Thats where I went to look first. Computing derivatives happens in two steps: a) compute partials for each component and b) solve a linear system with those partials to compute totals. The slow down has to be in one of those two steps.
Since you can run check_partials without too much trouble, step (a) is not likely to be the culprit. So that means step (b) is probably where we need to speed things up.
I ran the summary utility (openmdao summary core.py) on your model and saw this:
============== Problem Summary ============
Groups: 9
Components: 36
Max tree depth: 4
Design variables: 1 Total size: 1
Nonlinear Constraints: 0 Total size: 0
equality: 0 0
inequality: 0 0
Linear Constraints: 0 Total size: 0
equality: 0 0
inequality: 0 0
Objectives: 1 Total size: 1
Input variables: 87 Total size: 1661820
Output variables: 44 Total size: 1169614
Total connections: 87 Total transfer data size: 1661820
Then I generated an N2 of your model and saw this:
So we have an output vector that is 1169614 elements long, which means your linear system is a matrix that is about 1e6x1e6. Thats pretty big, and you are using a DirectSolver to try and compute/store a factorization of it. Thats the source of the slow down. Using DirectSolvers is great for smaller models (rule of thumb, is that the output vector should be less than 10000 elements). For larger ones you need to be more careful and use more advanced linear solvers.
In your case we can see from the N2 that there is no coupling anywhere in your model (nothing in the lower triangle of the N2). Purely feed-forward models like this can use a much simpler and faster LinearRunOnce solver (which is the default if you don't set anything else). So I turned off all DirectSolvers in your model, and the derivatives became effectively instant. Make your N2 look like this instead:
The choice of best linear solver is extremely model dependent. One factor to consider is computational cost, another is numerical robustness. This issue is covered in some detail in Section 5.3 of the OpenMDAO paper, and I won't cover everything here. But very briefly here is a summary of the key considerations.
When just starting out with OpenMDAO, using DirectSolver is both the simplest and usually the fastest option. It is simple because it does not require consideration of your model structure, and it's fast because for small models OpenMDAO can assemble the Jacobian into a dense or sparse matrix and provide that for direct factorization. However, for larger models (or models with very large vectors of outputs), the cost of computing the factorization is prohibitively high. In this case, you need to break the solver structure down more intentionally, and use other linear solvers (sometimes in conjunction with the direct solver--- see Section 5.3 of OpenMDAO paper, and this OpenMDAO doc).
You stated that you wanted to use the DirectSolver to take advantage of the sparse Jacobian storage. That was a good instinct, but the way OpenMDAO is structured this is not a problem either way. We are pretty far down in the weeds now, but since you asked I'll give a short summary explanation. As of OpenMDAO 3.7, only the DirectSolver requires an assembled Jacobian at all (and in fact, it is the linear solver itself that determines this for whatever system it is attached to). All other LinearSolvers work with a DictionaryJacobian (which stores each sub-jac keyed to the [of-var, wrt-var] pair). Each sub-jac can be stored as dense or sparse (depending on how you declared that particular partial derivative). The dictionary Jacobian is effectively a form of a sparse-matrix, though not a traditional one. The key takeaway here is that if you use the LinearRunOnce (or any other solver), then you are getting a memory efficient data storage regardless. It is only the DirectSolver that changes over to a more traditional assembly of an actual matrix object.
Regarding the issue of memory allocation. I borrowed this image from the openmdao docs
2) Why does SLSQP stop after one iteration?
Gradient based optimizations are very sensitive to scaling. I ploted your objective function inside your allowed design space and got this:
So we can see that the minimum is at about 6 degrees, but the objective values are TINY (about 1e-4).
As a general rule of thumb, getting your objective to around order of magnitude 1 is a good idea (we have a scaling report feature that helps with this). I added a reference that was about the order of magnitude of your objective:
p.model.add_objective('obj', ref=1e-4)
Then I got a good result:
Optimization terminated successfully (Exit mode 0)
Current function value: [3.02197589e-11]
Iterations: 7
Function evaluations: 9
Gradient evaluations: 7
Optimization Complete
-----------------------------------
Finished optimization
alpha = [6.04143334]
time: 2.1188600063323975 seconds
Unfortunately, scaling is just hard with gradient based optimization. Starting by scaling your objective/constraints to order-1 is a decent rule of thumb, but its common that you need to adjust things beyond that for more complex problems.

How to display optimal variable values of a class-type Pyomo model?

I am a new Pyomo/Python user, and I am just wondering how to display the optimal variable values in a class-type Pyomo model.
I have just tried the standard example from Pyomo example library, the "min-cost-flow model". The code is available in https://github.com/Pyomo/PyomoGallery/wiki/Min-cost-flow
At the bottom of the code, it says:
sp = MinCostFlow('nodes.csv', 'arcs.csv')
sp.solve()
print('\n\n---------------------------')
print('Cost: ', sp.m.OBJ())
and the output is
Academic license - for non-commercial use only
Read LP format model from file.
Reading time = 0.00 seconds
x8: 7 rows, 8 columns, 16 nonzeros
No parameters matching 'mip_tolerances_integrality' found
No parameters matching 'mip_tolerances_mipgap' found
Optimize a model with 7 rows, 8 columns and 16 nonzeros
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [1e+00, 5e+00]
Bounds range [0e+00, 0e+00]
RHS range [1e+00, 1e+00]
Presolve removed 7 rows and 8 columns
Presolve time: 0.00s
Presolve: All rows and columns removed
Iteration Objective Primal Inf. Dual Inf. Time
0 5.0000000e+00 0.000000e+00 0.000000e+00 0s
Solved in 0 iterations and 0.01 seconds
Optimal objective 5.000000000e+00
I can only get the optimal objective, but how about the optimal variable values? I have also searched the documentation, which told me to use something like:
print("x[2]=",pyo.value(model.x[2])).
But it did not work for class-type models like the min-cost-flow model.
I also tried to modify the function definition in the class:
def solve(self):
"""Solve the model."""
solver = pyomo.opt.SolverFactory('gurobi')
results = solver.solve(self.m, tee=True, keepfiles=False, options_string="mip_tolerances_integrality=1e-9, mip_tolerances_mipgap=0")
print('\n\n---------------------------')
print('First Variable: ', self.m.Y[0])
But it did not work as well. The output is:
KeyError: "Index '0' is not valid for indexed component 'Y'"
Can you help me with this? Thanks!
Gabriel
The most straightforward way to display the model results after solution is to use the model.display() function. In your case, self.m.display().
The display() function works on Var objects as well, so if you have a variable self.m.x, you could do self.m.x.display().

Linear programming for wait time optimization

I am trying to solve a problem using simplex method.Although this is a mathematical problem, I need to solve it using any programming language.I am stuck at basic phase itself about dealing those modulus, while coding the matrix Ax=B which is used to solve the problem in a general case simplex.
Route Departure Runtime Arrival Wait time\\
A-B x 4 MOD(x+4,24) MOD(y-(MOD(x+4,24),24)\\
B-C y 6 MOD(y+6,24) MOD(z-(MOD(y+6,24),24)\\
C-D z 8 MOD(z+8,24) MOD(8-(MOD(z+8,24),24)\\
The objective is to minimize the total wait time
subject to constraints
0<= x,y,z <= 24
Simplex is not specifically required, any method may be used.
edit -
This is a part of much bigger problem, so just assuming z = 0 and starting won't help. I need to solve the entire thing.I want to know how to deal with the modulus.
The expression
y = mod(x,24)
is not linear so we can not use it in a continuous LP (Linear Programming) model. However, it can be modeled in a Mixed Integer Program as
x = k*24 + y
k : integer variable
0 <= y <= 23.999
You'll need a MIP solver for this.

Tensorflow: Tuning rules of quantization

Is there a way that the following process:
https://www.tensorflow.org/performance/quantization
And the call:
tf.contrib.quantize.create_eval_graph()
Could be tuned in such way like the following call does?
https://www.tensorflow.org/versions/master/api_docs/python/tf/quantize
I would like to have the weights being scaled to 8bits with symmetric ranges, with exact 0 and max/min being power 2 like it's with the SCALED mode. For example I would prefer -31 to 31 instead of -10 to 30. Even when -10 to 30 would give better resolution at 8bits, but accurate 0, symmetricity and range as power of 2 is more important for DSP devices.
TOCO(tf.lite.TocoConverter) so far does not have the option to control quantization type since you actually want symmetric quantization instead of asymmetric approach. However, real value of 0.0 is guaranteed to be accurate during quantization. This means 0.0 is mapped to an uint8 q without any rounding error.