Why the reduced cost for the nonbasic variable can be negative? - optimization

I am working the column generation with the IP solver CPLEX.
When the master problem was solved to optimal, I output the basis information, and I found that there is a nonbasic varible takes value 1.0, and its reduced cost is negative, although the status of the cplex model is optimal.
I can't really understand what happen to this phenomenon. Maybe it is because that the solver can attach both the lower and upper bounds on a variable (eg. [0,1] in my experiment), so the reduced cost of a nonbasic variable can be negative, and its value reaches the upper bound. But I don't know how to prove it.
Any help would be greatly appreciated. Thank you very much!

Depends on maximization or minimization and on whether the variable is at lower bound or at upper bound.

Related

Pseudo-inverse via signular value decomposition in numpy.linalg.lstsq

at the risk of using the wrong SE...
My question concerns the rcond parameter in numpy.linalg.lstsq(a, b, rcond). I know it's used to define the cutoff for small singular values in the singular value decomposition when numpy computed the pseudo-inverse of a.
But why is it advantageous to set "small" singular values (values below the cutoff) to zero rather than just keeping them as small numbers?
PS: Admittedly, I don't know exactly how the pseudo-inverse is computed and exactly what role SVD plays in that.
Thanks!
To determine the rank of a system, you could need compare against zero, but as always with floating point arithmetic we should compare against some finite range. Hitting exactly 0 never happens when adding and subtracting messy numbers.
The (reciprocal) condition number allows you to specify such a limit in a way that is independent of units and scale in your matrix.
You may also opt of of this feature in lstsq by specifying rcond=None if you know for sure your problem can't run the risk of being under-determined.
The future default sounds like a wise choice from numpy
FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions
because if you hit this limit, you are just fitting against rounding errors of floating point arithmetic instead of the "true" mathematical system that you hope to model.

Why does IPOPT converge faster when using path constraints instead of variable bounds?

I am using GPOPS-II (commercial optimisation software, unfortunately) to solve an aircraft trajectory optimisation problem. GPOPS-II transcribes the problem to a NLP problem that is subsequently solved by IPOPT, an NLP solver.
When trying to solve my problem, I impose a bound on the altitude of the aircraft. I am setting an upper limit of 5500 m on the altitude. Now, I can do this in two ways. First of all, I can set a direct upper bound on the state variable altitude of 5500 m. Doing this, IPOPT requires approximately 1000 iterations and 438 seconds until it finds an optimal solution.
Secondly, I can impose a path constraint on the state variable altitude of 5500 m. At the same time, I am relaxing the direct bound on the state variable altitude to 5750 m. Now, these problem formulations are logically equivalent, but not mathematically it seems: this time IPOPT takes only 150 iterations and 240 seconds to converge to the exact same optimal solution.
I already found a discussion where someone states that loosening the bounds on an NLP program promotes faster convergence, because of the nature of interior point methods. This seems logical to me: an interior point solver transforms the problem to a barrier problem in which the constraints are basically converted to an exponentially increasing cost at the constraint violation boundaries. As a result, the interior point solver will (initially) avoid the bounds of the problem (because of the increasing penalty function at the constraint violation boundaries) and converge at a slower rate.
My questions are the following:
How do the mathematical formulations of bound and of path constraints differ in an interior point method?
Why doesn't setting the bound of the path constraint to 5500 m slow down convergence in the same way the variable bound slows down convergence?
Thanks in advance!
P.s. The optimal solution lies near the constraint boundary of the altitude of 5500 m; in the optimal solution, the aircraft should reach h = 5500 m at the final time, and as a consequence, it flies near this altitude some time before t_f.
I found the answer to my first question in this post. I thought that IPOPT treated path constraints and bounds on variables equally. It turns out that "The only constraints that Ipopt is guaranteed to satisfy at all intermediate iterations are simple upper and lower bounds on variables. Any other linear or nonlinear equality or inequality constraint will not necessarily be satisfied until the solver has finished converging at the final iteration (if it can get to a point that satisfies termination conditions)."
So setting bounds on the variables gives a hard bound on the decision variables, whereas the path constraints give soft bounds.
This also partly answers my second question, in that the difference in convergence is explicable. However, with this knowledge, I'd expect setting bounds to result in faster convergence.

In GAMS, how to deal with divisions?

In my GAMS model, I have a objective function that involves a division.
GAMS sets the initial values to zero whenever it solves something...brilliant idea, how could that possibly ever go wrong!....oh wait, now there's division by zero.
What is the approach to handle this? I have tried manually setting lower bounds such that division by zero is avoided, but then GAMS spits out "infeasible" solution.
Which is wrong, since I know the model is feasible. In fact, removing the division term from my model and resolving does produce a solution. This solution ought to be feasible for the original problem as well, since we are just adding terms to the objective.
Here are some common approaches:
set a lower bound. E.g. Z =E= X/Y, add Y.LO = 0.0001;
similarly, write something like: Z =E= X/(Y+0.0001)
set a initial value. E.g. Y.L = 1
Multiply both sides by Y: Z*Y =E= X
For any non-linear variable you should really think carefully about bounds and initial values (irrespective of division).
Try using the $ sign. For example: A(i,j)$C(i,j) = B(i,j) / C(i,j)

Break if Newton's method is not convergent

I'm trying to implement Newton's method for polynomials to find zero of function. But I must predict case when function hasn't a root. I'm wondering how can I detect moment when the method becomes divergent and then stop procedure?
Thank you in advance for any help
Generally, if the root is not found after 10 iterations, then the initial point was bad. To be safe take 15 or 20 iterations. Or check after 5-10 iterations for quadratic convergence, measured by the function value decreasing from iteration to iteration faster than by a factor of 0.25.
Restart in a bad case with a different point.

LDPC behaviour as density of parity-check matrix increases

My assignment is to implement a Loopy Belief Propagation algorithm for Low-density Parity-check Code. This code uses a parity-check matrix H which is rather sparse (say 750-by-1000 binary matrix with an average of about 3 "ones" per each column). The code to generate the parity-check matrix is taken from here
Anyway, one of the subtasks is to check the reliability of LDPC code when the density of the matrix H increases. So, I fix the channel at 0.5 capacity, fix my code speed at 0.35 and begin to increase the density of the matrix. As the average number of "ones" in a column goes from 3 to 7 in steps of 1, disaster happens. With 3 or 4 the code copes perfectly well. With higher density it begins to fail: not only does it sometimes fail to converge, it oftentimes converges to the wrong codeword and produces mistakes.
So my question is: what type of behaviour is expected of an LDPC code as its sparse parity-check matrix becomes denser? Bonus question for skilled mind-readers: in my case (as the code performance degrades) is it more likely because the Loopy Belief Propagation algo has no guarantee on convergence or because I made a mistake implementing it?
After talking to my TA and other students I understand the following:
According to Shannon's theorem, the reliability of the code should increase with the density of the parity check matrix. That is simply because more checks are made.
However, since we use Loopy Belief Propagation, it struggles a lot when there are more and more edges in the graph forming more and more loops. Therefore, the actual performance degrades.
Whether or not I made a mistake in my code based solely on this behaviour cannot be established. However, since my code does work for sparse matrices, it is likely that the implementation is fine.