Always enabling numeric instability correction with Gurobi - gurobi

We run large-scale optimization problems on regular-basis using Cvxpy+Gurobi.
Our optimization problem sometimes (~10% of the time) becomes numerically unstable.
Fortunately, the issue automatically resolves after re-running the optimization problem with setting of Gurobi's numeric instability correction parameter, i.e. NumericFocus=3.
We were curious on:
To avoid re-running it the second time, can we always by-default enable the numeric instability correction parameter NumericFocus=3?
Other than slightly higher runtime, is other any other downside also?

If you haven't done so already, please read Guidelines for Numerical Issues in the Gurobi Optimizer Reference Manual. In short, there is far more to numerical issues than just setting a magic parameter. If you are a commercial customer, you can contact Gurobi Support for specific guidance on your models.

Related

GUROBI only uses single core to setup problem with cvxpy (python)

I have a large MILP that I build with cvxpy and want to solve with GUROBI. When I give use the solve() function of cvxpy it take a really really really long time to setup and does not start solving for hours. Whilest doing that only 1 core of my cluster is being used. It is used for 100%. I would like to use multiple cores to build the model so that the process of building the model does not take so long. Running grbprobe also shows that gurobi knows about the other cores and for solving the problem it uses multiple cores.
I have tried to run with different flags i.e. turning presolve off and on or giving the number of Threads to be used (this seemed like i didn't even for the solving.
I also have reduce the number of constraints in the problem and it start solving much faster which means that this is definitively not a problem of the model itself.
The problem in it's normal state should have 2200 constraints i reduce it to 150 and it took a couple of seconds until it started to search for a solution.
The problem is that I don't see anything since it takes so long to get the ""set username parameters"" flag and I don't get any information on what the computer does in the mean time.
Is there a way to tell GUROBI or CVXPY that it can take more cpus for the build-up?
Is there another way to solve this problem?
Sorry. The first part of the solve (cvxpy model generation, setup, presolving, scaling, solving the root, preprocessing) is almost completely serial. The parallel part is when it really starts working on the branch-and-bound tree. For many problems, the parallel part is by far the most expensive, but not for all.
This is not only the case for Gurobi. Other high-end solvers have the same behavior.
There are options to do less presolving and preprocessing. That may get you earlier in the B&B. However, usually, it is better not to touch these options.
Running things with verbose=True may give you more information. If you have more detailed questions, you may want to share the log.

Gurobi resume optimization after model modification

As far as i know Gurobi resumes optimizing where it left after calling Model.Terminate() and then calling Model.Optimize() again. So I can terminate and get the best solution so far and then proceed.Now I want to do the same, but since I want to use parts of the suboptimal solution I need to set some variables to fixed values before I call Model.Optimize() again and optimize the rest of the model. How can i do this so that gurobi does not start all over again?
First, it sounds like you're describing a mixed-integer program (MIP); model modification is different for continuous optimization (linear programming, quadratic programming).
When you modify a MIP model, the tree information is no longer helpful. Instead, you must resolve the continuous (LP) relaxation and create a new branch-and-cut tree. However, the prior solution may still be used as a MIP start, which can reduce the solve time for the second model.
However, your method may be redundant with the RINS algorithm, which is an automatic feature of Gurobi MIP. You can control the behavior of RINS via the parameters RINS, SubMIPNodes and Heuristics.

Column generation is exact or heuristic algorithm?

I know that column generation gives an optimal solution and it can be used with other heuristics. But does that make it an exact algorithm? Thanks in advance.
Traditional CG operates on the relaxed problem. Although it finds the optimal LP solution, this may not translate directly into an optimal MIP solution. For some problems (e.g. 1d cutting stock) there is evidence this gap is small, and we just apply the set of columns found for the relaxed problem to a final MIP knowing this is a good solution but necessarily optimal. So it is a heuristic.
With some effort you can use column generation inside a branch-and-bound algorithm (this is called branch-and-price). This gives proven optimal solutions.
An exact algorithm means that the algorithm can solve the optimization problem globally i.e it has given the global optima.
Column generation technique is conventionally applied to relaxed LP problem and tries to optimize the relaxed LP problem by constantly improving the current solution with the help of dual multipliers. It gives an exact LP solution for the relaxed LP problem. But sometimes in real-world problems, the exact solution of the relaxed Lp problem is not feasible to use, it needs to be translated to an integer solution in order to use it. Now if the problem scale is small, then there are many exact MIP algorithms (such as Branch and Bound) which can solve it exactly and give an integer solution. But if the problem is large-scale, even the exact MIP algorithms can take longer runtimes, hence, we use some special/intelligent heuristics to lower the difficulty of the MIP problem.
Summary: Column generation is an exact technique for solving the relaxed LP problem, not the original IP problem.
First, strictly speaking, all algorithms are heuristic, including Simplex Method.
Second, I think Column generation is a heuristic algorithm, because it solves the LP relaxation of the master problem. It does not guarantee IP optimal. Actually CG does not always converge very well.

Synopsys: Repeated compiles produce different results. How to automate iterated compile?

I'm new to using Design Compiler. In the past, I've done mostly FPGA work. Right now, I'm using Synopsys to determine the minimum are necessary to represent some circuits (using the Nangate 45nm library). I'm not doing P&R right now; I'm just trying to determine transistor area.
My only optimization constraint is to minimize area. I've noticed that if I tell DC to compile more than one time in a row, it produces different (and usually smaller) results each time.
I've looked and looked and failed to see if this is mentioned in a manual or anywhere in any discussion. Is it meant to work this way?
This suggests that optimization is stopping earlier than it could, so it's not REALLY minimizing area. Any idea why?
Is there a way I can tell it to increase the effort and/or tell it to automatically iterate compiles so that it will converge on the smallest design?
I'm guessing that DC is expecting to meet timing constraints, but I've given it a purely combinatorial block and no timing constraint. Did they never consider the usage scenario when all you want to do is work out the minimum gate area for a combinatorial circuit?
On a pure combinatorial circuit you can use a set_max_delay constraint and DC will attempt to meet that.
For reduced area you can use -map_effort high or -map_effort ultra to get it to work harder.
DC is a funny beast, and the algorithms it uses change as processes advance and make certain activities more or less useful. A lot of pre-layout optimization is less useful since the whole situation can change once the gates are actually placed and routed.
I filed a support ticket with Synopsys. I was using a 2010 version of design compiler. Apparently, area optimization has been improved since then, and the 2014 version will minimize area in one compiler pass.

Consistant behaviour of float code with GCC

I do some numerical computing, and I have often had problems with floating points computations when using GCC. For my current purpose, I don't care too much about the real precision of the results, but I want this firm property:
no matter WHERE the SAME code is in my program, when it is run on the SAME inputs, I want it to give the SAME outputs.
How can I force GCC to do this? And specifically, what is the behavior of --fast-math, and the different -O optimizations?
I've heard that GCC might try to be clever, and sometimes load floats in registers, and sometime read them directly from memory, and that this might change the precision of the floats, resulting in a different output. How can I avoid this?
Again, I want :
my computations to be fast
my computations to be reliable (ie. same input -> same result)
I don't care that much about the precision for this particular code, so I can be fine with reduced precision if this brings reliability
could anyone tell me what is the way to go for this problem?
If your targets include x86 processors, using the switch that makes gcc use SSE2 instructions (instead of the historical stack-based ones) will make these run more like the others.
If your targets include PowerPC processors, using the switch that makes gcc not use the fmadd instruction (to replace a multiplication followed by an addition in the source code) will make these run more like the others.
Do not use --fast-math: this allows the compiler to take some shortcuts, and this will cause differences between architectures. Gcc is more standard-compliant, and therefore predictable, without this option.
Including your own math functions (exp, sin, ...) in your application instead of relying on those from the system's library can only help with predictability.
And lastly, even when the compiler does rigorously respect the standard (I mean C99 here), there may be some differences, because C99 allows intermediate results to be computed with a higher precision than required by the type of the expression. If you really want the program always to give the same results, write three-address code. Or, use only the maximum precision available for all computations, which would be double if you can avoid the historical x86 instructions. In any case do not use lower-precision floats in an attempt to improve predictability: the effect would be the opposite, as per the above clause in the standard.
I think that GCC is pretty well documented so I'm not going to reveal my own ignorance by trying to answer the parts of your question about its options and their effects. I would, though, make the general statement that when numeric precision and performance are concerned, it pays big dividends to read the manual. The clever people who work on GCC put a lot of effort into their documentation, reading it is rewarding (OK, it can be a trifle dull, but heck, it's a compiler manual not a bodice-ripper).
If it is important to you that you get identical-to-the-last-bit numeric results you'll have to concern yourself with more than just GCC and how you can control its behaviour. You'll need to lock down the libraries it calls, the hardware it runs on and probably a number of other factors I haven't thought of yet. In the worst (?) case you may even want to, and I've seen this done, write your own implementations of f-p maths to guarantee bit-identity across platforms. This is difficult, and therefore expensive, and leaves you possibly less certain of the correctness of your own code than of the code usd by GCC.
However, you write
I don't care that much about the precision for this particular code, so I can be fine with reduced precision if this brings reliability
which prompts the question to you -- why don't you simply use 5-decimal-digit precision as your standard of (reduced) precision ? It's what an awful lot of us in numerical computing do all the time; we ignore the finer aspects of numerical analysis since they are difficult, and costly in computation time, to circumvent. I'm thinking of things like interval arithmetic and high-precision maths. (OF course, if 5 is not right for you, choose another single-digit number.)
But the good news is that this is entirely justifiable: we're dealing with scientific data which, by its nature, comes with errors attached (of course we generally don't know what the errors are but that's another matter) so it's OK to disregard the last few digits in the decimal representation of, say, a 64-bit f-p number. Go right ahead and ignore a few more of them. Even better, it doesn't matter how many bits your f-p numbers have, you will always lose some precision doing numerical calculations on computers; adding more bits just pushes the errors back, both towards the least-significant-bits and towards the end of long-running computations.
The case you have to watch out for is where you have such a poor algorithm, or a poor implementation of an algorithm, that it loses lots of precision quickly. This usually shows up with any reasonable size of f-p number. Your test suite should have exposed this if it is a real problem for you.
To conclude: you have to deal with loss of precision in some way and it's not necessarily wrong to brush the finer details under the carpet.