Global random seed parameter in SCIP? - scip

How can I set a kind of global random seed on SCIP in order to obtain potentially different behaviors when solving a MIP? I'm looking for something like the Seed parameter in Gurobi or the CPXPARAM_RandomSeed parameter in CPLEX.
Looking at the SCIP documentation, I see the following parameters but they make reference to particular plugins or aspects of the algorithm, and there does not seem to be a "global" random seed:
randomization/permutationseed (for permuting problem)
randomization/lpseed (for simplex)
branching/random/seed (for the random branching rule)
branching/relpscost/startrandseed (for the relpscost branching rule)
heuristics/alns/seed (for bandit algorithms)
separating/zerohalf/initseed (for tie-breaking in cut selection)
I do see the randomization/randomseedshift parameter that is described as "global shift of all random seeds in the plugins and the LP random seed". Could this parameter be used to achieve a global effect?
Thanks!

The short answer is yes. The parameter randomization/randomseedshift affects all solver plugins that use randomization, and the LP.
The longer answer is that randomization of the solving process in SCIP can be achieved in three different basic ways:
changing the randomization/randomseedshift parameter, which affects the seed initialization of all plugins and the LP
changing the randomization/lpseed parameter for changing simplex randomization only
randomization/permutationseed for a permutation of the constraints and variables of the problems.
The permutation of the problem is the classic way of randomizing the solution process, however, it may obfuscate problem structure of the original input model.
SCIP also provides access to the individual seeds such as heuristics/alns/seed to modify only the behavior of a single plugin without affecting the rest.

Related

Algebraic/implicit loops handling by Gekko

I have got a specific question with regards to algebraic / implicit loops handling by Gekko.
I will give examples in the field of Chemical Engineering, as this is how I found the project and its other libraries.
For example, when it comes to multicomponent chemical equilibrium calculations, it is not possible to explicitly work out the equations, because the concentration of one specie may be present in many different equations.
I have been using other paid software in the past and it would automatically propose a resolution procedure based on how the system is solvable (by analyzing dependency and creating automatic algebraic loops).
My question would be:
Does Gekko do that automatically?
It is a little bit tricky because sometimes one needs to add tear variables and iterate from a good starting value.
I know this message may be a little bit abstract, but I am trying to decide which software to use for my work and this is a pragmatic bottle neck that I have happened to find.
Thanks in advance for your valuable insight.
Python Gekko uses a simultaneous solution strategy so that all units are solved together instead of sequentially. Therefore, tear variables are not needed but large flowsheet problems with recycle can be difficult to converge to a feasible solution. Below are three methods that are in Python Gekko to assist in efficient solutions and initialization.
Method 1: Intermediate Variables
Intermediate variables are useful to decrease the complexity of the model. In many models, the temporary variables outnumber the regular variables. This model reduction often aides the solver in finding a solution by reducing the problem size. Intermediate variables are declared with m.Intermediates() in Python Gekko. The intermediate variables may be defined in one section or in multiple declarations throughout the model. Intermediate variables are parsed sequentially, from top to bottom. To avoid inadvertent overwrites, intermediate variable can be defined once. In the case of intermediate variables, the order of declaration is critical. If an intermediate is used before the definition, an error reports that there is an uninitialized value. Here is additional information on Intermediates with an example problem.
Method 2: Lower Block Triangular Decomposition
For large problems that have trouble with initialization, there is a mode that is activated with the option m.options.COLDSTART=2. This mode performs a lower block triangular decomposition to automatically identify independent blocks that are then solved independently and sequentially.
This decomposition method for initialization is discussed in the PhD dissertation (chapter 2) of Mostafa Safdarnejad or also in Safdarnejad, S.M., Hedengren, J.D., Lewis, N.R., Haseltine, E., Initialization Strategies for Optimization of Dynamic Systems, Computers and Chemical Engineering, 2015, Vol. 78, pp. 39-50, DOI: 10.1016/j.compchemeng.2015.04.016.
Method 3: Automatic Model Reduction
Model reduction requires more pre-processing time but can help to significantly reduce the solver time. There is additional documentation on m.options.REDUCE.
Overall Strategy for Initialization
The overall strategy that we use for initializing hard problems, such as flowsheets with recycle, is shown in this flowchart.
Sometimes it does mean breaking recycles to get an initialized solution. Other times, the initialization strategies detailed above work well and no model rearrangement is necessary. The advantage of working with a simultaneous solution strategy is degree of freedom swapping such as downstream variables can be fixed and upstream variables calculated to meet that value.

Pyomo-IPOPT: solver falls into local minima, how to avoid that?

I am trying to solve an optimisation problem consisting in finding the global maximum of a high dimensional (10+) monotonic function (as in monotonic in every direction). The constraints are such that they cut the search space with planes.
I have coded the whole thing in pyomo and I am using the ipopt solver. In most cases, I am confident it converges successfully to the global optimal. But if I play a bit with the constraints I see that it sometimes converges to a local minima.
It looks like a exploration-exploitation trade-off.
I have looked into the options that can be passed to ipopt and the list is so long that I cannot understand which parameters to play with to help with the convergence to the global minima.
edit:
Two hints of a solution:
my variables used to be defined with very infinite bounds, e.g. bounds=(0,None) to move on the infinite half-line. I have enforced two finite bounds on them.
I am now using multiple starts with:
opt = SolverFactory('multistart')
results = opt.solve(self.model, solver='ipopt', strategy='midpoint_guess_and_bound')
So far this has made my happy with the convergence.
Sorry, IPOPT is a local solver. If you really want to find global solutions, you can use a global solver such as Baron, Couenne or Antigone. There is a trade-off: global solvers are slower and may not work for large problems.
Alternatively, you can help local solvers with a good initial point. Be aware that active set methods are often better in this respect than interior point methods. Sometimes multistart algorithms are used to prevent bad local optima: use a bunch of different starting points. Pyomo has some facilities to do this (see the documentation).

idea behind xgboost/lightgbm/catboost in comparison

I'm trying to decide, which one of the following I will use in practice for regression tasks: xgboost, lightgbm or catboost (python 3).
So, what are general idea behind each of them? Why should I choose one, but not another?
I'm not interested in very slight difference in the accuracy score like 0.781 vs 0.782. Result should be tenable, and my tool should be robust, convenient in use. The workhorse.
As I understand about these methods, Just how they are implemented is different, otherwise they have implemented GBM methods.
So you should just try to do some hyper parameter tuning.
Also, its good idea to read this paper:
catboost-vs-light-gbm-vs-xgboost
You cannot determine a priori which Tree algorithm (or any algorithm) will be automatically the best. This is because of the https://en.wikipedia.org/wiki/No_free_lunch_theorem
It's best to try them all out. You should also throw in Random Forest (RF) as another one to try.
I will say that http://CatBoost.ai (CB) does have one advantage over the others: if you have Categorical Variables, CB will most likely beat the others because it can handle categorical variables directly without One-Hot-Encoding.
You might try http://H2O.ai 's grid search which supports several algorithms (RF, XGBoost, GBM, Linear Regression) with Hypertuning of parameters to see which one works best. You can run this overnight. (CB is not included in H2O's grid search)

Gurobi resume optimization after model modification

As far as i know Gurobi resumes optimizing where it left after calling Model.Terminate() and then calling Model.Optimize() again. So I can terminate and get the best solution so far and then proceed.Now I want to do the same, but since I want to use parts of the suboptimal solution I need to set some variables to fixed values before I call Model.Optimize() again and optimize the rest of the model. How can i do this so that gurobi does not start all over again?
First, it sounds like you're describing a mixed-integer program (MIP); model modification is different for continuous optimization (linear programming, quadratic programming).
When you modify a MIP model, the tree information is no longer helpful. Instead, you must resolve the continuous (LP) relaxation and create a new branch-and-cut tree. However, the prior solution may still be used as a MIP start, which can reduce the solve time for the second model.
However, your method may be redundant with the RINS algorithm, which is an automatic feature of Gurobi MIP. You can control the behavior of RINS via the parameters RINS, SubMIPNodes and Heuristics.

Looking for ideas/references/keywords: adaptive-parameter-control of a search algorithm (online-learning)

I'm looking for ideas/experiences/references/keywords regarding an adaptive-parameter-control of search algorithm parameters (online-learning) in combinatorial-optimization.
A bit more detail:
I have a framework, which is responsible for optimizing a hard combinatorial-optimization-problem. This is done with the help of some "small heuristics" which are used in an iterative manner (large-neighborhood-search; ruin-and-recreate-approach). Every algorithm of these "small heuristics" is taking some external parameters, which are controlling the heuristic-logic in some extent (at the moment: just random values; some kind of noise; diversify the search).
Now i want to have a control-framework for choosing these parameters in a convergence-improving way, as general as possible, so that later additions of new heuristics are possible without changing the parameter-control.
There are at least two general decisions to make:
A: Choose the algorithm-pair (one destroy- and one rebuild-algorithm) which is used in the next iteration.
B: Choose the random parameters of the algorithms.
The only feedback is an evaluation-function of the new-found-solution. That leads me to the topic of reinforcement-learning. Is that the right direction?
Not really a learning-like-behavior, but the simplistic ideas at the moment are:
A: A roulette-wheel-selection according to some performance-value collected during the iterations (near past is more valued than older ones).
So if heuristic 1 did find all the new global best solutions -> high probability of choosing this one.
B: No idea yet. Maybe it's possible to use some non-uniform random values in the range (0,1) and i'm collecting some momentum of the changes.
So if heuristic 1 last time used alpha = 0.3 and found no new best solution, then used 0.6 and found a new best solution -> there is a momentum towards 1
-> next random value is likely to be bigger than 0.3. Possible problems: oscillation!
Things to remark:
- The parameters needed for good convergence of one specific algorithm can change dramatically -> maybe more diversify-operations needed at the beginning, more intensify-operations needed at the end.
- There is a possibility of good synergistic-effects in a specific pair of destroy-/rebuild-algorithm (sometimes called: coupled neighborhoods). How would one recognize something like that? Is that still in the reinforcement-learning-area?
- The different algorithms are controlled by a different number of parameters (some taking 1, some taking 3).
Any ideas, experiences, references (papers), keywords (ml-topics)?
If there are ideas regarding the decision of (b) in a offline-learning-manner. Don't hesitate to mention that.
Thanks for all your input.
Sascha
You have a set of parameter variables which you use to control your set of algorithms. Selection of your algorithms is just another variable.
One approach you might like to consider is to evolve your 'parameter space' using a genetic algorithm. In short, GA uses an analogue of the processes of natural selection to successively breed ever better solutions.
You will need to develop an encoding scheme to represent your parameter space as a string, and then create a large population of candidate solutions as your starting generation. The genetic algorithm itself takes the fittest solutions in your set and then applies various genetic operators to them (mutation, reproduction etc.) to breed a better set which then become the next generation.
The most difficult part of this process is developing an appropriate fitness function: something to quantitatively measure the quality of a given parameter space. Your search problem may be too complex to measure for each candidate in the population, so you will need a proxy model function which might be as hard to develop as the ideal solution itself.
Without understanding more of what you've written it's hard to see whether this approach is viable or not. GA is usually well suited to multi-variable optimisation problems like this, but it's not a silver bullet. For a reference start with Wikipedia.
This sounds like hyper heuristics which you're trying to do. Try looking for that keyword.
In Drools Planner (open source, java) I have support for tabu search and simulated annealing out the box.
I haven't implemented the ruin-and-recreate-approach (yet), but that should be easy, although I am not expecting better results. Challenge: Prove me wrong and fork it and add it and beat me in the examples.
Hyper heuristics are on my TODO list.