How to choose best parameters with a differential evolution algorithm - evolutionary-algorithm

for an assignment in class i need to optimize 4 10-dimensional functions, when implementing the differential evolution i noted that all the functions needed different parameter settings. By playing around it seemed that especially when choosing your crossover-rate high and your F around 0.5 seemed to work fine.
However on one function, the 10-dimensional Katsuura function, my differential algorithm seems to fail. I tried a bunch of parameters but keep scoring a 0.01 out of 10. Does differential evolution not work for certain objective functions?
I tried implementing PSO for this problem as well but that failed too so i seem to think this function has certain properties which can only be solved by certain algorithms?
i Inspired my DE on this article:
With kind regards,
Kees Til

If you look at the function you will notice that this function is pretty tough. It is something usual heuristics like DE and PSO to have problems with so tough functions.


Why differential evolution works so well?

What is the idea behind the mutation in differential evolution and why should this kind of mutation perform well?
I do not see any good geometric reason behind it.
Could anyone point me to some technical explanation of this?
Like all evolutionary algorithms, DE uses a heuristic, so my explanation is going to be a bit hand-wavy. What DE is trying to do, like all evolutionary algorithms, is to do a random search that's not too random. DE's mutation operator first computes the vector between two random members of the population, then adds that vector to a third random member of the population. This works well because it uses the current population as a way of figuring out how large of a step to take, and in what direction. If the population is widely dispersed, then it's reasonable to take big steps; if it's tightly concentrated, then it's reasonable to take small steps.
There are many reasons DE works better than Goldberg's GA, but focusing on the variation operators I'd say that the biggest difference is that DE uses real-coded variables and GA uses binary encoding. When optimizing on a continuous space, binary encoding is not a good choice. This has been known since the early 1990s, and one of the first things to come out of the encounter between the primarily German Evolution Strategy community and the primarily American Genetic Algorithm community was Deb's Simulated Binary Crossover. This operator acts like the GA's crossover operator, but on real-coded variables.

Examples of apache math optimization

I have a simple optimization problem and am looking for java software for that.
The Apache math optimization software looks just like what I want but I cant find documentation to suit my needs (where those needs are to useful to a beginner / non maths professional!)
Does anyone know of a worked, simple, example?
In case it helps, the problem is that I want to find the max r where
r1 = s1 * m1
r2 = s2 * m2
and there are some constraints and formula for defining the relationship between the variables. The Excel Solver works fine for this problem. I got LPSolve working great, but this problem requires a multiplication of s and m, so I understand LPSolve cant help as this makes the problem non linear.
I recently ported the derivative-free non-linear constrained optimization code COBYLA2 to Java. Since it does not explicitly rely on derivatives, the algorithm may require quite a few iterations for larger problems. Nonetheless, you are able to formulate your problem with both a non-linear objective function and (potentially) non-linear constraints.
You can read more about it and download the source code from here.
I am not aware of a simple Java-based NLP solver. (I did find an example of Quadratic programming (QP) in Apache Math Works, but it doesn't qualify since you asked for a non-math professional example.)
I have two suggestions for you to solve your non-linear program:
1.. Excel's Solver does have the ability to tackle non-linear problems. (Don't use LPSOLVE.) In fact, NLP is the default mode in Solver.
Here are two links to using Excel to solve NLPs: Example 1 - Step by step Solver walk-through that covers NLP and
Example 2 - A General Neural network example in Excel
Also for Excel, I like Paul Jensen's (utexas) ORMM Add-in's.
He has a module called Teach NLP. Chapter 10 of his book deals with NLP and is available from his site.
2.. If you are going to be doing even some amount of data analysis, then I recommend investing a few hours to download and learn the basics of R.
R has numerous packages and libraries for optimization. optim() and nlme are relavant for solving non-linear programs.
Just for completeness, I mention SAS, MATLAB and CPLEX as other options. If you have access to any of these, they all do a very good job with solving non-linear programs.
Hope these pointers help.

Optimal optimization order

I am working on a system of optimisation problems. These tasks can be solved by a generic optimization accross all the state space. But some of my equations are independent of the remaining system( imagine a Jacobian Matrix with some blocks full of zero ) and i would like to use this fact to optimize first the joint equations and then taking the previous solution as an input finish to solve the independent components.
The rules that say the relation between the tasks can be represented as an oriented graph, but this graph contains cycle because of the joint equations, which mean that i can't use a topological sort on it.
Does anyone have an idea of how to solve this kind of pb?
There are a couple of types of frameworks you can look into (instead of inventing it yourself), which might solve your problem. The question is a bit to abstract to tell which one suits your needs, so take a look at these:
Use a solver framework to solve this optimization and look through the search space of. Take a look at Drools Planner, Gurobi, JGap, OpenTS, ...
Use a rules engine to apply the optimization changes. Take a look at Drools Expert, JESS, ...

Looking for ideas/references/keywords: adaptive-parameter-control of a search algorithm (online-learning)

I'm looking for ideas/experiences/references/keywords regarding an adaptive-parameter-control of search algorithm parameters (online-learning) in combinatorial-optimization.
A bit more detail:
I have a framework, which is responsible for optimizing a hard combinatorial-optimization-problem. This is done with the help of some "small heuristics" which are used in an iterative manner (large-neighborhood-search; ruin-and-recreate-approach). Every algorithm of these "small heuristics" is taking some external parameters, which are controlling the heuristic-logic in some extent (at the moment: just random values; some kind of noise; diversify the search).
Now i want to have a control-framework for choosing these parameters in a convergence-improving way, as general as possible, so that later additions of new heuristics are possible without changing the parameter-control.
There are at least two general decisions to make:
A: Choose the algorithm-pair (one destroy- and one rebuild-algorithm) which is used in the next iteration.
B: Choose the random parameters of the algorithms.
The only feedback is an evaluation-function of the new-found-solution. That leads me to the topic of reinforcement-learning. Is that the right direction?
Not really a learning-like-behavior, but the simplistic ideas at the moment are:
A: A roulette-wheel-selection according to some performance-value collected during the iterations (near past is more valued than older ones).
So if heuristic 1 did find all the new global best solutions -> high probability of choosing this one.
B: No idea yet. Maybe it's possible to use some non-uniform random values in the range (0,1) and i'm collecting some momentum of the changes.
So if heuristic 1 last time used alpha = 0.3 and found no new best solution, then used 0.6 and found a new best solution -> there is a momentum towards 1
-> next random value is likely to be bigger than 0.3. Possible problems: oscillation!
Things to remark:
- The parameters needed for good convergence of one specific algorithm can change dramatically -> maybe more diversify-operations needed at the beginning, more intensify-operations needed at the end.
- There is a possibility of good synergistic-effects in a specific pair of destroy-/rebuild-algorithm (sometimes called: coupled neighborhoods). How would one recognize something like that? Is that still in the reinforcement-learning-area?
- The different algorithms are controlled by a different number of parameters (some taking 1, some taking 3).
Any ideas, experiences, references (papers), keywords (ml-topics)?
If there are ideas regarding the decision of (b) in a offline-learning-manner. Don't hesitate to mention that.
Thanks for all your input.
You have a set of parameter variables which you use to control your set of algorithms. Selection of your algorithms is just another variable.
One approach you might like to consider is to evolve your 'parameter space' using a genetic algorithm. In short, GA uses an analogue of the processes of natural selection to successively breed ever better solutions.
You will need to develop an encoding scheme to represent your parameter space as a string, and then create a large population of candidate solutions as your starting generation. The genetic algorithm itself takes the fittest solutions in your set and then applies various genetic operators to them (mutation, reproduction etc.) to breed a better set which then become the next generation.
The most difficult part of this process is developing an appropriate fitness function: something to quantitatively measure the quality of a given parameter space. Your search problem may be too complex to measure for each candidate in the population, so you will need a proxy model function which might be as hard to develop as the ideal solution itself.
Without understanding more of what you've written it's hard to see whether this approach is viable or not. GA is usually well suited to multi-variable optimisation problems like this, but it's not a silver bullet. For a reference start with Wikipedia.
This sounds like hyper heuristics which you're trying to do. Try looking for that keyword.
In Drools Planner (open source, java) I have support for tabu search and simulated annealing out the box.
I haven't implemented the ruin-and-recreate-approach (yet), but that should be easy, although I am not expecting better results. Challenge: Prove me wrong and fork it and add it and beat me in the examples.
Hyper heuristics are on my TODO list.

looking for simulated annealing implementation in VB

Is anyone aware of a reasonably well documented example of simulated annealing in Visual Basic that I can examine and adapt?
This project looks pretty well documented: It's C# but contains only one important source file (TravellingSalesmanProblem.cs) so it's pretty easy to run it through a converter. Maybe:
MSDN magazine also had an interesting article on neural networks. As I understand simulated annealing, you can add it to other function estimation methods (like neural nets). So you could add simulated annealing to the MSDN VB code by shrinking the Momentum over time. The network starts 'hot' by backpropagating error with a large Momentum and slowly 'cools' by shrinking the Momentum and thus reducing the effect of output error in backpropagation.
I generally refer to "Numerical recipes in C/C++" for all the pseudocode and adapt to my own later. That is the best documentation/implementation you could find. Sometimes you could even find better algorithms or an alternative way of solving. (In case Newton Raphshon is not the way to go)