Can I use this fitness function? - optimization

I am working on a project using a genetic algorithm, and I am trying to formulate a fitness function, my questions are:
What is the effect of fitness formula choice on a GA?
It is possible to make the fitness function equals directly the number of violation (in case of minimisation)?

What is the effect of fitness formula choice on a GA
The fitness function plays a very important role in guiding GA.
Good fitness functions will help GA to explore the search space effectively and efficiently. Bad fitness functions, on the other hand, can easily make GA get trapped in a local optimum solution and lose the discovery power.
Unfortunately every problem has its own fitness function.
For classification tasks error measures (euclidean, manhattan...) are widely adopted. You can also use entropy based approaches.
For optimization problems, you can use a crude model of the function you are investigating.
Extensive literature is available on the characteristics of fitness function (e.g. {2}, {3}, {5}).
From an implementation point of view, some additional mechanisms have to be taken into consideration: linear scaling, sigma truncation, power scaling... (see {1}, {2}).
Also the fitness function can be dynamic: changing during the evolution to help search space exploration.
It is possible to make the fitness function equals directly the number of violation (in case of minimisation)?
Yes, it's possible but you have to consider that it could be a too coarse grained fitness function.
If the fitness function is too coarse (*), it doesn't have enough expressiveness to guide the search and the genetic algorithm will get stuck in local minima a lot more often and may never converge on a solution.
Ideally a good fitness function should have the capacity to tell you what the best direction to go from a given point is: if the fitness of a point is good, a subset of its neighborhood should be better.
So no large plateau (a broad flat region that doesn't give a search direction and induces a random walk).
(*) On the other hand a perfectly smooth fitness function could be a sign you are using the wrong type of algorithm.
A naive example: you look for parameters a, b, c such that
g(x) = a * x / (b + c * sqrt(x))
is a good approximation of n given data points (x_i, y_i)
You could minimize this fitness function:
| 0 if g(x_i) == y_i
E1_i = |
| 1 otherwise
f1(a, b, c) = sum (E1_i)
i
and it could work, but the search isn't aimed. A better choice is:
E2_i = (y_i - g(x_i)) ^ 2
f1(a, b, c) = sum (E2_i)
i
now you have a "search direction" and greater probability of success.
Further details:
Genetic Algorithms: what fitness scaling is optimal? by Vladik Kreinovich, Chris Quintana
Genetic Algorithms in Search, Optimization and Machine Learning by Goldberg, D. (1989, Addison-Wesley)
The Royal Road for Genetic Algorithms: Fitness Landscapes and GA Performance by Melanie Mitchell, Stephanie Forrest, John H Holland.
Avoiding the pitfalls of noisy fitness functions with genetic algorithms by Fiacc Larkin, Conor Ryan (ISBN: 978-1-60558-325-9)
Essentials of Metaheuristics by Sean Luke

Related

How can I order the basic solutions of a min cost flow problem according to their cost?

I was wondering if, given a min cost flow problem and an integer n, there is an efficient algorithm/package or mathematical method, to obtain the set of the
n-best basic solutions of the min cost flow problem (instead of just the best).
Not so easy. There were some special LP solvers that could do that (see: Ralph E. Steuer, Multiple Criteria Optimization: Theory, Computation, and Application, Wiley, 1986), but currently available LP solvers can't.
There is a way to encode a basis using binary variables:
b[i] = 1 if variable x[i] = basic
0 nonbasic
Using this, we can use "no good cuts" or "solution pool" technology to get the k best bases. See: https://yetanothermathprogrammingconsultant.blogspot.com/2016/01/finding-all-optimal-lp-solutions.html. Note that not all solution-pools can do the k-best. (Cplex can't, Gurobi can.) The "no-good" cuts work with any mip solver.
Update: a more recent reference is Craig A. Piercy, Ralph E. Steuer,
Reducing wall-clock time for the computation of all efficient extreme points in multiple objective linear programming, European Journal of Operational Research, 2019, https://doi.org/10.1016/j.ejor.2019.02.042

Population size in Fast Messy Genetic Algorithm

I'm trying to implement the Fast Messy GA using the paper by Goldberg, Deb, Kargupta Harik: fmGA - Rapid Accurate Optimization of Difficult Problems using Fast Messy Genetic Algorithms.
I'm stuck with the formula about the initial population size to account for the Building Block evaluation noise:
The sub-functions here are m=10 order-3(k=3) deceptive functions:
l=30, l'=27 and B is signal-to-noise ratio which is the ratio of the fitness deviation to the difference between the best and second best fitness value(30-28=2). Fitness deviation according to the table above is sqrt(155).
However in the paper they say using 10 order-3 subfunctions and using the equation must give you population size 3,331 but after substitution I can't reach it since I am not sure what is the value of c(alpha).
Any help will be appreciated. Thank you
I think I've figured it out what exactly is c(alpha). At least the graph drawing it against alpha looks exactly the same as in the paper. It seems by the square of the ordinate they mean the square of the Z-score found by Inverse Normal Random Distribution using alpha as the right-tail area. At first I was missleaded that after finding the Z-score it should be substituted in the Normal Random Distribution equation to fight the height(ordinate).
There is some implementation in Lua here https://github.com/xenomeno/GA-Messy for the interested folks. However the Fast Messy GA has some problems reproducing the figures from the original Goldberg's paper which I am not sure how to fix but these is another matter.

Genetic algorithm - find max of minimized subsets

I have a combinatorial optimization problem for which I have a genetic algorithm to approximate the global minima.
Given X elements find: min f(X)
Now I want to expand the search over all possible subsets and to find the one subset where its global minimum is maximal compared to all other subsets.
X* are a subset of X, find: max min f(X*)
The example plot shows all solutions of three subsets (one for each color). The black dot indicates the highest value of all three global minima.
image: solutions over three subsets
The main problem is that evaluating the fitness between subsets runs agains the convergence of the solution within a subset. Further the solution is actually a local minimum.
How can this problem be generally described? I couldn't find a similar problem in the literature so far. For example if its solvable with a multi-object genetic algorithm.
Any hint is much appreciated.
While it may not always provide exactly the highest minima (or lowest maxima), a way to maintain local optima with genetic algorithms consists in implementing a niching method. These are ways to maintain population diversity.
For example, in Niching Methods for Genetic Algorithms by Samir W. Mahfoud 1995, the following sentence can be found:
Using constructed models of fitness sharing, this study derives lower bounds on the population size required to maintain, with probability gamma, a fixed number of desired niches.
If you know the number of niches and you implement the solution mentioned, you could theoretically end up with the local optima you are looking for.

Implementing the Bayes' theorem in a fitness function

In an evolutionary programming project I'm working on I thought it could be a useful idea to use the formula in Bayes' theorem. Although I'm not totally sure what that would look like.
So the programs that are evolving are attempting to predict the future state of a time series using past data. Given some price data for the past n days the program will predict either buy if it predicts the price will rise, sell if fall, leave if there is too little movement.
From my understanding, I work out the probability of the model being accurate with regards to buying with the following algorithm after testing it on the historical data and recording correct and incorrect predictions.
prob-b-given-a = correct-buy-predictions / total
prob-a = actual-buy-count / total
prob-b = prediction-buy-count / total
prob-a-given-b = (prob-b-given-a * prob-a) / prob-b
fitness = prob-a-given-b //last step for clarification
Am I interpreting Bayes' theorem correctly and is this a suitable fitness function?
How would I combine the fitness function for all predictions? (in my example I only show the predictive probability of the buy prediction)

what is the importance of crossing over in Differential Evolution Algorithm?

In Differential Evolution Algorithm for optimization problems.
There are three evolutionary processes involved, that is mutation crossing over and selection
I am just a beginner but I have tried removing the crossing over process and there is no significant difference result from the original algorithm.
So what is the importance of crossing over in Differential Evolution Algorithm?
If you don't use crossover may be your algorithm just explore the problem search space and doesn't exploit it. In general an evolutionary algorithm succeeds if it makes good balance between exploration and exploitation rates.
For example DE/rand/1/Either-Or is a variant of DE which eliminates crossover operator but uses effective mutation operator. According to Differential Evolution: A Survey of the State-of-the-Art, in this Algorithm, trial vectors that are pure mutants occur with a probability pF and those that are pure recombinants occur with a probability 1 − pF. This variant is shown to yield competitive results against classical DE-variants rand/1/bin and target-to-best/1/bin (Main Reference).
X(i,G) is the i-th target (parent) vector of Generation G, U(i,G) is it's corresponding trial vector,F is difference vector scale factor and k = 0.5*(F + 1)[in the original paper].
In this scheme crossover isn't used but mutation is effective enough to compare with original DE algorithm.