Does translating the genes in a chromosome for a genetic algorithm for a combinatorial function increase the diversity of candidates? - optimization

I'm new to genetic algorithms and am writing code for the Traveling Salesman problem. I'm using cycle crossover to generate new offspring and I've found that this leads to some of the offspring retaining the same exact phenotype as one parent even when the two parents are different. Would translating the chromosomes avoid this?
By translate I mean a chromosome with phenotype ABCDE shifting over two to DEABC. They would be equivalent answers and have equal fitness, but might make more diverse offspring.
Is this worth it in the long run, or is it just wasting computing time?

Cycle crossover (CX) is based on the assumption that it's important to preserve the absolute position of cities (a city preferably inherits its position from either parent) and the preventive "translation" is against the spirit of CX.
Anyway multiple studies (e.g. 1) have shown that for TSP the key is to preserve the relative position of cities and the edges.
So it could work, but you have to experiment. Some form of mutation is another possibility.
Probably, if the characteristics of CX aren't satisfying, a different crossover operator is a better choice: staying with simple operators, one of the most successful is Order Crossover (e.g. 2).
L. Darrell Whitley, Timothy Starkweather, D'Ann Fuquay - Scheduling problems and traveling salesmen: The genetic edge recombination operator - 1989.
Pablo Moscato - On Genetic Crossover Operators for Relative Order Preservation.

Related

Genetic algorithm - find max of minimized subsets

I have a combinatorial optimization problem for which I have a genetic algorithm to approximate the global minima.
Given X elements find: min f(X)
Now I want to expand the search over all possible subsets and to find the one subset where its global minimum is maximal compared to all other subsets.
X* are a subset of X, find: max min f(X*)
The example plot shows all solutions of three subsets (one for each color). The black dot indicates the highest value of all three global minima.
image: solutions over three subsets
The main problem is that evaluating the fitness between subsets runs agains the convergence of the solution within a subset. Further the solution is actually a local minimum.
How can this problem be generally described? I couldn't find a similar problem in the literature so far. For example if its solvable with a multi-object genetic algorithm.
Any hint is much appreciated.
While it may not always provide exactly the highest minima (or lowest maxima), a way to maintain local optima with genetic algorithms consists in implementing a niching method. These are ways to maintain population diversity.
For example, in Niching Methods for Genetic Algorithms by Samir W. Mahfoud 1995, the following sentence can be found:
Using constructed models of fitness sharing, this study derives lower bounds on the population size required to maintain, with probability gamma, a fixed number of desired niches.
If you know the number of niches and you implement the solution mentioned, you could theoretically end up with the local optima you are looking for.

What are the advantages and disadvantages of using the crossover genetic operator?

For example, we have this problem:
Maximise the function f(X) = X^2 , with 0 ≤ X ≤ 31
Using binary encoding we can represent individuals using 5 bits. After undergoing a selection method, we get to the genetic operators.
For this problem (or any optimisation problem), what are the advantages and disadvantages of the following:
High or Low crossover rate
Using 1-Point crossover
Using multi-point crossover
Using Uniform crossover
Here's what I came up with so far:
High crossover rates and multi-point crossover can decrease the quality of parents with good fitness, and produce worse offspring
Low crossover rates mean the solution will take longer to converge to some optima
It's hard to give a good answer as more information is needed what exactly the 5 bits represent, but I gave it a try:
A high crossover rate causes the genomes in the next generation to be more random, as there will be more genomes that are a mix of previous generation genomes
A low crossover rate keeps fit genomes from the previous generation, although it decreases the chance that a very fit genome will be produced by crossover operation
Uniform crossover will create genomes that will be very different from their parents if their parents are not similar. If its parents are similar, the offspring will be similar to its parents.
Using 1-point crossover means that offspring genomes will be less diverse, they will be quite similar to their parents.
Using multi-point crossover is basically a mix between 1-point and uniform, depending on the amount of points.

distribution of population in genetic algorithms

My questions is ,if there are genetic optimization algorithms where the population keeps i.i.d (independ identically distributed) during all iterations. The most common ones like NSGA2 or SPEA2 mix the current population with the previous one so that mixed population is no longer iid. But are there algorithms where the distribution of the population changes during optimization but still remains i.i.d?
You can try fitness uniform selection https://arxiv.org/abs/cs/0103015.
But, IMHO the results won't be very good.

what is the best data structure for solving TSP in visual basic

I made a program in VB to solve TSP using Genetic algorithm and I use Array list as Data structure , I want to ask is there another data structure for solving TSP in visual basic better than what I used?
also I will make a program in VB to solve TSP using branch and bound algorithm , what is the best data structure can I use it in this case or array good for that?
thank you
I don't know VB, but the following should be general enough.
If the genotypes are directly the city permutations the data structures I use are (for N cities):
a distance matrix - a N-1 by N-1 2-D array where position (i, j) contains the distance from i-th city to j-th city
the genotypes are then arrays (or lists) of the city indices (i.e. 0..N-1)
The fitness evaluation is then easy and fast as it is just a single walk through the genotype with constant-time lookups of the distances. If memory is an issue and the problem is BIG (i.e. tens of thousands of cities and more) you might want to consider not storing the whole distance matrix and store only a part of it (if the problem type allows to, like in symmetric TSP where distance from A to B is equal to distance from B to A) or just not to store it at all and compute the distances on demand.
For branch-and-bound approach you basically need the distance matrix too. If you are going to do some distance-based prioritisation over the order in which are the cities chosen and your TSP is a metric one (i.e. each city is a point in a 2D plane) you can use K-D tree for fast lookup of cities nearest to any point in the plane.

what is the importance of crossing over in Differential Evolution Algorithm?

In Differential Evolution Algorithm for optimization problems.
There are three evolutionary processes involved, that is mutation crossing over and selection
I am just a beginner but I have tried removing the crossing over process and there is no significant difference result from the original algorithm.
So what is the importance of crossing over in Differential Evolution Algorithm?
If you don't use crossover may be your algorithm just explore the problem search space and doesn't exploit it. In general an evolutionary algorithm succeeds if it makes good balance between exploration and exploitation rates.
For example DE/rand/1/Either-Or is a variant of DE which eliminates crossover operator but uses effective mutation operator. According to Differential Evolution: A Survey of the State-of-the-Art, in this Algorithm, trial vectors that are pure mutants occur with a probability pF and those that are pure recombinants occur with a probability 1 − pF. This variant is shown to yield competitive results against classical DE-variants rand/1/bin and target-to-best/1/bin (Main Reference).
X(i,G) is the i-th target (parent) vector of Generation G, U(i,G) is it's corresponding trial vector,F is difference vector scale factor and k = 0.5*(F + 1)[in the original paper].
In this scheme crossover isn't used but mutation is effective enough to compare with original DE algorithm.