Why differential evolution works so well? - evolutionary-algorithm

What is the idea behind the mutation in differential evolution and why should this kind of mutation perform well?
I do not see any good geometric reason behind it.
Could anyone point me to some technical explanation of this?

Like all evolutionary algorithms, DE uses a heuristic, so my explanation is going to be a bit hand-wavy. What DE is trying to do, like all evolutionary algorithms, is to do a random search that's not too random. DE's mutation operator first computes the vector between two random members of the population, then adds that vector to a third random member of the population. This works well because it uses the current population as a way of figuring out how large of a step to take, and in what direction. If the population is widely dispersed, then it's reasonable to take big steps; if it's tightly concentrated, then it's reasonable to take small steps.
There are many reasons DE works better than Goldberg's GA, but focusing on the variation operators I'd say that the biggest difference is that DE uses real-coded variables and GA uses binary encoding. When optimizing on a continuous space, binary encoding is not a good choice. This has been known since the early 1990s, and one of the first things to come out of the encounter between the primarily German Evolution Strategy community and the primarily American Genetic Algorithm community was Deb's Simulated Binary Crossover. This operator acts like the GA's crossover operator, but on real-coded variables.

Related

what is the right way to crossover when using GA to get minimum of one variable function,like sin(x)^2

I am encoding the interval [x:y] to binary codes like 10101111, so for population, it is like [[1,0,1,1],[0,1,0,1]].
I defined the fitness function directly using the value of the function (sin(x)^2).
For selection, i am using tournament selection and for crossover, only simple exchange part of the chromosome like this: 1(10)0 and 0(01)1 -> 1(01)0 and 0(10)1.
For mutation, using Bit inversion.
The algorithm kind of works, it can generate the global minimum sometimes, and sometimes local ones. but I don't see the function of crossover in this problem, because the feature of the 'x' is being broken every time (i think), I don't know why, and if it is even right way to code the crossover or maybe the encoding part.
I'm afraid that there isn't a "right way" to crossover.
There are many crossover operator (e.g. Comparison of a Crossover Operator in Binary-coded Genetic Algorithms - STJEPAN PICEK, MARIN GOLUB) that can be used in binary coded genetic algorithm, but:
depending on the properties of a problem one or another crossover operator will have better result.
every crossover operator has its advantages and downfalls, so choosing one ultimately represents the question of someone's requirements and experiments
undergone.
in many situations uniform and two-point crossover are good choices.
Crossover is the major exploratory mechanism of the genetic algorithm, but the driving force behind GA is the cooperation between selection, crossover and mutation (mutation prevents convergence of the population and introduces variation).
Usually a mutation-only approach doesn't have enough exploration strength to reach to the minimum and the success is largely due to distribution of solutions in the initial population.
For continuous function optimization you should also check differential evolution.

What model best suits optimizing for a real-time strategy game?

An article has been making the rounds lately discussing the use of genetic algorithms to optimize "build orders" in StarCraft II.
http://lbrandy.com/blog/2010/11/using-genetic-algorithms-to-find-starcraft-2-build-orders/
The initial state of a StarCraft match is pre-determined and constant. And like chess, decisions made in this early stage of the match have long-standing consequences to a player's ability to perform in the mid and late game. So the various opening possibilities or "build orders" are under heavy study and scrutiny. Until the circulation of the above article, computer-assisted build order creation probably wasn't as popularity as it has been recently.
My question is... Is a genetic algorithm really the best way to model optimizing build orders?
A build order is a sequence of actions. Some actions have prerequisites like, "You need building B before you can create building C, but you can have building A at any time." So a chromosome may look like AABAC.
I'm wondering if a genetic algorithm really is the best way to tackle this problem. Although I'm not too familiar with the field, I'm having a difficult time shoe-horning the concept of genes into a data structure that is a sequence of actions. These aren't independent choices that can be mixed and matched like a head and a foot. So what value is there to things like reproduction and crossing?
I'm thinking whatever chess AIs use would be more appropriate since the array of choices at any given time could be viewed as tree-like in a way.
Although I'm not too familiar with the field, I'm having a difficult time shoe-horning the concept of genes into a data structure that is a sequence of actions. These aren't independent choices that can be mixed and matched like a head and a foot. So what value is there to things like reproduction and crossing?
Hmm, that's a very good question. Perhaps the first few moves in Starcraft can indeed be performed in pretty much any order, since contact with the enemy is not as immediate as it can be in Chess, and therefore it is not as important to remember the order of the first few moves as it is to know which of the many moves are included in those first few. But the link seems to imply otherwise, which means the 'genes' are indeed not all that amenable to being swapped around, unless there's something cunning in the encoding that I'm missing.
On the whole, and looking at the link you supplied, I'd say that genetic algorithms are a poor choice for this situation, which could be accurately mathematically modelled in some parts and the search tree expanded out in others. They may well be better than an exhaustive search of the possibility space, but may not be - especially given that there are multiple populations and poorer ones are just wasting processing time.
However, what I mean by "a poor choice" is that it is inefficient relative to a more appropriate approach; that's not to say that it couldn't still produce 98% optimal results in under a second or whatever. In situations such as this where the brute force of the computer is useful, it is usually more important that you have modelled the search space correctly than to have used the most effective algorithm.
As TaslemGuy pointed out, Genetic Algorithms aren't guaranteed to be optimal, even though they usually give good results.
To get optimal results you would have to search through every possible combination of actions until you find the optimal path through the tree-like representation. However, doing this for StarCraft is difficult, since there are so many different paths to reach a goal. In chess you move a pawn from e2 to e4 and then the opponent moves. In StarCraft you can move a unit at instant x or x+1 or x+10 or ...
A chess engine can look at many different aspects of the board (e.g. how many pieces does it have and how many does the opponent have), to guide it's search. It can ignore most of the actions available if it knows that they are strictly worse than others.
For a build-order creator only time really matters. Is it better to build another drone to get minerals faster, or is it faster to start that spawning pool right away? Not as straightforward as with chess.
These kinds of decisions happen pretty early on, so you will have to search each alternative to conclusion before you can decide on the better one, which will take a long time.
If I were to write a build-order optimizer myself, I would probably try to formulate a heuristic that estimates how good (close the to the goal state) the current state is, just as chess engines do:
Score = a*(Buildings_and_units_done/Buildings_and_units_required) - b*Time_elapsed - c*Minerals - d*Gas + e*Drone_count - f*Supply_left
This tries to keep the score tied to the completion percentage as well as StarCraft common knowledge (keep your ressources low, build drones, don't build more supply than you need). The variables a to f would need tweaking, of course.
After you've got a heuristic that can somewhat estimate the worth of a situation, I would use Best-first search or maybe IDDFS to search through the tree of possibilities.
Edit:
I recently found a paper that actually describes build order optimization in StarCraft, in real time even. The authors use depth-first search with branch and bound and heuristics that estimate the minimum amount of effort required to reach the goal based on the tech tree (e.g. zerglings need a spawning pool) and the time needed to gather the required minerals.
Genetic Algorithm can be, or can sometimes not be, the optimal or non-optimal solution. Based on the complexity of the Genetic Algorithm, how much mutation there is, the forms of combinations, and how the chromosomes of the genetic algorithm is interpreted.
So, depending on how your AI is implemented, Genetic Algorithms can be the best.
You are looking at a SINGLE way to implement genetic algorithms, while forgetting about genetic programming, the use of math, higher-order functions, etc. Genetic algorithms can be EXTREMELY sophisticated, and by using clever combining systems for crossbreeding, extremely intelligent.
For instance, neural networks are optimized by genetic algorithms quite often.
Look up "Genetic Programming." It's similar, but uses tree-structures instead of lines of characters, which allows for more complex interactions that breed better. For more complex stuff, they typically work out better.
There's been some research done using hierarchical reinforcement learning to build a layered ordering of actions that efficiently maximizes a reward. I haven't found much code implementing the idea, but there are a few papers describing MAXQ-based algorithms that have been used to explicitly tackle real-time strategy game domains, such as this and this.
This Genetic algorithm only optimizes the strategy for one very specific part of the game: The order of the first few build actions of the game. And it has a very specific goal as well: To have as many roaches as quickly as possible.
The only aspects influencing this system seem to be (I'm no starcraft player):
build time of the various units and
buildings
allowed units and buildings given the available units and buildings
Larva regeneration rate.
This is a relatively limited, relatively well defined problem with a large search space. As such it is very well suited for genetic algorithms (and quite a few other optimization algorithm at that). A full gene is a specific set of build orders that ends in the 7th roach. From what I understand you can just "play" this specific gene to see how fast it finishes, so you have a very clear fitness test.
You also have a few nice constraints on the build order, so you can combine different genes slightly smarter than just randomly.
A genetic algorithm used in this way is a very good tool to find a more optimal build order for the first stage of a game of starcraft. Due to its random nature it is also good at finding a surprising strategy, which might have been an additional goal of the author.
To use a genetic algorithm as the algorithm in an RTS game you'd have to find a way to encode reactions to situations rather than just plain old build orders. This also involves correctly identifying situations which can be a difficult task in itself. Then you'd have to let these genes play thousands of games of starcraft, against each other and (possibly) against humans, selecting and combining winners (or longer-lasting losers). This is also a good application of genetic algorithms, but it involves solving quite a few very hard problems before you even get to the genetic algorithm part.

Improved Genetic algorithm for multiknapsack problem

Recently i've been improving traditional genetic algorithm for multiknapsack problem. So My Improved Genetic Algorithm is working better then Traditional Genetic Algorithm. I tested. (i used publically available from OR-Library (http://people.brunel.ac.uk/~mastjjb/jeb/orlib/mknapinfo.html) were used to test the GAs.) Does anybody know other improved GA. I wanted to compare with other improved genetic algorithm. Actually i searched in internet. But couldn't find good algorithm to compare.
There should be any number of decent GA methods against which you can compare. However, you should try to first clearly establish exactly which "traditional" GA method you have already tested.
One good method which I can recommend is the NSGA-II algorithm, which was developed for multi-objective optimization.
Take a look at the following for other ideas:
Genetic Algorithm - Wikipedia
Carlos A. Coello Coello (1999). "A Comprehensive Survey of Evolutionary-Based Multiobjective Optimization Techniques", Knowledge and Information Systems, Vol. 1, pp. 269-308.
Carlos A. Coello Coello et al (2005). "Current and Future Research Trends in Evolutionary Multiobjective Optimization", Information Processing with Evolutionary Algorithms, Springer.
You can compare your solution only to problems with the exact same encoding and fitness function (meaning they are equivalent problems). If the problem is different any comparison becomes quickly irrelevant as the problem changes, since the fitness function is almost always ad-hoc for whatever you're trying to solve. In fact the fitness function is the only thing you need to code if you use a Genetic Algorithms toolkit, as everything else usually comes out of the box.
On the other end, if the fitness function is the same, then it makes sense to compare results given different parameters, such as different mutation rate, different implementations of crossover, or even completely different evolutionary paradigms, such as coevolution, gene expression, compared to standard GAs, and so on.
Are you trying to improve the state-of-the-art in multiknapsack solvers by the use of genetic algorithms? Or are you trying to advance the genetic algorithm technique by using multiknapsack as a test platform? (Can you clarify?)
Depending on which one is your goal, the answer to your question is entirely different. Since others have addressed the latter question, I'll assume the former.
There has been little major leaps and bounds over the basic genetic algorithm. The best improvement in solving the multiknapsack via the use of genetic algorithms would be to improve your encoding of the mutation and crossover operators which can make orders of magnitude of difference in the resulting performance and blow out of the water any tweaks to the fundamental genetic algorithm. There is a lot you can do to make your mutation and crossover operators tailored to multiknapsack.
I would first survey the literature on multiknapsack to see what are the different kinds of search spaces and solution techniques people have used on multiknapsack. In their optimal or suboptimal methods (independent of genetic algorithms), what kinds of search operators do they use? What do they encode as variables and what do they encode as values? What heuristic evaluation functions are used? What constraints do they check for? Then you would adapt their encodings to your mutation and crossover operators, and see how well they perform in your genetic algorithms.
It is highly likely that an efficient search space encoding or an accurate heuristic evaluation function of the multiknapsack problem can translate into highly effective mutation and crossover operators. Since multiknapsack is a very well studied problem with a large corpus of research literature, it should be a gold mine for you.

Genetic/Evolutionary algorithms and local minima/maxima

I have run across several posts and articles that suggests using things like simulated annealing to avoid the local minima/maxima problem.
I don't understand why this would be necessary if you started out with a sufficiently large random population.
Is it just another check to insure that the initial population was, in fact, sufficiently large and random? Or are those techniques just an alternative to producing a "good" initial population?
Simulated annealing is a probabilistic optimization technique -- it's not supposed to give you more precise answers, it's supposed to give you approximations faster.
Simulated annealing is probabilistic technique where chance of getting trapped in local minima/maxima depends on scheduling of temperature. Scheduling temperature is different for different types of problems. Evolutionary Algorithm is much more robust and less likely to get trapped in local minima/maxima. SA is probabilistic. On the other hand, EA uses mutation which introduces random walk in search space, that's why EA has higher probability of getting global optima.
First of all, simulated annealing is a last resort method. There are far better, more efficient, and more effective methods of discovering where the local minima are found.
A better check would be to use a statistical method to uncover information about your data set such as variance or standard deviation.

What are the typical use cases of Genetic Programming?

Today I read this blog entry by Roger Alsing about how to paint a replica of the Mona Lisa using only 50 semi transparent polygons.
I'm fascinated with the results for that particular case, so I was wondering (and this is my question): how does genetic programming work and what other problems could be solved by genetic programming?
There is some debate as to whether Roger's Mona Lisa program is Genetic Programming at all. It seems to be closer to a (1 + 1) Evolution Strategy. Both techniques are examples of the broader field of Evolutionary Computation, which also includes Genetic Algorithms.
Genetic Programming (GP) is the process of evolving computer programs (usually in the form of trees - often Lisp programs). If you are asking specifically about GP, John Koza is widely regarded as the leading expert. His website includes lots of links to more information. GP is typically very computationally intensive (for non-trivial problems it often involves a large grid of machines).
If you are asking more generally, evolutionary algorithms (EAs) are typically used to provide good approximate solutions to problems that cannot be solved easily using other techniques (such as NP-hard problems). Many optimisation problems fall into this category. It may be too computationally-intensive to find an exact solution but sometimes a near-optimal solution is sufficient. In these situations evolutionary techniques can be effective. Due to their random nature, evolutionary algorithms are never guaranteed to find an optimal solution for any problem, but they will often find a good solution if one exists.
Evolutionary algorithms can also be used to tackle problems that humans don't really know how to solve. An EA, free of any human preconceptions or biases, can generate surprising solutions that are comparable to, or better than, the best human-generated efforts. It is merely necessary that we can recognise a good solution if it were presented to us, even if we don't know how to create a good solution. In other words, we need to be able to formulate an effective fitness function.
Some Examples
Travelling Salesman
Sudoku
EDIT: The freely-available book, A Field Guide to Genetic Programming, contains examples of where GP has produced human-competitive results.
Interestingly enough, the company behind the dynamic character animation used in games like Grand Theft Auto IV and the latest Star Wars game (The Force Unleashed) used genetic programming to develop movement algorithms. The company's website is here and the videos are very impressive:
http://www.naturalmotion.com/euphoria.htm
I believe they simulated the nervous system of the character, then randomised the connections to some extent. They then combined the 'genes' of the models that walked furthest to create more and more able 'children' in successive generations. Really fascinating simulation work.
I've also seen genetic algorithms used in path finding automata, with food-seeking ants being the classic example.
Genetic algorithms can be used to solve most any optimization problem. However, in a lot of cases, there are better, more direct methods to solve them. It is in the class of meta-programming algorithms, which means that it is able to adapt to pretty much anything you can throw at it, given that you can generate a method of encoding a potential solution, combining/mutating solutions, and deciding which solutions are better than others. GA has an advantage over other meta-programming algorithms in that it can handle local maxima better than a pure hill-climbing algorithm, like simulated annealing.
I used genetic programming in my thesis to simulate evolution of species based on terrain, but that is of course the A-life application of genetic algorithms.
The problems GA are good at are hill-climbing problems. Problem is that normally it's easier to solve most of these problems by hand, unless the factors that define the problem are unknown, in other words you can't achieve that knowledge somehow else, say things related with societies and communities, or in situations where you have a good algorithm but you need to fine tune the parameters, here GA are very useful.
A situation of fine tuning I've done was to fine tune several Othello AI players based on the same algorithms, giving each different play styles, thus making each opponent unique and with its own quirks, then I had them compete to cull out the top 16 AI's that I used in my game. The advantage was they were all very good players of more or less equal skill, so it was interesting for the human opponent because they couldn't guess the AI as easily.
http://en.wikipedia.org/wiki/Genetic_algorithm#Problem_domains
You should ask yourself : "Can I (a priori) define a function to determine how good a particular solution is relative to other solutions?"
In the mona lisa example, you can easily determine if the new painting looks more like the source image than the previous painting, so Genetic Programming can be "easily" applied.
I have some projects using Genetic Algorithms. GA are ideal for optimization problems, when you cannot develop a fully sequential, exact algorithm do solve a problem. For example: what's the best combination of a car characteristcs to make it faster and at the same time more economic?
At the moment I'm developing a simple GA to elaborate playlists. My GA has to find the better combinations of albums/songs that are similar (this similarity will be "calculated" with the help of last.fm) and suggests playlists for me.
There's an emerging field in robotics called Evolutionary Robotics (w:Evolutionary Robotics), which uses genetic algorithms (GA) heavily.
See w:Genetic Algorithm:
Simple generational genetic algorithm pseudocode
Choose initial population
Evaluate the fitness of each individual in the population
Repeat until termination: (time limit or sufficient fitness achieved)
Select best-ranking individuals to reproduce
Breed new generation through crossover and/or mutation (genetic
operations) and give birth to
offspring
Evaluate the individual fitnesses of the offspring
Replace worst ranked part of population with offspring
The key is the reproduction part, which could happen sexually or asexually, using genetic operators Crossover and Mutation.