ACO Pheromone update - ant-colony

I'm working on ACO and a little confused about the probability of choosing the next city. I have read some papers and books but still the idea of choosing is unclear. I am looking for a simple explanation how this path building works.
Also how does the heuristics and pheromone come into this decision making?
Because we have same pheromone values at every edge in the beginning and the heuristics (closeness) values remains constant, so how will different ants make different decisions based on these values?

Maybe is too late to answer to question but lately I have been working with ACO and it was also a bit confusing for me. I didn't find much information on stackoverflow about ACO when I needed it, so I have decided to answer this question since maybe this info is usefull for people working on ACO now or in the future.
Swarm Intelligence Algorithms are a set of techniques based on emerging social and cooperative behavior of organisms grouped in colonies, swarms, etc.
The collective intelligence algorithm Ant Colony Optimization (ACO) is an optimization algorithm inspired by ant colonies. In nature, ants of some species initially wander randomly until they find a food source and return to their colony laying down a pheromone trail. If other ants find this trail, they are more likely not to keep travelling at random, but instead to follow the trail, and reinforcing it if they eventually find food.
In the Ant Colony Optimization algorithm the agents (ants) are placed on different nodes (usually it is used a number of ants equal to the number of nodes). And, the probability of choosing the next node (city) is based on agents chose the next node using the equation known as the transition rule, that represents the probability for ant π‘˜ to go from city 𝑖 to city 𝑗 on the 𝑑th tour.
In the equation, πœπ‘–π‘— represents the pheromone trail and πœ‚π‘–π‘— the visibility between the two cities, while 𝛼 and 𝛽 are adjustable parameters that control the relative weight of trail intensity and visibility.
At the beginning there is the same value of pheromone in all the edges. Based on the transition rule, which, in turn, is based on the pheromone and the visibility (distance between nodes), some paths will be more likely to be chosen than others.
When the algorithm starts to run each agent (ant) performs a tour (visits each node), the best tour found until the moment will be updated with a new quantity of pheromone, which will make that tour more probably to be chosen next time by the ants.
You can found more information about ACO in the following links:
http://dataworldblog.blogspot.com.es/2017/06/ant-colony-optimization-part-1.html
http://dataworldblog.blogspot.com.es/2017/06/graph-optimization-using-ant-colony.html
http://hdl.handle.net/10609/64285
Regards,
Ester

Related

Simulated Annealing - How to set up an appropriate neighbourhood configuration for a complex problem?

I am working on a project that attempts to use simulated annealing to organise various items into designated sections while maintaining some stability requirements.
For this problem the neighbourhood configuration should remove two or more of these items and switch their places (there are exactly as many items as there are places for them), but I am unsure of how calculated or random this switch should be.
When testing if the stability requirements are fulfilled, I am able to tell vaguely how items should be repositioned in order to improve the stability, but should the neighbourhood configuration take this into account? Or will that make the algorithm too "greedy"?
Any help/direction is much appreciated!
I have some experience with simulated annealing problems, but I don't know what you mean by 'neighbourhood configuration'. My scenario used the following rules: 1) if making the simulation lower energy, always accept the exchange 2) if making the simulation higher energy, calculate a probability of exchange using the Boltzmann factor and randomise a yes/no for acceptance. "Energy" is abstracted as counting all the non-same-type nearest neighbour interactions and subtracting all the same-type interactions, so that segregating into regions of similar neighbours is energetically favourable.

What optimization algorithm is more suitable for timetable rescheduling?

I'm working on the project where university course is represented as a to-do list, where:
course owner (teacher of the course) can add tasks (containing the URL to the resource needs to be learned and two datetime fields - when to start and when to complete the task)
course subscriber (student) can mark tasks as complete or not complete and their marks are saved individually for each account.
If student marks task as complete - his account + element he marked are shown in the course activity tab for teacher where he can:
initiate a conversation in JavaScript-based chat with him
evaluate the result of the conversation
What optimization algorithm you could recommend me to use for timetable rescheduling (changing datetime fields for to-do element if student procrastinates) here?
Actually, we can use the student activity on the resource + fact that he marked the task as complete + if he clicked or not on the URL placed on the to-do element leading to the external learning material (for example Google Book).
For example, are genetic algorithms suitable for this model and what pitfalls do they have: https://medium.com/#vijinimallawaarachchi/time-table-scheduling-2207ca593b4d ?
I'm not sure I completely understand your problem but it sounds like you have a feasible timetable to begin with and you just need to improve it.
If so genetic algorithms will work very well, but I think representing everything as binary 'chromosomes' like in the link might not be practical.
There are many other ways you can represent a timetable, such as in a 2D array, or giving an event a slot number.
You could look into algorithms such as Tabu search, Simulated Annealing and Great Deluge and Hill Climbing. They are all based on similar ideas but some work better with some problems than others. For example if you have a very rough search space simulated annealing won't be the best and Hill Climbing usually only finds a local optimum.
The general architecture of the algorithms mentioned above and many other genetic algorithms and Metaheuristics is: select a neighbouring solution using a move operator (e.g. swapping the time of one or two or three events or swapping the rooms of two events etc...), check the move doesn't violate any hard constraints, use an acceptance strategy such as, simulated annealing or Great Deluge, to determine if the move is accepted. If it is keep the solution and repeat the steps until the termination criterion is met. This can be max time, number of iterations reached or improving move hasn't been found in x number of iterations.
Whilst this is running keep a log of the 'best' solution so when the algorithm is terminated you have the best solution found. You can determine what is considered 'best' based on how many soft constraints the timetable violates
Hope this helps!

First Fit Decreasing Algorithm on CVRP in Optaplanner

I am using the FFD Algorithm in Optaplanner as a construction heuristic for my CVRP problem. I thought I understood the FFD-Alg from bin picking, but I don't understand the logic behind it when applied in OP on CVRP. So my thought was, it focuses on the demands (Sort cities in decreasing order, starting with the highest demand). To proof my assumption, I fixed the city coordinates to only one location, so the distance to the depot of all cities is the same. Then I changed the demands from big to small. But it doesn't take the cities in decrasing order in the result file.
The Input is: City 1: Demand 16, City 2: Demand 12, City 3: Demand 8, City 4: Demand 4,
City 5: Demand 2.
3 Vehicles with a capacity of 40 per vehicle.
What I thougt: V1<-[C1,C2,C3,C4], V2<-[C5]
What happened: V1<-[C5,C4,C3,C2], V2<-[C1]
Could anyone please explain me the theory of this? Also, I would like to know what happens the other way around, same capacities, but different locations per customer. I tried this too, but it also doesn't sort the cities beginning with the farthest one.
Thank you!
(braindump)
Unlike with non-VRP problems, the choice of "difficulty comparison" to determine the "Decreasing Difficulty" of "First Fit Decreasing", isn't always clear. I've done experiments with several forms - such as based on distance to depot, angle to depot, latitude, etc. You can find all those forms of difficulty comperators in the examples, usually TSP.
One common pitfall is to tweak the comperator before enabling Nearby Selection: tweak Nearby Selection first. If you're dealing with large datasets, an "angle to depo" comparator will behave much better, just because Nearby Selection and/or Paritioned Search aren't active yet. In any case, as always: you need to be using optaplanner-benchmark for this sort of work.
This being said, on a pure TSP use case, the First Fit Decreasing algorithm has worse results than the Nearest Neighbor algorithm (which is another construction heuristics for which we have limited support atm). Both require Local Search to improve further, of course. However, translating the Nearest Neighbor algorithm to VRP is difficult/ambiguous (to say the least): I was/am working on such a translation and I am calling it the Bouy algorithm (look for a class in the vrp example that starts with Bouy). It works, but it doesn't combine well with Nearby Selection yet IIRC.
Oh, and there's also the Clarke and Wright savings algorithm, which is great on small pure CVRP cases. But suffers from BigO (= scaling) problems on bigger datasets and it too becomes difficult/ambiguous when other constraints are added (such as time windows, skill req, lunch breaks, ...).
Long story short: the jury's still out on what's the best construction heuristic for real-world, advanced VRP cases. optaplanner-benchmark will help us there. This despite all the academic papers talking about their perfect CH for a simple form of VRP on small datasets...

Strategy to tackle knapsack binded with travelling salesman

I have been assigned the following problem as a research topic for summer. However, I have not been able to successfully find related problem, except that it seems to be a combination of travelling salesman with the knapsack, even though I'm not sure if that is the case. The statement is:
You are a truck driver who earned a big contract and now you must
deliver a lot of packages. There are N packages to be delivered, each
one has to be delivered at a certain address (x,y) on the city.
Additionally each package i has a weight Wi.
For simplicity, suppose the distribution area is rectangular and you
always start at the point (0,90).
You only have one truck, with limited capacity of 1000 (excluding
truck weight). The truck base weight is 10.
The distances to be travelled are far away, so the distance shall be
computer using Haversine distance.
The company who contracted you will provide you with enough fuel, so
you can make unlimited ammount of travels.
However, you must be very careful while delivering the packages since
you must deliver every single one of them and, if you choose to pick
up a package during a trip, you must deliver it no matter what, you
can't leave them in the middle of your trip.
As you are a bit miserly, you agree on the conditions, but you know
that if you don't take a close to optimal strategy, your truck can
wear too much, so you could end leaving the contract incomplete,
which will get you sued and leave you without truck and money.
So, due to your experience, you know that to maximize the chances of survival
of your truck, you must minimize the following function:
http://goo.gl/jQxXzN (sorry, I can't post images because I have not enough reputation).
where m is the number of tripss, n is the number of packages for each trip j, wij is the weight of the i-th gift during the j-th trip, Dist represents Haversine distance between two points, Loc(i) is the location of the i-th gift, Loc(0) and Loc(n) are the (0,90) point (starting point), and wnj (last weight of the trip) is always the truck weight (base weight).
So, basically, those are all the restictions of the research topic I got.
I have been thinking maybe some metaheuristics such as ant collony or genetic algorithms could be of help, but I would have to get to know the problem a bit better. Any idea or information (paper, book, etc) is welcomed.
This sounds like a variation of the Capacitated Vehicle Routing Problem (CRVP), specifically with with single-vehicle and single-depot (however with non-uniform packages). For some reference on this problem, see e.g.:
On the Capacitated Vehicle Routing Problem (T.K. Ralphs, L. Kopman, W.R. Pulleyblank, and L.E. Trotter, Jr)
Modeling and Solving the Capacitated Vehicle Routing Problem on Trees (B. Chandran and S. Raghavan)
I think your idea of metaheuristics---ant colony optimisation (ACO) in particular---would be a wise approach. Note that for TSP related problems, generally ACO is to prefer over genetic algorithms (GA).
Possibly the following somewhat well known article (1) can get you started to studying the possible benefits of ACO approach. (2) extends the result from (1). Note that this covers the regular VHP (no capacitated), but it should prove valuable as a starting point, the least giving inspiration.
SavingsAnts for the Vehicle Routing Problem (K. Doerner et al.)
An improved ant colony optimization for vehicle routing problem (Yu Bin et al.)
There also seems to exists literature specifically one the subject of ACO and CVRP, but I cannot comment to the quality of these, but I'll list them for you to inspect by yourself. Note that (3) is an extension from the authors of (1).
Parallel Ant Systems for the Capacitated Vehicle Routing Problem (K. Doerner et al.)
Ant Colony Optimization for Capacitated Vehicle Routing Problem (W.F. Tan et al.)
Sound like an interesting research topic, good luck!

What model best suits optimizing for a real-time strategy game?

An article has been making the rounds lately discussing the use of genetic algorithms to optimize "build orders" in StarCraft II.
http://lbrandy.com/blog/2010/11/using-genetic-algorithms-to-find-starcraft-2-build-orders/
The initial state of a StarCraft match is pre-determined and constant. And like chess, decisions made in this early stage of the match have long-standing consequences to a player's ability to perform in the mid and late game. So the various opening possibilities or "build orders" are under heavy study and scrutiny. Until the circulation of the above article, computer-assisted build order creation probably wasn't as popularity as it has been recently.
My question is... Is a genetic algorithm really the best way to model optimizing build orders?
A build order is a sequence of actions. Some actions have prerequisites like, "You need building B before you can create building C, but you can have building A at any time." So a chromosome may look like AABAC.
I'm wondering if a genetic algorithm really is the best way to tackle this problem. Although I'm not too familiar with the field, I'm having a difficult time shoe-horning the concept of genes into a data structure that is a sequence of actions. These aren't independent choices that can be mixed and matched like a head and a foot. So what value is there to things like reproduction and crossing?
I'm thinking whatever chess AIs use would be more appropriate since the array of choices at any given time could be viewed as tree-like in a way.
Although I'm not too familiar with the field, I'm having a difficult time shoe-horning the concept of genes into a data structure that is a sequence of actions. These aren't independent choices that can be mixed and matched like a head and a foot. So what value is there to things like reproduction and crossing?
Hmm, that's a very good question. Perhaps the first few moves in Starcraft can indeed be performed in pretty much any order, since contact with the enemy is not as immediate as it can be in Chess, and therefore it is not as important to remember the order of the first few moves as it is to know which of the many moves are included in those first few. But the link seems to imply otherwise, which means the 'genes' are indeed not all that amenable to being swapped around, unless there's something cunning in the encoding that I'm missing.
On the whole, and looking at the link you supplied, I'd say that genetic algorithms are a poor choice for this situation, which could be accurately mathematically modelled in some parts and the search tree expanded out in others. They may well be better than an exhaustive search of the possibility space, but may not be - especially given that there are multiple populations and poorer ones are just wasting processing time.
However, what I mean by "a poor choice" is that it is inefficient relative to a more appropriate approach; that's not to say that it couldn't still produce 98% optimal results in under a second or whatever. In situations such as this where the brute force of the computer is useful, it is usually more important that you have modelled the search space correctly than to have used the most effective algorithm.
As TaslemGuy pointed out, Genetic Algorithms aren't guaranteed to be optimal, even though they usually give good results.
To get optimal results you would have to search through every possible combination of actions until you find the optimal path through the tree-like representation. However, doing this for StarCraft is difficult, since there are so many different paths to reach a goal. In chess you move a pawn from e2 to e4 and then the opponent moves. In StarCraft you can move a unit at instant x or x+1 or x+10 or ...
A chess engine can look at many different aspects of the board (e.g. how many pieces does it have and how many does the opponent have), to guide it's search. It can ignore most of the actions available if it knows that they are strictly worse than others.
For a build-order creator only time really matters. Is it better to build another drone to get minerals faster, or is it faster to start that spawning pool right away? Not as straightforward as with chess.
These kinds of decisions happen pretty early on, so you will have to search each alternative to conclusion before you can decide on the better one, which will take a long time.
If I were to write a build-order optimizer myself, I would probably try to formulate a heuristic that estimates how good (close the to the goal state) the current state is, just as chess engines do:
Score = a*(Buildings_and_units_done/Buildings_and_units_required) - b*Time_elapsed - c*Minerals - d*Gas + e*Drone_count - f*Supply_left
This tries to keep the score tied to the completion percentage as well as StarCraft common knowledge (keep your ressources low, build drones, don't build more supply than you need). The variables a to f would need tweaking, of course.
After you've got a heuristic that can somewhat estimate the worth of a situation, I would use Best-first search or maybe IDDFS to search through the tree of possibilities.
Edit:
I recently found a paper that actually describes build order optimization in StarCraft, in real time even. The authors use depth-first search with branch and bound and heuristics that estimate the minimum amount of effort required to reach the goal based on the tech tree (e.g. zerglings need a spawning pool) and the time needed to gather the required minerals.
Genetic Algorithm can be, or can sometimes not be, the optimal or non-optimal solution. Based on the complexity of the Genetic Algorithm, how much mutation there is, the forms of combinations, and how the chromosomes of the genetic algorithm is interpreted.
So, depending on how your AI is implemented, Genetic Algorithms can be the best.
You are looking at a SINGLE way to implement genetic algorithms, while forgetting about genetic programming, the use of math, higher-order functions, etc. Genetic algorithms can be EXTREMELY sophisticated, and by using clever combining systems for crossbreeding, extremely intelligent.
For instance, neural networks are optimized by genetic algorithms quite often.
Look up "Genetic Programming." It's similar, but uses tree-structures instead of lines of characters, which allows for more complex interactions that breed better. For more complex stuff, they typically work out better.
There's been some research done using hierarchical reinforcement learning to build a layered ordering of actions that efficiently maximizes a reward. I haven't found much code implementing the idea, but there are a few papers describing MAXQ-based algorithms that have been used to explicitly tackle real-time strategy game domains, such as this and this.
This Genetic algorithm only optimizes the strategy for one very specific part of the game: The order of the first few build actions of the game. And it has a very specific goal as well: To have as many roaches as quickly as possible.
The only aspects influencing this system seem to be (I'm no starcraft player):
build time of the various units and
buildings
allowed units and buildings given the available units and buildings
Larva regeneration rate.
This is a relatively limited, relatively well defined problem with a large search space. As such it is very well suited for genetic algorithms (and quite a few other optimization algorithm at that). A full gene is a specific set of build orders that ends in the 7th roach. From what I understand you can just "play" this specific gene to see how fast it finishes, so you have a very clear fitness test.
You also have a few nice constraints on the build order, so you can combine different genes slightly smarter than just randomly.
A genetic algorithm used in this way is a very good tool to find a more optimal build order for the first stage of a game of starcraft. Due to its random nature it is also good at finding a surprising strategy, which might have been an additional goal of the author.
To use a genetic algorithm as the algorithm in an RTS game you'd have to find a way to encode reactions to situations rather than just plain old build orders. This also involves correctly identifying situations which can be a difficult task in itself. Then you'd have to let these genes play thousands of games of starcraft, against each other and (possibly) against humans, selecting and combining winners (or longer-lasting losers). This is also a good application of genetic algorithms, but it involves solving quite a few very hard problems before you even get to the genetic algorithm part.