Hill Climbing and Tabu Search in OptaPlanner - optaplanner

I'm using OptaPlanner to solve some plannig problems. I read the documentation and I'm not quite sure how does exactly Hill Climbing and Tabu Search algorithms work. What I'm unsure of is:
does hill climbing pick only moves with THE BEST score that is BETTER than current one or does it allow to pick moves with THE BEST score that is NOT WORSE than the current one?
does tabu search allow picking moves that have WORSE score than the current one if there is no move leading to a solution with better or equal score to the current one?

For Hill Climbing, see HillClimbingAcceptor#isAccepted(...). It accepts any move that has a score that is better or equal to the lastest step score. And looking at the default forager config for hill climing (in LocalSearchPhaseConfig, which says foragerConfig.setAcceptedCountLimit(1);), as soon as 1 move is accepted, it is the winning move.
For Tabu Search, it will select moves that have a worse score, if:
in the number of moves it selects per step (usually acceptedCountLimit is configured to 1000 or so), no better move is seen
OR all the moves that do lead to a better score are in the tabu list (they are "taboo to use"). For solutionTabu, this means that there's a guarantee that they won't lead to a new best solution (but solutionTabu is useless). For entityTabu there is no such 100% guarantee, but you will get a better results in about 99.999999999999999999999999999999999999999999999999999999999999999% of the cases if you have more than let's say 50+ variables (and more if you have a 1000+ variables).
PS: Hill Climbing sucks. There's never a good reason to not use Late Acceptance or Tabu Search instead.
PPS: Use the Benchmarker, do let HC, LA, TS, ... fight against each other. It will give you a lot of insight.

Related

Iterative deepening in minimax - sorting all legal moves, or just finding the PV-move then using MVV-LVA?

After reading the chessprogramming wiki and other sources, I've been confused about what the exact purpose of iterative deepening. My original understanding was the following:
It consisted of minimax search performed at depth=1, depth=2, etc. until reaching the desired depth. After a minimax search of each depth, sort the root-node moves according to the results from that search, to make for optimal move ordering in the next search with depth+1, so in the next deeper search,the PV-move is searched, then the next best move, then the next best move after that, and so on.
Is this correct? Doubts emerged when I read about MVV-LVA ordering, specifically about ordering captures, and additionally, using hash tables and such. For example, this page recommends a move ordering of:
PV-move of the principal variation from the previous iteration of an iterative deepening framework for the leftmost path, often implicitly done by 2.
Hash move from hash tables
Winning captures/promotions
Equal captures/promotions
Killer moves (non capture), often with mate killers first
Non-captures sorted by history heuristic and that like
Losing captures
If so, then what's the point of sorting the minimax from each depth, if only the PV-move is needed? On the other hand, if the whole point of ID is the PV-move, won't it be a waste to search from every single minimax depth up till desired depth just to calculate the PV-move of each depth?
What is the concrete purpose of ID, and how much computation does it save?
Correct me if I am wrong, but I think you are mixing 2 different concepts here.
Iterative deepening is mainly used to set a maximum search time for each move. The AI will go deeper and deeper, and then when the decided time is up it returns the move from the latest depth it finished searching. Since each increase in depth leads to exponentially longer search times, searching each depth from e.g. 1 to 12 take almost the same time as only searching with depth 12.
Sorting the moves is done to maximize the effect of alpha-beta pruning. If you want an optimal alpha-beta pruning you look at the best move first. Which is of course impossible to know beforehand, but the points you stated above is a good guess. Just make sure that the sorting algorithm doens't slow down your recursive function, and by that removing the effect from the alhpa-beta.
Hope this helps and that I understood your question correctly.

First Fit Decreasing Algorithm on CVRP in Optaplanner

I am using the FFD Algorithm in Optaplanner as a construction heuristic for my CVRP problem. I thought I understood the FFD-Alg from bin picking, but I don't understand the logic behind it when applied in OP on CVRP. So my thought was, it focuses on the demands (Sort cities in decreasing order, starting with the highest demand). To proof my assumption, I fixed the city coordinates to only one location, so the distance to the depot of all cities is the same. Then I changed the demands from big to small. But it doesn't take the cities in decrasing order in the result file.
The Input is: City 1: Demand 16, City 2: Demand 12, City 3: Demand 8, City 4: Demand 4,
City 5: Demand 2.
3 Vehicles with a capacity of 40 per vehicle.
What I thougt: V1<-[C1,C2,C3,C4], V2<-[C5]
What happened: V1<-[C5,C4,C3,C2], V2<-[C1]
Could anyone please explain me the theory of this? Also, I would like to know what happens the other way around, same capacities, but different locations per customer. I tried this too, but it also doesn't sort the cities beginning with the farthest one.
Thank you!
(braindump)
Unlike with non-VRP problems, the choice of "difficulty comparison" to determine the "Decreasing Difficulty" of "First Fit Decreasing", isn't always clear. I've done experiments with several forms - such as based on distance to depot, angle to depot, latitude, etc. You can find all those forms of difficulty comperators in the examples, usually TSP.
One common pitfall is to tweak the comperator before enabling Nearby Selection: tweak Nearby Selection first. If you're dealing with large datasets, an "angle to depo" comparator will behave much better, just because Nearby Selection and/or Paritioned Search aren't active yet. In any case, as always: you need to be using optaplanner-benchmark for this sort of work.
This being said, on a pure TSP use case, the First Fit Decreasing algorithm has worse results than the Nearest Neighbor algorithm (which is another construction heuristics for which we have limited support atm). Both require Local Search to improve further, of course. However, translating the Nearest Neighbor algorithm to VRP is difficult/ambiguous (to say the least): I was/am working on such a translation and I am calling it the Bouy algorithm (look for a class in the vrp example that starts with Bouy). It works, but it doesn't combine well with Nearby Selection yet IIRC.
Oh, and there's also the Clarke and Wright savings algorithm, which is great on small pure CVRP cases. But suffers from BigO (= scaling) problems on bigger datasets and it too becomes difficult/ambiguous when other constraints are added (such as time windows, skill req, lunch breaks, ...).
Long story short: the jury's still out on what's the best construction heuristic for real-world, advanced VRP cases. optaplanner-benchmark will help us there. This despite all the academic papers talking about their perfect CH for a simple form of VRP on small datasets...

Optaplanner take fastest path

How can we optimize Optaplanner to select the fastest route? See the highlighted point in the below image. It is taking the long route.
Note: Vehicles does not need to come back depot. I think i cannot use CVRPTW as arrivalAfterDueTimeAtDepot is a build-in hard constraint (and besides i do not have any time constraints).
How can we write a constraint to select the less capacity vehicle?
For example, A customer needs only 3 items and we have two vehicles with 4 and 9 capacities. Seems like Optaplanner is selecting the first vehicle from the order of input by default.
I presume it's taking the blue vehicle for the center of Bengaluru because the green in is already at full capacity.
Check what the score is (calculated through Solver.getScoreDirectorFactory()) if you manually put that location in the green trip and swap the vehicles of the green and blue trip. If it's worse (or breaks a hard constraint), then it's normal that OptaPlanner selects the other solution. In that case, either your score function has bug (or you realize don't want that solution at all). But if it has indeed a better score, OptaPlanner's <localSearch> (such as Late Acceptance) should find it (especially when scaling out because ironically local optima are a bigger problem when scaling down). You can try to add <subchainSwapMoveSelector> etc to escape local optima faster.
If you want to guide the search more (which is often not a good idea), you can define a planning value strength comparator to sort small vehicles before big vehicles and use the Construction Heuristic WEAKEST_FIT(_DECREASING).

Iterative deepening with a time limit

I'm working on implementing iterative deepening with principal variation for alpha-beta search for a computer chess program, and I was hoping to include a time limit for the search. I was wondering about the consequences of the time limit being reached in the middle of, say, a search at a depth of 5. If this incomplete search has found a new principal variation, would that be guaranteed to be at least as good as the principal variation found by the complete search at a depth of 4? Otherwise, it seems like I should throw out anything found by the incomplete search at a depth of 5.
If you stop in the middle of an iteration you can use the best move found so far backed up to the root on that iteration. It's not guaranteed to be at least as good as the best move found by the previous iteration but it is ordered above it by the current iteration. The best scoring move will be missed on the current iteration only if it's ordered below the stopping move.

I am looking for a radio advertising scheduling algorithm / example / experience

Tried doing a bit of research on the following with no luck. Thought I'd ask here in case someone has come across it before.
I help a volunteer-run radio station with their technology needs. One of the main things that have come up is they would like to schedule their advertising programmatically.
There are a lot of neat and complex rule engines out there for advertising, but all we need is something pretty simple (along with any experience that's worth thinking about).
I would like to write something in SQL if possible to deal with these entities. Ideally if someone has written something like this for other advertising mediums (web, etc.,) it would be really helpful.
Entities:
Ads (consisting of a category, # of plays per day, start date, end date or permanent play)
Ad Category (Restaurant, Health, Food store, etc.)
To over-simplify the problem, this will be a elegant sql statement. Getting there... :)
I would like to be able to generate a playlist per day using the above two entities where:
No two ads in the same category are played within x number of ads of each other.
(nice to have) high promotion ads can be pushed
At this time, there are no "ad slots" to fill. There is no "time of day" considerations.
We queue up the ads for the day and go through them between songs/shows, etc. We know how many per hour we have to fill, etc.
Any thoughts/ideas/links/examples? I'm going to keep on looking and hopefully come across something instead of learning it the long way.
Very interesting question, SMO. Right now it looks like a constraint programming problem because you aren't looking for an optimal solution, just one that satisfies all the constraints you have specified. In response to those who wanted to close the question, I'd say they need to check out constraint programming a bit. It's far closer to stackoverflow that any operations research sites.
Look into constraint programming and scheduling - I'll bet you'll find an analogous problem toot sweet !
Keep us posted on your progress, please.
Ignoring the T-SQL request for the moment since that's unlikely to be the best language to write this in ...
One of my favorites approaches to tough 'layout' problems like this is Simulated Annealing. It's a good approach because you don't need to think HOW to solve the actual problem: all you define is a measure of how good the current layout is (a score if you will) and then you allow random changes that either increase or decrease that score. Over many iterations you gradually reduce the probability of moving to a worse score. This 'simulated annealing' approach reduces the probability of getting stuck in a local minimum.
So in your case the scoring function for a given layout might be based on the distance to the next advert in the same category and the distance to another advert of the same series. If you later have time of day considerations you can easily add them to the score function.
Initially you allocate the adverts sequentially, evenly or randomly within their time window (doesn't really matter which). Now you pick two slots and consider what happens to the score when you switch the contents of those two slots. If either advert moves out of its allowed range you can reject the change immediately. If both are still in range, does it move you to a better overall score? Initially you take changes randomly even if they make it worse but over time you reduce the probability of that happening so that by the end you are moving monotonically towards a better score.
Easy to implement, easy to add new 'rules' that affect score, can easily adjust run-time to accept a 'good enough' answer, ...
Another approach would be to use a genetic algorithm, see this similar question: Best Fit Scheduling Algorithm this is likely harder to program but will probably converge more quickly on a good answer.