Is there a function for aborting routing calculation in optaplanner? - optaplanner

I want to have a function like if the calculation time get too long, we abort routing calculation and submit the best solution at the point of time. Is there such a function in optaplanner ?

For example in a GUI application, you would start solving on a background (worker) thread. In this scenario you can stop solver asynchronously by calling solver.terminateEarly() from another thread, typically the UI thread when you click a stop button.
If this is not what you're looking for, read on.
Provided that by calculation you actually mean the time spent solving, you have several options how to stop solver. Besides asynchronous termination described in the first paragraph, you can use synchronous termination:
Use time spent termination if you know how much time you want dedicate to solving beforehand.
Use unimproved time spent termination if you want to stop solving if the solution doesn't improve for a specified amount of time.
Use best score termination if you want to stop solving after a certain score has been reached.
Synchronous termination is defined before starting the solver and it's done either by XML solver configuration or using the SolverConfig API. See OptaPlanner documentation for other termination conditions.
Lastly, in case you're talking about score calculation and it takes too long to calculate score for a single move (solution change) then you're most certainly doing something wrong. For OptaPlanner to be able to search the solution space effectively, the score calculation must be fast (at least 1000 calculations per second).
For example in vehicle routing problem, driving time or road distances must be known at the time when you start solving. You shouldn't slow down score calculation with a heavy computation that can be done beforehand.

Related

How to handle huge amount of changes in real time planning?

If real time planning and daemon mode is enabled, when an update or addition of planning entity is to be made a problem fact change must be invoked.
So let say the average rate of change will be 1/sec, so for every second a problem fact change must be called resulting to restarting the solver every second.
Do we just invoke or schedule a problem fact change every second resulting to restarting the solver every second or if we know that there will be huge amount of changes, stop the solver first, apply changes then start the solver?
In the scenario you describe, the solver will be likely restarted every time. It's not a complete restart as if you just call the Solver.solve() with the updated last known solution, but the ScoreDirector, a component responsible for score calculation, is restarted each time a problem change is applied.
If problem changes come faster, they might be processed in a batch. The solver checks problem changes between the evaluation of individual moves, so if multiple changes come before the solver finishes the evaluation of the current move, they are all applied and the solver restarts just once. In the opposite case, when there are seldom changes coming, the restart doesn't matter much, as there is enough time for the solver to improve the solution.
But the rate of 1 change/sec will likely lead to frequent solver restarts and will affect its ability to produce better solutions.
The solver does not know if there is going to be a bigger amount of changes in the next second. The current behavior may be improved by processing the problem changes periodically in a predefined time interval rather than between move evaluations.
Of course, the periodic grouping of problem changes can be done outside the solver as well.

Recommended way of measuring execution time in Tensorflow Federated

I would like to know whether there is a recommended way of measuring execution time in Tensorflow Federated. To be more specific, if one would like to extract the execution time for each client in a certain round, e.g., for each client involved in a FedAvg round, saving the time stamp before the local training starts and the time stamp just before sending back the updates, what is the best (or just correct) strategy to do this? Furthermore, since the clients' code run in parallel, are such a time stamps untruthful (especially considering the hypothesis that different clients may be using differently sized models for local training)?
To be very practical, using tf.timestamp() at the beginning and at the end of #tf.function client_update(model, dataset, server_message, client_optimizer) -- this is probably a simplified signature -- and then subtracting such time stamps is appropriate?
I have the feeling that this is not the right way to do this given that clients run in parallel on the same machine.
Thanks to anyone can help me on that.
There are multiple potential places to measure execution time, first might be defining very specifically what is the intended measurement.
Measuring the training time of each client as proposed is a great way to get a sense of the variability among clients. This could help identify whether rounds frequently have stragglers. Using tf.timestamp() at the beginning and end of the client_update function seems reasonable. The question correctly notes that this happens in parallel, summing all of these times would be akin to CPU time.
Measuring the time it takes to complete all client training in a round would generally be the maximum of the values above. This might not be true when simulating FL in TFF, as TFF maybe decided to run some number of clients sequentially due to system resources constraints. In practice all of these clients would run in parallel.
Measuring the time it takes to complete a full round (the maximum time it takes to run a client, plus the time it takes for the server to update) could be done by moving the tf.timestamp calls to the outer training loop. This would be wrapping the call to trainer.next() in the snippet on https://www.tensorflow.org/federated. This would be most similar to elapsed real time (wall clock time).

Optaplanner - soft scoring rule not working as expected

I built an application which implements a similar function as task assignment. I thought it works well until recently I noticed the solutions are not optimal. In details, there is a score table for each possible pair between machines and tasks, and usually the number of machines is much less than the number of tasks. I used hard/medium/soft rules, where the soft rule is incremental based on the score of each assignment from the score table.
However, when I reviewed the results after 1-2 hours run, I found out of the unassigned tasks there are many better choices (would achieve higher soft score if assigned) than current assignments. The benchmark reports indicate that the total soft score reached plateau within a hour and then stuck at that score level.
I checked the logic of rules - if the soft rule working perfectly, it should eventually find a way of allocation which achieves the highest overall soft score, whereas meeting the other hard/medium rules, isn't it?
I've been trying various things such as tuning algorithm parameters, scaling the score table, etc. but none delivers the optimal solution.
One problem is that you might be facing a score trap (see docs). In that case, make your constraint score more fine grained to deal with that.
If that's not the case, and you're stuck in a local optima, then I wouldn't play too much with the algorithm parameters - they will probably fix it, but you'll be overfitting on that dataset.
Instead, figure out the smallest possible move that gets you of that local optima and a step closer to the global optimum. Add that kind of moves as a custom move. For example if a normal swap move can't help, but you see a way of getting there by doing a 3-swap move, then implement that move.

Optaplanner - Real-time planning doesn't find good solution

I am trying to use real time planning using a custom SolverThread implementing SolverEventListener and the daemon mode.
I am not interested in inserting or deleting entities. I am just interested in "updating" them, for example, changing the "priority" for a particular entity in my PlanningEntityCollectionProperty collection.
At the moment, I am using:
scoreDirector.beforeProblemPropertyChanged(entity);
entity.setPriority(newPriority);
scoreDirector.afterProblemPropertyChanged(entity);
It seems that the solver is executed and it manages to improve the actual solution, but it only spends a few ms on it:
org.optaplanner.core.impl.solver.DefaultSolver: Real-time problem fact changes done: step total (1), new best score (0hard/-100medium/-15soft).
org.optaplanner.core.impl.solver.DefaultSolver: Solving restarted: time spent (152), best score (0hard/-100medium/-15soft), environment mode (REPRODUCIBLE),
Therefore, the solver stops really soon, considering that my solver has a 10 seconds UnimprovedSecondsSpentLimit. So, the first time the solver is executed, it stops after 10 seconds, but the following times, it stops after a few ms and doesn't manage to get a good solution.
I am not sure I need to use "beforeProblemPropertyChanged" when the planning entity changes, but I can't find any alternative because "beforeVariableChanged" is used when the planning variable changes, right? Maybe optaplanner just doesn't support updates in the entities and I need to delete the old one using beforeEntityRemoved and inserted it again using beforeEntityAdded?
I was using BRANCH_AND_BOUND, however, I have changed to local search TABU_SEARCH and it seems that the scheduler uses 10 seconds now. However, it seems stuck in a local optima because it doesn't manage to improve the solution, even with a really small collection (10 entities).
Anyone with experience with real time planning?
Thanks
The "Solving restarted" always follows very shortly after "Real-time problem fact changes done", because real-time problem facts effectively "stop & restart" the solver. The 10 sec unimproved termination only starts counting again after the restart.
DEBUG logging (or even TRACE) will show you what's really going on.

Optaplanner termination strategy

In the opta planner configuration ,there is a provision to specify the termination time out.
Is there a better way to handle the termination time out strategy? For example , my problem size is small and I have set the termination time out as 10 sec.
But I can see from the logs that the best score is obtained well within 2 - 3 seconds. Is there any means to exit once the best score is reached ?
Or should the program always run till the timeout is reached and then output the best score.
Take a look at the Termination chapter in the OptaPlanner documentation.
What you are referring to is called BestScoreTermination but it might not be what you actually want -- do note that OptaPlanner has no way of knowing if the score is "the optimal score"... unless you configure Exhaustive Search (which doesn't scale well).
Therefore, if you misjudge your problem and set the BestScoreTermination to something "better" than the optimal value, OptaPlanner will run until it tries out all combinations (which might take effectively forever on big problems). If you're looking for a compromise, take a look at "termination composition"