Optaplanner - Why switching from mode full_assert to another gives totally different results? - optaplanner

I am developing a solver using drools to find the most likely date (start date = planningVariable) for which a medical analysis (planningEntity) will start. Then i calculate other dates from it whose calculation does not depend on the optaplanner mechanism (end date and other intermediate dates). For my problem, I need each rule to fire whenever the planningVariable value change.
When using FULL_ASSERT mode everything works but when i change to another mode, results are a mess. Almost no rule is respected or the solution gives null values and i don't understand why.
Is it because the full_assert mode is the only one that guarantees to fire all rules each time the planningVariable value changes?

Try running NON_INTRUSIVE_FULL_ASSERT, that doesn't trigger more fireAllRules().
Either way, switching on FULL_ASSERT etc, shouldn't change the behavior (unless you're explicitly using simulated annealing with walk clock time because it is time gradient sensitive). If it does change behavior, it's probably due to some sort of corruption. All the more reasons to run NON_INTRUSIVE_FULL_ASSERT and figure out where.

Related

How to handle huge amount of changes in real time planning?

If real time planning and daemon mode is enabled, when an update or addition of planning entity is to be made a problem fact change must be invoked.
So let say the average rate of change will be 1/sec, so for every second a problem fact change must be called resulting to restarting the solver every second.
Do we just invoke or schedule a problem fact change every second resulting to restarting the solver every second or if we know that there will be huge amount of changes, stop the solver first, apply changes then start the solver?
In the scenario you describe, the solver will be likely restarted every time. It's not a complete restart as if you just call the Solver.solve() with the updated last known solution, but the ScoreDirector, a component responsible for score calculation, is restarted each time a problem change is applied.
If problem changes come faster, they might be processed in a batch. The solver checks problem changes between the evaluation of individual moves, so if multiple changes come before the solver finishes the evaluation of the current move, they are all applied and the solver restarts just once. In the opposite case, when there are seldom changes coming, the restart doesn't matter much, as there is enough time for the solver to improve the solution.
But the rate of 1 change/sec will likely lead to frequent solver restarts and will affect its ability to produce better solutions.
The solver does not know if there is going to be a bigger amount of changes in the next second. The current behavior may be improved by processing the problem changes periodically in a predefined time interval rather than between move evaluations.
Of course, the periodic grouping of problem changes can be done outside the solver as well.

How to solve the scheduling problem with random durations? (undomoves will result different scores)

"A shadow variable is in essence the result of a formula/algo based on at least 1 planning variable (and maybe some problem properties). The same planning variables state should always deliver the exact same shadow variable state." However, for project scheduling problems with random durations, even the start time of a job remains the same as before move or undomove, the end time of the job will be different, because the duration is a random variable. Both the start time and the end time of a job are shadow variables. Then the score after undomove and beforemove will be different. How to deal with this situation?
When you say:
Then the score after undomove and beforemove will be different.
That is the root of your problem. Assume a solution X and a move M. Move M transforms solution X into solution Y. Undo move M2 must then transform solution Y back into solution X. (X and Y do not need to, and are not going to, be the same instance; they just need to represent the exact same state of the problem.)
Where you fall short is in modelling random duration of tasks. Every time the duration of a task changes, that is a change to the problem. When task duration changes, you are no longer solving the same problem - and you need to tell the solver that.
There are two ways of doing that:
Externally via ProblemChange. This will effectively restart the solver.
During a custom move, using ScoreDirector's before...() and after...() methods. But if you do that, then your undo moves must restore the solution back to its original state. They must reset the duration to what it was before.
There really is no way around this. Undo moves restore the solution to the exact same state as there was before the original move.
That said, I honestly do not understand how you implement randomness in your planning entities. If you share your code, maybe I will be able to give a more targetted answer.

What exactly does A/B test mean in this scenario?

When multiple developers are continually applying changes to databases
within an organization, there may be no structured change log. Then
it's difficult to find out what change caused the database performance
to slow down. In a call center, where results must be instantaneous
for customers waiting on the phone, an adverse change can be
detrimental to the business. The DEA tests a change that might have an
impact, and compares it to the unchanged database. You apply the A/B
testing method to analyze one change at a time, helping you to clearly
know its effect. The change might mean an improvement to performance,
degradation, or maybe no performance difference, which is equally
acceptable.
I am looking for an example to understand the following statement from the above quote: The DEA tests a change that might have an impact, and compares it to the unchanged database. You apply the A/B testing method to analyze one change at a time, helping you to clearly know its effect.
For example - suppose a setting is currently having the value as 0 and I want to set it to 1. How will the above A/B testing help me exactly?
You got two environments and the only difference between them is the setting you specified. And then you split the traffic to the DB into two (50% vs. 50% for example), and part of them goes to the first enviroment and part of them goes to another. You monitor the metrics (performance or something else) and if the difference is statistically significant, you would have high confidence that setting is the change that moves the metric.

Optaplanner: Reproducible solution

I am trying to solve a problem similar to employee rostering. The problem I am facing is every time I run the solver, it generates a different assignment. This makes it harder to debug why a particular case was picked over another. Why is this the case?
P.S. My assignment has many hard constraint and all of them may not be satisfied (most cases I still see some negative hard score). So my termination strategy is based on unimprovedSecondsSpentLimit. Could this be the reason?
Yes, it's likely the termination. OptaPlanner's default environmentMode guarantees the exact same solution at the exact same step (*). But CPU cycles differ a lot from run to run, so that means you get more or less steps per run. Use DEBUG logging to see that.
Use stepCountLimit or unimprovedStepCountLimit termination.
(*) Unless specified otherwise in the docs. Simulated Annealing for example will be different even in the exact same step if used with time bound terminations.

Optaplanner - soft scoring rule not working as expected

I built an application which implements a similar function as task assignment. I thought it works well until recently I noticed the solutions are not optimal. In details, there is a score table for each possible pair between machines and tasks, and usually the number of machines is much less than the number of tasks. I used hard/medium/soft rules, where the soft rule is incremental based on the score of each assignment from the score table.
However, when I reviewed the results after 1-2 hours run, I found out of the unassigned tasks there are many better choices (would achieve higher soft score if assigned) than current assignments. The benchmark reports indicate that the total soft score reached plateau within a hour and then stuck at that score level.
I checked the logic of rules - if the soft rule working perfectly, it should eventually find a way of allocation which achieves the highest overall soft score, whereas meeting the other hard/medium rules, isn't it?
I've been trying various things such as tuning algorithm parameters, scaling the score table, etc. but none delivers the optimal solution.
One problem is that you might be facing a score trap (see docs). In that case, make your constraint score more fine grained to deal with that.
If that's not the case, and you're stuck in a local optima, then I wouldn't play too much with the algorithm parameters - they will probably fix it, but you'll be overfitting on that dataset.
Instead, figure out the smallest possible move that gets you of that local optima and a step closer to the global optimum. Add that kind of moves as a custom move. For example if a normal swap move can't help, but you see a way of getting there by doing a 3-swap move, then implement that move.