Optaplanner - local phase 0 step in total in 1 minute - optaplanner

I have set up a solver with two local search phases and it works fine. However, there was a time that the 2nd phase didn't make any move in about 1 minute, as the log shows below:
...
2016-05-07 14:14:55,847 [main] DEBUG LS step (10069), time spent (593822), score (0hard/-81medium/5395020soft), best score (0hard/-80medium/5393781soft), accepted/selected move count (5/48), picked move (CL [cID=1147576, id=27246 => SL [id=49, sID=E942648]] <=> CL [cID=1133912, id=14716 => SL [ id=7, sID=E942592]]).
2016-05-07 14:14:55,858 [main] DEBUG LS step (10070), time spent (593833), score (0hard/-81medium/5395390soft), best score (0hard/-80medium/5393781soft), accepted/selected move count (5/18), picked move (CL [cID=1142322, id=22533 => SL [ id=51, sID=E943251]] <=> CL [cID=1134362, id=14118 => SL [ id=49, sID=E942648]]).
2016-05-07 14:14:55,858 [main] INFO Local Search phase (1) ended: step total (10071), time spent (593833), best score (0hard/-80medium/5393781soft).
2016-05-07 14:16:05,042 [main] INFO Local Search phase (2) ended: step total (0), time spent (663017), best score (0hard/-80medium/5393781soft).
2016-05-07 14:16:05,042 [main] INFO Solving ended: time spent (663017), best score (0hard/-80medium/5393781soft), average calculate count per second (2771).
Before phase 1 ended, there wasn't any improvement in the last couple of steps. And phase 2 started but made 0 step in a minute. The solver then ended because it has reached the maximum time allowed.
I'm a bit surprised that phase 2 made no step at all. Is it simply because it didn't manage to find any better score?

If you don't see any moves in TRACE log (as suggested in comments), it could be because you're using a custom move list factory and that's taken too long to initialize.

Related

Local Search phase needs to start from an initialized solution - But how?

I am writing a modified version of the Task Assignment example with my own domain model.
In my model each Task can have a NextTask and a PreviousTask and an Assignee. All 3 are configured as PlanningVariables:
...
/** PreviousTask is a calculated task that the Resource will complete before this one. */
#PlanningVariable(valueRangeProviderRefs = { "tasksRange" }, graphType = PlanningVariableGraphType.CHAINED)
public Task PreviousTask;
#InverseRelationShadowVariable(sourceVariableName = "PreviousTask")
public Task NextTask;
#AnchorShadowVariable(sourceVariableName = "PreviousTask")
public Resource Assignee;
I have been stuck on the Local Search Phase step for some time now as it appears to require an initialized state of my planning variables (Task.PreviousTask in this case).
Error log:
2020-07-16 15:00:15.341 INFO 4616 --- [pool-1-thread-1] o.o.core.impl.solver.DefaultSolver : Solving started: time spent (65), best score (-27init/[0]hard/[0/0/0/0]soft), environment mode (REPRODUCIBLE), random (JDK with seed 0).
2020-07-16 15:00:15.356 INFO 4616 --- [pool-1-thread-1] .c.i.c.DefaultConstructionHeuristicPhase : Construction Heuristic phase (0) ended: time spent (81), best score (-27init/[0]hard/[0/0/0/0]soft), score calculation speed (90/sec), step total (0).
2020-07-16 15:00:15.376 ERROR 4616 --- [pool-1-thread-1] o.o.c.impl.solver.DefaultSolverManager : Solving failed for problemId (5e433f57-8c75-4756-8a9c-4c4ca4a83d6d).
java.lang.IllegalStateException: Local Search phase (1) needs to start from an initialized solution, but the planning variable (Task.PreviousTask) is uninitialized for the entity (com.redhat.optaplannersbs.domain.Task#697ea710).
Maybe there is no Construction Heuristic configured before this phase to initialize the solution.
I've been pouring over the documentation and trying to figure out what I've missed or broken between it and the source example but I cannot work it out. I have no Construction Heuristic configured (Same as the example), and I could not see where in the example does it ever set the previousTaskOrEmployee variable before solving.
I could supply a random PreviousTask in the initial solution model, but surely at least 1 of the tasks would have no previous task?
May I ask if you have <localSearch/> in your solverConfig.xml? If so, the <constructionHeuristic/> has to be there as well.
If none of these phases (CH nor LS) is configured, both Construction Heuristic and Local Search is added with default parameters. But once the appears in the solverConfig.xml, it's considered an override of the defaults, and a user is supposed to take care of initializing the solution (most often by providing the Construction Heuristic configuration).

How to apply multithread function in optaplanner

I want to use multithread function in optaplaner.
I edited cloudBalancingSolverConfig.xml
and set multithread count 4
but I think it does not work.
All thread name is the same.
19:37:12.980 [l-4-thread-1] DEBUG CH step (1533), time spent (5839), score (-3266init/0hard/-909220soft), selected move count (1276), picked move (CloudProcess-1262 {null -> CloudComputer-1112}).
19:37:12.982 [l-4-thread-1] DEBUG CH step (1534), time spent (5841), score (-3265init/0hard/-909710soft), selected move count (1600), picked move (CloudProcess-1053 {null -> CloudComputer-908}).
19:37:12.984 [l-4-thread-1] DEBUG CH step (1535), time spent (5843), score (-3264init/0hard/-909710soft), selected move count (1277), picked move (CloudProcess-1049 {null -> CloudComputer-1138}).
19:37:12.987 [l-4-thread-1] DEBUG CH step (1536), time spent (5846), score (-3263init/0hard/-912110soft), selected move count (1600), picked move (CloudProcess-1003 {null -> CloudComputer-1231}).
19:37:12.990 [l-4-thread-1] DEBUG CH step (1537), time spent (5849), score (-3262init/0hard/-912110soft), selected move count (1278), picked move (CloudProcess-941 {null -> CloudComputer-1231}).
19:37:12.992 [l-4-thread-1] DEBUG CH step (1538), time spent (5851), score (-3261init/0hard/-912110soft), selected move count (1278), picked move (CloudProcess-893 {null -> CloudComputer-1231}).
19:37:12.994 [l-4-thread-1] DEBUG CH step (1539), time spent (5853), score (-3260init/0hard/-914510soft), selected move count (1600), picked move (CloudProcess-840 {null -> CloudComputer-1286}).
19:37:12.996 [l-4-thread-1] DEBUG CH step (1540), time spent (5855), score (-3259init/0hard/-916910soft), selected move count (1600), picked move (CloudProcess-820 {null -> CloudComputer-1311}).
19:37:12.998 [l-4-thread-1] DEBUG CH step (1541), time spent (5857), score (-3258init/0hard/-916910soft), selected move count (1278), picked move (CloudProcess-779 {null -> CloudComputer-1231}).
19:37:13.000 [l-4-thread-1] DEBUG CH step (1542), time spent (5859), score (-3257init/0hard/-916910soft), selected move count (1279), picked move (CloudProcess-768 {null -> CloudComputer-1286}).
19:37:13.003 [l-4-thread-1] DEBUG CH step (1543), time spent (5862), score (-3256init/0hard/-917470soft), selected move count (1600), picked move (CloudProcess-739 {null -> CloudComputer-750}).
19:37:13.005 [l-4-thread-1] DEBUG CH step (1544), time spent (5864), score (-3255init/0hard/-917470soft), selected move count (1278), picked move (CloudProcess-728 {null -> CloudComputer-1231}).
19:37:13.007 [l-4-thread-1] DEBUG CH step (1545), time spent (5866), score (-3254init/0hard/-917470soft), selected move count (1278), picked move (CloudProcess-671 {null -> CloudComputer-1231}).
19:37:13.009 [l-4-thread-1] DEBUG CH step (1546), time spent (5868), score (-3253init/0hard/-917470soft), selected move count (1279), picked move (CloudProcess-615 {null -> CloudComputer-1286}).
I checked rebased implementation in cloudChangeMove, couldSwapMove.
It was already implemented like belows.
#Override
public CloudComputerChangeMove rebase(ScoreDirector<CloudBalance> destinationScoreDirector) {
return new CloudComputerChangeMove(destinationScoreDirector.lookUpWorkingObject(cloudProcess),
destinationScoreDirector.lookUpWorkingObject(toCloudComputer));
}
Did I forget something?
The version of optaplaner is 7.23.
But it was the same previous version(7.22)
Please let me know my problem.
Turn on TRACE logging. The steps are done by the solver thread, the move evaluations (only visible in trace logging) are done by the move threads.

AWS Cloudwatch alarm set to NonBreaching (or notBreaching) is not triggering, based on a log filter

With the following Metric and Alarm combination
Metric
Comes from a Cloudwatch log filter (when a match is found on the log)
Metric value: "1"
Default value: None
Unit: Count
Alarm
Statistic: Sum
Period: 1 minute
Treat missing data as: notBreaching
Threshold: [Metric] > 0 for 1 datapoints within 1 minute
The alarm goes to:
State changed to OK at 2018/12/17.
Reason: Threshold Crossed: no datapoints were received for 1 period and 1 missing datapoint was treated as [NonBreaching].
And then it doesn't trigger, even though I force the metric > 0
Why is the alarm stuck in OK? How can the alarm become triggered again?
Solution
Remove the "Unit" property from the stack template Alarm config.
The source of the problem was actually the "Unit" property. This being set to "Count" actually made the alarm become stuck :(
Ensure the stack is producing the same result as a manual alarm setup by checking with the describe-alarms API.

Optaplanner result is not reproducible

I have migrated from Optaplanner version 7.5 to 7.10. While migration I saw bulk(Output) is not the same. While running for another time I am not getting the same output. Can you share which are necessary steps I need to consider while migration? Can you help me out?
The Job schedule is different too. Screenshot attached below:
The primary difference is for Run 1 executes LS 11040 but Run 2 for the same time gets to reach LS 11600 plus with environment and other run parameters maintained without being changed.
Job Schedule diffference
log1_run
11:35:06.421 [main ] INFO Solving started: time spent (225), best score (-1722init/0hard/0medium/0soft), environment mode (REPRODUCIBLE), random (JDK with seed 0).
11:35:06.682 [main ] DEBUG CH step (0), time spent (487), score (-1721init/0hard/0medium/-900000soft), selected move count (543), picked move (TimeWindowedCustomer-1 {null -> Vehicle-393}).
11:35:06.756 [main ] DEBUG CH step (1), time spent (561), score (-1720init/0hard/0medium/-1560000soft), selected move count (544), picked move (TimeWindowedCustomer-2 {null -> TimeWindowedCustomer-7}).
11:35:07.229 [main ] DEBUG CH step (2), time spent (1034), score (-1719init/0hard/0medium/-2520000soft), selected move count (545), picked move (TimeWindowedCustomer-3 {null -> Vehicle-396}).
11:35:07.266 [main ] DEBUG CH step (3), time spent (1071), score (-1718init/0hard/0medium/-3360000soft), selected move count (546), picked move (TimeWindowedCustomer-4 {null -> Vehicle-328}).
--------------------------------------------------
--------------------------------------------------
--------------------------------------------------
--------------------------------------------------
11:37:06.194 [main ] DEBUG LS step (11039), time spent (119999), score (0hard/-45187medium/-1556640000soft), best score (0hard/-45154medium/-1549560000soft), accepted/selected move count (1/3), picked move (TimeWindowedCustomer-1 {Vehicle-220} <-tailChainSwap-> TimeWindowedCustomer-8 {Vehicle-78}).
11:37:06.200 [main ] DEBUG LS step (11040), time spent (120005), score (-54512774442hard/-30651medium/-1659540000soft), best score (0hard/-45154medium/-1549560000soft), accepted/selected move count (0/1), picked move (TimeWindowedCustomer-12 {TimeWindowedCustomer-4} <-tailChainSwap-> null {TimeWindowedCustomer-16}).
11:37:06.203 [main ] INFO Local Search phase (1) ended: time spent (120008), best score (0hard/-45154medium/-1549560000soft), score calculation speed (5349/sec), step total (11041).
11:37:06.203 [main ] INFO Solving ended: time spent (120008), best score (0hard/-45154medium/-1549560000soft), score calculation speed (22645/sec), phase total (2), environment mode (REPRODUCIBLE).
log2 _run
11:38:29.684 [main ] INFO Solving started: time spent (213), best score (-1722init/0hard/0medium/0soft), environment mode (REPRODUCIBLE), random (JDK with seed 0).
11:38:29.955 [main ] DEBUG CH step (0), time spent (485), score (-1721init/0hard/0medium/-900000soft), selected move count (543), picked move (TimeWindowedCustomer-97041025 {null -> Vehicle-393}).
11:38:30.027 [main ] DEBUG CH step (1), time spent (557), score (-1720init/0hard/0medium/-1560000soft), selected move count (544), picked move (TimeWindowedCustomer-1 {null -> TimeWindowedCustomer-6}).
11:38:30.075 [main ] DEBUG CH step (2), time spent (605), score (-1719init/0hard/0medium/-2520000soft), selected move count (545), picked move (TimeWindowedCustomer-2 {null -> Vehicle-396}).
11:38:30.114 [main ] DEBUG CH step (3), time spent (644), score (-1718init/0hard/0medium/-3360000soft), selected move count (546), picked move (TimeWindowedCustomer-3 {null -> Vehicle-328}).
-------------------------------------------
--------------------------------------------
11:40:25.467 [main ] DEBUG LS step (11039), time spent (116004), score (0hard/-45187medium/-1556640000soft), best score (0hard/-45154medium/-1549560000soft), accepted/selected move count (1/3), picked move (TimeWindowedCustomer-1 {Vehicle-220} <-tailChainSwap-> TimeWindowedCustomer-8 {Vehicle-78}).
11:40:25.470 [main ] DEBUG LS step (11040), time spent (116011), score (0hard/-45187medium/-1556640000soft), best score (0hard/-45154medium/-1549560000soft), accepted/selected move count (0/1), picked move (TimeWindowedCustomer-12 {TimeWindowedCustomer-4} <-tailChainSwap-> null {TimeWindowedCustomer-16}).
--------------------------------------------
--------------------------------------------
11:40:29.474 [main ] DEBUG LS step (11674), time spent (120004), score (-4200029hard/-45093medium/-1554780000soft), best score (0hard/-45064medium/-1558380000soft), accepted/selected move count (0/1), picked move (TimeWindowedCustomer-9 {TimeWindowedCustomer-5 -> TimeWindowedCustomer-2}).
11:40:29.477 [main ] INFO Local Search phase (1) ended: time spent (120007), best score (0hard/-45064medium/-1558380000soft), score calculation speed (5451/sec), step total (11675).
11:40:29.477 [main ] INFO Solving ended: time spent (120007), best score (0hard/-45064medium/-1558380000soft), score calculation speed (22762/sec), phase total (2), environment mode (REPRODUCIBLE).
Reproducibility is only guaranteed on each step index, not on time (although they are correlated). When using time spent termination, for example 2 minutes, there is no guarantee how much CPU cycles OptaPlanner will get and how and when the JVM will hotspot optimize. In some situations, it might actually get almost no CPU cycles. In other situations it gets plenty - in any case OptaPlanner uses what it gets, nothing goes to waste.
Looking at the step index level, the two runs are exactly the same (except for the time spent):
First run:
11:37:06.194 ...LS step (11039), time spent (119999), score (0hard/-45187medium/-1556640000soft), best score (0hard/-45154medium/-1549560000soft), accepted/selected move count (1/3), picked move (TimeWindowedCustomer-1 {Vehicle-220} <-tailChainSwap-> TimeWindowedCustomer-8 {Vehicle-78}).
Second run:
11:40:25.467 ...LS step (11039), time spent (116004), score (0hard/-45187medium/-1556640000soft), best score (0hard/-45154medium/-1549560000soft), accepted/selected move count (1/3), picked move (TimeWindowedCustomer-1 {Vehicle-220} <-tailChainSwap-> TimeWindowedCustomer-8 {Vehicle-78}).
To get the exact same result between 2 results, use a step limit termination instead.

Optaplanner - Local search phase terminated without picking a nextStep

I ran into a situation where only two valid moves that is allowed happens during the construction phase. This results in a situation where there is no valid moves to perform during local search phase. All the moves tried during the local search phase is not doable.
This results in NullPointerException at the end of local search phase.
Exception in thread "main" java.lang.NullPointerException
at org.optaplanner.core.impl.solver.scope.DefaultSolverScope.getBestScoreWithUninitializedPrefix(DefaultSolverScope.java:205)
at org.optaplanner.core.impl.solver.DefaultSolver.outerSolvingEnded(DefaultSolver.java:216)
at org.optaplanner.core.impl.solver.DefaultSolver.solve(DefaultSolver.java:161)
at com.example.jobplanner.Application.main(Application.java:328)
Here is the trace of the two valid moves that happens during the construction phase.
CH step (0), time spent (151), score (0hard/0soft), selected move count (1), picked move (com.example.jobplanner.domain.ServiceJobChip#981500 => com.example.jobplanner.domain.ShopFloorBay#9da471).
Update [fact 0:5:4216052:4216052:10:DEFAULT:NON_TRAIT:com.example.jobplanner.domain.ServiceJobChip#4054f4]
CH step (1), time spent (158), score (0hard/0soft), selected move count (1), picked move (com.example.jobplanner.domain.ServiceJobChip#4054f4 => com.example.jobplanner.domain.ShopFloorBay#15ac5d5).
Construction Heuristic phase (0) ended: step total (2), time spent (159), best score (null).
The model of the domain I am trying to optimize would be something like:
Each bay has --> set of tasks that can be performed
Job -> collection of tasks to be performed
Task --> can be scheduled to bays which can perform the task