Non-greedy configurations across steps - optaplanner

I'm using late acceptance as local search algorithm and here is how it actually picks moves:
If my forager is 5, it'll pick 5 moves and then get 1 random move to be applied for every step.
At every step it only picks moves that are increasing scores ie greedy picking across steps.
Forager.pickMove()
public LocalSearchMoveScope pickMove(LocalSearchStepScope stepScope) {
stepScope.setSelectedMoveCount(selectedMoveCount);
stepScope.setAcceptedMoveCount(acceptedMoveCount);
if (earlyPickedMoveScope != null) {
return earlyPickedMoveScope;
}
List<LocalSearchMoveScope> finalistList = finalistPodium.getFinalistList();
if (finalistList.isEmpty()) {
return null;
}
if (finalistList.size() == 1 || !breakTieRandomly) {
return finalistList.get(0);
}
int randomIndex = stepScope.getWorkingRandom().nextInt(finalistList.size());// should have checked for best here
return finalistList.get(randomIndex);
}
I have two questions:
In first, can we make forager to pick the best of 5 instead of pick 1 randomly.
Can we allow move to pick that degrades score but can increase score later(no way to know it)?

Look for acceptedCountLimit and selectedCountLimit in the docs. Those do exactly that.
That's already the case (especially with Late Acceptance and Simulated Annealing). In the DEBUG log, just look at the step score vs the best score. Or ask for the step score statistic in optaplanner-benchmark.

Related

OptaPlanner - Explain score of non-optimal solutions

We have a use-case where we want to present the user with some human-readable message with why an "assignment" was rejected based on the score of the constraints.
For e.g. in the CloudBalancing problem with 3 computers (Computer-1,2,3) and 1 process (Process-1) we ended up with the below result:
Computer-1 broke a hard constraint (requiredCpu)
Computer-2 lost due to a soft constraint (min cost)
Computer-3 assigned to Process-1 --> (Optimal solution)
We had implemented the BestSolutionChanged listener where we used solution.explainScore() to get some info and enabled DEBUG logging which provided us the OptaPlanner internal logs for intermediate moves and their scores. But the requirement is to provide some custom human readable information on why all the non-optimal solutions (Computer-1, Computer-2) were rejected even if they were infeasible (basically explanation of scores of these two solutions).
So wanted to know how can we achieve the above ?
We did not want to rely on listening to BestSolutionChanged event as
it might not get triggered for other solutions if the LS/CH
phase starts with a solution which is already a "best solution"
(Computer-3). Is this a valid assumption ?
DEBUG logs do provide us with the
information but building a custom message from this log does not seem
like a good idea so was wondering if there is another
listener/OptaPlanner concept which can be used to achieve this.
By "all the non-optimal solutions", do you mean instead a particular non-optimal solution? Search space can get very large very quickly, and OptaPlanner itself probably won't evaluate the majority of those solutions (simply because the search space is so large).
You are correct that BestSolutionChanged event will not fire again if the problem/solution given to the Solver is already the optimal solution (since by definition, there are no solutions better than it).
Of particular interest is ScoreManager, which allows you to calculate and explain the score of any problem/solution:
(Examples taken from https://www.optaplanner.org/docs/optaplanner/latest/score-calculation/score-calculation.html#usingScoreCalculationOutsideTheSolver)
To create it and get a ScoreExplanation do:
ScoreManager<CloudBalance, HardSoftScore> scoreManager = ScoreManager.create(solverFactory);
ScoreExplanation<CloudBalance, HardSoftScore> scoreExplanation = scoreManager.explainScore(cloudBalance);
Where cloudBalance is the problem/solution you want to explain. With the
score explanation you can:
Get the score
HardSoftScore score = scoreExplanation.getScore();
Break down the score by constraint
Collection<ConstraintMatchTotal<HardSoftScore>> constraintMatchTotals = scoreExplanation.getConstraintMatchTotalMap().values();
for (ConstraintMatchTotal<HardSoftScore> constraintMatchTotal : constraintMatchTotals) {
String constraintName = constraintMatchTotal.getConstraintName();
// The score impact of that constraint
HardSoftScore totalScore = constraintMatchTotal.getScore();
for (ConstraintMatch<HardSoftScore> constraintMatch : constraintMatchTotal.getConstraintMatchSet()) {
List<Object> justificationList = constraintMatch.getJustificationList();
HardSoftScore score = constraintMatch.getScore();
...
}
}
and get the impact of individual entities and problem facts:
Map<Object, Indictment<HardSoftScore>> indictmentMap = scoreExplanation.getIndictmentMap();
for (CloudProcess process : cloudBalance.getProcessList()) {
Indictment<HardSoftScore> indictment = indictmentMap.get(process);
if (indictment == null) {
continue;
}
// The score impact of that planning entity
HardSoftScore totalScore = indictment.getScore();
for (ConstraintMatch<HardSoftScore> constraintMatch : indictment.getConstraintMatchSet()) {
String constraintName = constraintMatch.getConstraintName();
HardSoftScore score = constraintMatch.getScore();
...
}
}

How to get just the most recent of all documents

In sanity studio you get a nice list of the most recent version of all your documents. If there is a draft you get that, if not, you get the published one.
I need the same list for a few filters and scripts. The following groq does the job but is not very fast and does not work in the new API (v2021-03-25).
*[
_type == $type &&
!defined(*[_id == "drafts." + ^._id])
]._id
A way around the breaking changes in the API is to use length() = 0 in place of !defined() but that makes an already slow query 10-20 X slower.
Does anyone know a way of making filters that consider only the latest version?
Edit: An example where I need this is if I want to see all documents without any categories. Regardless whether it is the published document or the draft that has no categories it shows up in a normal filter. So if you add categories but don't immediately want to publish it will be confusing in the no-categories-list. ,'-)
100 X improvement on API v2021-03-25 🥳
The only way I was able to solve this with speed was to first make a projection of the sub-query so it doesn't run once for every non-draft. Then I thought, why not project both sets and then figure out the overlap, and that was even faster! It runs more than 10 x faster than possible on API v1 and 100 x faster than any suggestions for new API.
{
'drafts': *[ _type == $type && _id in path("drafts.**") ]._id,
'published': *[ _type == $type && !(_id in path("drafts.**"))]._id,
}
{
'current': published[ !("drafts." + # in ^.drafts) ] + drafts
}
First I get both drafts and non-drafts and "store" it in this projection, like a variable-😉-ish
Then I start with my non-drafts - published
And filter out any that has a counterpart in my drafts "variable"
Lastly I add all drafts to the my list of filtered non-drafts
Overall I think you're on the right track. Some ideas to help you out:
Drafts are always fresher and newer than published documents, so if a given doc's id in path("drafts.**"), that's already the last updated one.
Knowing the above allows you to skip the defined(*[_id == ...]) part of the query for drafts, speeding up your execution
As drafts are already included, we can exclude published documents with a draft (defined(*[_id == "drafts." + ^._id][0]))
Notice I added a [0] to the end of the query to pick only the first element that matches. This will improve performance slightly.
For getting only documents that have no categories, use count(categoriesField) < 1
Order documents with | order(_updatedAt desc) to get the freshest documents first
And paginate your request to reduce the payload and speed things up.
Here's a sample query applying these principles (I haven't ran it, you may have to do some adjustments there):
*[
_type == $type &&
// Assuming you only want those without categories:
count(categories) < 1 &&
(
// Is either a draft -> drafts are always fresher
_id in path("drafts.**") ||
// Or a published document with no draft
!defined(*[_id == "drafts." + ^._id][0])
// 👆 with the check above we're ensuring only
// published documents run the expensive defined query
)
]
// Order by last updated
| order(_updatedAt desc)
// Paginate for faster queries
[$paginationStart..$paginationEnd]
// Get only the _id, assuming that's what you want
._id
Hope this helps 🙌

Opta planner incorrect best score

I am trying out the optaplanner for a shift assignment problem.
It is a many to many relationship since one shift can have many employees.
In the trial run , I have two employees and three shifts .
One of the shift needs two employees.
So I have created a new ShiftAssignment class to handle the many to many relationship . ShiftAssignment is the planning entity and employee is the planning variable.
I pass the two employees and four shift assignment class ( because one shift needs two employees )
to the planning solution
I have only one hard rule in the score calculator which is basically the employee should
have the necessary skill needed for the shift
When I run the solver , I print the score in my code below ( I dont have any soft constraints so I have hard coded it to zero )
public HardSoftScore calculateScore(AuditAllocationSolution auditAllocationSolution) {
int hardScore = 0;
for (Auditor auditor : auditAllocationSolution.getAuditors()) {
for (AuditAssignment auditAssignment : auditAllocationSolution.getAuditAssignments()) {
if (auditor.equals(auditAssignment.getAuditor())) {
List<String> auditorSkils = auditor.getQualifications().stream().map(skill -> skill.getSkillName())
.collect(Collectors.toList());
String requiredSkillForThisAuditInstance = auditAssignment.getRequiredSkill().getSkillName();
if ( !auditorSkils.contains(requiredSkillForThisAuditInstance))
{
// increement hard score since skill match contraint is violated
hardScore = hardScore + 1;
}
}
}
}
System.out.println(" hardScore " + hardScore);
return HardSoftScore.valueOf(hardScore, 0);
}
When I print the values of the solution class in the score calculator , I can see that there are few solutions where hard score is zero. The solution satisfies the rules and matches the expected results . But it is not accepted as per the logs
08:16:35.549 [main] TRACE o.o.c.i.l.decider.LocalSearchDecider - Move index (0), score (0hard/0soft), accepted (false), move (AuditAssignment-2 {Auditor-1} <-> AuditAssignment-3 {Auditor-0}).
08:16:35.549 [main] TRACE o.o.c.i.l.decider.LocalSearchDecider - Move index (0), score (0hard/0soft), accepted (false), move (AuditAssignment-2 {Auditor-1} <-> AuditAssignment-3 {Auditor-0}).
One another observation which I want to clarify in the logs.
I understand that every new solution , which is the outcome of each step , is passed to score calculator . But sometimes I see that for a single step , score calculator is invoked more than once with different solution. This is my observation from the logs. Assuming this is single threaded and log sequencing is correct , why does that happen ?
The final output is incorrect . The best score that is selected is something with high hard score. And the solutions with the best score are not accepted
I also see the below line in the logs which I am not able to comprehend. Is there anything wrong in my configuration ?
23:53:01.242 [main] DEBUG o.o.c.i.l.DefaultLocalSearchPhase - LS step (26), time spent (121), score (2hard/0soft), best score (4hard/0soft), accepted/selected move count (1/1), picked move (AuditAssignment-2 {Auditor-1} <-> AuditAssignment-0 {Auditor-0}).
23:53:01.242 [main] DEBUG o.o.c.i.l.DefaultLocalSearchPhase - LS step (26), time spent (121), score (2hard/0soft), best score (4hard/0soft), accepted/selected move count (1/1), picked move (AuditAssignment-2 {Auditor-1} <-> AuditAssignment-0 {Auditor-0}).
This is a small problem size and I feel I have not set it up right . Kindly suggest.
Hard Score has to be decremented when a constraint is violated. In the above code , I had incremented the hard score which probably had led to the erroneous result.
It worked as expected once I fixed the above.

twiiter4j when to STOP when no more tweets available?

So, I've figured out how to be able to get more than 100 tweets, thanks to How to retrieve more than 100 results using Twitter4j
However, when do I make the script stop and print stop when maximum results have been reached? For example, I set
int numberOfTweets = 512;
And, it finds just 82 tweets matching my query.
However, because of:
while (tweets.size () < numberOfTweets)
it still continues to keep on querying over and over until I max out my rate limit of 180 requests per 15 seconds.
I'm really a novice at java, so I would really appreciate if you could show me how to resolve this by modifying the first answer script at How to retrieve more than 100 results using Twitter4j
Thanks in advance!
You only need to modify things in the try{} block. One solution is to check whether the ID of the last tweet you found on the previous loop(previousLastID) in the while is the same as the ID of the last tweet (lastID) in the new batch collected (newTweets). If it is, it means the new batch's elements already exist in the previous array, and that that we have reached the end of possible tweets for this hastag.
try {
QueryResult result = twitter.search(query);
List<Status> newTweets = result.getTweets();
long previousLastID = lastID;
for (Status t: newTweets)
if (t.getId() < lastID) lastID = t.getId();
if (previousLastID == lastID) {
println("Last batch (" + tweets.size() + " tweets) was the same as first. Stopping the Gathering process");
break;
}

Miniprofiler isn't profiling assignment statements / fast steps

I'm using the StackExchange Miniprofiler with ASP.NET MVC 4. I'm currently trying to profile an assignment to a member variable of a class with an expensive expression that generates the value to be assigned. Miniprofiler doesn't seem to want to profile assignment statements. I've simplified my code to highlight the error:
public ActionResult TestProfiling()
{
var profiler = MiniProfiler.Current;
using (profiler.Step("Test 1"))
Thread.Sleep(50);
int sue;
using (profiler.Step("Test 2"))
{
sue = 1;
}
if (sue == 1)
sue = 2;
using (profiler.Step("Test 3"))
{
Thread.Sleep(50);
int bob;
using (profiler.Step("Inner Test"))
{
bob = 1;
}
if (bob == 1)
bob = 2;
}
return View();
}
N.B. the if statements are simply to avoid compiler warnings.
Test 1 and Test 3 get displayed in the Miniprofiler section on the resulting page. Test 2 and Inner Test do not. However if I replace the contents of either Test 2 or Inner Test with a sleep statement they get output to the resulting page.
What is going on here? Even if I replace the simple assignment statement inside one of the non appearing tests i.e.
using (profiler.Step("Test 2"))
{
ViewModel.ComplexData = MyAmazingService.LongRunningMethodToGenerateComplexData();
}
with a more complex one, the Test 2 step still doesn't get output to the rendered Miniprofiler section. Why isn't Miniprofiler profiling assignment statements?
Edit: code example now corresponds to text.
Edit2: After further digging around it seems that the problem isn't with assignment statements. It seems that whether something gets displayed in the output results is dependent on how long it takes to execute. i.e.
using (profiler.Step("Test 2"))
{
sue = 1;
Thread.Sleep(0);
}
Using the above code, Test 2 is not displayed in the Miniprofiler results.
using (profiler.Step("Test 2"))
{
sue = 1;
Thread.Sleep(10);
}
Using the above code Test 2 is now displayed in the Miniprofiler results.
So it seems my LongRunningCodeToGenerateComplexData turns out to be quite quick... but is it expected behaviour of Miniprofiler to not show steps that take a really small amount of time?
Just click on "show trivial" on the bottom right of the profiler results.
this should show all actions lesser
It seems the problem was that Miniprofiler isn't displaying results for steps where the execution time is less than 3ms.
Edit: From the Miniprofiler documentation.
TrivialDurationThresholdMilliseconds Any Timing step with a duration less than or equal to this will be hidden by default in the UI; defaults to 2.0 ms.
http://community.miniprofiler.com/permalinks/20/various-miniprofiler-settings