Break distribution scheduling problem using time grains - optaplanner

I have a following time scheduling optimisation problem:
There are n breaks to be scheduled. A break takes up k time grains of 15 minutes each. The total horizon I am looking at is of m time grains. Each time grain has a desired amount of breaks to optimise for. The range to start a break at is defined per break, you cannot freely pick the range.
To make it more general - there is a distribution of breaks over time as a goal. I need to output a result which would align with this desired distribution as much as possible. I am allowed to move each break within certain boundaries, e.g. 1 hour boundary.
I had a look at the TimeGrain pattern as a starting point which is described here: https://www.optaplanner.org/blog/2015/12/01/TimeSchedulingDesignPatterns.html and in this video: https://youtu.be/wLK2-4IGtWY. I am trying to use Constraint Streams for incremental optimisation.
My approach so far is following:
Break.scala:
case class Break(vehicleId: String, durationInGrains: Int)
TimeGrain.scala:
#PlanningEntity
case class TimeGrain(desiredBreaks: Int,
instant: Instant,
#CustomShadowVariable(...), // Dummy annotation, I want to use the entity in constraint stream
var breaks: Set[Break])
BreakAssignment:
#PlanningEntity
case class BreakAssignment(
break: Break,
#PlanningVariable(valueRangeProviderRefs = Array("timeGrainRange"))
var startingTimeGrain: TimeGrain,
#ValueRangeProvider(id = "timeGrainRange")
#ProblemFactCollectionProperty #field
timeGrainRange: java.util.List[TimeGrain],
#CustomShadowVariable(
variableListenerClass = classOf[StartingTimeGrainVariableListener],
sources = Array(new PlanningVariableReference(variableName = "startingTimeGrain"))
)
var timeGrains: util.Set[TimeGrain]
)
object BreakAssignment {
class StartingTimeGrainVariableListener extends VariableListener[Solution, BreakAssignment] {
override def afterVariableChanged(scoreDirector: ScoreDirector[Solution], entity: BreakAssignment): Unit = {
val end = entity.startingTimeGrain.instant
.plusSeconds((entity.break.durationInGrains * TimeGrain.grainLength).toSeconds)
scoreDirector.getWorkingSolution.timeGrains.asScala
.filter(
timeGrain =>
timeGrain.instant == entity.startingTimeGrain.instant ||
entity.startingTimeGrain.instant.isBefore(timeGrain.instant) && end
.isAfter(timeGrain.instant)
)
.foreach { timeGrain =>
scoreDirector.beforeVariableChanged(timeGrain, "breaks")
timeGrain.breaks = timeGrain.breaks + entity.break
scoreDirector.afterVariableChanged(timeGrain, "breaks")
}
}
}
}
Constraints.scala:
private def constraint(constraintFactory: ConstraintFactory) =
constraintFactory
.from(classOf[TimeGrain])
.filter(timeGrain => timeGrain.breaks.nonEmpty)
.penalize(
"Constraint",
HardSoftScore.ONE_SOFT,
(timeGrain: TimeGrain) => {
math.abs(timeGrain.desiredBreaks - timeGrain.breaks.size)
}
)
As you can see I need to iterate over all the grains in order to find out which ones needs to be updated to hold the break which was just moved in time. This somewhat negates the idea of Constraint Streams.
A different way to look at the issue I am facing is that I cannot come up with an approach to link the BreakAssignment planning entity with respective TimeGrains via e.g. Shadow Variables. A break assignment is spanning multiple time grains. A time grain in return contains multiple break assignments. For the soft constraint I need to group all the assignments within the same grain, accessing the desired target break count of a grain, while doing this for all the grains of my time horizon. My approach therefore is having each grain as a planning entity, so I can store the information of all the breaks on each change of the starting time grain of the assignment himself. What I end up is basically a many-to-many relationship between assignments and time grains. This does not fit into the inverse mechanism of a shadow variable from my understanding as it needs to be a one-to-many relationship.
Am I going in the wrong direction while trying to come up with the correct model?

If I understand properly what you mean, then conceptually, in the TimeGrain class, I would keep a (custom) shadow variable keeping (only) the count of Break instances that are overlapping that TimeGrain (instance). Let me call it breakCount for simplicity. Let me call x the number of TimeGrains a Break spans.
So, upon the solver assigning a Break instance to a TimeGrain instance, I would increment that TimeGrain instance's breakCount. Not only thát TimeGrain instance's breakCount, but also the breakCount of the next few (x-1) TimeGrain instances. Beware to wrap each of those incrementations in a "scoreDirector.beforeVariableChanged()"-"scoreDirector.afterVariableChanged()" bracket.
The score calculation would do the rest. But do note that I myself would moreover also square the difference of a TimeGrain's ideal breakCount and it's "real" breakCount (i.e. the shadow variable), like explained in OptaPlanner's documentation, in order to enforce more "fairness".
Edit : of course also decrement a TimeGrain's breakCount upon removing a Break instance from a Timegrain instance...

Related

What is the most efficient way to join one list to another in kotlin?

I start with a list of integers from 1 to 1000 in listOfRandoms.
I would like to left join on random from the createDatabase list.
I am currently using a find{} statement within a loop to do this but feel like this is too heavy. Is there not a better (quicker) way to achieve same result?
Psuedo Code
data class DatabaseRow(
val refKey: Int,
val random: Int
)
fun main() {
val createDatabase = (1..1000).map { i -> DatabaseRow(i, Random()) }
val listOfRandoms = (1..1000).map { j ->
val lookup = createDatabase.find { it.refKey == j }
lookup.random
}
}
As mentioned in comments, the question seems to be mixing up database and programming ideas, which isn't helping.
And it's not entirely clear which parts of the code are needed, and which can be replaced. I'm assuming that you already have the createDatabase list, but that listOfRandoms is open to improvement.
The ‘pseudo’ code compiles fine except that:
You don't give an import for Random(), but none of the likely ones return an Int. I'm going to assume that should be kotlin.random.Random.nextInt().
And because lookup is nullable, you can't simply call lookup.random; a quick fix is lookup!!.random, but it would be safer to handle the null case properly with e.g. lookup?.random ?: -1. (That's irrelevant, though, given the assumption above.)
I think the general solution is to create a Map. This can be done very easily from createDatabase, by calling associate():
val map = createDatabase.associate{ it.refKey to it.random }
That should take time roughly proportional to the size of the list. Looking up values in the map is then very efficient (approx. constant time):
map[someKey]
In this case, that takes rather more memory than needed, because both keys and values are integers and will be boxed (stored as separate objects on the heap). Also, most maps use a hash table, which takes some memory.
Since the key is (according to comments) “an ascending list starting from a random number, like 18123..19123”, in this particular case it can instead be stored in an IntArray without any boxing. As you say, array indexes start from 0, so using the key directly would need a huge array and use only the last few cells — but if you know the start key, you could simply subtract that from the array index each time.
Creating such an array would be a bit more complex, for example:
val minKey = createDatabase.minOf{ it.refKey }
val maxKey = createDatabase.maxOf{ it.refKey }
val array = IntArray(maxKey - minKey + 1)
for (row in createDatabase)
array[row.refKey - minKey] = row.random
You'd then access values with:
array[someKey - minKey]
…which is also constant-time.
Some caveats with this approach:
If createDatabase is empty, then minOf() will throw a NoSuchElementException.
If it has ‘holes’, omitting some keys inside that range, then the array will hold its default value of 0 — you can change that by using the alternative IntArray constructor which also takes a lambda giving the initial value.)
Trying to look up a value outside that range will give an ArrayIndexOutOfBoundsException.
Whether it's worth the extra complexity to save a bit of memory will depend on things like the size of the ‘database’, and how long it's in memory for; I wouldn't add that complexity unless you have good reason to think memory usage will be an issue.

Solver.solve() returns seeming random solution and score, not best

What I Have
Using OptaPlanner 7.45.0.Final on Java 8.
I am working on a proof-of-concept for scheduling of shifts, where I have created CrewAssignment object as a #PlanningEntity.
I have created my own EasyScoreCalculator implementation (from the new non-deprecated EasyScoreCalculator).
The score function returns a HardMediumSoftScore, with Hard representing physical possibility (can't be in 2 places at once, can't take on an activity that starts in a different place than the previous activity ended), Medium representing legal issues (maximum work day, etc.), and Soft roughly representing financial concerns.
The behavior I am getting
The schedule seemed to work well for small datasets where there are many good possible Solutions. However, when I moved to larger datasets, it had a harder time returning good final schedules. There would be scores like -60hard/-390060medium/-1280457soft, which seems really bad. Moreover, if I increased the time available, the score would sometimes get worse!
Things I have tried
I put a print in my score function for the score being calculated. It would often give scores such as 0hard/0medium/-2250411soft, which in comparison to the final scores is great! However, the final result would still be a bad score.
I added a SolverEventListener to the Solver, and it is only called once at the very end of the solving.
I thought maybe there was a problem with cloning of the Solution, so I created my own SolutionCloner. It is called once at the beginning, and twice at the end. Since it is called so infrequently, I suspect the solver uses the first Solution clone as the best score, tries to copy the values from the current Solution iteration to the best Solution, but given that the SolverEventListener is only called once, that may indicate that it is not recognizing the best solution.
I tried simplifying to a HardSoftScore by combining Hard and Medium values, but the behavior is the same.
I tried calling solverConfig.setEnvironmentMode(EnvironmentMode.NON_INTRUSIVE_FULL_ASSERT); (yes, I'm programatically creating the SolverConfig, but the behavior does not change.
Relevant Code Snippets
Subset of the CrewAssignment #PlanningEntity:
#PlanningEntity
public final class CrewAssignment
{
private Long crewMemberId;
#PlanningVariable(valueRangeProviderRefs = {"crewMemberId"})
public Long getCrewMemberId() { return crewMemberId; }
public void setCrewMemberId(Long value) { crewMemberId = value; }
#Override
protected CrewAssignment clone()
{
CrewAssignment newAssignment = new CrewAssignment(getActivities());
newAssignment.setCrewMemberId(crewMemberId);
return newAssignment;
}
}
Subset of the Solution:
#PlanningSolution(solutionCloner = CrewSchedSolutionCloner.class)
public final class CrewSchedSolution
{
private final List<Long> crewMemberIds;
#ValueRangeProvider(id="crewMemberId")
#ProblemFactCollectionProperty
public List<Long> getCrewMemberIds() { return crewMemberIds; }
// assignments
private List<CrewAssignment> crewAssignments = new ArrayList<>();
#PlanningEntityCollectionProperty
public List<CrewAssignment> getCrewAssignments() { return crewAssignments; }
HardSoftScore score;
#PlanningScore
public HardSoftScore getScore() { return score; }
public void setScore(HardSoftScore value) { score = value; }
public CrewSchedSolution cloneSolution()
{
List<CrewAssignment> newCrewAssignments = new ArrayList<>(crewAssignments);
for (int i = 0; i < newCrewAssignments.size(); i++)
{
CrewAssignment existingAssignment = newCrewAssignments.get(i);
CrewAssignment newAssignment = existingAssignment.clone();
newCrewAssignments.set(i, newAssignment);
}
return new CrewSchedSolution(/* Various additional data */,
crewMemberIds,
newCrewAssignments, score);
}
}
The SolutionCloner:
public final class CrewSchedSolutionCloner
implements SolutionCloner<CrewSchedSolution>
{
#Override
public CrewSchedSolution cloneSolution(CrewSchedSolution originalSolution)
{
return originalSolution.cloneSolution();
}
}
Summary
I have run out of obvious (to me) ways to debug this further. It kind of feels like the good values are getting written over, but I can't prove that yet. I might be able to come up with a way of saving the values set in the CrewAssignment #PlanningEntity just to get it to work, but it seems like that is something OptaPlanner was designed to be able to handle.
The problem is too large to post a SSCCE. I have posted what I think might be relevant. Let me know if there is any other part of the code you would need to see. Thanks in advance!
P.S. I realize this may be a duplicate of OptaPlanner return the best score but not its associated solution, but the original poster never followed up with the posted answer.
if I increased the time available, the score would sometimes get worse! That is impossible (if it gets more steps in the second run, which it should, and it is a reproducible run, which it is).
Does the DEBUG log show it's running more "LS step" lines? If it does, you might have score corruption, but NON_INTRUSIVE_FULL_ASSERT or FULL_ASSERT would detect that if they run long enough (much longer than without because they are slower).
Normally, optaplanner runs are 100% reproducible, given the same amount of steps (~ same amount of time give or take a few steps). Check your DEBUG log if that's the case. Simulated Annealing isn't reproducible, but that's off by default.
OptaPlanner Benchmark is your best friend. Especially the BEST_SCORE graph and the score calculation speed number. Run it 4 times longer, so you can see how the BEST_SCORE graph behaves if given more time. Also, subSingleCount might be an interesting thing to turn on here, to see how sensitive the optimization is to a good random seed (it shouldn't be).
Ow, wait. You wrote your own solution cloner? That's probably it, because solution cloners are extremely difficult to write correctly. Disable that and see what happens.
OptaPlanner was (of course) working correctly. I had a conceptual misunderstanding.
I was focusing on the Scores I was calculating, and not focusing on the scores that OptaPlanner was actually using. OptaPlanner can augment the Scores that are returned from the scoring function by adding an "init" value representing the number of uninitialized #PlanningVariables. Scores that had 0 Hard and Medium scores may have looked reasonable, but may still be uninitialized, which takes precedence over Hard/Medium/Soft values.
Once the duration of the simulation was set sufficiently high, all #PlanningVariables were being set. No scores with init of 0 were created after that, and completed optimized solutions (whether infeasible or feasible) were computed.
See the steps Geoffrey outlined for debugging. If you are using JDK logging, you may have to add slf4j-api and slf4j-jdk14 jars to your path.
See the documentation, section 2.3.6 "Gather the domain objects in a planning solution", where it describes the difference between an "uninitialized", "infeasible", and "feasible" solution.
Note: It appears that the solver does not notify SolverEventListeners when the best solution so far is uninitialized.

Kotlin: Why is Sequence more performant in this example?

Currently, I am looking into Kotlin and have a question about Sequences vs. Collections.
I read a blog post about this topic and there you can find this code snippets:
List implementation:
val list = generateSequence(1) { it + 1 }
.take(50_000_000)
.toList()
measure {
list
.filter { it % 3 == 0 }
.average()
}
// 8644 ms
Sequence implementation:
val sequence = generateSequence(1) { it + 1 }
.take(50_000_000)
measure {
sequence
.filter { it % 3 == 0 }
.average()
}
// 822 ms
The point here is that the Sequence implementation is about 10x faster.
However, I do not really understand WHY that is. I know that with a Sequence, you do "lazy evaluation", but I cannot find any reason why that helps reducing the processing in this example.
However, here I know why a Sequence is generally faster:
val result = sequenceOf("a", "b", "c")
.map {
println("map: $it")
it.toUpperCase()
}
.any {
println("any: $it")
it.startsWith("B")
}
Because with a Sequence you process the data "vertically", when the first element starts with "B", you don't have to map for the rest of the elements. It makes sense here.
So, why is it also faster in the first example?
Let's look at what those two implementations are actually doing:
The List implementation first creates a List in memory with 50 million elements.  This will take a bare minimum of 200MB, since an integer takes 4 bytes.
(In fact, it's probably far more than that.  As Alexey Romanov pointed out, since it's a generic List implementation and not an IntList, it won't be storing the integers directly, but will be ‘boxing’ them — storing references to Int objects.  On the JVM, each reference could be 8 or 16 bytes, and each Int could take 16, giving 1–2GB.  Also, depending how the List gets created, it might start with a small array and keep creating larger and larger ones as the list grows, copying all the values across each time, using more memory still.)
Then it has to read all the values back from the list, filter them, and create another list in memory.
Finally, it has to read all those values back in again, to calculate the average.
The Sequence implementation, on the other hand, doesn't have to store anything!  It simply generates the values in order, and as it does each one it checks whether it's divisible by 3 and if so includes it in the average.
(That's pretty much how you'd do it if you were implementing it ‘by hand’.)
You can see that in addition to the divisibility checking and average calculation, the List implementation is doing a massive amount of memory access, which will take a lot of time.  That's the main reason it's far slower than the Sequence version, which doesn't!
Seeing this, you might ask why we don't use Sequences everywhere…  But this is a fairly extreme example.  Setting up and then iterating the Sequence has some overhead of its own, and for smallish lists that can outweigh the memory overhead.  So Sequences only have a clear advantage in cases when the lists are very large, are processed strictly in order, there are several intermediate steps, and/or many items are filtered out along the way (especially if the Sequence is infinite!).
In my experience, those conditions don't occur very often.  But this question shows how important it is to recognise them when they do!
Leveraging lazy-evaluation allows avoiding the creation of intermediate objects that are irrelevant from the point of the end goal.
Also, the benchmarking method used in the mentioned article is not super accurate. Try to repeat the experiment with JMH.
Initial code produces a list containing 50_000_000 objects:
val list = generateSequence(1) { it + 1 }
.take(50_000_000)
.toList()
then iterates through it and creates another list containing a subset of its elements:
.filter { it % 3 == 0 }
... and then proceeds with calculating the average:
.average()
Using sequences allows you to avoid doing all those intermediate steps. The below code doesn't produce 50_000_000 elements, it's just a representation of that 1...50_000_000 sequence:
val sequence = generateSequence(1) { it + 1 }
.take(50_000_000)
adding a filtering to it doesn't trigger the calculation itself as well but derives a new sequence from the existing one (3, 6, 9...):
.filter { it % 3 == 0 }
and eventually, a terminal operation is called that triggers the evaluation of the sequence and the actual calculation:
.average()
Some relevant reading:
Kotlin: Beware of Java Stream API Habits
Kotlin Collections API Performance Antipatterns

Best Pattern Applyable to Sudoku?

I'm trying to make a Sudoku game, and I gathered the following validations to each number inserted:
Number must be between 1 and 9;
Number must be unique in the line;
Number must be unique in the column;
Number must be unique in the sub-matrix.
As I'm repeating too much the "Number must be unique in..." rule, I made the following design:
There are 3 kinds of groups, ColumnGroup, LineGroup, and SubMatrixGroup (all of them implement the GroupInterface);
GroupInterface has a method public boolean validate(Integer number);
Each cell is related to 3 groups, and it must be unique between the groups, if any of them doesn't evaluate to true, number isn't allowed;
Each cell is an observable, making the group an observer, that reacts to one Cell change attempt.
And that s*cks.
I can't find what's wrong with my design. I just got stuck with it.
Any ideas of how I can make it work?
Where is it over-objectified? I can feel it too, maybe there is another solution that would be more simple than that...
Instead of having 3 validator classes, an abstract GroupInterface, an observable, etc., you can do it with a single function.
Pseudocode ahead:
bool setCell(int cellX, int cellY, int cellValue)
{
m_cells[x][y] = cellValue;
if (!isRowValid(y) || !isColumnValid(x) || !isSubMatrixValid(x, y))
{
m_cells[x][y] = null; // or 0 or however you represent an empty cell
return false;
}
return true;
}
What is the difference between a ColumnGroup, LineGroup and SubMatrixGroup? IMO, these three should simply be instances of a generic "Group" type, as the type of the group changes nothing - it doesn't even need to be noted.
It sounds like you want to create a checker ("user attempted to write number X"), not a solver. For this, your observable pattern sounds OK (with the change mentioned above).
Here (link) is an example of a simple sudoku solver using the above-mentioned "group" approach.

Game Design: Data structures for Stackable Attributes (DFP, HP, MP, etc) in an RPG (VB.Net)

I'm wrangling with issues regarding how character equipment and attributes are stored within my game.
My characters and equippable items have 22 different total attributes (HP, MP, ATP, DFP). A character has their base-statistics and can equip up to four items at one time.
For example:
BASE ATP: 55
WEAPON ATP: 45
ARMOR1 ATP: -10
ARMOR2 ATP: -5
ARMOR3 ATP: 3
Final character ATP in this case would be 88.
I'm looking for a way to implement this in my code. Well, I already have this implemented but it isn't elegant. Right now, I have ClassA that stores these attributes in an array. Then, I have ClassB that takes five ClassA and will add up the values and return the final attribute.
However, I need to emphasize that this isn't an elegant solution.
There has to be some better way to access and manipulate these attributes, including data structures that I haven't thought of using. Any suggestions?
EDIT: I should note that there are some restrictions on these attributes that I need to be put in place. E.g., these are the baselines.
For instance, the character's own HP and MP cannot be more than the baseline and cannot be less than 0, whereas the ATP and MST can be. I also currently cannot enforce these constraints without hacking what I currently have :(
Make an enum called CharacterAttributes to hold each of STR, DEX, etc.
Make an Equipment class to represent any equippable item. This class will have a Dictionary which is a list of any stats modified by this equipment. For a sword that gives +10 damage, use Dictionary[CharacterAttributes.Damage] = 10. Magic items might influence more than one stat, so just add as many entries as you like.
The equipment class might also have an enum representing which inventory it slots to (Boots, Weapon, Helm).
Your Character class will have a List to represent current gear. It will also have a dictionary of CharacterAttributes just like the equipment class, which represents the character's base stats.
To calculate final stats, make a method in your Character class something like this:
int GetFinalAttribute(CharacterAttributes attribute)
{
int x = baseStats[attribute];
foreach (Equipment e in equipment)
{
if (e.StatModifiers[attribute] != null)
{
x += e.StatModifiers[attribute];
}
}
// do bounds checking here, e.g. ensure non-negative numbers, max and min
return x;
}
I know this is C# and your post was tagged VB.NET, but it should be easy to understand the method. I haven't tested this code so apologies if there's a syntax error or something.