I built an application which implements a similar function as task assignment. I thought it works well until recently I noticed the solutions are not optimal. In details, there is a score table for each possible pair between machines and tasks, and usually the number of machines is much less than the number of tasks. I used hard/medium/soft rules, where the soft rule is incremental based on the score of each assignment from the score table.
However, when I reviewed the results after 1-2 hours run, I found out of the unassigned tasks there are many better choices (would achieve higher soft score if assigned) than current assignments. The benchmark reports indicate that the total soft score reached plateau within a hour and then stuck at that score level.
I checked the logic of rules - if the soft rule working perfectly, it should eventually find a way of allocation which achieves the highest overall soft score, whereas meeting the other hard/medium rules, isn't it?
I've been trying various things such as tuning algorithm parameters, scaling the score table, etc. but none delivers the optimal solution.
One problem is that you might be facing a score trap (see docs). In that case, make your constraint score more fine grained to deal with that.
If that's not the case, and you're stuck in a local optima, then I wouldn't play too much with the algorithm parameters - they will probably fix it, but you'll be overfitting on that dataset.
Instead, figure out the smallest possible move that gets you of that local optima and a step closer to the global optimum. Add that kind of moves as a custom move. For example if a normal swap move can't help, but you see a way of getting there by doing a 3-swap move, then implement that move.
Related
story/problem
Imagine there are N party guests and a bouncer guarding the only door. Before the party starts, all the guests are outside (that's for sure). However, once the party starts, people come and go. Each time such an event occurs, the bouncer makes a note for each potential guest of the likelihood that it could have been him or her. One could call this score the bouncer's classification confidence. For each event, this is a list of N candidates that adds up to one. All in all, T events have been observed until the next morning.
Unfortunately, some valuables were stolen that night. To narrow down the group of suspects, the host checked the bouncers notes. However, he soon found contradictory and therefore unreliable data: For example, according to the data, the same person entered the place of high confidences twice in a row, which we know is impossible. Therefore, he attempts by cleaning the data from contradictions to improve the quality of the classifications.
where I am stuck/what I tried
First I solved this issue by formulating a Linear Program and solved this in Python. However, as the number of guests increase, this soon becomes computationally infeasible. Therefore want use Bayes' theorem to compute the probability of the guests presence.
Since we have prior beliefs of the guests attending or not and repeated information updates, I felt that a bayesian approach would be the natural way to tackle this problem.
Even though this approach seemed intuitive to me, I faced some problems:
(1) The problem of being 100% sure of the first state
Being 100% sure of a person being outside at time t=0 seems to "break" the Bayesian formular. Let A be person A and data be the information available by the bouncer.
If the prior P(A present) is either 0 or 1 the formular collapses to 0 or 1 respectively. Am I wrong here?
(2) The problem of using all available information
I did not find a way not only to use data that happened before time step t but also information gathered at a later time. E.g. given three candidates the bouncer makes three observations with confidence ct:
cout->in1=(0.5, 0.4, 0.1)T,
cout->in2=(0.4, 0.4, 0.2)T,
cout->in3=(0.9, 0.05, 0.05)T.
Using only information that was available until t, the most plausible solution would be: (A entered, B entered, C entered) in. However, using all the information available this is a better solution: (B entered, C entered, A entered).
(3) The problem of noisy observations.
I was thinking of using a Bernoulli distribution as prior, however, I do not observe binary events but confidences.
Since I'm stuck, I'm looking forward to your help on Bayesian reasoning and ultimately finding the thief.
I have a domain model where I penalize a score at multiple levels within a single rule. Consider a cloud scheduling problem where we have to assign processes to computers, but each process can be split amongst several computers. Each process has a threshold (e.g. 75%), and we can only "win" the process if we can schedule up to its threshold. We get some small additional benefit from scheduling the remaining 25% of the process, but our solver is geared to "winning" as many processes as possible, so we should be scheduling as many processes as possible to their threshold before scheduling the remainder of the process.
Our hard rule counts hard constraints (we can't schedule more processes on a computer than it can handle)
Our medium rule is rewarded for how many processes have been scheduled up to the threshold (no additional reward for going above 75%).
Our soft rule is rewarded for how many processes have been scheduled total (here we do get additional reward for going above 75%).
This scoring implementation means that it is more important to schedule all processes up to their threshold than to waste precious computer space scheduling 100% of a process.
When we used a drools implementation, we had a rule which rewarded the medium and soft levels simultaneously.
when
$process : Process()
$percentAllocated : calculatePercentProcessAllocated($process) //uses an accumulator over all computers
then
mediumReward;
if ($percentAllocated > $process.getThreshold()) {
mediumReward = $process.getThreshold();
}
else {
mediumReward = $percentAllocated;
}
softReward = $percentAllocated;
scoreHolder.addMultiConstraintMatch(0, mediumReward, softReward);
The above pseudo-drools is heavily simplified, just want to show how we were rewarded two levels at once.
The problem is that I don't see any good way to apply multi constraint matches using constraint streams. All the examples I see automatically add a terminator after applying a penalize or reward method, so that no further score modifications can be made. The only way I see to implement my rules is to make two rules which are identical outside of their reward calls.
I would very much like to avoid running the same constraint twice if possible, so is there a way to penalize the score at multiple levels at once?
Also to anticipate a possible answer to this question, it is not possible to split our domain model so that each process is two processes (one process from 0% to the threshold, and another from the threshold to 100%). Part of the accumulation that I have glossed over involves linking the two parts and would be too expensive to perform if they were separate objects.
There is currently no way in the constraint streams API to do that. Constraint weights are assumed to be constant.
If you must do this, you can take advantage of the fact that there really is no "medium" score. The medium part in HardMediumSoftScore is just another level of soft. Therefore 0hard/1medium/2soft would in practice behave the same as 0hard/1000002soft. If you pick a sufficiently high constant to multiply the medium part with, you can have HardSoftScore work just like HardMediumSoftScore and implement your use case at the same time.
Is it a hack? Yes. Do I like it? No. But it does solve your issue.
Another way to do that would be to take advantage of node sharing. I'll use an example that will show it better than a thousand words:
UniConstraintStream<Person> stream = constraintFactory.forEach(Person.class);
Constraint one = stream.penalize(HardSoftScore.ONE).asConstraint("Constraint 1")
Constraint two = stream.penalize(HardSoftScore.TEN).asConstraint("Constraint 2")
This looks like two constraints executing twice. And in CS-Drools, it is exactly that. But CS-Bavet can actually optimize this and will only execute the stream once, applying two different penalties at the end.
This is not a very nice programming model, you lose the fluency of the API and you need to switch to the non-default CS implementation. But again - if you absolutely need to do what you want to do, this would be another way.
We are using the "auto delay until last" pattern from the docs, https://www.optaplanner.org/docs/optaplanner/latest/design-patterns/design-patterns.html
The loop detection computation is extremely expensive, and we would like to minimize the number of times it is called. Right now, it is being called in the #AfterVariableChanged method of the arrival time shadow variable listener.
The only information I have available in that method is the stop that got a new previous stop and the score director. A move may change several planning variables, so I will be doing the loop detection once for every planning variable that changed, when I should only have to do it once per move (and possibly undo move, unless I can be really clever and cache the loop detection result across moves).
Is there a way for me, from the score director, to figure out what move is being executed right now? I have no need to know exactly what move or what kind of move is being performed, only whether I am still in the same move as before.
I tried using scoreDirector.toString(), which has an incrementing number in it, but that number appears to be the same for a move and the corresponding undo move.
No, there is no access to a move from scoring code. That is by design - scoring needs to be independent of moves executed. Any solution has a score, and if two solutions represent the same state of the problem, their scores must equal - therefore the only thing that matters for the purposes of scoring is the state of the solution, not the state of the solver or any other external factor.
I'm developing a solver for a VRPTW problem using the OptaPlanner and I have faced a problem when large number of customers need to be serviced. By the large number I mean up to 10,000 customers. I have tried running a solver for about 48 hours but no feasible solution was ever reached.
I use a highly customized VRPTW domain model that introduces additional planning entity so-called "Workbreak". Workbreaks are like customers but they can have a location that is actually another planning value - because every day a worker can return home or go to the hotel. Workbreaks have fixed time of departure (usually next day morning), and a variable time of arrival (because it depends on the previous entity within a chain). A hard constraint cares about not allowing to "arrive" to the Workbreak after certain point of time. There are other hard constraints too, like:
multiple service time windows per customer
every week the last customer in chain must be a special customer "storage space visit" (workers need to gather materials before the next week)
long jobs management (when a customer needs to be serviced longer than specified time it should be serviced before specific hour of a day)
max number of jobs per workday
max total job duration per workday (as worker cannot work longer than specified time)
a workbreak cannot have a location of a hotel that is too close to worker's home.
jobs can not be serviced on Sundays
... and many more - there is a total number of 19 hard constrains that have to be applied. There are 3 soft constraints too.
All the aforementioned constraints were initially written as Drools rules, but because of many accumulation-based constraints (max jobs per day, max hours per day, overtime hours per week) the overall speed of the solver (benchmarks) was about 400 step/sec.
At first I thought that solver's speed is too slow to reach a feasible solution in a reasonable time, so I have rewritten all rules into easy score calculator, and it had a decent speed - about 4600 steps/sec. I knew that is will only perform best for a really small number of customers, but I wanted to know if the Drools was the cause of that poor performance. Then I have rewritten all these rules into incremental score calculator (and survived the pain of corrupted score bugs until all of them were successfully fixed). Surprisingly incremental score calculation is a bit slower for a small number of customers, comparing to easy score calculator, but it is not an issue, because overall speed is about 4000 steps/sec - no matter how many entities I have.
The thing that bugs me the most is that above a certain number of customers (problems start at 1000 customers) the solver cannot reach feasible solution. Currently I'm using Late Acceptance and Step Counting algorithms, because they perform really good for this kind of a problem (at least for a less number of customers). I used Simulated Annealing too, but without success, mostly because I could not find good values for algorithm specific parameters.
I have implemented some custom moves too:
Composite move that changes workbreak's location when sibling entities are changed using other moves like change/swap moves (it helps escaping many score traps, as improving step usually needs at least two moves to be performed in a single step)
Move factory for better long jobs assignment (it generates moves that tries to put customers with longer service time in the front of a workday chain)
Workbreak assignment move factory (it generates moves that helps putting workbreaks in proper sequence)
Now I'm scratching my head, and wondering what I should do to diagnose the source of my problem. I suspected that maybe it was hitting a score trap, but I have modified the solver so it saves snapshots of best score each minute. After reading these snapshots I realized that the score was still decreasing. Can the number of hard constraints play the role? I suspect that many moves need to be performed to find out a move that improves the score. The fact is that maybe 48 hours isn't that much for this kind of a problem, and it should make computations a whole week? Unfortunately I have nothing to compare with.
I would like to know how to find out if it is solely a performance problem, or a solver (algorithm, custom moves, hard/soft score) configuration problem.
I really apologize for my bad English.
TL;DR but FWIW:
To scale above 1k locations you need to use NearBy selection.
To scale above 10k locations, add Partitioned Search too.
Tried doing a bit of research on the following with no luck. Thought I'd ask here in case someone has come across it before.
I help a volunteer-run radio station with their technology needs. One of the main things that have come up is they would like to schedule their advertising programmatically.
There are a lot of neat and complex rule engines out there for advertising, but all we need is something pretty simple (along with any experience that's worth thinking about).
I would like to write something in SQL if possible to deal with these entities. Ideally if someone has written something like this for other advertising mediums (web, etc.,) it would be really helpful.
Entities:
Ads (consisting of a category, # of plays per day, start date, end date or permanent play)
Ad Category (Restaurant, Health, Food store, etc.)
To over-simplify the problem, this will be a elegant sql statement. Getting there... :)
I would like to be able to generate a playlist per day using the above two entities where:
No two ads in the same category are played within x number of ads of each other.
(nice to have) high promotion ads can be pushed
At this time, there are no "ad slots" to fill. There is no "time of day" considerations.
We queue up the ads for the day and go through them between songs/shows, etc. We know how many per hour we have to fill, etc.
Any thoughts/ideas/links/examples? I'm going to keep on looking and hopefully come across something instead of learning it the long way.
Very interesting question, SMO. Right now it looks like a constraint programming problem because you aren't looking for an optimal solution, just one that satisfies all the constraints you have specified. In response to those who wanted to close the question, I'd say they need to check out constraint programming a bit. It's far closer to stackoverflow that any operations research sites.
Look into constraint programming and scheduling - I'll bet you'll find an analogous problem toot sweet !
Keep us posted on your progress, please.
Ignoring the T-SQL request for the moment since that's unlikely to be the best language to write this in ...
One of my favorites approaches to tough 'layout' problems like this is Simulated Annealing. It's a good approach because you don't need to think HOW to solve the actual problem: all you define is a measure of how good the current layout is (a score if you will) and then you allow random changes that either increase or decrease that score. Over many iterations you gradually reduce the probability of moving to a worse score. This 'simulated annealing' approach reduces the probability of getting stuck in a local minimum.
So in your case the scoring function for a given layout might be based on the distance to the next advert in the same category and the distance to another advert of the same series. If you later have time of day considerations you can easily add them to the score function.
Initially you allocate the adverts sequentially, evenly or randomly within their time window (doesn't really matter which). Now you pick two slots and consider what happens to the score when you switch the contents of those two slots. If either advert moves out of its allowed range you can reject the change immediately. If both are still in range, does it move you to a better overall score? Initially you take changes randomly even if they make it worse but over time you reduce the probability of that happening so that by the end you are moving monotonically towards a better score.
Easy to implement, easy to add new 'rules' that affect score, can easily adjust run-time to accept a 'good enough' answer, ...
Another approach would be to use a genetic algorithm, see this similar question: Best Fit Scheduling Algorithm this is likely harder to program but will probably converge more quickly on a good answer.