What is the recommended way to handle a hierarchical constraint in OptaPlanner? - optaplanner

I'm trying to learn OptaPlanner by building a playlist generator. My constraints look roughly like:
The total time must be ~60 minutes.
All songs selected must be unique.
Each ~15 minute block may contain only songs from a single artist.
The four 15 minute blocks must be different artists.
So by hierarchical, I mean you could choose the artist and then attempt to fill the block.
My current implementation expresses these as constraints on song selection. It's able to solve the problem, but I feel like it's spending a lot of time trying to align the artist constraint.
Reading through docs, it seems like there are some features which might be helpful:
Partitioned search
Chained variables
Custom move selectors (e.g. changing all the songs to have consistent artist)
Different weighting on constraints
What's the recommended way to handle this type of relationship?

Sounds like these are just 4 constraints with 4 different score levels (see docs chapter on score calculation). So in that case, you'll need a BendableScore.
In practice though, I would be surprised if 3. and 4. are hierarchical: 3 might just have a much heavier score weight than 4. In that case a HardMediumSoftScore suffices.

Related

Is there a way for optaplanner to continue the last best solution to the next calculation

I am using optaplanner 8.4.1 and constraint flow API.
Does optaplanner have a way to output the best solution to the problem based on simple constraints, and then input the solution into complex constraints to find the best solution? Can this speed up the time to the best solution?
Constraints are too complex and slow, especially when you use groupby and toList methods. My current logic involves the judgment of consecutive substrings.
For example: there are 10 positions and 10 balls with serial numbers. The color of the ball can be white, black or red. At this point, you need to line up 10 balls into 10 positions. It is required that the white balls must be piled together. The white balls have a range of 4 to 6; the position here is the planning entity, and the ball is the planning variable. Here we need to calculate how many continuous white balls there are. Constraint flow currently only supports 4 bases, so it can only be judged in the form of groupBy and toList
You can't do that as a single solver currently, but you should be able to do that by running 2 Solvers (with a different SolverConfig) one after another. For you example you could have BasicConstraintProvider with the hard constraints and a FullConstraintProvider that extends the basic one and adds the soft constraints.
That being said, I'd first invest time in improving the score calculation speed (see info log or a benchmark report), to avoid such workarounds all together. That number should be above 10 000.
Also, FIRST_FEASIBLE_FIT might be intresting to look at.
I don't know the details of your problem, but what I am doing is:
Run solver for some time (maybe even using less number of constraints), then serialize solution it provides
Run solver second time (could be with different settings) providing previous solution as a problem, it will keep looking for better one.
It works for me in version 8.30.
Maybe it will help somebody else.

What optimization algorithm is more suitable for timetable rescheduling?

I'm working on the project where university course is represented as a to-do list, where:
course owner (teacher of the course) can add tasks (containing the URL to the resource needs to be learned and two datetime fields - when to start and when to complete the task)
course subscriber (student) can mark tasks as complete or not complete and their marks are saved individually for each account.
If student marks task as complete - his account + element he marked are shown in the course activity tab for teacher where he can:
initiate a conversation in JavaScript-based chat with him
evaluate the result of the conversation
What optimization algorithm you could recommend me to use for timetable rescheduling (changing datetime fields for to-do element if student procrastinates) here?
Actually, we can use the student activity on the resource + fact that he marked the task as complete + if he clicked or not on the URL placed on the to-do element leading to the external learning material (for example Google Book).
For example, are genetic algorithms suitable for this model and what pitfalls do they have: https://medium.com/#vijinimallawaarachchi/time-table-scheduling-2207ca593b4d ?
I'm not sure I completely understand your problem but it sounds like you have a feasible timetable to begin with and you just need to improve it.
If so genetic algorithms will work very well, but I think representing everything as binary 'chromosomes' like in the link might not be practical.
There are many other ways you can represent a timetable, such as in a 2D array, or giving an event a slot number.
You could look into algorithms such as Tabu search, Simulated Annealing and Great Deluge and Hill Climbing. They are all based on similar ideas but some work better with some problems than others. For example if you have a very rough search space simulated annealing won't be the best and Hill Climbing usually only finds a local optimum.
The general architecture of the algorithms mentioned above and many other genetic algorithms and Metaheuristics is: select a neighbouring solution using a move operator (e.g. swapping the time of one or two or three events or swapping the rooms of two events etc...), check the move doesn't violate any hard constraints, use an acceptance strategy such as, simulated annealing or Great Deluge, to determine if the move is accepted. If it is keep the solution and repeat the steps until the termination criterion is met. This can be max time, number of iterations reached or improving move hasn't been found in x number of iterations.
Whilst this is running keep a log of the 'best' solution so when the algorithm is terminated you have the best solution found. You can determine what is considered 'best' based on how many soft constraints the timetable violates
Hope this helps!

Optaplanner take fastest path

How can we optimize Optaplanner to select the fastest route? See the highlighted point in the below image. It is taking the long route.
Note: Vehicles does not need to come back depot. I think i cannot use CVRPTW as arrivalAfterDueTimeAtDepot is a build-in hard constraint (and besides i do not have any time constraints).
How can we write a constraint to select the less capacity vehicle?
For example, A customer needs only 3 items and we have two vehicles with 4 and 9 capacities. Seems like Optaplanner is selecting the first vehicle from the order of input by default.
I presume it's taking the blue vehicle for the center of Bengaluru because the green in is already at full capacity.
Check what the score is (calculated through Solver.getScoreDirectorFactory()) if you manually put that location in the green trip and swap the vehicles of the green and blue trip. If it's worse (or breaks a hard constraint), then it's normal that OptaPlanner selects the other solution. In that case, either your score function has bug (or you realize don't want that solution at all). But if it has indeed a better score, OptaPlanner's <localSearch> (such as Late Acceptance) should find it (especially when scaling out because ironically local optima are a bigger problem when scaling down). You can try to add <subchainSwapMoveSelector> etc to escape local optima faster.
If you want to guide the search more (which is often not a good idea), you can define a planning value strength comparator to sort small vehicles before big vehicles and use the Construction Heuristic WEAKEST_FIT(_DECREASING).

Using Redis for "trending now" functionality

I'm working on a very high throughput site with many items, am looking into implementing "trending now" type functionality, that would allow users to quickly get a prioritized list of the top N items that have been viewed recently by many people, that gradually fade away as they get fewer views.
One idea about how to do this is to give more weight to recent views of an item, something like a weight of 16 for every view of an item the past 15 minutes, a weight of 8 for every view of an item in the past 1 hour, a weight of 4 for things in the past 4 hours, etc but I do not know if this is the right way to approach it.
I'd like to do this in Redis, we've had good success with Redis in the past for other projects.
What is the best way to do this, both technologically and the determination of what is trending?
The first answer hints at a solution but I'm looking for more detail -- starting a bounty.
These are both decent ideas, but not quite detailed enough. One got half the bounty but leaving the question open.
So, I would start with a basic time ordering (zset of item_id scored by timestamp, for example), and then float things up based on interactions. So you might decided that a single interaction is worth 10 minutes of 'freshness', so each interaction adds that much time to the score of the relevant item. If all interactions are valued equally, you can do this with one zset and just increment the scores as interactions occur.
If you want to have some kind of back-off, say, scoring by the square root of the interaction count instead of the interaction count directly, you could build a second zset with your score for interactions, and use zunionstore to combine this with your timestamp index. For this, you'll probably want to pull out the existing score, do some math on it and put a new score over it (zadd will let you overwrite a score)
The zunionstore is potentially expensive, and for sufficiently large sets even the zadd/zincrby gets expensive. To this end, you might want to keep only the N highest scoring items, for N=10,000 say, depending on your application needs.
These two links are very helpful:
http://stdout.heyzap.com/2013/04/08/surfacing-interesting-content/
http://word.bitly.com/post/41284219720/forget-table
The Reddit Ranking algorithm does a pretty good job of what you describe. A good write up here that talks through how it works.
https://medium.com/hacking-and-gonzo/how-reddit-ranking-algorithms-work-ef111e33d0d9
consider an ordered set with the number of views as the scores. whenever an item is accessed, increment its score (http://redis.io/commands/zincrby). this way you can get top items out of the set ordered by scores.
you will need to "fade" the items too, maybe with an external process that would decrement the scores.

I am looking for a radio advertising scheduling algorithm / example / experience

Tried doing a bit of research on the following with no luck. Thought I'd ask here in case someone has come across it before.
I help a volunteer-run radio station with their technology needs. One of the main things that have come up is they would like to schedule their advertising programmatically.
There are a lot of neat and complex rule engines out there for advertising, but all we need is something pretty simple (along with any experience that's worth thinking about).
I would like to write something in SQL if possible to deal with these entities. Ideally if someone has written something like this for other advertising mediums (web, etc.,) it would be really helpful.
Entities:
Ads (consisting of a category, # of plays per day, start date, end date or permanent play)
Ad Category (Restaurant, Health, Food store, etc.)
To over-simplify the problem, this will be a elegant sql statement. Getting there... :)
I would like to be able to generate a playlist per day using the above two entities where:
No two ads in the same category are played within x number of ads of each other.
(nice to have) high promotion ads can be pushed
At this time, there are no "ad slots" to fill. There is no "time of day" considerations.
We queue up the ads for the day and go through them between songs/shows, etc. We know how many per hour we have to fill, etc.
Any thoughts/ideas/links/examples? I'm going to keep on looking and hopefully come across something instead of learning it the long way.
Very interesting question, SMO. Right now it looks like a constraint programming problem because you aren't looking for an optimal solution, just one that satisfies all the constraints you have specified. In response to those who wanted to close the question, I'd say they need to check out constraint programming a bit. It's far closer to stackoverflow that any operations research sites.
Look into constraint programming and scheduling - I'll bet you'll find an analogous problem toot sweet !
Keep us posted on your progress, please.
Ignoring the T-SQL request for the moment since that's unlikely to be the best language to write this in ...
One of my favorites approaches to tough 'layout' problems like this is Simulated Annealing. It's a good approach because you don't need to think HOW to solve the actual problem: all you define is a measure of how good the current layout is (a score if you will) and then you allow random changes that either increase or decrease that score. Over many iterations you gradually reduce the probability of moving to a worse score. This 'simulated annealing' approach reduces the probability of getting stuck in a local minimum.
So in your case the scoring function for a given layout might be based on the distance to the next advert in the same category and the distance to another advert of the same series. If you later have time of day considerations you can easily add them to the score function.
Initially you allocate the adverts sequentially, evenly or randomly within their time window (doesn't really matter which). Now you pick two slots and consider what happens to the score when you switch the contents of those two slots. If either advert moves out of its allowed range you can reject the change immediately. If both are still in range, does it move you to a better overall score? Initially you take changes randomly even if they make it worse but over time you reduce the probability of that happening so that by the end you are moving monotonically towards a better score.
Easy to implement, easy to add new 'rules' that affect score, can easily adjust run-time to accept a 'good enough' answer, ...
Another approach would be to use a genetic algorithm, see this similar question: Best Fit Scheduling Algorithm this is likely harder to program but will probably converge more quickly on a good answer.