OptaPlanner VRPPD with Traffic and Time Windows - optaplanner

Is there currently a way to incorporate traffic patterns into OptaPlanner with the package and delivery VRP problem?
Eg. Let's say I need to optimize 500 pickup and deliveries today and tomorrow amongst 30 vehicles where each pickup has a 1-4hr time window. I want to avoid busy areas of the city during rush hours when possible.
New pickups can also be added (or cancelled in the meantime).
I'm sure this is a common problem. Does a decent solution exist for this in OptaPlanner?
Thanks!

Users often do this, but there is no out-of-the-box example of it.
There are several ways to do it, but one way is to add a 3th dimension to the distanceMatrix, indicating the departureTime. Typically that use a granularity of 15 minutes, 30 minutes or 1 hour.
There are 2 scaling concerns here:
memory. 15 minutes means 24 * 4 = 96 per day. Given that with 2 dimensions, a 10k locations distanceMatrix uses almost 2 GB RAM, clearly memory can become a concern.
pre-calculation time. Calculation the distance matrix can be time consuming. "bulk algorithms" can help here. For example, graphhopper community doesn't support bulk distance calculations, but their enterprise version - as well OSRM (which is free) does. Getting a 3 dimensional matrix from the remote Google Maps API, or the remote enterprise Graphhopper API, can result in bandwidth concerns (see above, the distance matrix can become several GB in size, especially in non-binary formats such as JSON or CSV).
In any case, one that 3 dimensional matrix is there, it's just a matter of adjust OptaPlanner examples's ArrivalTimeUpdateListener to use getDistance(from, to, departureTime).

Related

Is it possible to model the Universe in an object oriented manner from the subatomic level upwards?

While I'm certain this must have been tried before, I cant seem to find any examples of this concept being done myself.
What I'm describing goes off of the idea that effectively you could model all "things" which are as objects. From their you can make objects which use other objects. An example would be starting at the fundamental particles in physics combine them to get certain particles like protons neutrons and electrons - then atoms - work your way up to the rest of chemistry etc....
Has this been attempted before and is it possible? How would I even go about it?
If what you mean by "the Universe," is the entire actual universe, the answer to "Is it possible?" is a resounding "Hell no!!!"
Consider a single mole of H2O, good old water. By definition a mole contains ~6*1023 atoms, and knowing the atomic weights involved yields the mass. The density of water is well known. Pulling all the pieces together, we end up with 1 mole is about 18 mL of water. To put that in perspective, the cough syrup dose cup in my medicine cabinet is 20mL. If you could represent the state of each atom using a single byte—I doubt it!—you'd require 1011 terabytes of storage just to represent a snapshot of that mass, and you'd need to update that volume of data every delta-t for the duration you wish to simulate. Additionally, the number of 2-way interactions between N entities grows as O(N2), i.e., on the order of 1046 calculations would be involved, again at every delta-t. To put that into perspective, if you had access to the world's fastest current distributed computer with exaflop capability, it would take you O(1028) seconds (on the order of 1020 years) to perform the calculations for a single simulated delta-t update! You might be able to improve that by playing games with locality, but given the speed of light and the small distances involved you'd have to make a convincing case that heat transfer via thermal radiation couldn't cause state-altering interactions between any pair of atoms within the volume. To sum it up, the storage and calculation requirements are both infeasible for as little as a single mole of mass.
I know from a conversation at a conference a couple of years ago that there are some advanced physics labs that have worked on this approach to get an idea of what happens with a few thousand atoms. However, I can't give specific references since I haven't seen the papers and only heard about it over a beer.

Using Optaplanner to solve VRPTW with large number of customers and sophisticated constraints

I'm developing a solver for a VRPTW problem using the OptaPlanner and I have faced a problem when large number of customers need to be serviced. By the large number I mean up to 10,000 customers. I have tried running a solver for about 48 hours but no feasible solution was ever reached.
I use a highly customized VRPTW domain model that introduces additional planning entity so-called "Workbreak". Workbreaks are like customers but they can have a location that is actually another planning value - because every day a worker can return home or go to the hotel. Workbreaks have fixed time of departure (usually next day morning), and a variable time of arrival (because it depends on the previous entity within a chain). A hard constraint cares about not allowing to "arrive" to the Workbreak after certain point of time. There are other hard constraints too, like:
multiple service time windows per customer
every week the last customer in chain must be a special customer "storage space visit" (workers need to gather materials before the next week)
long jobs management (when a customer needs to be serviced longer than specified time it should be serviced before specific hour of a day)
max number of jobs per workday
max total job duration per workday (as worker cannot work longer than specified time)
a workbreak cannot have a location of a hotel that is too close to worker's home.
jobs can not be serviced on Sundays
... and many more - there is a total number of 19 hard constrains that have to be applied. There are 3 soft constraints too.
All the aforementioned constraints were initially written as Drools rules, but because of many accumulation-based constraints (max jobs per day, max hours per day, overtime hours per week) the overall speed of the solver (benchmarks) was about 400 step/sec.
At first I thought that solver's speed is too slow to reach a feasible solution in a reasonable time, so I have rewritten all rules into easy score calculator, and it had a decent speed - about 4600 steps/sec. I knew that is will only perform best for a really small number of customers, but I wanted to know if the Drools was the cause of that poor performance. Then I have rewritten all these rules into incremental score calculator (and survived the pain of corrupted score bugs until all of them were successfully fixed). Surprisingly incremental score calculation is a bit slower for a small number of customers, comparing to easy score calculator, but it is not an issue, because overall speed is about 4000 steps/sec - no matter how many entities I have.
The thing that bugs me the most is that above a certain number of customers (problems start at 1000 customers) the solver cannot reach feasible solution. Currently I'm using Late Acceptance and Step Counting algorithms, because they perform really good for this kind of a problem (at least for a less number of customers). I used Simulated Annealing too, but without success, mostly because I could not find good values for algorithm specific parameters.
I have implemented some custom moves too:
Composite move that changes workbreak's location when sibling entities are changed using other moves like change/swap moves (it helps escaping many score traps, as improving step usually needs at least two moves to be performed in a single step)
Move factory for better long jobs assignment (it generates moves that tries to put customers with longer service time in the front of a workday chain)
Workbreak assignment move factory (it generates moves that helps putting workbreaks in proper sequence)
Now I'm scratching my head, and wondering what I should do to diagnose the source of my problem. I suspected that maybe it was hitting a score trap, but I have modified the solver so it saves snapshots of best score each minute. After reading these snapshots I realized that the score was still decreasing. Can the number of hard constraints play the role? I suspect that many moves need to be performed to find out a move that improves the score. The fact is that maybe 48 hours isn't that much for this kind of a problem, and it should make computations a whole week? Unfortunately I have nothing to compare with.
I would like to know how to find out if it is solely a performance problem, or a solver (algorithm, custom moves, hard/soft score) configuration problem.
I really apologize for my bad English.
TL;DR but FWIW:
To scale above 1k locations you need to use NearBy selection.
To scale above 10k locations, add Partitioned Search too.

Using OptaPlanner to solve large Vehicle Routing case

I have 4 people to visit 22.000 places. So, I need to minimize the total time of the visits.
I have the spatial location of the places, and I'm thinking of getting a distance between them or using euclidian distance or using the Google Maps API.
It's possible to solve this problem using OptaPlanner.
I think of solving using the Vehicle Routing modeling. This is the best option? Would OptaPlanner support this amount of input data?
OptaPlanner has done cases like this, but you'll need to enable "nearby selection" explicitly because it's above 1k locations.
Because it's above 10k locations, it might be interesting to benchmark (using the benchmarker) with Partitioned Search too. For example, to speed up the Construction Heuristic, you might want to wrap that in a Partitioned Search. You probably can't wrap it all, because there are only 4 people.
As for using Google Maps API, first read this blog. Then: 10k locations takes 2GB of RAM IIRC to store the distance matrix in its most efficient form (double array of 32-bits) - this has nothing to do with optaplanner. I suspect 22k will bring you near 10GB of RAM just to load that in memory.

Optaplanner CVRPTW example clarification

I am trying to understand the Optaplanner CVRPTW example and have the below questions:
Does every node require both distance and travel time to every other node? Or it just requires any one of them? Example data set does not contain both of them. I think it uses euclidean formula to calculate the distance, but how does it automatically calculate travel time?
Is it possible to use real time data (precalculated road distance data)?
Depends if the dataset is using AirLocation or RoadLocation. See docs on vehicle routing, chapter 3.
Yes, if you can hold all the data in memory. At 10k+ locations this becomes a problem because (10k)² ints require almost 2GB RAM. The goal of SegmentedRoadLocation is to scale up to 100k locations without using a lot of RAM, but generating good segmented road location has proven to be difficult.

How to calculate ETA without a map

Need a little help from someone who knows a little about logistics;
I am currently working with an application known as Framework. The application is not really something that I am familiar with, but regardless I can figure out how it works. One of the tabs running in the application is for expected orders (shipping trucks). Within that, I am able to see where an outbound truck's current location is, as well as it's destination. I am trying to add functionality to the application that would allow me to see an estimated time of arrival to its current destination + the drive back to my location. This seems simple enough, but I'm trying to figure out the best way that I could calculate this. I looked into The Google Distance Matrix API, but I have no need to display a map on the application, all I want is the ETA. I am pretty inexperienced with this kind of thing, so I was hoping someone could point me in the right direction.
Thanks guys.
This may not be the best forum for this question...
It looks like Google Distance Matrix requires you to display the map. An alternative is the open source OSRM project. Natively it's a C++ engine for routing, which outputs directions and the total route information so the any map display is up to you.
There is a demo and HTTP API hosted on the project site but you will need to check if it's suitable for your usage level.
Just an idea, but depending on the size of your delivery area, and how accurate you want the estimated time, you may be able to keep it all in a database.
Let's assume your delivery area is 10 miles x 10 miles.
So that's 100 square miles. We'll use each square mile as a point.
Do a one time calculation of how long it will take to get from each
point to the rest. You can
use the Google Distance Matrix API for this since you're only doing
this once.
This will give you 10,000 records that has every point to point time.
So, if your truck is in point 25, and has to get to point 64, you do a lookup and see that it should take about 10 minutes. plus the drive from point 64 back to the warehouse (point 10) is 8 minutes. Then you'll know the truck should be back in about 18 minutes.
It's not super accurate, but it might be close enough for your needs. I would be curious if you do implement this method.
Btw, if your delivery area is 100 miles x 100 miles, then that would be 100,000,000 location points if each point is 1 square mile. If that's too much, then if you increase your point size to 2 miles x 2 miles (4 square miles), then that's about 6,000,000 records.