Strategy to tackle knapsack binded with travelling salesman - optimization

I have been assigned the following problem as a research topic for summer. However, I have not been able to successfully find related problem, except that it seems to be a combination of travelling salesman with the knapsack, even though I'm not sure if that is the case. The statement is:
You are a truck driver who earned a big contract and now you must
deliver a lot of packages. There are N packages to be delivered, each
one has to be delivered at a certain address (x,y) on the city.
Additionally each package i has a weight Wi.
For simplicity, suppose the distribution area is rectangular and you
always start at the point (0,90).
You only have one truck, with limited capacity of 1000 (excluding
truck weight). The truck base weight is 10.
The distances to be travelled are far away, so the distance shall be
computer using Haversine distance.
The company who contracted you will provide you with enough fuel, so
you can make unlimited ammount of travels.
However, you must be very careful while delivering the packages since
you must deliver every single one of them and, if you choose to pick
up a package during a trip, you must deliver it no matter what, you
can't leave them in the middle of your trip.
As you are a bit miserly, you agree on the conditions, but you know
that if you don't take a close to optimal strategy, your truck can
wear too much, so you could end leaving the contract incomplete,
which will get you sued and leave you without truck and money.
So, due to your experience, you know that to maximize the chances of survival
of your truck, you must minimize the following function:
http://goo.gl/jQxXzN (sorry, I can't post images because I have not enough reputation).
where m is the number of tripss, n is the number of packages for each trip j, wij is the weight of the i-th gift during the j-th trip, Dist represents Haversine distance between two points, Loc(i) is the location of the i-th gift, Loc(0) and Loc(n) are the (0,90) point (starting point), and wnj (last weight of the trip) is always the truck weight (base weight).
So, basically, those are all the restictions of the research topic I got.
I have been thinking maybe some metaheuristics such as ant collony or genetic algorithms could be of help, but I would have to get to know the problem a bit better. Any idea or information (paper, book, etc) is welcomed.

This sounds like a variation of the Capacitated Vehicle Routing Problem (CRVP), specifically with with single-vehicle and single-depot (however with non-uniform packages). For some reference on this problem, see e.g.:
On the Capacitated Vehicle Routing Problem (T.K. Ralphs, L. Kopman, W.R. Pulleyblank, and L.E. Trotter, Jr)
Modeling and Solving the Capacitated Vehicle Routing Problem on Trees (B. Chandran and S. Raghavan)
I think your idea of metaheuristics---ant colony optimisation (ACO) in particular---would be a wise approach. Note that for TSP related problems, generally ACO is to prefer over genetic algorithms (GA).
Possibly the following somewhat well known article (1) can get you started to studying the possible benefits of ACO approach. (2) extends the result from (1). Note that this covers the regular VHP (no capacitated), but it should prove valuable as a starting point, the least giving inspiration.
SavingsAnts for the Vehicle Routing Problem (K. Doerner et al.)
An improved ant colony optimization for vehicle routing problem (Yu Bin et al.)
There also seems to exists literature specifically one the subject of ACO and CVRP, but I cannot comment to the quality of these, but I'll list them for you to inspect by yourself. Note that (3) is an extension from the authors of (1).
Parallel Ant Systems for the Capacitated Vehicle Routing Problem (K. Doerner et al.)
Ant Colony Optimization for Capacitated Vehicle Routing Problem (W.F. Tan et al.)
Sound like an interesting research topic, good luck!

Related

Is it possible to model the Universe in an object oriented manner from the subatomic level upwards?

While I'm certain this must have been tried before, I cant seem to find any examples of this concept being done myself.
What I'm describing goes off of the idea that effectively you could model all "things" which are as objects. From their you can make objects which use other objects. An example would be starting at the fundamental particles in physics combine them to get certain particles like protons neutrons and electrons - then atoms - work your way up to the rest of chemistry etc....
Has this been attempted before and is it possible? How would I even go about it?
If what you mean by "the Universe," is the entire actual universe, the answer to "Is it possible?" is a resounding "Hell no!!!"
Consider a single mole of H2O, good old water. By definition a mole contains ~6*1023 atoms, and knowing the atomic weights involved yields the mass. The density of water is well known. Pulling all the pieces together, we end up with 1 mole is about 18 mL of water. To put that in perspective, the cough syrup dose cup in my medicine cabinet is 20mL. If you could represent the state of each atom using a single byteβ€”I doubt it!β€”you'd require 1011 terabytes of storage just to represent a snapshot of that mass, and you'd need to update that volume of data every delta-t for the duration you wish to simulate. Additionally, the number of 2-way interactions between N entities grows as O(N2), i.e., on the order of 1046 calculations would be involved, again at every delta-t. To put that into perspective, if you had access to the world's fastest current distributed computer with exaflop capability, it would take you O(1028) seconds (on the order of 1020 years) to perform the calculations for a single simulated delta-t update! You might be able to improve that by playing games with locality, but given the speed of light and the small distances involved you'd have to make a convincing case that heat transfer via thermal radiation couldn't cause state-altering interactions between any pair of atoms within the volume. To sum it up, the storage and calculation requirements are both infeasible for as little as a single mole of mass.
I know from a conversation at a conference a couple of years ago that there are some advanced physics labs that have worked on this approach to get an idea of what happens with a few thousand atoms. However, I can't give specific references since I haven't seen the papers and only heard about it over a beer.

First Fit Decreasing Algorithm on CVRP in Optaplanner

I am using the FFD Algorithm in Optaplanner as a construction heuristic for my CVRP problem. I thought I understood the FFD-Alg from bin picking, but I don't understand the logic behind it when applied in OP on CVRP. So my thought was, it focuses on the demands (Sort cities in decreasing order, starting with the highest demand). To proof my assumption, I fixed the city coordinates to only one location, so the distance to the depot of all cities is the same. Then I changed the demands from big to small. But it doesn't take the cities in decrasing order in the result file.
The Input is: City 1: Demand 16, City 2: Demand 12, City 3: Demand 8, City 4: Demand 4,
City 5: Demand 2.
3 Vehicles with a capacity of 40 per vehicle.
What I thougt: V1<-[C1,C2,C3,C4], V2<-[C5]
What happened: V1<-[C5,C4,C3,C2], V2<-[C1]
Could anyone please explain me the theory of this? Also, I would like to know what happens the other way around, same capacities, but different locations per customer. I tried this too, but it also doesn't sort the cities beginning with the farthest one.
Thank you!
(braindump)
Unlike with non-VRP problems, the choice of "difficulty comparison" to determine the "Decreasing Difficulty" of "First Fit Decreasing", isn't always clear. I've done experiments with several forms - such as based on distance to depot, angle to depot, latitude, etc. You can find all those forms of difficulty comperators in the examples, usually TSP.
One common pitfall is to tweak the comperator before enabling Nearby Selection: tweak Nearby Selection first. If you're dealing with large datasets, an "angle to depo" comparator will behave much better, just because Nearby Selection and/or Paritioned Search aren't active yet. In any case, as always: you need to be using optaplanner-benchmark for this sort of work.
This being said, on a pure TSP use case, the First Fit Decreasing algorithm has worse results than the Nearest Neighbor algorithm (which is another construction heuristics for which we have limited support atm). Both require Local Search to improve further, of course. However, translating the Nearest Neighbor algorithm to VRP is difficult/ambiguous (to say the least): I was/am working on such a translation and I am calling it the Bouy algorithm (look for a class in the vrp example that starts with Bouy). It works, but it doesn't combine well with Nearby Selection yet IIRC.
Oh, and there's also the Clarke and Wright savings algorithm, which is great on small pure CVRP cases. But suffers from BigO (= scaling) problems on bigger datasets and it too becomes difficult/ambiguous when other constraints are added (such as time windows, skill req, lunch breaks, ...).
Long story short: the jury's still out on what's the best construction heuristic for real-world, advanced VRP cases. optaplanner-benchmark will help us there. This despite all the academic papers talking about their perfect CH for a simple form of VRP on small datasets...

Confusion about NP-hard and NP-Complete in Traveling Salesman problems

Traveling Salesman Optimization(TSP-OPT) is a NP-hard problem and Traveling Salesman Search(TSP) is NP-complete. However, TSP-OPT can be reduced to TSP since if TSP can be solved in polynomial time, then so can TSP-OPT(1). I thought for A to be reduced to B, B has to be as hard if not harder than A. As I can see in the below references, TSP-OPT can be reduced to TSP. TSP-OPT is supposed to be harder than TSP. I am confused...
References: (1)Algorithm, Dasgupta, Papadimitriou, Vazirani Exercise 8.1 http://algorithmics.lsi.upc.edu/docs/Dasgupta-Papadimitriou-Vazirani.pdf https://cseweb.ucsd.edu/classes/sp08/cse101/hw/hw6soln.pdf
http://cs170.org/assets/disc/dis10-sol.pdf
I took a quick look at the references you gave, and I must admit there's one thing I really dislike in your textbook (1st pdf) : they address NP-completeness while barely mentioning decision problems. The provided definition of an NP-complete problem also somewhat deviates from what I'd expect from a textbook. I assume that was a conscious decision to make the introduction more appealing...
I'll provide a short answer, followed by a more detailed explanation about related notions.
Short version
Intuitively (and informally), a problem is in NP if it is easy to verify its solutions.
On the other hand, a problem is NP-hard if it is difficult to solve, or find a solution.
Now, a problem is NP-complete if it is both in NP, and NP-hard. Therefore you have two key, intuitive properties to NP-completeness. Easy to verify, but hard to find solutions.
Although they may seem similar, verifying and finding solutions are two different things. When you use reduction arguments, you're looking at whether you can find a solution. In that regard, both TSP and TSP-OPT are NP-hard, and it is difficult to find their solutions. In fact, the third pdf provides a solution to excercise 8.1 of your textbook, which directly shows that TSP and TSP-OPT are equivalently hard to solve.
Now, one major distinction between TSP and TSP-OPT is that the former is (what your textbook call) a search problem, whereas the latter is an optimization problem. The notion of verifying the solution of a search problem makes sense, and in the case of TSP, it is easy to do, therefore it is NP-complete. For optimization problems, the notion of verifying a solution becomes weird, because I can't think of any way to do that without first computing the size of an optimal solution, which is roughly equivalent to solving the problem itself. Since we do not know an efficient way to verify a solution for TSP-OPT, we cannot say that it is in NP, thus we cannot say that it is NP-complete. (More on this topic in the detailed explanation.)
The tl;dr is that for TSP-OPT, it is both hard to verify and hard to find solutions, while for TSP it is easy to verify and hard to find solutions.
Reductions arguments only help when it comes to finding solutions, so you need other arguments to distinguish them when it comes to verifying solutions.
More details
One thing your textbook is very brief about is what a decision problem is.
Formally, the notion of NP-completeness, NP-hardness, NP, P, etc, were developed in the context of decision problems, and not optimization or search problems.
Here's a quick example of the differences between these different types of problems.
A decision problem is a problem whose answer is either YES or NO.
TSP decision problem
Input: a graph G, a budget b
Output: Does G admit a tour of weight at most b ? (YES/NO)
Here is the search version
TSP search problem
Input: a graph G, a budget b
Output: Find a tour of G of weight at most b, if it exists.
And now the optimization version
TSP optimization problem
Input: a graph G
Output: Find a tour of G with minimum weight.
Out of context, the TSP problem could refer to any of these. What I personally refer to as the TSP is the decision version. Here your textbook use TSP for the search version, and TSP-OPT for the optimization version.
The problem here is that those various problems are strictly distinct. The decision version only ask for existence, while the search version asks for more, it needs one example of such a solution. In practice, we often want to have the actual solution, so more practical approaches may omit to mention decision problems.
Now what about it? The definition of an NP-complete problem was meant for decision problems, so it technically does not apply directly to search or optimization problems. But because the theory behind it is well defined and useful, it is handy to still apply the term NP-complete/NP-hard to search/optimization problem, so that you have an idea of how hard these problems are to solve. So when someone says the travelling salesman problem is NP-complete, formally it should be the decision problem version of the problem.
Obviously, many notions can be extended so that they also cover search problems, and that is how it is presented in your textbook. The differences between decision, search, and optimization, are precisely what the exercises 8.1 and 8.2 try to cover in your textbook. Those exercises are probably meant to get you interested in the relationship between these different types of problems, and how they relate to one another.
Short Version
The decision problem is NP-complete because you can both have a polynomial time verifier for the solution, as well as the fact that the hamiltonian cycle problem is reducible to TSP_DECIDE in polynomial time.
However, the optimization problem is strictly NP-hard, because even though TSP_OPTIMIZE is reducible from the hamiltonian (HAM) cycle problem in polynomial time, you don't have a poly time verifier for a claimed hamiltonian cycle C, whether it is the shortest or not, because you simply have to enumerate all possibilities (which consumes the factorial order space & time).
What the given reference define is, bottleneck TSP
The Bottleneck traveling salesman problem (bottleneck TSP) is a problem in discrete or combinatorial optimization. The problem is to find the Hamiltonian cycle in a weighted graph which minimizes the weight of the most weighty edge of the cycle.
The problem is known to be NP-hard. The decision problem version of this, "for a given length x is there a Hamiltonian cycle in a graph G with no edge longer than x?", is NP-complete. NP-completeness follows immediately by a reduction from the problem of finding a Hamiltonian cycle.
This problem can be solved by performing a binary search or sequential search for the smallest x such that the subgraph of edges of weight at most x has a Hamiltonian cycle. This method leads to solutions whose running time is only a logarithmic factor larger than the time to find a Hamiltonian cycle.
Long Version
The mistake is to say that the TSP is NP complete. Truth is that TSP is NP hard. Let me explain a bit:
The TSP is a problem defined by a set of cities and the distances
between each city pair. The problem is to find a circuit that goes
through each city once and that ends where it starts. This in itself
isn't difficult. What makes the problem interesting is to find the
shortest circuit among all those that are possible.
Solving this problem is quite simple. One merely need to compute the length of all possible circuits, then keep the shortest one. Issue is that the number of such circuits grows very quickly with the number of cities. If there are n cities then this number is factorial of n-1 = (n-1)(n-2)...3.2.
A problem is NP if one can easily (in polynomial time) check that a proposed solution is indeed a solution.
Here is the trick.
In order to check that a proposed tour is a solution of the TSP we need to check two things, namely
That each city is is visited only once
That there is no shorter tour than the one we are checking
We didn't check the second condition! The second condition is what makes the problem difficult to solve. As of today, no one has found a way to check condition 2 in polynomial time. It means that the TSP isn't in NP, as far as we know.
Therefore, TSP isn't NP complete as far as we know. We can only say that TSP is NP hard.
When they write that TSP is NP complete, they mean that the following decision problem (yes/no question) is NP complete:
TSP_DECISION : Given a number L, a set of cities, and distance between all city pairs, is there a tour visiting each city exactly once of length less than L?
This problem is indeed NP complete, as it is easy (polynomial time) to check that a given tour leads to a yes answer to TSPDECISION.

ACO Pheromone update

I'm working on ACO and a little confused about the probability of choosing the next city. I have read some papers and books but still the idea of choosing is unclear. I am looking for a simple explanation how this path building works.
Also how does the heuristics and pheromone come into this decision making?
Because we have same pheromone values at every edge in the beginning and the heuristics (closeness) values remains constant, so how will different ants make different decisions based on these values?
Maybe is too late to answer to question but lately I have been working with ACO and it was also a bit confusing for me. I didn't find much information on stackoverflow about ACO when I needed it, so I have decided to answer this question since maybe this info is usefull for people working on ACO now or in the future.
Swarm Intelligence Algorithms are a set of techniques based on emerging social and cooperative behavior of organisms grouped in colonies, swarms, etc.
The collective intelligence algorithm Ant Colony Optimization (ACO) is an optimization algorithm inspired by ant colonies. In nature, ants of some species initially wander randomly until they find a food source and return to their colony laying down a pheromone trail. If other ants find this trail, they are more likely not to keep travelling at random, but instead to follow the trail, and reinforcing it if they eventually find food.
In the Ant Colony Optimization algorithm the agents (ants) are placed on different nodes (usually it is used a number of ants equal to the number of nodes). And, the probability of choosing the next node (city) is based on agents chose the next node using the equation known as the transition rule, that represents the probability for ant π‘˜ to go from city 𝑖 to city 𝑗 on the 𝑑th tour.
In the equation, πœπ‘–π‘— represents the pheromone trail and πœ‚π‘–π‘— the visibility between the two cities, while 𝛼 and 𝛽 are adjustable parameters that control the relative weight of trail intensity and visibility.
At the beginning there is the same value of pheromone in all the edges. Based on the transition rule, which, in turn, is based on the pheromone and the visibility (distance between nodes), some paths will be more likely to be chosen than others.
When the algorithm starts to run each agent (ant) performs a tour (visits each node), the best tour found until the moment will be updated with a new quantity of pheromone, which will make that tour more probably to be chosen next time by the ants.
You can found more information about ACO in the following links:
http://dataworldblog.blogspot.com.es/2017/06/ant-colony-optimization-part-1.html
http://dataworldblog.blogspot.com.es/2017/06/graph-optimization-using-ant-colony.html
http://hdl.handle.net/10609/64285
Regards,
Ester

Optimizing assignments based on many variables

I was recently talking with someone in Resource Management and we discussed the problem of assigning developers to projects when there are many variables to consider (of possibly different weights), e.g.:
The developer's skills & the technology/domain of the project
The developer's travel preferences & the location of the project
The developer's interests and the nature of the project
The basic problem the RM person had to deal with on a regular basis was this: given X developers where each developers has a unique set of attributes/preferences, assign them to Y projects where each project has its own set of unique attributes/requirements.
It seems clear to me that this is a very mathematical problem; it reminds me of old optimization problems from algebra and/or calculus (I don't remember which) back in high school: you know, find the optimal dimensions for a container to hold the maximum volume given this amount of materialβ€”that sort of thing.
My question isn't about the math, but rather whether there are any software projects/libraries out there designed to address this kind of problem. Does anyone know of any?
My question isn't about the math, but rather whether there are any software projects/libraries out there designed to address this kind of problem. Does anyone know of any?
In my humble opinion, I think that this is putting the cart before the horse. You first need to figure out what problem you want to solve. Then, you can look for solutions.
For example, if you formulate the problem by assigning some kind of numerical compatibility score to every developer/project pair with the goal of maximizing the total sum of compatibility scores, then you have a maximum-weight matching problem which can be solved with the Hungarian algorithm. Conveniently, this algorithm is implemented as part of Google's or-tools library.
On the other hand, let's say that you find that computing compatibility scores to be infeasible or unreasonable. Instead, let's say that each developer ranks all the projects from best to worst (e.g.: in terms of preference) and, similarly, each project ranks each developer from best to worst (e.g.: in terms of suitability to the project). In this case, you have an instance of the Stable Marriage problem, which is solved by the Gale-Shapley algorithm. I don't have a pointer to an established library for G-S, but it's simple enough that it seems that lots of people just code their own.
Yes, there are mathematical methods for solving a type of problem which this problem can be shoehorned into. It is the natural consequence of thinking of developers as "resources", like machine parts, largely interchangeable, their individuality easily reduced to simple numerical parameters. You can make up rules such as
The fitness value is equal to the subject skill parameter multiplied by the square root of the reliability index.
and never worry about them again. The same rules can be applied to different developers, different subjects, different scales of projects (with a SLOC scaling factor of, say, 1.5). No insight or real leadership is needed, the equations make everything precise and "assured". The best thing about this approach is that when the resources fail to perform the way your equations say they should, you can just reduce their performance scores to make them fit. And if someone has already written the tool, then you don't even have to worry about the math.
(It is interesting to note that Resource Management people always seem to impose such metrics on others in an organization -- thereby making their own jobs easier-- and never on themselves...)