When to switch from Dynamic Programming (2D table) to Branch & Bound algorithm?

When to switch from Dynamic Programming (2D table) to Branch & Bound algorithm? - optimization

I'm doing a knapsack optimization problem involving dynamic programming and branch & bound. I noticed that when the capacity and the item of the problem gets large, filling up the 2D table for the dynamic programming algorithm will get exponentially slower. At some point I am going to assume that I am suppose to switch algorithm depending on the size of the problem (since lecture have give two types of optimization)?
I've tried to google at what point (what size) should I switch from dynamic programming to branch & bound, but I couldn't get the result that I wanted.
Or, is there another way of looking at the knapsack problem in which I can combine dynamic programming and branch & bound as one algorithm instead of switching algorithm depending of the size of the problem?
Thanks.

Often when you have several algorithms that solve a problem but whose runtimes have different characteristics, you determine (empirically, not theoretically) when one algorithm becomes faster than the other. This is highly implementation- and machine-dependent. So measure both the DP algorithm and the B&B algorithm and figure out which one is better when.
A couple of hints:
You know that DP's runtime is proportional to the number of objects times the size of the knapsack.
You know that B&B's runtime can be as bad as 2# objects, but it's typically much better. Try to figure out the worst case.
Caches and stuff are going to matter for DP, so you'll want to estimate its runtime piecewise. Figure out where the breakpoints are.
DP takes up exorbitant amounts of memory. If it's going to take up too much, you don't really have a choice between DP and B&B.

Related

Information about CGAL and alternatives

I'm working on a problem that will eventually run in an embedded microcontroller (ESP8266). I need to perform some fairly simple operations on linear equations. I don't need much, but do need to be able work with points and linear equations to:
Define an equations for lines either from two known points, or one
point and a gradient
Calculate a new x,y point on an equation line that is a specific distance from another point on that equation line
Drop a perpendicular onto an equation line from a point
Perform variations of cosine-rule calculations on points and triangle sides defined as equations
I've roughed up some code for this a while ago based on high school "y = mx + c" concepts, but it's flawed (it fails with infinities when lines are vertical), and currently in Scala. Since I suspect I'm reinventing a wheel that's not my primary goal, I'd like to use someone else's work for this!
I've come across CGAL, and it seems very likely it's capable of all this and more, but I have two questions about it (given that it seems to take ages to get enough understanding of this kind of huge library to actually be able to answer simple questions!)
It seems to assert some kind of mathematical perfection in it's calculations, but that's not important to me, and my system will be severely memory constrained. Does it use/offer memory efficient approximations?
Is it possible (and hopefully easy) to separate out just a limited subset of features, or am I going to find the entire library (or even a very large subset) heading into my memory limited machine?
And, I suppose the inevitable follow up: are there more suitable libraries I'm unaware of?
TIA!

The problems that you are mentioning sound fairly simple indeed, so I'm wondering if you really need any library at all. Maybe if you post your original code we could help you fix it--your problem sounds like you need to redo a calculation avoiding a division by zero.
As for your point (2) about separating a limited number of features from CGAL, giving the size and the coding style of that project, from my experience that will be significantly more complicated (if at all possible) than fixing your own code.
In case you want to try a simpler library than CGAL, maybe you could try Boost.Geometry
Regards,

Determining hopeless branches early in branch-and-bound algorithms

I have to design a branch-and bound algorithm that solves the optimal tour of a graph on the cartesian plane every time. I have been given the hint that identifying hopeless branches earlier in the runtime will compound into a program that runs "a hundred times faster". I had the idea of assuming that the shortest edge connected to the starting/ending node will be either the first or last edge in the tour but a thin diamond shaped graph proves otherwise. Does any one have ideas for how to eliminate these hopeless branches or a reference that talks about this?
Basically, is there a better way to branch to subsets of solutions better than just lexicographically, eg. first branch is including and excluding edge a-b, second branch includes and excludes branch a-c

So somewhere in your branch-and-bound algorithm, you look at possible places to go, and then somehow keep track of them to do later.
To make this more efficient, you can do a couple things:
Write a better bound calculator. In other words, come up with an algorithm that determines the bound more accurately. This will result in less time spent on paths that turn out to be poor.
Instead of using a stack to keep track of things to do, use a queue. Instead of using a queue, use a priority queue (heap) ordered by bound, e.g. the things that seem best are put at the top of the heap, and the things that seem bad are put on the bottom.

Nearest-neighbor is a simple algorithm. Branch-and-Bound is just an optimizing loop and additionally you need a sub-problem solver. I think nearest-neighbor is also a branch-and-bound algorithm. Instead I would look into the simplex algorithm. It's a linear programming algorithm. Also cutting-plane algorithm to solve tsp.

Optimization algorithm question

This may be a simple question for those know-how guys. But I cannot figure it out by myself.
Suppose there are a large number of objects that I need to select some from. Each object has two known variables: cost and benefit. I have a budget, say $1000. How could I find out which objects I should buy to maximize the total benefit within the given budget? I want a numeric optimization solution. Thanks!

Your problem is called the "knapsack problem". You can read more on the wikipedia page. Translating the nomenclature from your original question into that of the wikipedia article, your problem's "cost" is the knapsack problem's "weight". Your problem's "benefit" is the knapsack problem's "value".
Finding an exact solution is an NP-complete problem, so be prepared for slow results if you have a lot of objects to choose from!

You might also look into Linear Programming. From MathWorld:
Simplistically, linear programming is
the optimization of an outcome based
on some set of constraints using a
linear mathematical model.

Yes, as stated before, this is the knapsack problem and I would choose to use linear programming.
The key to this problem is storing data so that you do not need to recompute things more than once (if enough memory is available). There are two general ways to go about linear programming: top-down, and bottom - up. This one is a bottom up problem.
(in general) Find base case values, what is the most optimal object to select for a small case. Then build on this. If we allow ourselves to spend more money what is the best combination of objects for that small increment in money. Possibilities could be taking more of what you previously had, taking one new object and replacing the old one, taking another small object that will still keep you under your budget etc.
Like I said, the main idea is to not recompute values. If you follow this pattern, you will get to a high number and find that in order to buy X amount of dollars worth of goods, the best solution is combining what you had for two smaller cases.

Looking for ideas/references/keywords: adaptive-parameter-control of a search algorithm (online-learning)

I'm looking for ideas/experiences/references/keywords regarding an adaptive-parameter-control of search algorithm parameters (online-learning) in combinatorial-optimization.
A bit more detail:
I have a framework, which is responsible for optimizing a hard combinatorial-optimization-problem. This is done with the help of some "small heuristics" which are used in an iterative manner (large-neighborhood-search; ruin-and-recreate-approach). Every algorithm of these "small heuristics" is taking some external parameters, which are controlling the heuristic-logic in some extent (at the moment: just random values; some kind of noise; diversify the search).
Now i want to have a control-framework for choosing these parameters in a convergence-improving way, as general as possible, so that later additions of new heuristics are possible without changing the parameter-control.
There are at least two general decisions to make:
A: Choose the algorithm-pair (one destroy- and one rebuild-algorithm) which is used in the next iteration.
B: Choose the random parameters of the algorithms.
The only feedback is an evaluation-function of the new-found-solution. That leads me to the topic of reinforcement-learning. Is that the right direction?
Not really a learning-like-behavior, but the simplistic ideas at the moment are:
A: A roulette-wheel-selection according to some performance-value collected during the iterations (near past is more valued than older ones).
So if heuristic 1 did find all the new global best solutions -> high probability of choosing this one.
B: No idea yet. Maybe it's possible to use some non-uniform random values in the range (0,1) and i'm collecting some momentum of the changes.
So if heuristic 1 last time used alpha = 0.3 and found no new best solution, then used 0.6 and found a new best solution -> there is a momentum towards 1
-> next random value is likely to be bigger than 0.3. Possible problems: oscillation!
Things to remark:
- The parameters needed for good convergence of one specific algorithm can change dramatically -> maybe more diversify-operations needed at the beginning, more intensify-operations needed at the end.
- There is a possibility of good synergistic-effects in a specific pair of destroy-/rebuild-algorithm (sometimes called: coupled neighborhoods). How would one recognize something like that? Is that still in the reinforcement-learning-area?
- The different algorithms are controlled by a different number of parameters (some taking 1, some taking 3).
Any ideas, experiences, references (papers), keywords (ml-topics)?
If there are ideas regarding the decision of (b) in a offline-learning-manner. Don't hesitate to mention that.
Thanks for all your input.
Sascha

You have a set of parameter variables which you use to control your set of algorithms. Selection of your algorithms is just another variable.
One approach you might like to consider is to evolve your 'parameter space' using a genetic algorithm. In short, GA uses an analogue of the processes of natural selection to successively breed ever better solutions.
You will need to develop an encoding scheme to represent your parameter space as a string, and then create a large population of candidate solutions as your starting generation. The genetic algorithm itself takes the fittest solutions in your set and then applies various genetic operators to them (mutation, reproduction etc.) to breed a better set which then become the next generation.
The most difficult part of this process is developing an appropriate fitness function: something to quantitatively measure the quality of a given parameter space. Your search problem may be too complex to measure for each candidate in the population, so you will need a proxy model function which might be as hard to develop as the ideal solution itself.
Without understanding more of what you've written it's hard to see whether this approach is viable or not. GA is usually well suited to multi-variable optimisation problems like this, but it's not a silver bullet. For a reference start with Wikipedia.

This sounds like hyper heuristics which you're trying to do. Try looking for that keyword.
In Drools Planner (open source, java) I have support for tabu search and simulated annealing out the box.
I haven't implemented the ruin-and-recreate-approach (yet), but that should be easy, although I am not expecting better results. Challenge: Prove me wrong and fork it and add it and beat me in the examples.
Hyper heuristics are on my TODO list.

What statistics concepts are useful for profiling?

I've been meaning to do a little bit of brushing up on my knowledge of statistics. One area where it seems like statistics would be helpful is in profiling code. I say this because it seems like profiling almost always involves me trying to pull some information from a large amount of data.
Are there any subjects in statistics that I could brush up on to get a better understanding of profiler output? Bonus points if you can point me to a book or other resource that will help me understand these subjects better.

I'm not sure books on statistics are that useful when it comes to profiling. Running a profiler should give you a list of functions and the percentage of time spent in each. You then look at the one that took the most percentage wise and see if you can optimise it in any way. Repeat until your code is fast enough. Not much scope for standard deviation or chi squared there, I feel.

All I know about profiling is what I just read in Wikipedia :-) but I do know a fair bit about statistics. The profiling article mentioned sampling and statistical analysis of sampled data. Clearly statistical analysis will be able to use those samples to develop some statistical statements on performance. Let's say you have some measure of performance, m, and you sample that measure 1000 times. Let's also say you know something about the underlying processes that created that value of m. For instance, if m is the SUM of a bunch of random variates, the distribution of m is probably normal. If m is the PRODUCT of a bunch of random variates, the distribution is probably lognormal. And so on...
If you don't know the underlying distribution and you want to make some statement about comparing performance, you may need what are called non-parametric statistics.
Overall, I'd suggest any standard text on statistical inference (DeGroot), a text that covers different probability distributions and where they're applicable (Hastings & Peacock), and a book on non-parametric statistics (Conover). Hope this helps.

Statistics is fun and interesting, but for performance tuning, you don't need it. Here's an explanation why, but a simple analogy might give the idea.
A performance problem is like an object (which may actually be multiple connected objects) buried under an acre of snow, and you are trying to find it by probing randomly with a stick. If your stick hits it a couple of times, you've found it - it's exact size is not so important. (If you really want a better estimate of how big it is, take more probes, but that won't change its size.) The number of times you have to probe the snow before you find it depends on how much of the area of the snow it is under.
Once you find it, you can pull it out. Now there is less snow, but there might be more objects under the snow that remains. So with more probing, you can find and remove those as well. In this way, you can keep going until you can't find anything more that you can remove.
In software, the snow is time, and probing is taking random-time samples of the call stack. In this way, it is possible to find and remove multiple problems, resulting in large speedup factors.
And statistics has nothing to do with it.

Zed Shaw, as usual, has some thoughts on the subject of statistics and programming, but he puts them much more eloquently than I could.

I think that the most important statistical concept to understand in this context is Amdahl's law. Although commonly referred to in contexts of parallelization, Amdahl's law has a more general interpretation. Here's an excerpt from the Wikipedia page:
More technically, the law is concerned
with the speedup achievable from an
improvement to a computation that
affects a proportion P of that
computation where the improvement has
a speedup of S. (For example, if an
improvement can speed up 30% of the
computation, P will be 0.3; if the
improvement makes the portion affected
twice as fast, S will be 2.) Amdahl's
law states that the overall speedup of
applying the improvement will be

I think one concept related to both statistics and profiling (your original question) that is very useful and used by some (you see the technique advised from time to time) is while doing "micro profiling": a lot of programmers will rally and yell "you can't micro profile, micro profiling simply doesn't work, too many things can influence your computation".
Yet simply run n times your profiling, and keep only x% of your observations, the ones around the median, because the median is a "robust statistic" (contrarily to the mean) that is not influenced by outliers (outliers being precisely the value you want to not take into account when doing such profiling).
This is definitely a very useful statistician technique for programmers who want to micro-profile their code.

If you apply the MVC programming method with PHP this would be what you need to profile:
Application:
Controller Setup time
Model Setup time
View Setup time
Database
Query - Time
Cookies
Name - Value
Sessions
Name - Value

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas