Linear Genetic Programming - Error Landscape Issue - landscape

I'm exploring the world of linear genetic programming and I find myself stuck with this one issue. It seems to me that the error landscape of even the simplest problem is extremely non smooth. In particular, the error landscape seems to always contain these huge gaps of constant error (gaps where the fitness of a solution is just zero). This deteriorates the evolutionary algorithm to a random search over the space of programs and renders a solution almost impossible to discover. Does anyone out there have an explanation for how people get around this? What am I missing?

It's about not to choose a too high selection pressure. a too high selection pressure leads to a loss of diversity which makes it much harder to find a hard reachable global optima. under a weak pressure also unfit individuals have a chance of creating offspring which could lead to the discovering of new optimas.
an other influence is the mutation step width. if you have a high selection pressure you should at least ensure that also wide mutation steps are possible even though they have a smaller probability to happen.
some even suggest to give the mutation operator the power to reach every part of the searchspace within a single step: http://www.lehmanns.de/shop/nocategory/3400811-9783826597008-anwendungsorientierter-entwurf-evolutionaerer-algorithmen

Related

Information about CGAL and alternatives

I'm working on a problem that will eventually run in an embedded microcontroller (ESP8266). I need to perform some fairly simple operations on linear equations. I don't need much, but do need to be able work with points and linear equations to:
Define an equations for lines either from two known points, or one
point and a gradient
Calculate a new x,y point on an equation line that is a specific distance from another point on that equation line
Drop a perpendicular onto an equation line from a point
Perform variations of cosine-rule calculations on points and triangle sides defined as equations
I've roughed up some code for this a while ago based on high school "y = mx + c" concepts, but it's flawed (it fails with infinities when lines are vertical), and currently in Scala. Since I suspect I'm reinventing a wheel that's not my primary goal, I'd like to use someone else's work for this!
I've come across CGAL, and it seems very likely it's capable of all this and more, but I have two questions about it (given that it seems to take ages to get enough understanding of this kind of huge library to actually be able to answer simple questions!)
It seems to assert some kind of mathematical perfection in it's calculations, but that's not important to me, and my system will be severely memory constrained. Does it use/offer memory efficient approximations?
Is it possible (and hopefully easy) to separate out just a limited subset of features, or am I going to find the entire library (or even a very large subset) heading into my memory limited machine?
And, I suppose the inevitable follow up: are there more suitable libraries I'm unaware of?
TIA!
The problems that you are mentioning sound fairly simple indeed, so I'm wondering if you really need any library at all. Maybe if you post your original code we could help you fix it--your problem sounds like you need to redo a calculation avoiding a division by zero.
As for your point (2) about separating a limited number of features from CGAL, giving the size and the coding style of that project, from my experience that will be significantly more complicated (if at all possible) than fixing your own code.
In case you want to try a simpler library than CGAL, maybe you could try Boost.Geometry
Regards,

Correcting SLAM drift error using GPS measurements

I'm trying to figure out how to correct drift errors introduced by a SLAM method using GPS measurements, I have two point sets in euclidian 3d space taken at fixed moments in time:
The red dataset is introduced by GPS and contains no drift errors, while blue dataset is based on SLAM algorithm, it drifts over time.
The idea is that SLAM is accurate on short distances but eventually drifts, while GPS is accurate on long distances and inaccurate on short ones. So I would like to figure out how to fuse SLAM data with GPS in such way that will take best accuracy of both measurements. At least how to approach this problem?
Since your GPS looks like it is very locally biased, I'm assuming it is low-cost and doesn't use any correction techniques, e.g. that it is not differential. As you probably are aware, GPS errors are not Gaussian. The guys in this paper show that a good way to model GPS noise is as v+eps where v is a locally constant "bias" vector (it is usually constant for a few metters, and then changes more or less smoothly or abruptly) and eps is Gaussian noise.
Given this information, one option would be to use Kalman-based fusion, e.g. you add the GPS noise and bias to the state vector, and define your transition equations appropriately and proceed as you would with an ordinary EKF. Note that if we ignore the prediction step of the Kalman, this is roughly equivalent to minimizing an error function of the form
measurement_constraints + some_weight * GPS_constraints
and that gives you a more straigh-forward, second option. For example, if your SLAM is visual, you can just use the sum of squared reprojection errors (i.e. the bundle adjustment error) as the measurment constraints, and define your GPS constraints as ||x- x_{gps}|| where the x are 2d or 3d GPS positions (you might want to ignore the altitude with low-cost GPS).
If your SLAM is visual and feature-point based (you didn't really say what type of SLAM you were using so I assume the most widespread type), then fusion with any of the methods above can lead to "inlier loss". You make a sudden, violent correction, and augment the reprojection errors. This means that you lose inliers in SLAM's tracking. So you have to re-triangulate points, and so on. Plus, note that even though the paper I linked to above presents a model of the GPS errors, it is not a very accurate model, and assuming that the distribution of GPS errors is unimodal (necessary for the EKF) seems a bit adventurous to me.
So, I think a good option is to use barrier-term optimization. Basically, the idea is this: since you don't really know how to model GPS errors, assume that you have more confidance in SLAM locally, and minimize a function S(x) that captures the quality of your SLAM reconstruction. Note x_opt the minimizer of S. Then, fuse with GPS data as long as it does not deteriorate S(x_opt) more than a given threshold. Mathematically, you'd want to minimize
some_coef/(thresh - S(X)) + ||x-x_{gps}||
and you'd initialize the minimization with x_opt. A good choice for S is the bundle adjustment error, since by not degrading it, you prevent inlier loss. There are other choices of S in the litterature, but they are usually meant to reduce computational time and add little in terms of accuracy.
This, unlike the EKF, does not have a nice probabilistic interpretation, but produces very nice results in practice (I have used it for fusion with other things than GPS too, and it works well). You can for example see this excellent paper that explains how to implement this thoroughly, how to set the threshold, etc.
Hope this helps. Please don't hesitate to tell me if you find inaccuracies/errors in my answer.

Finding best path trought strongly connected component

I have a directed graph which is strongly connected and every node have some price(plus or negative). I would like to find best (highest score) path from node A to node B. My solution is some kind of brutal force so it takes ages to find that path. Is any algorithm for this or any idea how can I do it?
Have you tried the A* algorithm?
It's a fairly popular pathfinding algorithm.
The algorithm itself is not to difficult to implement, but there are plenty of implementations available online.
Dijkstra's algorithm is a special case for the A* (in which the heuristic function h(x) = 0).
There are other algorithms who can outperform it, but they usually require graph pre-processing. If the problem is not to complex and you're looking for a quick solution, give it a try.
EDIT:
For graphs containing negative edges, there's the Bellman–Ford algorithm. Detecting the negative cycles comes at the cost of performance, though (worse than the A*). But it still may be better than what you're currently using.
EDIT 2:
User #templatetypedef is right when he says the Bellman-Ford algorithm may not work in here.
The B-F works with graphs where there are edges with negative weight. However, the algorithm stops upon finding a negative cycle. I believe that is a useful behavior. Optimizing the shortest path in a graph that contains a cycle of negative weights will be like going down a Penrose staircase.
What should happen if there's the possibility of reaching a path with "minus infinity cost" depends on the problem.

Stages in genetic programming

When evolving a genetic program, how is the required time distributed between different stages in development? I mean: Is 90 percent of the time devoted to becoming a little bit better than random programs, after which improving the program to the final version is not very computation-intensive?
Most metaheuristics (including genetic algorithms I think) have a progress like the green and red lines on this image. They try to reach the best score as fast as possible and it gets harder and hard to find a better score.
However, some (like simulated annealing, the blue line) can be told the amount of time they 'll be given and behave differently based upon that. In that case you can get a more linear like line.
Generally progress is quicker earlier, with progress slowing in later generations. But it does depend on the nature of the problem. Why not test it out on a few different problems and plot the progress?
An approximate indication to this can be the size of the program. If the program size becomes stable but you notice that fitness is still improving, then most likely all random programs were weeded out. The fitness improvement can therefore be attributed to minor numerical changes in say the coefficients.

Genetic/Evolutionary algorithms and local minima/maxima

I have run across several posts and articles that suggests using things like simulated annealing to avoid the local minima/maxima problem.
I don't understand why this would be necessary if you started out with a sufficiently large random population.
Is it just another check to insure that the initial population was, in fact, sufficiently large and random? Or are those techniques just an alternative to producing a "good" initial population?
Simulated annealing is a probabilistic optimization technique -- it's not supposed to give you more precise answers, it's supposed to give you approximations faster.
Simulated annealing is probabilistic technique where chance of getting trapped in local minima/maxima depends on scheduling of temperature. Scheduling temperature is different for different types of problems. Evolutionary Algorithm is much more robust and less likely to get trapped in local minima/maxima. SA is probabilistic. On the other hand, EA uses mutation which introduces random walk in search space, that's why EA has higher probability of getting global optima.
First of all, simulated annealing is a last resort method. There are far better, more efficient, and more effective methods of discovering where the local minima are found.
A better check would be to use a statistical method to uncover information about your data set such as variance or standard deviation.