Is it possible to approximate missing position data by imputation? - gps

I would like to increase the density of my AIS or GPS data in order to carry out more precise analyses afterwards. During my research I came across different approaches like interpolation, filtering or imputation. With the first two approaches, there is no doubt that these can be used to approximate the points between two collected data points.
In the case of imputation (e.g. MICE), however, I have not yet found an approach in the literature for determining position data.
That's why I wanted to ask if anyone knew a paper dealing with this subject and whether it makes sense at all to determine further position data approximately by imputation.

The problem you are describing there is trajectory reconstruction for AIS/GPS data. There's a number of papers for general trajectory reconstruction (see this for example), but AIS data are quite specific.
The irregularity of AIS data is a well know problem with no standard approach to deal with, as far as I know.
However, there is a handful of publications which try to deal with this issue. The problem of reconstruction is connected to the trajectory prediction problem, since both of these two shares some of the methods (the latter is more popular in the scientific community, I think).
Traditionally, AIS trajectory reconstruction is done using some physical models, which take into account the curvature of the earth and other factors, such as data noise (see examples here, here, and here).
More recent approach tries to use LSTM neural networks.
I don't know much about GPS data, but I think the methods are very similar to the ones mentioned above (especially taking into account the fact that you probably want to deal with maritime data).

Related

Correcting SLAM drift error using GPS measurements

I'm trying to figure out how to correct drift errors introduced by a SLAM method using GPS measurements, I have two point sets in euclidian 3d space taken at fixed moments in time:
The red dataset is introduced by GPS and contains no drift errors, while blue dataset is based on SLAM algorithm, it drifts over time.
The idea is that SLAM is accurate on short distances but eventually drifts, while GPS is accurate on long distances and inaccurate on short ones. So I would like to figure out how to fuse SLAM data with GPS in such way that will take best accuracy of both measurements. At least how to approach this problem?
Since your GPS looks like it is very locally biased, I'm assuming it is low-cost and doesn't use any correction techniques, e.g. that it is not differential. As you probably are aware, GPS errors are not Gaussian. The guys in this paper show that a good way to model GPS noise is as v+eps where v is a locally constant "bias" vector (it is usually constant for a few metters, and then changes more or less smoothly or abruptly) and eps is Gaussian noise.
Given this information, one option would be to use Kalman-based fusion, e.g. you add the GPS noise and bias to the state vector, and define your transition equations appropriately and proceed as you would with an ordinary EKF. Note that if we ignore the prediction step of the Kalman, this is roughly equivalent to minimizing an error function of the form
measurement_constraints + some_weight * GPS_constraints
and that gives you a more straigh-forward, second option. For example, if your SLAM is visual, you can just use the sum of squared reprojection errors (i.e. the bundle adjustment error) as the measurment constraints, and define your GPS constraints as ||x- x_{gps}|| where the x are 2d or 3d GPS positions (you might want to ignore the altitude with low-cost GPS).
If your SLAM is visual and feature-point based (you didn't really say what type of SLAM you were using so I assume the most widespread type), then fusion with any of the methods above can lead to "inlier loss". You make a sudden, violent correction, and augment the reprojection errors. This means that you lose inliers in SLAM's tracking. So you have to re-triangulate points, and so on. Plus, note that even though the paper I linked to above presents a model of the GPS errors, it is not a very accurate model, and assuming that the distribution of GPS errors is unimodal (necessary for the EKF) seems a bit adventurous to me.
So, I think a good option is to use barrier-term optimization. Basically, the idea is this: since you don't really know how to model GPS errors, assume that you have more confidance in SLAM locally, and minimize a function S(x) that captures the quality of your SLAM reconstruction. Note x_opt the minimizer of S. Then, fuse with GPS data as long as it does not deteriorate S(x_opt) more than a given threshold. Mathematically, you'd want to minimize
some_coef/(thresh - S(X)) + ||x-x_{gps}||
and you'd initialize the minimization with x_opt. A good choice for S is the bundle adjustment error, since by not degrading it, you prevent inlier loss. There are other choices of S in the litterature, but they are usually meant to reduce computational time and add little in terms of accuracy.
This, unlike the EKF, does not have a nice probabilistic interpretation, but produces very nice results in practice (I have used it for fusion with other things than GPS too, and it works well). You can for example see this excellent paper that explains how to implement this thoroughly, how to set the threshold, etc.
Hope this helps. Please don't hesitate to tell me if you find inaccuracies/errors in my answer.

Encoding invariance for deep neural network

I have a set of data, 2D matrix (like Grey pictures).
And use CNN for classifier.
Would like to know if there is any study/experience on the accuracy impact
if we change the encoding from traditionnal encoding.
I suppose yes, question is rather which transformation of the encoding make the accuracy invariant, which one deteriorates....
To clarify, this concerns mainly the quantization process of the raw data into input data.
EDIT:
Quantize the raw data into input data is already a pre-processing of the data, adding or removing some features (even minor). It seems not very clear the impact in term of accuracy on this quantization process on real dnn computation.
Maybe, some research available.
I'm not aware of any research specifically dealing with quantization of input data, but you may want to check out some related work on quantization of CNN parameters: http://arxiv.org/pdf/1512.06473v2.pdf. Depending on what your end goal is, the "Q-CNN" approach may be useful for you.
My own experience with using various quantizations of the input data for CNNs has been that there's a heavy dependency between the degree of quantization and the model itself. For example, I've played around with using various interpolation methods to reduce image sizes and reducing the color palette size, and in the end, I discovered that each variant required a different tuning of hyper-parameters to achieve optimal results. Generally, I found that minor quantization of data had a negligible impact, but there was a knee in the curve where throwing away additional information dramatically impacted the achievable accuracy. Unfortunately, I'm not aware of any way to determine what degree of quantization will be optimal without experimentation, and even deciding what's optimal involves a trade-off between efficiency and accuracy which doesn't necessarily have a one-size-fits-all answer.
On a theoretical note, keep in mind that CNNs need to be able to find useful, spatially-local features, so it's probably reasonable to assume that any encoding that disrupts the basic "structure" of the input would have a significantly detrimental effect on the accuracy achievable.
In usual practice -- a discrete classification task in classic implementation -- it will have no effect. However, the critical point is in the initial computations for back-propagation. The classic definition depends only on strict equality of the predicted and "base truth" classes: a simple right/wrong evaluation. Changing the class coding has no effect on whether or not a prediction is equal to the training class.
However, this function can be altered. If you change the code to have something other than a right/wrong scoring, something that depends on the encoding choice, then encoding changes can most definitely have an effect. For instance, if you're rating movies on a 1-5 scale, you likely want 1 vs 5 to contribute a higher loss than 4 vs 5.
Does this reasonably deal with your concerns?
I see now. My answer above is useful ... but not for what you're asking. I had my eye on the classification encoding; you're wondering about the input.
Please note that asking for off-site resources is a classic off-topic question category. I am unaware of any such research -- for what little that is worth.
Obviously, there should be some effect, as you're altering the input data. The effect would be dependent on the particular quantization transformation, as well as the individual application.
I do have some limited-scope observations from general big-data analytics.
In our typical environment, where the data were scattered with some inherent organization within their natural space (F dimensions, where F is the number of features), we often use two simple quantization steps: (1) Scale all feature values to a convenient integer range, such as 0-100; (2) Identify natural micro-clusters, and represent all clustered values (typically no more than 1% of the input) by the cluster's centroid.
This speeds up analytic processing somewhat. Given the fine-grained clustering, it has little effect on the classification output. In fact, it sometimes improves the accuracy minutely, as the clustering provides wider gaps among the data points.
Take with a grain of salt, as this is not the main thrust of our efforts.

Is there a common/standard/accepted way to model GPS entities (waypoints, tracks)?

This question somewhat overlaps knowledge on geospatial information systems, but I think it belongs here rather than GIS.StackExchange
There are a lot of applications around that deal with GPS data with very similar objects, most of them defined by the GPX standard. These objects would be collections of routes, tracks, waypoints, and so on. Some important programs, like GoogleMaps, serialize more or less the same entities in KML format. There are a lot of other mapping applications online (ridewithgps, strava, runkeeper, to name a few) which treat this kind of data in a different way, yet allow for more or less equivalent "operations" with the data. Examples of these operations are:
Direct manipulation of tracks/trackpoints with the mouse (including drawing over a map);
Merging and splitting based on time and/or distance;
Replacing GPS-collected elevation with DEM/SRTM elevation;
Calculating properties of part of a track (total ascent, average speed, distance, time elapsed);
There are some small libraries (like GpxPy) that try to model these objects AND THEIR METHODS, in a way that would ideally allow for an encapsulated, possibly language-independent Library/API.
The fact is: this problem is around long enough to allow for a "common accepted standard" to emerge, isn't it? In the other hand, most GIS software is very professionally oriented towards geospatial analyses, topographic and cartographic applications, while the typical trip-logging and trip-planning applications seem to be more consumer-hobbyist oriented, which might explain the quite disperse way the different projects/apps treat and model the problem.
Thus considering everything said, the question is: Is there, at present or being planned, a standard way to model canonicaly, in an Object-Oriented way, the most used GPS/Tracklog entities and their canonical attributes and methods?
There is the GPX schema and it is very close to what I imagine, but it only contains objects and attributes, not methods.
Any information will be very much appreciated, thanks!!
As far as I know, there is no standard library, interface, or even set of established best practices when it comes to storing/manipulating/processing "route" data. We have put a lot of effort into these problems at Ride with GPS and I know the same could be said by the other sites that solve related problems. I wish there was a standard, and would love to work with someone on one.
GPX is OK and appears to be a sort-of standard... at least until you start processing GPX files and discover everyone has simultaneously added their own custom extensions to the format to deal with data like heart rate, cadence, power, etc. Also, there isn't a standard way of associating a route point with a track point. Your "bread crumb trail" of the route is represented as a series of trkpt elements, and course points (e.g. "turn left onto 4th street") are represented in a separate series of rtept elements. Ideally you want to associate a given course point with a specific track point, rather than just giving the course point a latitude and longitude. If your path does several loops over the same streets, it can introduce some ambiguity in where the course points should be attached along the route.
KML and Garmin's TCX format are similar to GPX, with their own pros and cons. In the end these formats really only serve the purpose of transferring the data between programs. They do not address the issue of how to represent the data in your program, or what type of operations can be performed on the data.
We store our track data as an array of objects, with keys corresponding to different attributes such as latitude, longitude, elevation, time from start, distance from start, speed, heart rate, etc. Additionally we store some metadata along the route to specify details about each section. When parsing our array of track points, we use this metadata to split a Route into a series of Segments. Segments can be split, joined, removed, attached, reversed, etc. They also encapsulate the method of trackpoint generation, whether that is by interpolating points along a straight line, or requesting a path representing directions between the endpoints. These methods allow a reasonably straightforward implementation of drag/drop editing and other common manipulations. The Route object can be used to handle operations involving multiple segments. One example is if you have a route composed of segments - some driving directions, straight lines, walking directions, whatever - and want to reverse the route. You can ask each segment to reverse itself, maintaining its settings in the process. At a higher level we use a Map class to wire up the interface, dispatch commands to the Route(s), and keep a series of snapshots or transition functions updated properly for sensible undo/redo support.
Route manipulation and generation is one of the goals. The others are aggregating summary statistics are structuring the data for efficient visualization/interaction. These problems have been solved to some degree by any system that will take in data and produce a line graph. Not exactly new territory here. One interesting characteristic of route data is that you will often have two variables to choose from for your x-axis: time from start, and distance from start. Both are monotonically increasing, and both offer useful but different interpretations of the data. Looking at the a graph of elevation with an x-axis of distance will show a bike ride going up and down a hill as symmetrical. Using an x-axis of time, the uphill portion is considerably wider. This isn't just about visualizing the data on a graph, it also translates to decisions you make when processing the data into summary statistics. Some weighted averages make sense to base off of time, some off of distance. The operations you end up wanting are min, max, weighted (based on your choice of independent var) average, the ability to filter points and perform a filtered min/max/avg (only use points where you were moving, ignore outliers, etc), different smoothing functions (to aid in calculating total elevation gain for example), a basic concept of map/reduce functionality (how much time did I spend between 20-30mph, etc), and fixed window moving averages that involve some interpolation. The latter is necessary if you want to identify your fastest 10 minutes, or 10 minutes of highest average heartrate, etc. Lastly, you're going to want an easy and efficient way to perform whatever calculations you're running on subsets of your trackpoints.
You can see an example of all of this in action here if you're interested: http://ridewithgps.com/trips/964148
The graph at the bottom can be moused over, drag-select to zoom in. The x-axis has a link to switch between distance/time. On the left sidebar at the bottom you'll see best 30 and 60 second efforts - those are done with fixed window moving averages with interpolation. On the right sidebar, click the "Metrics" tab. Drag-select to zoom in on a section on the graph, and you will see all of the metrics update to reflect your selection.
Happy to answer any questions, or work with anyone on some sort of standard or open implementation of some of these ideas.
This probably isn't quite the answer you were looking for but figured I would offer up some details about how we do things at Ride with GPS since we are not aware of any real standards like you seem to be looking for.
Thanks!
After some deeper research, I feel obligated, for the record and for the help of future people looking for this, to mention the pretty much exhaustive work on the subject done by two entities, sometimes working in conjunction: ISO and OGC.
From ISO (International Standards Organization), the "TC 211 - Geographic information/Geomatics" section pretty much contains it all.
From OGS (Open Geospatial Consortium), their Abstract Specifications are very extensive, being at the same time redundant and complimentary to ISO's.
I'm not sure it contains object methods related to the proposed application (gps track and waypoint analysis and manipulation), but for sure the core concepts contained in these documents is rather solid. UML is their schema representation of choice.
ISO 6709 "[...] specifies the representation of coordinates, including latitude and longitude, to be used in data interchange. It additionally specifies representation of horizontal point location using coordinate types other than latitude and longitude. It also specifies the representation of height and depth that can be associated with horizontal coordinates. Representation includes units of measure and coordinate order."
ISO 19107 "specifies conceptual schemas for describing the spatial characteristics of geographic features, and a set of spatial operations consistent with these schemas. It treats vector geometry and topology up to three dimensions. It defines standard spatial operations for use in access, query, management, processing, and data exchange of geographic information for spatial (geometric and topological) objects of up to three topological dimensions embedded in coordinate spaces of up to three axes."
If I find something new, I'll come back to edit this, including links when available.

Approximate and Interpolate GPS Trajectory

I have a sequence of gps values each containing: timestamp, latitude, longitude, n_sats, gps_speed, gps_direction, ... (some subset of NMEA data). I'm not sure of what quality the direction and speed values are. Further, I cannot expect the sequence to be evenly spaced w.r.t. the timestamp. I want to get a smooth trajectory at an even time step.
I've read the Kalman Filter is the tool of choice for such tasks. Is this indeed the case?
I've found some implementations of the Kalman Filter for Python:
http://www.scipy.org/Cookbook/KalmanFiltering
http://ascratchpad.blogspot.de/2010/03/kalman-filter-in-python.html
These however appear to assume regularly spaced data, i.e. iterations.
What would it take to integrate support of irregularly spaced observations?
One thing I could imagine is to repeat/adapt the prediction step to a time-based model. Can you recommend such a model for this application? Would it need to take into account the NMEA speed values?
Having looked all over for an understandable resource on Kalman filters, I'd highly recommend this one: https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python
To your particular question regarding irregularly spaced observations: Look at Chapter 8 in the reference above, and under the heading "Nonstationary Processes". To summarize, you'll need to use a different state transition function and process noise covariance for each iteration. Those are the only things you'll need to change at each iteration, since they're the only components dependent on delta t.
You could also try kinematic interpolation to see if the results fit to what you expect.
Here's a Python implementation of one of these algorithms: https://gist.github.com/talespaiva/128980e3608f9bc5083b

Travelling Salesman and Map/Reduce: Abandon Channel

This is an academic rather than practical question. In the Traveling Salesman Problem, or any other which involves finding a minimum optimization ... if one were using a map/reduce approach it seems like there would be some value to having some means for the current minimum result to be broadcast to all of the computational nodes in some manner that allows them to abandon computations which exceed that.
In other words if we map the problem out we'd like each node to know when to give up on a given partial result before it's complete but when it's already exceeded some other solution.
One approach that comes immediately to mind would be if the reducer had a means to provide feedback to the mapper. Consider if we had 100 nodes, and millions of paths being fed to them by the mapper. If the reducer feeds the best result to the mapper than that value could be including as an argument along with each new path (problem subset). In this approach the granularity is fairly rough ... the 100 nodes will each keep grinding away on their partition of the problem to completion and only get the new minimum with their next request from the mapper. (For a small number of nodes and a huge number of problem partitions/subsets to work across this granularity would be inconsequential; also it's likely that one could apply heuristics to the sequence in which the possible routes or problem subsets are fed to the nodes to get a rapid convergence towards the optimum and thus minimize the amount of "wasted" computation performed by the nodes).
Another approach that comes to mind would be for the nodes to be actively subscribed to some sort of channel, or multicast or even broadcast from which they could glean new minimums from their computational loop. In that case they could immediately abandon a bad computation when notified of a better solution (by one of their peers).
So, my questions are:
Is this concept covered by any terms of art in relation to existing map/reduce discussions
Do any of the current map/reduce frameworks provide features to support this sort of dynamic feedback?
Is there some flaw with this idea ... some reason why it's stupid?
that's a cool theme, that doesn't have that much literature, that was done on it before. So this is pretty much a brainstorming post, rather than an answer to all your problems ;)
So every TSP can be expressed as a graph, that looks possibly like this one: (taken it from the german Wikipedia)
Now you can run a graph algorithm on it. MapReduce can be used for graph processing quite well, although it has much overhead.
You need a paradigm that is called "Message Passing". It was described in this paper here: Paper.
And I blog'd about it in terms of graph exploration, it tells quite simple how it works. My Blogpost
This is the way how you can tell the mapper what is the current minimum result (maybe just for the vertex itself).
With all the knowledge in the back of the mind, it should be pretty standard to think of a branch and bound algorithm (that you described) to get to the goal. Like having a random start vertex and branching to every adjacent vertex. This causes a message to be send to each of this adjacents with the cost it can be reached from the start vertex (Map Step). The vertex itself only updates its cost if it is lower than the currently stored cost (Reduce Step). Initially this should be set to infinity.
You're doing this over and over again until you've reached the start vertex again (obviously after you visited every other one). So you have to somehow keep track of the currently best way to reach a vertex, this can be stored in the vertex itself, too. And every now and then you have to bound this branching and cut off branches that are too costly, this can be done in the reduce step after reading the messages.
Basically this is just a mix of graph algorithms in MapReduce and a kind of shortest paths.
Note that this won't yield to the optimal way between the nodes, it is still a heuristic thing. And you're just parallizing the NP-hard problem.
BUT a little self-advertising again, maybe you've read it already in the blog post I've linked, there exists an abstraction to MapReduce, that has way less overhead in this kind of graph processing. It is called BSP (Bulk synchonous parallel). It is more freely in the communication and it's computing model. So I'm sure that this can be a lot better implemented with BSP than MapReduce. You can realize these channels you've spoken about better with it.
I'm currently involved in an Summer of Code project which targets these SSSP problems with BSP. Maybe you want to visit if you're interested. This could then be a part solution, it is described very well in my blog, too. SSSP's in my blog
I'm excited to hear some feedback ;)
It seems that Storm implements what I was thinking of. It's essentially a computational topology (think of how each compute node might be routing results based on a key/hashing function to the specific reducers).
This is not exactly what I described, but might be useful if one had a sufficiently low-latency way to propagate current bounding (i.e. local optimum information) which each node in the topology could update/receive in order to know which results to discard.