Grouping GPS data into routes - gps

I've build an embedded GPS tracker which logs coordinates to a database, and I'd like to analyse this data and extract travelled routes and idle moments from it.
The way I think it should work:
Order the coordinates by UTC timestamp, find coordinates close together, and mark them as a 'route start'. Proximity could be found by simply comparing decimals, as I think GPS can be considered linear at a scale of ~3m. This brings up the trouble of selecting the 'center' of the idling: due to GPS inaccuracy, these decimals may drift around a little, potentially over the threshold to be grouped with the first.
10:15 to 15:12 - Idle
15:12 to 16:38 - [coordinates]
16:38 to 17:43 - Idle
etc.
How can I group coordinates in this fashion, and what complexities am I overlooking?

The first thing I would do is work with some experimental data. Are you sure there isn't some distance D, where if someone moved more than D from their previous location it must be they are on the move?
Now, assuming that there isn't. Say that GPS precision is usually within a reasonable accuracy, say 10m, but sometimes you just get a point 100m away. What I would do is: as long as the user didn't move more than, say, 20m, calculate a centre for the "idle location" (just the average of the x and y values of all the coordinates). When the user moves beyond distance 20m from your centre, don't immediately assume they started moving. Instead, check the next (one or more) values; only if you see them several times in distance of more than 20m from the centre, infer they started moving.
I imagine this simple solution should be good enough for most applications, though it depends on what exactly the user is doing (how stable they are during idle time, how fast moving when they move), and the quality of the GPS data. I would start with something like that and see how it works. If it fails, I would analyse in what cases it fails and proceed from there.

Related

Using GPS, How can i detect if user on road or inside a building

I am currently developing technique to help users find a spot to park.
But i face a little problem:
if a user indicates that he is parking right now in a free spot but he is lying and he is at home right now.
How can i detect from GPS if he is inside a building or along side the road?
Thanks
You'll need map data (OpenStreetMap is free), and figure out whether the user is somewhere on that map or not. You do that by comparing GPS data to the map data.
What I do in such situations is measure the distance between the lat/lon and each road, and compare the GPS angle to that of each line. The more context information you use the more accurate you can get your results:
If the speed is 60km/h, you're probably not in a building. You're probably not on a 30km/h road either.
If you're standing still for more than 2 minutes, you're probably not in a car.
If you know the buildings, and there are only a few of them, you could check if you see a certain wifi router or not.
Basically you'll calculate a score for each road, and then pick the road with the highest score to know where you are.
Score = DistScore*DistWeight + AngleScore+AngleWeight etc.
Also, from iOS and Android you get an accuracy in meters. You can also calculate that yourself if you can access raw GPS data. Using that, you set the area that you need to scan. For example, for a high accuracy (3m), you probably don't have many roads to scan. If the accuracy is 50m, you should probably match roads that are farther away.
If accuracy is important, you should look at series of GPS data, and test if the followed route is a logical path or not.

Algorithm about ETA (Estimated time of arrival)

The question is simple and direct.Please help.
I am working on bus ETA projects where all the datum i get are:
1.Real time GPS locations from bus
2.Distance between each stop
Do we need more data than these?
My Big question is:
How to make use of these data to calculate the ETA for customer?
My thoughts:
ETA is all about distance/speed,
so at first: I tried to simply get distance from 2 GPS coordinates to calculate
speed and use stop's distance to calculate ETA.
i.e.
while(true){
ETA = stopDis/2ptSpeed;
stopDis = stopDis - 2ptDis;
}
and upDATE next stop when stopDis<0
However, the problem is that the GPS datum will jump quite wildly and hence the
speed calculated is really messy.
Broken downed Question: How to smooth the GPS datum? Kalman Filter? Averaging?
Have heard of those but does not really know what to do ESPECIALLY Kalman filter!
Because of traffic and not-straight lines, distances and speed alone are not reliable indicators.
I think that your best bet is to use the history of the line to compare the current distance with the average distance at that time in the line, and to deduce a deviation from the average ETA.
I don't think it's that simple. What if there are lots of traffic lights between 2 stops or different speed limits.
And you have the gps coordinates so you can calculate distance, but that isn't the distance over the road. So you need to have someway to get the real distance the bus needs to travel (Perhaps the google maps api can help).
So to answer the "Do we need more data than these?" question: yes, i think you do.
Just an idea: The bus schedule already contains info about how much time is needed between stops on average. If you can just read from the GPS position if the bus is on a stop or between two stops you should be able to make a fairly accurate prediction.
I'm guessing you are talking about busses in a city, where GPS signals are weak and bus stops are not far apart.
Edit: This could be further improved with data about current traffic.
Collect travel time data from stop to stop on a continuous basis. Derive an expected travel time between stops from this historical data to use as a baseline. this can be continuously updated. Initial ETA at a stop is time of last stop plus most recent travel time average. If you have continuous GPS data, you can adjust in real time from that.

Algorithm for reducing GPS track data to discard redundant data?

We're building a GIS interface to display GPS track data, e.g. imagine the raw data set from a guy wandering around a neighborhood on a bike for an hour. A set of data like this with perhaps a new point recorded every 5 seconds, will be large and displaying it in a browser or a handheld device will be challenging. Also, displaying every single point is usually not necessary since a user can't visually resolve that much data anyway.
So for performance reasons we are looking for algorithms that are good at 'reducing' data like this so that the number of points being displayed is reduced significantly but in such a way that it doesn't risk data mis-interpretation. For example, if our fictional bike rider stops for a drink, we certainly don't want to draw 100 lat/lon points in a cluster around the 7-Eleven.
We are aware of clustering, which is good for when looking at a bunch of disconnected points, however what we need is something that applies to tracks as described above. Thanks.
A more scientific and perhaps more math heavy solution is to use the Ramer-Douglas-Peucker algorithm to generalize your path. I used it when I studied for my Master of Surveying so it's a proven thing. :-)
Giving your path and the minimum angle you can tolerate in your path, it simplifies the path by reducing the number of points.
Typically the best way of doing that is:
Determine the minimum number of screen pixels you want between GPS points displayed.
Determine the distance represented by each pixel in the current zoom level.
Multiply answer 1 by answer 2 to get the minimum distance between coordinates you want to display.
starting from the first coordinate in the journey path, read each next coordinate until you've reached the required minimum distance from the current point. Repeat.

accelerometer measuring negative peaks of velocity

I'm writing an application for iphone4 and I'm taking values from the accelerometer to compute the current movement from a known initial position.
I've noticed a very strange behavior: often times when I walk holding the cellphone for a few meters and then I stop, I register a negative peak of overall velocity when the handset decelarates. How is that possible if I keep moving in the same direction?
To compute the variation in velocity I just do this:
delta_v = (acc_previous + acc_now)/2 * (1/(updating_frequency))
Say you are moving at a constant 10 m/s. Your acceleration is zero. Let's say, for the sake of simplicity, you sample every 1 second.
If you decelerate smoothly over a period of 0.1 seconds, you might get a reading of 100 m/s/s or you might not get a reading at all since the deceleration might fall between two windows. Your formula most likely will not detect any deceleration or if it does, you'll get two values of -50 m/s/s: (0 - 100) / 2 and then (-100 + 0) / 2. Either way you'll get the wrong final velocity.
Something similar could happen at almost any scale, All you need is a short period of high acceleration or deceleration that you happen to sample and your figures are screwed.
Numerical integration is hard. Naive numerical integration of a noisy signal will essentially always produce significant errors and drift (like what you're seeing). People have come up with all sorts of clever ways to deal with this problem, most of which require having some source of reference information other than the accelerometer (think of a Wii controller, which has not only an accelerometer, but also the thingy on top of the TV).
Note that any MEMS accelerometer is necessarily limited to reporting only a certain band of accelerations; if acceleration goes outside of that band, then you will absolutely get significant drift unless you have some way to compensate for it. On top of that, there is the fact that the acceleration is reported as a discrete quantity, so there is necessarily some approximation error as well as noise even if you do not go outside of the window. When you add all of those factors together, some amount of drift is inevitable.
Well if you move any object in one direction, there's a force involved which accelerates the object.
To make the object come to a halt again, the same force is needed in the exact opposite direction - or to be more precise, the vector of the acceleration event that happened before needs to be multiplied with -1. That's your negative peak.
Not strictly a programming answer, but then again, your question is not strictly a programming question :)
If it's going a thousand miles per hour, its acceleration is 0. If it's speeding, the acceleration is positive. If it's slowing down, the acceleration is negative.
You can use the absolute number of the velocity to invert any negative acceleration, if that's needed:
fabs(delta_v); // use abs for ints

GPS signal cleaning & road network matching

I'm using GPS units and mobile computers to track individual pedestrians' travels. I'd like to in real time "clean" the incoming GPS signal to improve its accuracy. Also, after the fact, not necessarily in real time, I would like to "lock" individuals' GPS fixes to positions along a road network. Have any techniques, resources, algorithms, or existing software to suggest on either front?
A few things I am already considering in terms of signal cleaning:
- drop fixes for which num. of satellites = 0
- drop fixes for which speed is unnaturally high (say, 600 mph)
And in terms of "locking" to the street network (which I hear is called "map matching"):
- lock to the nearest network edge based on root mean squared error
- when fixes are far away from road network, highlight those points and allow user to use a GUI (OpenLayers in a Web browser, say) to drag, snap, and drop on to the road network
Thanks for your ideas!
I assume you want to "clean" your data to remove erroneous spikes caused by dodgy readings. This is a basic dsp process. There are several approaches you could take to this, it depends how clever you want it to be.
At a basic level yes, you can just look for really large figures, but what is a really large figure? Yeah 600mph is fast, but not if you're in concorde. Whilst you are looking for a value which is "out of the ordinary", you are effectively hard-coding "ordinary". A better approach is to examine past data to determine what "ordinary" is, and then look for deviations. You might want to consider calculating the variance of the data over a small local window and then see if the z-score of your current data is greater than some threshold, and if so, exclude it.
One note: you should use 3 as the minimum satellites, not 0. A GPS needs at least three sources to calculate a horizontal location. Every GPS I have used includes a status flag in the data stream; less than 3 satellites is reported as "bad" data in some way.
You should also consider "stationary" data. How will you handle the pedestrian standing still for some period of time? Perhaps waiting at a crosswalk or interacting with a street vendor?
Depending on what you plan to do with the data, you may need to supress those extra data points or average them into a single point or location.
You mention this is for pedestrian tracking, but you also mention a road network. Pedestrians can travel a lot of places where a car cannot, and, indeed, which probably are not going to be on any map you find of a "road network". Most road maps don't have things like walking paths in parks, hiking trails, and so forth. Don't assume that "off the road network" means the GPS isn't getting an accurate fix.
In addition to Andrew's comments, you may also want to consider interference factors such as multipath, and how they are affected in your incoming GPS data stream, e.g. HDOPs in the GSA line of NMEA0183. In my own GPS controller software, I allow user specified rejection criteria against a range of QA related parameters.
I also tend to work on a moving window principle in this regard, where you can consider rejecting data that represents a spike based on surrounding data in the same window.
Read the posfix to see if the signal is valid (somewhere in the $GPGGA sentence if you parse raw NMEA strings). If it's 0, ignore the message.
Besides that you could look at the combination of HDOP and the number of satellites if you really need to be sure that the signal is very accurate, but in normal situations that shouldn't be necessary.
Of course it doesn't hurt to do some sanity checks on GPS signals:
latitude between -90..90;
longitude between -180..180 (or E..W, N..S, 0..90 and 0..180 if you're reading raw NMEA strings);
speed between 0 and 255 (for normal cars);
distance to previous measurement matches (based on lat/lon) matches roughly with the indicated speed;
timedifference with system time not larger than x (unless the system clock cannot be trusted or relies on GPS synchronisation :-) );
To do map matching, you basically iterate through your road segments, and check which segment is the most likely for your current position, direction, speed and possibly previous gps measurements and matches.
If you're not doing a realtime application, or if a delay in feedback is acceptable, you can even look into the 'future' to see which segment is the most likely.
Doing all that properly is an art by itself, and this space here is too short to go into it deeply.
It's often difficult to decide with 100% confidence on which road segment somebody resides. For example, if there are 2 parallel roads that are equally close to the current position it's a matter of creative heuristics.