Finding trips in a set of waypoints

Finding trips in a set of waypoints - gps

From a set of waypoints, each with lat/lon and timestamp, how do you find out the start and end point of a trip vs. staying at a place over time?
I'm playing with Google Latitude data, unfortunately it only provides a stack of locations, no meta data. I tried to calculate the average velocity between waypoints, but because of the nature of Latitude sometimes location data gets slightly inaccurate and it looks like I stopped somewhere even though I was still on a trip to somewhere.

I guess staying at a place over time is just a matter of degree, isn't it?
Given a bunch of lat/long pairs with timestamps, you can do the following:
Figure out the first and last points
Figure out what order the points were visited in
Figure out the difference between successive waypoints
But if this is all you have, this is all you have. Some questions you don't know the answer to:
Are any waypoints missing?
Are the waypoints entered by the system/user at consistent times?
Are the first/last points the user's primary residence?
Are the waypoints accurate? (Did the user make a transcription error somewhere?)
If you have a set of waypoints that may be several trips, I suppose you can take some time cutoff to bin your trips (a gap of a few weeks probably means separate trips).
Otherwise, the earliest and latest points are probably your best bets.

Related

Checking if a Coordinate is Within a Range - BigQuery GIS

I'm looking at the freely available Solar potential dataset on Google BigQuery that may be found here: https://bigquery.cloud.google.com/table/bigquery-public-data:sunroof_solar.solar_potential_by_censustract?pli=1&tab=schema
Each record on the table has the following border definitions:
lat_max - maximum latitude for that region
lat_min - minimum latitude for that region
lng_max - maximum longitude for that region
lng_min - minimum longitude for that region
Now I have a coordinate (lat/lng pair) and I would like to query to see whether or not that coordinate is within the above range. How do I do that with BQ Standard SQL?
I've seen the Geo Functions here: https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions
But I'm still not sure how to write this query.
Thanks!

Assuming the points are just latitude and longitude as numbers, why can't you just do a standard numerical comparison?
Note: The first link doesn't work without a google account, so I can't see the data.
But if you want to become spatial, I'd suggest you're going to need to take the border coordinates that you have and turn them into a polygon using one of: ST_MAKEPOLYGON, ST_GEOGFROMGEOJSON, or ST_GEOGFROMTEXT. Then create a point using the coords you wish to test ST_MAKEPOINT.
Now you have two geographies you can compare them both using ST_INTERSECTION or ST_DISJOINT depending on what outcome you want.
If you want to get fancy and see how far aware from the border you are (which I guess means more efficient?) you can use ST_DISTANCE.

Agree with Jonathan, just checking if each of the lat/lon value is within the bounds is simplest way to achieve it (unless there are any issues around antimeridian, but most likely you can just ignore them).
If you do want to use Geography objects for that, you can construct Geography objects for these rectangles, using
ST_MakePolygon(ST_MakeLine(
[ST_GeogPoint(lon_min, lat_min), ST_GeogPoint(lon_max, lat_min),
ST_GeogPoint(lon_max, lat_max), ST_GeogPoint(lon_min, lat_max),
ST_GeogPoint(lon_min, lat_min)]))
And then check if the point is within particular rectangle using
ST_Intersects(ST_GeogPoint(lon, lat), <polygon-above>)
But it will likely be slower and would not provide any benefit for this particular case.

Data Science: Using Inferential Statistics to label train dataset

Lack of High Schools in remote areas is a problem for students in developing country. Students in some locations are better than that in other. So, I have to find those locations. Now, the main problem is defining "BETTER". I have made some rules that will define the profile of a location.
Right now, I am concerned with the good students.
So, what I have done is-
1. Used some inferential statistics to and made some rules to come up with the conclusion that Location A,B,C,etc are the most potential locations where you can put the high schools because according to my rules these locations contain quality students.
I did all of the things above to label the data because I required to define "BETTER" and label the data so that I can now use machine learning algorithm to learn the factors which makes a location a potential location so that if I give a data point from test data to the model, it will instantly tell if the location is better or not.
Overview of the method:
For each location, I have these 4 information:
total_students_staying_for_high_school_education(A),
total_students_leaving_for_high_school_education_in_another_place(B),
mean_grade_point_of_students_of_type_B,
ratio (calculated as B/A),
For the location whose ratio > 1
I applied the chi-squared significance test to come up with a statistic which would tell me if students are leaving that place in significant amount than staying. I used ANOVA and then Tukey test to compare means_grade points and then find combinations of pairs of locations whose means vary and whose is greater than the others.
I then wrote a python program with a custom comparator that first compares if mean_grade of those points vary and returns the one with greater mean. If the means don't vary, the comparator return the location with the one whose chi-squared value is greater.
This is how, the whole process comes up with few suggestions of location and I call those location "BETTER".
What I am concerned about is-
1. How do I verify if my rules are valid? Or do I even need to verify it?
2. Most importantly, is mingling statistics with machine learning as described above an appropriate approach?Is there any major leakage in the method?Can anyone suggest a more general method?

A better way to handle Long Lat distances

OK so I don't have an issue here but I'm just wondering if there's a more standardized way to handle what I'm doing.
Essentially I have a DB table full of locations including longitude and Latitude, there could potentially be thousands of locations. I also have some functionality to search your postcode and you can then see from the stored the locations the closest x amount to you.
Ive read about going off and using the Google Maps api to do this but I don't really want to pull back and send thousands of requests to the google maps api.
So here's what I'm doing. I have a stored procedure where I am passing the users Long and Lat. I am then using this to form a column called distance with which I am then ordering the data. The distance column I am working out using the below logic:
SQRT(SQUARE((CAST(USERSLAT AS decimal(9,6))) - Latitude) + SQUARE((CAST(USERSLONG AS decimal(9,6)))-(Longitude))) AS Distance
Essentially what this is doing is the classic a^2=b^2+c^2 to find the distance between to coords, and then using these results I can theoretically see the closest locations to the user. Once I have this data i can use the google maps api to find the exact distances. Is this an ok way to do things? I have this nagging feeling in the back of my head that im missing something.

Reading file of Airport codes and lat, long coordinates

I have been given a file of 1200 airports that lists airport code, latitude, longitude, city, and state.
EX: ANB 33.58 85.85 Anniston AL
Eventually I will be writing methods 'Distance' to return distance and information of two input airports, 'Closest' to return code and distance of closest airports to and input airport, and 'Shortest' to find shortest trip that begins and input airport and travels to n airports.
For now my question is, what is the best way to read in this data that will eventually make it easier for me to write/calculate distances later?
Such as would I read in the file then put in a HashMap, or TreeSet in one method, and how would this be done? or would I wait and use HashMap/TreeSet in the other methods?
Sorry I don't have any code yet but I'm stuck on this for now and you guys all ways help me out tremendously, so just looking for direction at this point.

It sounds like the simplest approach would be to create an object to store the information for one airport and then store all of those objects in one array. I say this because you're probably going to be doing a lot of iteration over the entire array in order to build your other methods, and since you only have 1200 objects, any fancy sorting isn't going to speed up your program that much.
I suppose you could also divide your set of airports into geographic regions and override hashcode() in order to group nearby airports together, but that doesn't buy you much speed, and it's not particularly helpful for airports near the edge of a region. Similarly, you could implement a GeoHash, but these also have issues with certain edge conditions that may or may not matter with your set of airports. (There are also open source Java implementations of GeoHashes out there if you do a search.)
Whatever you do, don't set up a group of HashMaps to map airport name to each of your other pieces of data. That is a common beginner approach, but it is also the slowest approach. Creating an object is much better.

How to calculate current position lat/long based on previously kown lat/long

I have a requirement to calculate the lat and long values of the current position of the user. However, I can't use GPS/Network. I know a previous lat long location of the user. This previous location has been queried from the GPS provider. After this initial location is found, GPS is no more available. User travels a certain distance from this point and in certain direction. Both these values, distance and direction of travel (in terms of angle), are known. Is there any way that I can arrive at the new lat/long coordinates based on this available information (previous lat/long coordinates, distance & direction traveled from the previous position).

This method of navigation is known as dead reckoning: Given is an initial position, the task is to deduce the current position from known information, e.g. heading, time travelled, and speed (or heading and distance).
You may find some formulas to compote the new location here.

Stefan gives you a good place to start. However, depending on what you want to do with that information, you might be better to use a Kalman Filter, which would allow you to account for error in both the starting position, distance traveled and direction traveled. This is especially true if the user is isn't moving just once, but several times.

This link GeoTools - How to do Dead Reckoning and course calculations using GeoTools classes answers my question. Given an angle, distance and starting geographic point, GeoTools (java) can be used to do the calculation. Refer to the SO link to see the sample.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas