Kinect normalize depth - kinect

I have some Kinect data of somebody standing (reasonably) still and performing sets of punches. I am given it in the format of an x,y,z co-ordinate for each joint of which they are 20, so I have 60 data points per frame.
I'm trying to perform a classification task on the punches however I'm having some problems normalising my data. As you can see from the graph there are sections with much higher 'amplitude' than the others, my belief is that this is due to how close that person was to the kinect sensor when the readings were taken. (The graph is actually the first principal coefficient obtained by PCA for each frame, multiple sequences of the same punch are strung together in this graph)
Looking back at the data files it looks like those that are 'out' have a z co-ordinate (depth from sensor) of ~2.7 where as the others tent to hover around 3.3-3.6.
How can I perform a normalization with the depth values to make them closer to each other for each sequence? I've already tried differentiation to get the velocity, although it helps to normalise the output actually ends up too similar and makes it very hard to classify.
Edit: I should mention I am already using a normalization method by subtracting the hip position from each joint in an attempt to make the co-ordinates relative.

The Kinect can output some strange values when the person that is tracked is standing near the edges of the view of the Kinect. I would either completly ignore these data or just replace the data with an average of the previous 2 and next 2.
For example:
1,2,1,12,1,2,3
Replace 12 with (2 + 1 + 1 + 2) / 4 = 1.5
You can basically do this with the whole array of values you have, this way you have a more normalised line/graph.
You can also use the clippedEdges value to determine if one or more joints is outside the view.

Related

finding pivot point of two 3D transformations

I need to find out what the degrees of freedom are between two arbitrary geometries that may be linked to eachother. for instance a hinge consisting of two parts. I can simulate the motion of the two parts, and I figured that if I fix one of the parts in place, i can deduce what the axes and point of rotation is for the second moving part is from the transformation in each timestep.
I run into some difficulties calculating this (my vector algebra is ok, my (numpy) math skills less so)
How I see it is I have two 4x4 transformation matrices for each timestep, the previous position/orientation of the moving part (A) and the current position/orientation (A')
then the point of rotation can be found by by calculating the transformation matrix B that transforms A into A' which is I believe
B = inverse(A) * A'
and then find the point that does not change under transformation by B:
x = Bx
Is my thinking correct and if so, how do I solve this equation?

Creating an MLP that learns based on GPS coordinates

I have some data that tells me the amount of hours water is available for particular towns.
You can see it here
I want to use train a Multilayer Perceptron based on that data, to take a set of coordinates and indicate the approximate number of hours for which that coordinate will have water.
Does this make sense?
If so, am I correct in saying, there has to be two input layers? One for lat and one for long. And the output layer should be the number of hours.
Would love some guidance.
I would solve that differently:
Just create an ArrayList of WaterInfo:
WaterInfo contains lat,lon, waterHours.
Then for a given coordinate search the closest WaterInfo in the list.
Since you have not many elements, just do a brute force search, to find the closest.
You further can optimize, to find the three closest WaterInfo points, and calculate the weithted average of WaterHours. As weight you use the air distance from current position to Waterinfo position.
To answer your question:
"Does this makes sense"?
From the goal to get a working solution: NO!
Ask yourself, why do you want to use MLP for this task.
Further i doubt that using two layers for lat / long makes sense.
A coordinate (lat/lon) is one point on the world, so that should be one layer in the model. You can convert the lat/lon coord to a cell identifier: Span a grid over Brazil; with cell width 10 or 50km; now convert a lat/long coordinate to a cellId: Like E4 on a chess board, you will calculate one integer value representing the cell. (There are other solutions to get an unique number, too, choose one you like)
Now you have a modell geoCellID -> waterHours, which better represents the real world situation.

Search optimization problem

Suppose you have a list of 2D points with an orientation assigned to them. Let the set S be defined as:
S={ (x,y,a) | (x,y) is a 2D point, a is an orientation (an angle) }.
Given an element s of S, we will indicate with s_p the point part and with s_a the angle part. I would like to know if there exist an efficient data structure such that, given a query point q, is able to return all the elements s in S such that
(dist(q_p, s_p) < threshold_1) AND (angle_diff(q_a, s_a) < threshold_2) (1)
where dist(p1,p2), with p1,p2 2D points, is the euclidean distance, and angle_diff(a1,a2), with a1,a2 angles, is the difference between angles (taken to be the smallest one). The data structure should be efficient w.r.t. insertion/deletion of elements and the search as defined above. The number of vectors can grow up to 10.000 and more, but take this with a grain of salt.
Now suppose to change the above requirement: instead of using the condition (1), let's request all the elements of S such that, given a distance function d, we want all elements of S such that d(q,s) < threshold. If i remember well, this last setup is called range-search. I don't know if the first case can be transformed in the second.
For the distance search I believe the accepted best method is a Binary Space Partition tree. This can be stored as a series of bits. Each two bits (for a 2D tree) or three bits (for a 3D tree) subdivides the space one more level, increasing resolution.
Using a BSP, locating a set of objects to compare distances with is pretty easy. Just find the smallest set of squares or cubes which contain the edges of your distance box.
For the angle, I don't know of anything. I suppose that you could store each object in a second list or tree sorted by its angle. Then you would find every object at the proper distance using the BSP, every object at the proper angles using the angle tree, then do a set intersection.
You have effectively described a "three dimensional cyclindrical space", ie. a space that is locally three dimensional but where one dimension is topologically cyclic. In other words, it is locally flat and may be modeled as the boundary of a four-dimensional object C4 in (x, y, z, w) defined by
z^2 + w^2 = 1
where
a = arctan(w/z)
With this model, the space defined by your constraints is a 2-dimensional cylinder wrapped "lengthwise" around a cross section wedge, where the wedge wraps around the 4-d cylindrical space with an angle of 2 * threshold_2. This can be modeled using a "modified k-d tree" approach (modified 3-d tree), where the data structure is not a tree but actually a graph (it has cycles). You can still partition this space into cells with hyperplane separation, but traveling along the curve defined by (z, w) in the positive direction may encounter a point encountered in the negative direction. The tree should be modified to actually lead to these nodes from both directions, so that the edges are bidirectional (in the z-w curve direction - the others are obviously still unidirectional).
These cycles do not change the effectiveness of the data structure in locating nearby points or allowing your constraint search. In fact, for the most part, those algorithms are only slightly modified (the simplest approach being to hold a visited node data structure to prevent cycles in the search - you test the next neighbors about to be searched).
This will work especially well for your criteria, since the region you define is effectively bounded by these axis-defined hyperplane-bounded cells of a k-d tree, and so the search termination will leave a region on average populated around pi / 4 percent of the area.

Simplification / optimization of GPS track

I've got a GPS track produced by gpxlogger(1) (supplied as a client for gpsd). GPS receiver updates its coordinates every 1 second, gpxlogger's logic is very simple, it writes down location (lat, lon, ele) and a timestamp (time) received from GPS every n seconds (n = 3 in my case).
After writing down a several hours worth of track, gpxlogger saves several megabyte long GPX file that includes several thousands of points. Afterwards, I try to plot this track on a map and use it with OpenLayers. It works, but several thousands of points make using the map a sloppy and slow experience.
I understand that having several thousands of points of suboptimal. There are myriads of points that can be deleted without losing almost anything: when there are several points making up roughly the straight line and we're moving with the same constant speed between them, we can just leave the first and the last point and throw away anything else.
I thought of using gpsbabel for such track simplification / optimization job, but, alas, it's simplification filter works only with routes, i.e. analyzing only geometrical shape of path, without timestamps (i.e. not checking that the speed was roughly constant).
Is there some ready-made utility / library / algorithm available to optimize tracks? Or may be I'm missing some clever option with gpsbabel?
Yes, as mentioned before, the Douglas-Peucker algorithm is a straightforward way to simplify 2D connected paths. But as you have pointed out, you will need to extend it to the 3D case to properly simplify a GPS track with an inherent time dimension associated with every point. I have done so for a web application of my own using a PHP implementation of Douglas-Peucker.
It's easy to extend the algorithm to the 3D case with a little understanding of how the algorithm works. Say you have input path consisting of 26 points labeled A to Z. The simplest version of this path has two points, A and Z, so we start there. Imagine a line segment between A and Z. Now scan through all remaining points B through Y to find the point furthest away from the line segment AZ. Say that point furthest away is J. Then, you scan the points between B and I to find the furthest point from line segment AJ and scan points K through Y to find the point furthest from segment JZ, and so on, until the remaining points all lie within some desired distance threshold.
This will require some simple vector operations. Logically, it's the same process in 3D as in 2D. If you find a Douglas-Peucker algorithm implemented in your language, it might have some 2D vector math implemented, and you'll need to extend those to use 3 dimensions.
You can find a 3D C++ implementation here: 3D Douglas-Peucker in C++
Your x and y coordinates will probably be in degrees of latitude/longitude, and the z (time) coordinate might be in seconds since the unix epoch. You can resolve this discrepancy by deciding on an appropriate spatial-temporal relationship; let's say you want to view one day of activity over a map area of 1 square mile. Imagining this relationship as a cube of 1 mile by 1 mile by 1 day, you must prescale the time variable. Conversion from degrees to surface distance is non-trivial, but for this case we simplify and say one degree is 60 miles; then one mile is .0167 degrees. One day is 86400 seconds; then to make the units equivalent, our prescale factor for your timestamp is .0167/86400, or about 1/5,000,000.
If, say, you want to view the GPS activity within the same 1 square mile map area over 2 days instead, time resolution becomes half as important, so scale it down twice further, to 1/10,000,000. Have fun.
Have a look at Ramer-Douglas-Peucker algorithm for smoothening complex polygons, also Douglas-Peucker line simplification algorithm can help you reduce your points.
OpenSource GeoKarambola java library (no Android dependencies but can be used in Android) that includes a GpxPathManipulator class that does both route & track simplification/reduction (3D/elevation aware).
If the points have timestamp information that will not be discarded.
https://sourceforge.net/projects/geokarambola/
This is the algorith in action, interactively
https://lh3.googleusercontent.com/-hvHFyZfcY58/Vsye7nVrmiI/AAAAAAAAHdg/2-NFVfofbd4ShZcvtyCDpi2vXoYkZVFlQ/w360-h640-no/movie360x640_05_82_05.gif
This algorithm is based on reducing the number of points by eliminating those that have the greatest XTD (cross track distance) error until a tolerated error is satisfied or the maximum number of points is reached (both parameters of the function), wichever comes first.
An alternative algorithm, for on-the-run stream like track simplification (I call it "streamplification") is:
you keep a small buffer of the points the GPS sensor gives you, each time a GPS point is added to the buffer (elevation included) you calculate the max XTD (cross track distance) of all the points in the buffer to the line segment that unites the first point with the (newly added) last point of the buffer. If the point with the greatest XTD violates your max tolerated XTD error (25m has given me great results) then you cut the buffer at that point, register it as a selected point to be appended to the streamplified track, trim the trailing part of the buffer up to that cut point, and keep going. At the end of the track the last point of the buffer is also added/flushed to the solution.
This algorithm is lightweight enough that it runs on an AndroidWear smartwatch and gives optimal output regardless of if you move slow or fast, or stand idle at the same place for a long time. The ONLY thing that maters is the SHAPE of your track. You can go for many minutes/kilometers and, as long as you are moving in a straight line (a corridor within +/- tolerated XTD error deviations) the streamplify algorithm will only output 2 points: those of the exit form last curve and entry on next curve.
I ran in to a similar issue. The rate at which the gps unit takes points is much larger that needed. Many of the points are not geographically far away from each other. The approach that I took is to calculate the distance between the points using the haversine formula. If the distance was not larger than my threshold (0.1 miles in my case) I threw away the point. This quickly gets the number of points down to a manageable size.
I don't know what language you are looking for. Here is a C# project that I was working on. At the bottom you will find the haversine code.
http://blog.bobcravens.com/2010/09/gps-using-the-netduino/
Hope this gets you going.
Bob
This is probably NP-hard. Suppose you have points A, B, C, D, E.
Let's try a simple deterministic algorithm. Suppose you calculate the distance from point B to line A-C and it's smaller than your threshold (1 meter). So you delete B. Then you try the same for C to line A-D, but it's bigger and D for C-E, which is also bigger.
But it turns out that the optimal solution is A, B, E, because point C and D are close to the line B-E, yet on opposite sides.
If you delete 1 point, you cannot be sure that it should be a point that you should keep, unless you try every single possible solution (which can be n^n in size, so on n=80 that's more than the minimum number of atoms in the known universe).
Next step: try a brute force or branch and bound algorithm. Doesn't scale, doesn't work for real-world size. You can safely skip this step :)
Next step: First do a determinstic algorithm and improve upon that with a metaheuristic algorithm (tabu search, simulated annealing, genetic algorithms). In java there are a couple of open source implementations, such as Drools Planner.
All in all, you 'll probably have a workable solution (although not optimal) with the first simple deterministic algorithm, because you only have 1 constraint.
A far cousin of this problem is probably the Traveling Salesman Problem variant in which the salesman cannot visit all cities but has to select a few.
You want to throw away uninteresting points. So you need a function that computes how interesting a point is, then you can compute how interesting all the points are and throw away the N least interesting points, where you choose N to slim the data set sufficiently. It sounds like your definition of interesting corresponds to high acceleration (deviation from straight-line motion), which is easy to compute.
Try this, it's free and opensource online Service:
https://opengeo.tech/maps/gpx-simplify-optimizer/
I guess you need to keep points where you change direction. If you split your track into the set of intervals of constant direction, you can leave only boundary points of these intervals.
And, as Raedwald pointed out, you'll want to leave points where your acceleration is not zero.
Not sure how well this will work, but how about taking your list of points, working out the distance between them and therefore the total distance of the route and then deciding on a resolution distance and then just linear interpolating the position based on each step of x meters. ie for each fix you have a "distance from start" measure and you just interpolate where n*x is for your entire route. (you could decide how many points you want and divide the total distance by this to get your resolution distance). On top of this you could add a windowing function taking maybe the current point +/- z points and applying a weighting like exp(-k* dist^2/accuracy^2) to get the weighted average of a set of points where dist is the distance from the raw interpolated point and accuracy is the supposed accuracy of the gps position.
One really simple method is to repeatedly remove the point that creates the largest angle (in the range of 0° to 180° where 180° means it's on a straight line between its neighbors) between its neighbors until you have few enough points. That will start off removing all points that are perfectly in line with their neighbors and will go from there.
You can do that in Ο(n log(n)) by making a list of each index and its angle, sorting that list in descending order of angle, keeping how many you need from the front of the list, sorting that shorter list in descending order of index, and removing the indexes from the list of points.
def simplify_points(points, how_many_points_to_remove)
angle_map = Array.new
(2..points.length - 1).each { |next_index|
removal_list.add([next_index - 1, angle_between(points[next_index - 2], points[next_index - 1], points[next_index])])
}
removal_list = removal_list.sort_by { |index, angle| angle }.reverse
removal_list = removal_list.first(how_many_points_to_remove)
removal_list = removal_list.sort_by { |index, angle| index }.reverse
removal_list.each { |index| points.delete_at(index) }
return points
end

Continuous collision detection between two moving tetrahedra

My question is fairly simple. I have two tetrahedra, each with a current position, a linear speed in space, an angular velocity and a center of mass (center of rotation, actually).
Having this data, I am trying to find a (fast) algorithm which would precisely determine (1) whether they would collide at some point in time, and if it is the case, (2) after how much time they collided and (3) the point of collision.
Most people would solve this by doing triangle-triangle collision detection, but this would waste a few CPU cycles on redundant operations such as checking the same edge of one tetrahedron against the same edge of the other tetrahedron upon checking up different triangles. This only means I'll optimize things a bit. Nothing to worry about.
The problem is that I am not aware of any public CCD (continuous collision detection) triangle-triangle algorithm which takes self-rotation in account.
Therefore, I need an algorithm which would be inputted the following data:
vertex data for three triangles
position and center of rotation/mass
linear velocity and angular velocity
And would output the following:
Whether there is a collision
After how much time the collision occurred
In which point in space the collision occurred
Thanks in advance for your help.
The commonly used discrete collision detection would check the triangles of each shape for collision, over successive discrete points in time. While straightforward to compute, it could miss a fast moving object hitting another one, due to the collision happening between discrete points in time tested.
Continuous collision detection would first compute the volumes traced by each triangle over an infinity of time. For a triangle moving at constant speed and without rotation, this volume could look like a triangular prism. CCD would then check for collision between the volumes, and finally trace back if and at what time the triangles actually shared the same space.
When angular velocity is introduced, the volume traced by each triangle no longer looks like a prism. It might look more like the shape of a screw, like a strand of DNA, or some other non-trivial shapes you might get by rotating a triangle around some arbitrary axis while dragging it linearly. Computing the shape of such volume is no easy feat.
One approach might first compute the sphere that contains an entire tetrahedron when it is rotating at the given angular velocity vector, if it was not moving linearly. You can compute a rotation circle for each vertex, and derive the sphere from that. Given a sphere, we can now approximate the extruded CCD volume as a cylinder with the radius of the sphere and progressing along the linear velocity vector. Finding collisions of such cylinders gets us a first approximation for an area to search for collisions in.
A second, complementary approach might attempt to approximate the actual volume traced by each triangle by breaking it down into small, almost-prismatic sub-volumes. It would take the triangle positions at two increments of time, and add surfaces generated by tracing the triangle vertices at those moments. It's an approximation because it connects a straight line rather than an actual curve. For the approximation to avoid gross errors, the duration between each successive moments needs to be short enough such that the triangle only completes a small fraction of a rotation. The duration can be derived from the angular velocity.
The second approach creates many more polygons! You can use the first approach to limit the search volume, and then use the second to get higher precision.
If you're solving this for a game engine, you might find the precision of above sufficient (I would still shudder at the computational cost). If, rather, you're writing a CAD program or working on your thesis, you might find it less than satisfying. In the latter case, you might want to refine the second approach, perhaps by a better geometric description of the volume occupied by a turning, moving triangle -- when limited to a small turn angle.
I have spent quite a lot of time wondering about geometry problems like this one, and it seems like accurate solutions, despite their simple statements, are way too complicated to be practical, even for analogous 2D cases.
But intuitively I see that such solutions do exist when you consider linear translation velocities and linear angular velocities. Don't think you'll find the answer on the web or in any book because what we're talking about here are special, yet complex, cases. An iterative solution is probably what you want anyway -- the rest of the world is satisfied with those, so why shouldn't you be?
If you were trying to collide non-rotating tetrahedra, I'd suggest a taking the Minkowski sum and performing a ray check, but that won't work with rotation.
The best I can come up with is to perform swept-sphere collision using their bounding spheres to give you a range of times to check using bisection or what-have-you.
Here's an outline of a closed-form mathematical approach. Each element of this will be easy to express individually, and the final combination of these would be a closed form expression if one could ever write it out:
1) The equation of motion for each point of the tetrahedra is fairly simple in it's own coordinate system. The motion of the center of mass (CM) will just move smoothly along a straight line and the corner points will rotate around an axis through the CM, assumed to be the z-axis here, so the equation for each corner point (parameterized by time, t) is p = vt + x + r(sin(wt+s)i + cos(wt + s)j ), where v is the vector velocity of the center of mass; r is the radius of the projection onto the x-y plane; i, j, and k are the x, y and z unit vectors; and x and s account for the starting position and phase of rotation at t=0.
2) Note that each object has it's own coordinate system to easily represent the motion, but to compare them you'll need to rotate each into a common coordinate system, which may as well be the coordinate system of the screen. (Note though that the different coordinate systems are fixed in space and not traveling with the tetrahedra.) So determine the rotation matrices and apply them to each trajectory (i.e. the points and CM of each of the tetrahedra).
3) Now you have an equation for each trajectory all within the same coordinate system and you need to find the times of the intersections. This can be found by testing whether any of the line segments from the points to the CM of a tetrahedron intersects the any of the triangles of another. This also has a closed-form expression, as can be found here.
Layering these steps will make for terribly ugly equations, but it wouldn't be hard to solve them computationally (although with the rotation of the tetrahedra you need to be sure not to get stuck in a local minimum). Another option might be to plug it into something like Mathematica to do the cranking for you. (Not all problems have easy answers.)
Sorry I'm not a math boff and have no idea what the correct terminology is. Hope my poor terms don't hide my meaning too much.
Pick some arbitrary timestep.
Compute the bounds of each shape in two dimensions perpendicular to the axis it is moving on for the timestep.
For a timestep:
If the shaft of those bounds for any two objects intersect, half timestep and start recurse in.
A kind of binary search of increasingly fine precision to discover the point at which a finite intersection occurs.
Your problem can be cast into a linear programming problem and solved exactly.
First, suppose (p0,p1,p2,p3) are the vertexes at time t0, and (q0,q1,q2,q3) are the vertexes at time t1 for the first tetrahedron, then in 4d space-time, they fill the following 4d closed volume
V = { (r,t) | (r,t) = a0 (p0,t0) + … + a3 (p3,t0) + b0 (q0,t1) + … + b3 (q3,t1) }
Here the a0...a3 and b0…b3 parameters are in the interval [0,1] and sum to 1:
a0+a1+a2+a3+b0+b1+b2+b3=1
The second tetrahedron is similarly a convex polygon (add a ‘ to everything above to define V’ the 4d volume for that moving tetrahedron.
Now the intersection of two convex polygon is a convex polygon. The first time this happens would satisfy the following linear programming problem:
If (p0,p1,p2,p3) moves to (q0,q1,q2,q3)
and (p0’,p1’,p2’,p3’) moves to (q0’,q1’,q2’,q3’)
then the first time of intersection happens at points/times (r,t):
Minimize t0*(a0+a1+a2+a3)+t1*(b0+b1+b2+b3) subject to
0 <= ak <=1, 0<=bk <=1, 0 <= ak’ <=1, 0<=bk’ <=1, k=0..4
a0*(p0,t0) + … + a3*(p3,t0) + b0*(q0,t1) + … + b3*(q3,t1)
= a0’*(p0’,t0) + … + a3’*(p3’,t0) + b0’*(q0’,t1) + … + b3’*(q3’,t1)
The last is actually 4 equations, one for each dimension of (r,t).
This is a total of 20 linear constraints of the 16 values ak,bk,ak', and bk'.
If there is a solution, then
(r,t)= a0*(p0,t0) + … + a3*(p3,t0) + b0*(q0,t1) + … + b3*(q3,t1)
Is a point of first intersection. Otherwise they do not intersect.
Thought about this in the past but lost interest... The best way to go about solving it would be to abstract out one object.
Make a coordinate system where the first tetrahedron is the center (barycentric coords or a skewed system with one point as the origin) and abstract out the rotation by making the other tetrahedron rotate around the center. This should give you parametric equations if you make the rotation times time.
Add the movement of the center of mass towards the first and its spin and you have a set of equations for movement relative to the first (distance).
Solve for t where the distance equals zero.
Obviously with this method the more effects you add (like wind resistance) the messier the equations get buts its still probably the simplest (almost every other collision technique uses this method of abstraction). The biggest problem is if you add any effects that have feedback with no analytical solution the whole equation becomes unsolvable.
Note: If you go the route of of a skewed system watch out for pitfalls with distance. You must be in the right octant! This method favors vectors and quaternions though, while the barycentric coords favors matrices. So pick whichever your system uses most effectively.