How to find the transformation between two IMU measurements? - quaternions

I am looking at the Clubs Dataset https://clubs.github.io/ for some research into multiway registration of point clouds. I am initially trying to use the ICP registration method in a sequential order adding one point cloud at a time.
The dataset has the RGB parameters for the various poses of the object.
1525694104,-0.0913515162993169,-0.16815189159691318,0.4504956847817425,0.4591556084159897,-0.7817619820248951,-0.3682132612946675,0.20601769355692517
1525694157,-0.22510740390250225,-0.32514596548025265,0.45221561140129063,0.2388698281592328,-0.8750788198918591,-0.4081451880445991,0.10293936130543749
1525694174,-0.4179094161019803,-0.39403349319958664,0.4522321523188167,-0.004021371419719342,0.9070013543525104,0.4210736433584828,0.005455788205342825
The columns are namely
filename, translation across 3 axes, quaternion for rotation.
I converted the images into point clouds and when I try to align the point clouds, I would need to have a good estimation of the transformation. Given the measurements at two points, how would I find out the transformation between those points. I would use that as my intiial estimate of my ICP registration algorithm.

Related

How can I get an approximate upper body volume from the Kinect skeleton?

I want to do a fitting room app using the Kinect. Since I need to measure the player clothing size (S, M, L, XL) I must get the player's upper body "approximation" mass only using its skeleton (not using depth data). I don't need a very precise calculation.
Examine the length of the bones by calculating the distance between the relevant upper-body joints:
SHOULDER_LEFT
SHOULDER_CENTER
SHOULDER_RIGHT
SPINE
HIP_LEFT
HIP_CENTER
HIP_RIGHT
For example, two of the most relevant features to calculate are likely related to the user's height - the distance between SHOULDER_CENTER and SPINE joints, and the distance between SPINE and HIP_CENTER joints.
I suggest using Kinect Studio to store recordings of users and classify each recording according to the user's clothing size. With this data, you should be able to iterate on an algorithm (assuming it's feasible to approximate this accurately enough using only the skeleton data).
(As a side note, to do this more accurately, you'll probably need the depth data and 3D scanning. For example, there is an existing company called Styku that has a related Kinect product that does 3D body scanning.)

how to reconstruct scene from different views' point clouds

I am facing a problem on 3D reconstruction since I am a new to this filed. I have some different views' depth map(point clouds), I want to use them to reconstruct the scene to get the effect like using the kinect fusion. Is there any paper of source code to settle this problem. Or any ideas on this problem.
PS:the point cloud is stored as a file with (x,y,z), you can check here to get the data.
Thank you very much.
As you have stated that you are new to this field, I shall attempt to keep this high level. Please do comment if there is something that is not clear.
The pipeline you refer to has three key stages:
Integration
Rendering
Pose Estimation
The Integration stage takes the unprojected points from a Depth Map (Kinect image) under the current pose and "integrates" them into a spatial data structure (a Voxel Volume such as a Signed Distance Function or a hierarchical structure like an Octree), often by maintaining per Voxel running averages.
The Rendering stage takes the inverse pose for the current frame and produces an image of the visible parts of the model currently in view. For the common volumetric representations this is achieved by Raycasting. The output of this stage provides the points of the model to which the next live frame is registered (the next stage).
The Pose Estimation stage registers the previously extracted model points to those of the live frame. This is commonly achieved by the Iterative Closest Point algorithm.
With regards to pertinent literature, I would advise the following papers as a starting point.
KinectFusion: Real-Time Dense Surface Mapping and Tracking
Real-time 3D Reconstruction at Scale using Voxel Hashing
Very High Frame Rate Volumetric Integration of Depth Images on
Mobile Devices

Mesh to mesh. Mesh fitting (averaging). Mesh comparison.

I have 3 sets of point cloud that represent one surface. I want to use these point clouds to construct triangular mesh, then use the mesh to represent the surface. Each set of point cloud is collected in different ways so their representation to this surface are different. For example, some sets can represent the surface with smaller "error". My questions are:
(1) What's the best way to evaluate such mesh-to-surface "error"?
(2) Is there a mature/reliable way to convert point cloud to triangular mesh? I found some software doing this but most requests extensive manual adjustment.
(3) After the conversion I get three meshes. I want to use a fourth mesh, namely Mesh4, to "fit" the three meshes, and get an "average" mesh of the three. Then I can use this Mesh4 as a representation of the underlying surface. How can I do/call this "mesh to mesh" fitting? Is it a mature technique?
Thank you very much for your time!
Please find below my answers for point 1 and 2:
as a metric for mesh-to-surface error you can use Hausdorff distance. For example, you could use Libigl to compare two meshes.
To obtain a mesh from a point cloud, have a look at PCL

Registration of 3D surface mesh to 3D image volume

I have an accurate mesh surface model of an implant I'd like to optimally rigidly align to a computed tomography scan (scalar volume) that contains the exact same object. I've tried detecting edges in the image volume with canny filter and doing an iterative closest point alignment between the edges and the vertices of the mesh, but it's not working. I also tried voxelizing the mesh, and using image volume alignment methods (Mattes Mutual) which yields very inconsistent results.
Any other suggestions?
Thank you.
Generally, mesh and volume are two different data structures. You have to either convert mesh to volume or convert volume to mesh.
I would recommend doing a segmentation of volume data first, to segment out the issues you want to register. With canny filter might not be enough to segment the border clearly. I would like to recommend you with level-set method and active contour model. These two are frequently used in medical image processing. For these two topics, I would recommend professor Chunming Li's work.
And after you do the segmentation of volume data, you might be able to reconstruct mesh model of that volume with marching cubes. The vertexes of two mesh could be registered through a simple ICP algorithm.
However, this is just a workaround instead of real registration, it always takes too much time to do the segmentation.

It is possible to recognize all objects from a room with Microsoft Kinect?

I have a project where I have to recognize an entire room so I can calculate the distances between objects (like big ones eg. bed, table, etc.) and a person in that room. It is possible something like that using Microsoft Kinect?
Thank you!
Kinect provides you following
Depth Stream
Color Stream
Skeleton information
Its up to you how you use this data.
To answer your question - Official Micorosft Kinect SDK doesnt provides shape detection out of the box. But it does provide you skeleton data/face tracking with which you can detect distance of user from kinect.
Also with mapping color stream to depth stream you can detect how far a particular pixel is from kinect. In your implementation if you have unique characteristics of different objects like color,shape and size you can probably detect them and also detect the distance.
OpenCV is one of the library that i use for computer vision etc.
Again its up to you how you use this data.
Kinect camera provides depth and consequently 3D information (point cloud) about matte objects in the range 0.5-10 meters. With this information it is possible to segment out the floor (by fitting a plane) of the room and possibly walls and the ceiling. This step is important since these surfaces often connect separate objects making them a one big object.
The remaining parts of point cloud can be segmented by depth if they don't touch each other physically. Using color one can separate the objects even further. Note that we implicitly define an object as 3D dense and color consistent entity while other definitions are also possible.
As soon as you have your objects segmented you can measure the distances between your segments, analyse their shape, recognize artifacts or humans, etc. To the best of my knowledge however a Skeleton library can recognize humans after they moved for a few seconds. Below is a simple depth map that was broken on a few segments using depth but not color information.