using media pipe (facemesh) to adjust 3D model from face landmark question - mediapipe

all these months we were trying to research on how to generate a 3D mesh from a face picture and we have tried many options until we discovered media pipe and we are very happy with it.
though we have not tried it yet as our devs are mostly C++ and need to review on python, but we are getting there.
Anyways, as I read thru using mediapipe we can extract the depth and face landmark of a face image and this is what we need,
we are also planning to prepare a 3D mesh that will adjust or "morph" base from the landmarks extracted from media pipe, however, we are confuse of what will this 3D mesh will initial look or what the vertices of the mesh will begin from, where or what will the coordinates of the 3D mesh be base from, say for example, if the landmark in Media pipe gives us an X,Y and Z coordinate of the tip of the nose, where or what is the base of that XYZ landmark coordinates from, we need to know which 3D space the landmark is from so we can match it with the initial pose of the 3D mesh that we need to prepare.

Related

2D shape detection in 3D pointcloud

I'm working on a project to detect the position and orientation of a paper plane.
To collect the data, I'm using an Intel Realsense D435, which gives me accurate, clean depth data to work with.
Now I arrived at the problem of detecting the 2D paper plane silhouette from the 3D point cloud data.
Here is an example of the data (I put the plane on a stick for testing, this will not be in the final implementation):
https://i.stack.imgur.com/EHaEr.gif
Basically, I have:
A 3D point cloud with points on the plane
A 2D shape of the plane
I would like to calculate what rotations/translations are needed to align the 2D shape to the 3D point cloud as accurate as possible.
I've searched online, but couldn't find a good way to do it. One way would be to use Iterative Closest Point (ICP) to first take a calibration pointcloud of the plane in a known orientation, and align it with the current orientation. But from what I've heard, ICP doesn't perform well if the pointclouds aren't kind of already closely aligned at the start.
Any help is appreciated! Coding language doesn't matter.
Does your 3d point cloud have outliers? How many in what way?
How did you use ICP exactly?
One way would be using ICP, with a hand-crafted initial guess using
pcl::transformPointCloud (*cloud_in, *cloud_icp, transformation_matrix);
(to mitigate the problem that ICP needs to be close to work.)
What you actually want is the plane-model that describes the position and orientation of your point-cloud right?
A good estimator of your underlying function can be found with: pcl::ransac
pcl::ransace model consensus
You can then get the computedModel coefficents.
Now finding the correct transformation is just: How to calculate transformation matrix from one plane to another?

Compute road plane normal with an embedded camera

I am developing some computer vision algorithms for vehicle applications.
I am in front of a problem and some help would be appreciated.
Let say we have a calibrated camera attached to a vehicle which captures a frame of the road forward the vehicle:
Initial frame
We apply a first filter to keep only the road markers and return a binary image:
Filtered image
Once the road lane are separated, we can approximate the lanes with linear expressions and detect the vanishing point:
Objective
But what I am looking for to recover is the equation of the normal n into the image without any prior knowledge of the rotation matrix and the translation vector. Nevertheless, I assume L1, L2 and L3 lie on the same plane.
In the 3D space the problem is quite simple. In the 2D image plane, since the camera projective transformation does not keep the angle properties more complex. I am not able to find a way to figure out the equation of the normal.
Do you have any idea about how I could compute the normal?
Thanks,
Pm
No can do, you need a minimum of two independent vanishing points (i.e. vanishing points representing the images of the points at infinity of two different pencils of parallel lines).
If you have them, the answer is trivial: express the image positions of said vanishing points in homogeneous coordinates. Then their cross product is equal (up to scale) to the normal vector of the 3D plane said pencils define, decomposed in camera coordinates.
Your information is insufficient as the others have stated. If your data is coming from a video a common way to get a road ground plane is to take two or more images, compute the associated homography then decompose the homography matrix into the surface normal and relative camera motion. You can do the decomposition with OpenCV's decomposeHomographyMatmethod. You can compute the homography by associating four or more point correspondences using OpenCV's findHomography method. If it is hard to determine these correspondences it is also possible to do it with a combination of point and line correspondences paper, however this is not implemented in OpenCV.
You do not have sufficient information in the example you provide.
If you are wondering "which way is up", one thing you might be able to do is to detect the line on the horizon. If K is the calibration matrix then KTl will give you the plane normal in 3D relative to your camera. (The general equation for backprojection of a line l in the image to a plane E through the center of projection is E=PTl with a 3x4 projection matrix P)
A better alternative might be to establish a homography to rectify the ground-plane. To do so, however, you need at least four non-collinear points with known coordinates - or four lines, no three of which may be parallel.

Vertex Color to UV Map -> Remesh -> Texture mapping

I have a polygon mesh of a room in high resolution, and I want to extract vertices color information and map them as a UV map, so I can generate a texture atlas of the room.
After that, I want to remesh the model in order to reduce the number of polygons and map the hi-res texture onto the new mesh in lower resolution.
So far I've found this link to do it in Blender, but I would like to do it programmatically. Do you know about any library/code that could help my in my task?
I guess first of all I have to segment the model (normals criterion could be helpful) and then cut each mesh segment, so only then I am able to parameterize it. About parameterization, LSCM seems to provide good results for simple models. Once having available the texture atlas, I think the problem becomes a simple task of texture mapping.
My main problem is segmentation and mesh cutting. I'm using CGAL library for that purpose, but the algorithm is too simple to cut complex shapes. Any hint about a better segmentation/cutting algorithm that performs well for room-sized models?
EDIT:
The mesh consists in a room reconstructed with a RGB-D camera, with 2.5 million vertices and 4.7 million faces. The point is to extract high resolution texture, remesh the model to reduce number of polygons and then remap the texture onto it. It's not a closed mesh, and there are holes due to reconstruction, so I'm guessing if my task is not possible to accomplish at all.
I attach a capture of the mesh.
I would suggest using the following 4-steps procedure:
Step 1: remesh
For this type of mesh that comes from computer vision, you need a remesher that is robust to holes, overlaps, skinny triangles etc... You can use my GEOGRAM software [1]. Use the following command:
vorpalite my_input.obj my_output.obj pre=false post=false pts=30000
where 30000 is the number of desired points (adapt it to the complexity of your input). Note: I am deactivating pre and post-processing (pre=false post=false) that may remove too much parts of the mesh for this type of mesh.
Step 2: segment the remesh
My favourite method is "Variational Shape Approximation" [3]. I like it because it is simple to implement and gives reasonable results in most cases.
Step 3: parameterize
Besides my LSCM method, you may use ABF++ that we developed after [4], that gives much better results in most cases. You may also try ARAP [5].
Step 4: bake the texture
Once the simplified mesh is parameterized, you need to copy the colors from the original mesh onto the new one. This means determining for each pixel of the texture where it goes in 3D, and finding the nearest point in the original 3D mesh.
Segmentation, parameterization and baking are implemented in my Graphite software [2] (use the old version 2.x, the newer version 3.x does not have all the texturing functionalities).
[1] geogram: http://alice.loria.fr/software/geogram/doc/html/index.html
[2] graphite: http://alice.loria.fr/software/graphite/doc/html/
[3] Variational Shape Approximation (Cohen-Steiner, Alliez, Desbrun, SIGGRAPH 2004): http://www.geometry.caltech.edu/pubs/CAD04.pdf
[4] ABF++: http://alice.loria.fr/index.php/publications.html?redirect=1&Paper=ABF_plus_plus#2004
[5] ARAP: cs.harvard.edu/~sjg/papers/arap.pdf
For reducing the number of polygons, I prefer using mesh decimation. My recommended workflow: (Input: High resolution mesh(mesh0) with vertex color).
Compute uv coordinates for mesh0.
Generate texture image(textureImage) by vertex color. Thus, you have a texture mesh(mesh0 with uv coordinates, textureImage).
Apply mesh decimation to mesh0, and the decimation should take uv coorindates into consideration.
I have an example about this workflow in my site, the example image: Decimation of texture mesh .
Or you can refer my site for details.

Registration of 3D surface mesh to 3D image volume

I have an accurate mesh surface model of an implant I'd like to optimally rigidly align to a computed tomography scan (scalar volume) that contains the exact same object. I've tried detecting edges in the image volume with canny filter and doing an iterative closest point alignment between the edges and the vertices of the mesh, but it's not working. I also tried voxelizing the mesh, and using image volume alignment methods (Mattes Mutual) which yields very inconsistent results.
Any other suggestions?
Thank you.
Generally, mesh and volume are two different data structures. You have to either convert mesh to volume or convert volume to mesh.
I would recommend doing a segmentation of volume data first, to segment out the issues you want to register. With canny filter might not be enough to segment the border clearly. I would like to recommend you with level-set method and active contour model. These two are frequently used in medical image processing. For these two topics, I would recommend professor Chunming Li's work.
And after you do the segmentation of volume data, you might be able to reconstruct mesh model of that volume with marching cubes. The vertexes of two mesh could be registered through a simple ICP algorithm.
However, this is just a workaround instead of real registration, it always takes too much time to do the segmentation.

Creat mesh from point cloud on a 2D grid

I am using the new Kinect v2 and I am getting the depth map of the Kinect.
After I get the depth map I convert the depth data from Depth Space to Camera Space.
As far as I understand this is done, by converting all the X,Y coordinate of each pixel to Camera Space + adding the depth value as Z coordinate (also Kinect gives the depth value in millimetres so it is also converted to hold meters).
Because of this, the point cloud is actually on 2D grid extended with the depth value. The visualization also confirms this, since it is easy to notice that the points are ordered in a grid due to the above conversation.
For visualization I am using OpenGL the old fashion way (glBegin(...) and glEnd()).
I want to create a mesh out of the points. I kind of managed to do it with GL_TRIANGLES, but then I have lot of duplicated vertices and edges. So I thought I should create a better triangulation with GL_TRIANGLE_STRIP, but I am stuck here because I can't come up with a good algorithm which can go through my 2D grid in a way that I can feed it to the GL_TRIANGLE_STRIP so it creates a nice surface.
The problems:
For each triangle's vertices I am checking the Z coordinate. If it exceeds a certain threshold I disregard the triangle => this might create holes in my 2D grid.
Some depth values are NaN, because the Kinect can't "see" there nothing (for example an object is too far or too close) => this also creates holes in the 2D grid.
Anybody has any suggestion what would be the best method to solve this issue?
If you're able to use the point cloud library, you could use the
class pcl::OrganizedFastMesh< PointInT >.
http://docs.pointclouds.org/trunk/classpcl_1_1_organized_fast_mesh.html
I use it to triangulate complete depth frames.
You can try also a delanauy triangulation in 3d and look for the tetrahedons on the exterior. An easy algorithm is the bowyer-watson with tetrahedons and circumspheres. Cgal is a good example.