Phase Measurement 3d visualize when unwrapped phase function - matplotlib

Recently I have tried the phase-shifting-profilometry method to get a 3D surface.
Input Images
Object's Phase Function
Everything went smoothly until I found out that it becomes a Diagonal plane/ Diagonal surface when visualizing the surface due to the unwrapped phase algorithm.
3D object Visualize
I want to ask whether there is any method to make the surface horizontal (like the XY plane).
Sorry that I can not post images here because "I need at least 10 reputation to post images", so Images will be on the link below.
1: https://i.stack.imgur.com/R8BMt.gif
2: https://i.stack.imgur.com/QueKS.png
3: https://i.stack.imgur.com/6jysU.png
Thank you very much!

Ju Lee,
what you are actually plotting is the phase map which can be used to calculate the 3D object. The easiest way (which is only an approximation) to get a depth map is to calculate the phase difference to the phase map of a reference plane (without an object on it):
depth(x,y) ~ a*(phase_map(x,y) - phase_map_reference(x,y)),
where a is a scaling factor you have to determine experimentally. This "easy" procedure is roughly adapted from Takeda's famous 1983 paper: https://doi.org/10.1364/AO.22.003977.
See formula 22 therein for small phase differences.
A more accurate procedure giving the full 3D pointcloud directly from the phase can be found in Zhang's 2006 paper https://doi.org/10.1364/OE.14.009120.
But therefore, you have to calibrate the camera-projector system and calculate an "absolute" phase_map. This typically needs a lot of work, but references therefore are linked in Zhang's paper.
Have fun!

Related

2D shape detection in 3D pointcloud

I'm working on a project to detect the position and orientation of a paper plane.
To collect the data, I'm using an Intel Realsense D435, which gives me accurate, clean depth data to work with.
Now I arrived at the problem of detecting the 2D paper plane silhouette from the 3D point cloud data.
Here is an example of the data (I put the plane on a stick for testing, this will not be in the final implementation):
https://i.stack.imgur.com/EHaEr.gif
Basically, I have:
A 3D point cloud with points on the plane
A 2D shape of the plane
I would like to calculate what rotations/translations are needed to align the 2D shape to the 3D point cloud as accurate as possible.
I've searched online, but couldn't find a good way to do it. One way would be to use Iterative Closest Point (ICP) to first take a calibration pointcloud of the plane in a known orientation, and align it with the current orientation. But from what I've heard, ICP doesn't perform well if the pointclouds aren't kind of already closely aligned at the start.
Any help is appreciated! Coding language doesn't matter.
Does your 3d point cloud have outliers? How many in what way?
How did you use ICP exactly?
One way would be using ICP, with a hand-crafted initial guess using
pcl::transformPointCloud (*cloud_in, *cloud_icp, transformation_matrix);
(to mitigate the problem that ICP needs to be close to work.)
What you actually want is the plane-model that describes the position and orientation of your point-cloud right?
A good estimator of your underlying function can be found with: pcl::ransac
pcl::ransace model consensus
You can then get the computedModel coefficents.
Now finding the correct transformation is just: How to calculate transformation matrix from one plane to another?

how to reconstruct scene from different views' point clouds

I am facing a problem on 3D reconstruction since I am a new to this filed. I have some different views' depth map(point clouds), I want to use them to reconstruct the scene to get the effect like using the kinect fusion. Is there any paper of source code to settle this problem. Or any ideas on this problem.
PS:the point cloud is stored as a file with (x,y,z), you can check here to get the data.
Thank you very much.
As you have stated that you are new to this field, I shall attempt to keep this high level. Please do comment if there is something that is not clear.
The pipeline you refer to has three key stages:
Integration
Rendering
Pose Estimation
The Integration stage takes the unprojected points from a Depth Map (Kinect image) under the current pose and "integrates" them into a spatial data structure (a Voxel Volume such as a Signed Distance Function or a hierarchical structure like an Octree), often by maintaining per Voxel running averages.
The Rendering stage takes the inverse pose for the current frame and produces an image of the visible parts of the model currently in view. For the common volumetric representations this is achieved by Raycasting. The output of this stage provides the points of the model to which the next live frame is registered (the next stage).
The Pose Estimation stage registers the previously extracted model points to those of the live frame. This is commonly achieved by the Iterative Closest Point algorithm.
With regards to pertinent literature, I would advise the following papers as a starting point.
KinectFusion: Real-Time Dense Surface Mapping and Tracking
Real-time 3D Reconstruction at Scale using Voxel Hashing
Very High Frame Rate Volumetric Integration of Depth Images on
Mobile Devices

Should deep learning classification be used to classify details such as liquid level in bottle

Can deep learning classification be used to precisely label/classify both the object and one of its features. For example to identify the bottle (like Grants Whiskey) and liquid level in the bottle (in 10 percent steps - like 50% full). Is this the problem that can be best solved utilizing some of deep learning frameworks (Tensorflow etc) or some other approach is more effective?
Well, this should be well possible if the liquor is well colored. If not (e.g. gin, vodka), I'd say you have no chance with today's technologies when observing the object from a natural view angle and distance.
For colored liquor, I'd train two detectors. One for detecting the bottle, and a second one to detect the liquor given the bottle. The ratio between the two will be your percentage.
Some of the proven state-of-the-art deep learning-based object detectors (just Google them):
Multibox
YOLO
Faster RCNN
Or non-deep-learning-based:
Deformable part model
EDIT:
I was ask to elaborate more. Here is an example:
The box detector e.g. draws a box in the image at [0.1, 0.2, 0.5, 0.6] (min_height, min_width, max_height, max_width) which is the relative location of your bottle.
Now you crop the bottle from the original image and feed it to the second detector. The second detector draws e.g. [0.2, 0.3, 0.7, 0.8] in your cropped bottle image, the location indicates the fluid it has detected. Now (0.7 - 0.2) * (0.8 - 0.3) = 0.25 is the relative area of the fluid with respect to the area of the bottle, which is what OP is asking for.
EDIT 2:
I entered this reply assuming OP wants to use deep learning. I'd agree other methods should be considered if OP is still unsure with deep learning. For bottle detection, deep learning-based methods have shown to outperform traditional methods by a large margin. Bottle detection happens to be one of the classes in the PASCAL VOC challenge. See results comparison here: http://rodrigob.github.io/are_we_there_yet/build/detection_datasets_results.html#50617363616c20564f43203230313020636f6d7034
For the liquid detection however, deep learning might be slightly overkill. E.g. if you know what color you are looking for, even a simple color filter will give you "something"....
The rule of thumb for deep learning is, if it is visible in the image, hence a expert can tell you the answer solely based on the image then the chances are very high that you can learn this with deep learning, given enough annotated data.
However you are quite unlikely to have the required data needed for such a task, therefore I would ask myself the question if i can simplify the problem. For example you could take gin, vodka and so on and use SIFT to find the bottle again in a new scene. Then RANSAC for bottle detection and cut the bottle out of the image.
Then I would try gradient features to find the edge with the liquid level. Finally you can calculate the percentage with (liquid edge - bottom) / (top bottle - bottom bottle).
Identifying the bottle label should not be hard to do - it's even available "on tap" for cheap (these guys actually use it to identify wine bottle labels on their website): https://aws.amazon.com/blogs/aws/amazon-rekognition-image-detection-and-recognition-powered-by-deep-learning/
Regarding the liquid level, this might be a problem AWS wouldn't be able to solve straight away - but I'm sure it's possible to make a custom CNN for it. Alternatively, you could use good old humans on Amazon Mechanical Turk to do the work for you.
(Note: I do not work for Amazon)

Tweaking Heightmap Generation For Hexagon Grids

Currently I'm working on a little project just for a bit of fun. It is a C++, WinAPI application using OpenGL.
I hope it will turn into a RTS Game played on a hexagon grid and when I get the basic game engine done, I have plans to expand it further.
At the moment my application consists of a VBO that holds vertex and heightmap information. The heightmap is generated using a midpoint displacement algorithm (diamond-square).
In order to implement a hexagon grid I went with the idea explained here. It shifts down odd rows of a normal grid to allow relatively easy rendering of hexagons without too many further complications (I hope).
After a few days it is beginning to come together and I've added mouse picking, which is implemented by rendering each hex in the grid in a unique colour, and then sampling a given mouse position within this FBO to identify the ID of the selected cell (visible in the top right of the screenshot below).
In the next stage of my project I would like to look at generating more 'playable' terrains. To me this means that the shape of each hexagon should be more regular than those seen in the image above.
So finally coming to my point, is there:
A way of smoothing or adjusting the vertices in my current method
that would bring all point of a hexagon onto one plane (coplanar).
EDIT:
For anyone looking for information on how to make points coplanar here is a great explination.
A better approach to procedural terrain generation that would allow
for better control of this sort of thing.
A way to represent my vertex information in a different way that allows for this.
To be clear, I am not trying to achieve a flat hex grid with raised edges or platforms (as seen below).
)
I would like all the geometry to join and lead into the next bit.
I'm hope to achieve something similar to what I have now (relatively nice undulating hills & terrain) but with more controllable plateaus. This gives me the flexibility of cording off areas (unplayable tiles) later on, where I can add higher detail meshes if needed.
Any feedback is welcome, I'm using this as a learning exercise so please - all comments welcome!
It depends on what you actually want and what you mean by "more controlled".
Do you want to be able to say "there will be a mountain on coordinates [11, -127] with radius 20"? Complexity of this this depends on how far you want to go. If you want just mountains, then radial gradients are enough (just add the gradient values to the noise values). But if you want some more complex shapes, you are in for a treat.
I explore this idea to great depth in my project (please consider that the published version is just a prototype, which is currently undergoing major redesign, it is completely usable a map generator though).
Another way is to make the generation much more procedural - you just specify a sequence of mathematical functions, which you apply on the terrain. Even a simple value transformation can get you very far.
All of these methods should work just fine for hex grid. If artefacts occur because of the odd-row shift, then you could interpolate the odd rows instead (just calculate the height value for the vertex from the two vertices between which it is located with simple linear interpolation formula).
Consider a function, which maps the purple line into the blue curve - it emphasizes lower located heights as well as very high located heights, but makes the transition between them steeper (this example is just a cosine function, making the curve less smooth would make the transformation more prominent).
You could also only use bottom half of the curve, making peaks sharper and lower located areas flatter (thus more playable).
"sharpness" of the curve can be easily modulated with power (making the effect much more dramatic) or square root (decreasing the effect).
Implementation of this is actually extremely simple (especially if you use the cosine function) - just apply the function on each pixel in the map. If the function isn't so mathematically trivial, lookup tables work just fine (with cubic interpolation between the table values, linear interpolation creates artefacts).
Several more simple methods of "gamification" of random noise terrain can be found in this paper: "Realtime Synthesis of Eroded Fractal Terrain for Use in Computer Games".
Good luck with your project

Voxel Engine and Optimization

Recently I've started developing voxel engine. What I need is only colorful voxels without texture, but at very large amount (much smaller than minecraft) - and the question is how to draw the scene very fast? I'm using c#/xna but this is in my opinion not very important in this case, let's talk about general cases. Look at these two games:
http://www.youtube.com/watch?v=EKdRri5jSMs
http://www.youtube.com/watch?v=in0bavLJ8KQ
Especially I think video number 2 represents great optimization methods (my gfx card starts choking just at 192 x 192 x 64) How they achieve this?
What i would to have in the engine:
colorful voxels without texture, but shaded
many, many voxels, say minimum 512 x 512 x 128 to achieve something like video #2
shadows (smooth shadows will be great but this is not necessary)
optional: dynamic lighting (for example from fireballs flying, which light up near voxel structures)
framerate minimum 40 FPS
camera have 3 ways of freedom (move in x-axis, move in y-axis, move in z-axis), no camera rotation is needed
finally optional feature may be Depth of Field (it will be sweet ^^ )
What optimization I have already know:
remove unseen voxels that resides inside voxel structure (covered
from six directions by other voxels)
remove unseen faces of voxels - because camera have no rotation and always look aslant forward like in TPP games, so if we divide screen
by vertical cut, left voxels and right voxels will show only 3 faces
keep voxels in Dictionary instead of 3-dimensional array - jumping through array of size 512 x 512 x 128 takes miliseconds which is
unacceptable - but dictionary int:color where int describes packed
3D position is much much faster
use instancing where applciable
occluding? (how to do this?)
space dividing / octtree (is it good idea?)
I'll be very thankful if someone give me a tip how to improve existing optimizations listed above or can share ideas of new improvements. Thanks
1) Voxatron uses a software renderer rather than the GPU. You can read some details about it if you read the comments in this blog post:
http://www.lexaloffle.com/bbs/?tid=201
I haven't looked in detail myself so can't tell you much more than that.
2) I've never played 3D Dot Game Heroes but I don't have any reason to believe it uses voxels at all. I mean, I don't see any cubes being added or deleted. Most likely it is just a static polygon mesh with a nice texture applied.
As for implementing it yourself, do not try to draw the world by rendering cubes as this is very slow. Instead you should process the volume and generate meshes lying on the intersection of solid voxels and empty ones. Break the volume into suitable sized regions (e.g. 32x32x32) and generate a mesh for each.
I have written a book article about this which you might find useful. It's actually about smooth voxel terain but a lot of the priciples stll apply.
You can read it on Google books here: http://books.google.com/books?id=WNfD2u8nIlIC&lpg=PR1&dq=game%20engine%20gems&pg=PA39#v=onepage&q&f=false
And you can find the associated source code here: http://www.thermite3d.org
Since you are using XNA, you can just use instancing to get the desired effect: http://www.float4x4.net/index.php/2010/06/hardware-instancing-in-xna/
http://roecode.wordpress.com/2008/03/17/xna-framework-gameengine-development-part-19-hardware-instancing-pc-only/
The underlying concept is instancing: this feature lets you specify some amount of repeating data and some amount of varying data in a single DrawIndexedPrimitive call. In your case, the instance stream would be a single solid box, and the other stream would be the transform and color information.