Obtaining 3D location of an object being looked at by a camera with known position and orientation - camera

I am building an augmented reality application and I have the yaw, pitch, and roll for the camera. I want to start placing objects in the 3D environment. I want to make it so that when the user clicks, a 3D point pops up right where the camera is pointed (center of the 2D screen) and when the user moves, the point moves accordingly in 3D space. The camera does not change position, only orientation. Is there a proper way to recover the 3D location of this point? We can assume that all points are equidistant from the camera location.
I am able to accomplish this independently for two axes (OpenGL default orientation). This works for changes in the vertical axis:
x = -sin(pitch)
y = cos(pitch)
z = 0
This also works for changes in the horizontal axis:
x = 0
y = -sin(yaw)
z = cos(yaw)
I was thinking that I should just make combine into:
x = -sin(pitch)
y = sin(yaw) * cos(pitch)
z = cos(yaw)
and that seems to be close, but not exactly correct. Any suggestions would be greatly appreciated!

It sounds like you just want to convert from a rotation vector (pitch,yaw,roll) to a rotation matrix. The conversion can bee seen on the Wikipedia article on rotation matrices. The idea is that once you have constructed your matrix, to transform any point simply.
final_pos = rot_mat*initial_pose
where final and initial pose are 3x1 vectors and rot_mat is a 3x3 matrix.

Related

Convert grid of dots in XY plane from camera coordinates to real world coordinates

I am writing a program. I have, say, a grid of dots on a piece of paper. I fix one end and bend the paper toward the screen, giving me a trapezoidal shape from the camera's point of view. I have the (x,y) camera coordinate of each dot. Is there a simple way I can change these (x,y) to real life (x,y) which should give me a rectangle? I have the camera/real (x,y) of the original flat sheet of paper pre-bend if that helps.
I have looked at 3D Camera coordinates to world coordinates (change of basis?) and Transforming screen coordinates from security camera to real world coordinates.
Look up "homography". The transformation from a plane in 3D space to its image as captured by an ideal pinhole camera is a homography. It can be represented as a 3x3 matrix H that transforms the 3D coordinates X of points in the world to their corresponding homogeneous image coordinates x:
x = H * X
where X is a 3x1 vector of the world point coordinates, and x = [u, v, w]^T is the image point in homogeneous coordinates.
Given a minimum of 4 matches between world and image points (e.g. the corners of a rectangle) you can estimate the parameters of the matrix H. For details, look up "DLT algorithm". In OpenCV the routine to use is findHomography.

pose estimation: determine whether rotation and transmation matrix are right

Recently I'm struggling with a pose estimation problem with a single camera. I have some 3D points and the corresponding 2D points on the image. Then I use solvePnP to get the rotation and translation vectors. The problem is, how can I determine whether the vectors are right results?
Now I use an indirect way to do this:
I use the rotation matrix, the translation vector and the world 3D coordinates of a certain point to obtain the coordinates of that point in Camera system. Then all I have to do is to determine whether the coordinates are reasonable. I think I know the directions of x, y and z axes of Camera system.
Is Camera center the origin of the Camera system?
Now consider the x component of that point. Is x equavalent to the distance of the camera and the point in the world space in Camera's x-axis direction (the sign can then be determined by the point is placed on which side of the camera)?
The figure below is in world space, while the axes depicted are in Camera system.
========How Camera and the point be placed in the world space=============
|
|
Camera--------------------------> Z axis
| |} Xw?
| P(Xw, Yw, Zw)
|
v x-axis
My rvec and tvec results seems right and wrong. For a specified point, the z value seems reasonable, I mean, if this point is about one meter away from the camera in the z direction, then the z value is about 1. But for x and y, according to the location of the point I think x and y should be positive but they are negative. What's more, the pattern detected in the original image is like this:
But using the points coordinates calculated in Camera system and the camera intrinsic parameters, I get an image like this:
The target keeps its pattern. But it moved from bottom right to top left. I cannot understand why.
Yes, the camera center is the origin of the camera coordinate system, which seems to be right following to this post.
In case of camera pose estimation, value seems reasonable can be named as backprojection error. That's a measure of how well your resulting rotation and translation map the 3D points to the 2D pixels. Unfortunately, solvePnP does not return a residual error measure. Therefore one has to compute it:
cv::solvePnP(worldPoints, pixelPoints, camIntrinsics, camDistortion, rVec, tVec);
// Use computed solution to project 3D pattern to image
cv::Mat projectedPattern;
cv::projectPoints(worldPoints, rVec, tVec, camIntrinsics, camDistortion, projectedPattern);
// Compute error of each 2D-3D correspondence.
std::vector<float> errors;
for( int i=0; i < corners.size(); ++i)
{
float dx = pixelPoints.at(i).x - projectedPattern.at<float>(i, 0);
float dy = pixelPoints.at(i).y - projectedPattern.at<float>(i, 1);
// Euclidean distance between projected and real measured pixel
float err = sqrt(dx*dx + dy*dy);
errors.push_back(err);
}
// Here, compute max or average of your "errors"
An average backprojection error of a calibrated camera might be in the range of 0 - 2 pixel. According to your two pictures, this would be way more. To me, it looks like a scaling problem. If I am right, you compute the projection yourself. Maybe you can try once cv::projectPoints() and compare.
When it comes to transformations, I learned not to follow my imagination :) The first thing I Do with the returned rVec and tVec is usually creating a 4x4 rigid transformation matrix out of it (I posted once code here). This makes things even less intuitive, but instead it is compact and handy.
Now I know the answers.
Yes, the camera center is the origin of the camera coordinate system.
Consider that the coordinates in the camera system are calculated as (xc,yc,zc). Then xc should be the distance between the camera and
the point in real world in the x direction.
Next, how to determine whether the output matrices are right?
1. as #eidelen points out, backprojection error is one indicative measure.
2. Calculate the coordinates of the points according to their coordinates in the world coordinate system and the matrices.
So why did I get a wrong result(the pattern remained but moved to a different region of the image)?
Parameter cameraMatrix in solvePnP() is a matrix supplying the parameters of the camera's external parameters. In camera matrix, you should use width/2 and height/2 for cx and cy. While I use width and height of the image size. I think that caused the error. After I corrected that and re-calibrated the camera, everything seems fine.

Need deep explanation for viewport /perspective/frustum calculations

I have a lot of tutorials & books, but I'm unable to understand how my viewport, my near & far distance etc are used to calc perspective / frustum matrix.
I have the learningwebgl lessons, but.... I dont understand what viewport & 3D space adjustments are made.... What is my initial window projection size ? Why I see the triangle & square placed at z = -7.
Another thing I dont understand . A near plane of 0.001 creates the window projection just in front of my nose ? So what is my projection window dimension ?
I need a very deeper and basic help....
Can anybody help me ? Some really usefull links? I need graphical examples showing & teaching how frustum is calculated.
Thanks
There's this
http://games.greggman.com/game/webgl-3d-perspective/
Imagine you're in 2D. You have a canvas that's 200x100 pixels. If you draw at x = 201 it will be off the canvas. Similarly at x = -1 it will be off the canvas.
In WebGL it works in a 3D space that goes from -1 to +1 in x, y and z. The perspective / frustum matrix is the matrix that takes your 3d scene and converts it to this -1 / +1 space. The near and far values define what range in world space get converted to the -1 / +1 "clipspace". Anything outside that range will be clipped just like the 2D example. If you set near to 10 and far to 100 then something at Z = 9 will be clipped because it's too near and something at 101 will also be clipped as something that's too far. More specifically the near and far settings will form a matrix such that when a point is at Z = near it will become -1 when multiplied by the matrix and when it's at Z = far it will become +1 when multiplied by the matrix.
The viewport setting tells WebGL how to convert from the -1 to +1 space back into pixels.

Draw a scatterplot matrix using glut, opengl

I am new to GLUT and opengl. I need to draw a scatterplot matrix for n dimensional array.
I have saved the data from csv to a vector of vectors and each vector corresponds to a row. I have plotted just one scatterplot. And used GL_LINES to draw the grid. My questions
1. How do I draw points in a particular grid? Using GL_POINTS I can only draw points in the entire window.
Please let me know need any further info to answer this question
Thanks
What you need to do is be able to transform your data's (x,y) coordinates into screen coordinates. The most straightforward way to do it actually does not rely on OpenGL or GLUT. All you have to do is use a little math. Determine the screen (x,y) coordinates of the place where you want a datapoint for (0,0) to be on the screen, and then determine how far apart you want one increment to be on the screen. Simply take your original data points, apply the offset, and then scale them, to get your screen coordinates, which you then pass into glVertex2f() (or whatever function you are using to specify points in your API).
For instance, you might decide you want point (0,0) in your data to be at location (200,0) on your screen, and the distance between 0 and 1 in your data to be 30 pixels on the screen. This operation will look like this:
int x = 0, y = 0; //Original data points
int scaleX = 30, scaleY = 30; //Scaling values for each component
int offsetX = 100, offsetY = 100; //Where you want the origin of your graph to be
// Apply the scaling values and offsets:
int screenX = x * scaleX + offsetX;
int screenY = y * scaleY + offsetY;
// Calls to your drawing functions using screenX and screenY as your coordinates
You will have to determine values that make sense for the scalaing and offsets. You can also have your program use different values for different sets of data, so you can display multiple graphs on the same screen. But this is a simple way to do it.
There are also other ways you can go about this. OpenGL has very powerful coordinate transformation functions and matrix math capabilities. Those may become more useful when you develop increasingly elaborate programs. They're most useful if you're going to be moving things around the screen in real-time, or operating on incredibly large data sets, as they allow you to perform these mathematical calculations very quickly using your graphics hardware (which is able to do them much faster than the CPU). However, the time it takes for the CPU to do simple calculations like those where you only are going to do them once or very infrequently on limited sets of data is not a problem for computers today.

How Do I Switch Y and Z Axises from Blender? (So Y is Up)

I've been having a bit of a problem with making the Y axis my up axis when exporting mesh and scenes from Blender. Both Blender and my export target use right handed transformation matrices. Z is the up axis in Blender while Y is the up axis in my target. The problem exists in 2 places though. The scene's transformations can't just be shifted on the X axis to fix this, because I also need to do the Y/Z switch for the vertices in the mesh (export as vertex.x, vertex.z, vertex.y). I need to have the actual Y and Z rotations switched, so that if the Y and Z rotations are the same, no change will occur (ie. identity). Thanks for your help in advance. Feel free to ask questions if I was not thorough enough.
Blender does two things different than the rest of the known world!
1. It uses Z axis for vertical (should be Y); Y axis for horizontal (should b X); and X axis for in and out (should b Z).
Very weird! Every high school graph since the beginning of time uses X for horizontal and Y for vertical.
It uses the right mouse button for selections.
U can change the selection btn in Preferences, but not the crazy axis arrangement!
no,
you do this
y=z
z=-y
no rotation of 90 degrees can make you go from left to right hand.
I ran into a similar issue when working with cinema4d and blender. In cinema4d Y is the up axis and rotations are heading,pitch and bank.
Blender's system looks like a right handed system, but rotated by 90 degrees on x axis.
I did the same thing for coordinates(exported as vertex.x,vertex.z,vertex.y). For rotations,
I think you should add 90 degrees(math.pi * 0.5) for rotations on X axis and the rest should be fine.
HTH
Have you tried just using Select All (the 'a' key) and then r x 90 to rotate everything 90 degrees around the X axis and the pivot point? (your pivot point is choosable in the menu bar of the 3D view if you need to control that).
You could do that, export, and then undo.
Just Download Wings3D. Export from Blender as .3ds and then Import this file in Wings3D.
Now you can just export it from Wings3D, again to .3ds. But instead of clicking directly on .3ds, click on the small icon in the right of the ".3ds" menu. now you can unchecked the Box Swap y und z axis and import the .3ds in another program.
There is no way that would be possible. Coordinate system was innately selected as hard coded from the blender source and there are no explicit option has been made in blender to switch it. It would also affected many of the hard coded functionality of any function blender was used or has been made by assume that coordinate
However, in theory, it would be possible to access blender source code and rebuild the blender to have it use another coordinate we would like. Albeit we need to carefully swap everything related to coordinate system
I too wish that left handed coordinate system (as of Unity3D) would be industrial standard and blender should at least have another version that work in left handed coordinate. People should just graduated from table coordinate to screen coordinate already
In blender, you could add empty plain axes, that will correct your orientation when exporting to unity, or try exporting as fbx file and change orientation in export options