I am working on a geometry editor tool and I am dealing with the way how to get manipulators vector coordinates on the screen/camera plane so I can use these for mouse dragging. I've got access to vectors world coordinates matrix (or any objects matrix), projection matrix and camera direction, position etc.
Related
Suppose I created a sphere mesh, uv-unwrapped it, and created 1000 texture maps around those unwrapped coordinates. Now I realized that I want some parts of the sphere to be "untextured" and have an option to texture them with another random texture. How would I remove the uv coordinates from the sphere so they don't get textured or at least move them to another uvmap without changing the position of the unwrapped coordinates.
I am trying to use MATLAB's camera calibrator to calibrate an infrared camera. I was able to get the intrinsic matrix by just feeding around 100 images to the calibrator. But I'm struggling with how to get the extrinsic matrix [R|t].
Because the extrinsic matrix is used to map the world frame with the camera frame, so in theory, when the camera(object) is moving, there will be many extrinsic matrices.
In the picture below, if the intrinsic matrix is determined using 50 images, then there are 50 extrinsic matrices correspond to each image. Am I correct?
You are right. Usually, a by-product of an intrinsic calibration is the extrinsic matrix for each pattern observed; this is mostly used to draw the patterns with respect to the camera as in the picture you posted.
What you usually do afterwards is to define some external reference frame that makes sense for you application, also known as the 'world' reference frame, and compute the pose of the camera with respect to it. That's the extrinsic matrix you always hear about.
For this, you:
Define the reference frame and take some points with known 3D coordinates on it; this can be a grid drawn on the floor, for example.
Take a picture of the 3D points with the calibrated camera and get a list of the correspondent 2D (image) coordinates of the points.
Use a pose estimation function that takes: the camera intrinsic parameters, the 3D points and the correspondent 2D image points. I am more familiar with OpenCV, but the Matlab function that seems to do the job is: https://www.mathworks.com/help/vision/ref/estimateworldcamerapose.html
In the pinhole camera model, is it possible to determine the rotation required from the optical/principal axis (the axis which pierces the image plane) to intersect a given pixel coordinate (u,v)?
I have an image where I am detecting a marker in space, and have the intrinsic and extrinsic camera parameters available. I am using the extrinsic parameters to cast a 2d ray into a separately constructed map (which is overhead and 2d), however I would like the ray angle to change depending on if the detected marker is to the left or right inside of the image.
My first thought was to use arctan utilizing the focal length and the u coordinate (x-axis on image plane from center of image) to determine an angle, however I don't think the units of measurement cooperate: one should be in real world meters and the other is arbitrary pixels.
Given a 3D point coordinate, the internal and the external camera parameters in a calibrated stereo camera setup, is there a HALCON method that gives me the 2D pixel coordinates on each camera?
Regards,
MSK
You can project the coordinates into 2D world, if the camera parameters are known, using the following:
* Project 3D points into image.
project_3d_point(X, Y, Z, CameraParam, Row, Column)
Please refer to description here
I'm using a 3d engine and need to translate between 3d world space and 2d screen space using perspective projection, so I can place 2d text labels on items in 3d space.
I've seen a few posts of various answers to this problem but they seem to use components I don't have.
I have a Camera object, and can only set it's current position and lookat position, it cannot roll. The camera is moving along a path and certain target object may appear in it's view then disappear.
I have only the following values
lookat position
position
vertical FOV
Z far
Z near
and obviously the position of the target object.
Can anyone please give me an algorithm that will do this using just these components?
Many thanks.
all graphics engines use matrices to transform between different coordinats systems. Indeed OpenGL and DirectX uses them, because they are the standard way.
Cameras usually construct the matrices using the parameters you have:
view matrix (transform the world to position in a way you look at it from the camera position), it uses lookat position and camera position (also the up vector which usually is 0,1,0)
projection matrix (transforms from 3D coordinates to 2D Coordinates), it uses the fov, near, far and aspect.
You could find information of how to construct the matrices in internet searching for the opengl functions that create them:
gluLookat creates a viewmatrix
gluPerspective: creates the projection matrix
But I cant imagine an engine that doesnt allow you to get these matrices, because I can ensure you they are somewhere, the engine is using it.
Once you have those matrices, you multiply them, to get the viewprojeciton matrix. This matrix transform from World coordinates to Screen Coordinates. So just multiply the matrix with the position you want to know (in vector 4 format, being the 4ยบ component 1.0).
But wait, the result will be in homogeneous coordinates, you need to divide X,Y,Z of the resulting vector by W, and then you have the position in Normalized screen coordinates (0 means the center, 1 means right, -1 means left, etc).
From here it is easy to transform multiplying by width and height.
I have some slides explaining all this here: https://docs.google.com/presentation/d/13crrSCPonJcxAjGaS5HJOat3MpE0lmEtqxeVr4tVLDs/present?slide=id.i0
Good luck :)
P.S: when you work with 3D it is really important to understand the three matrices (model, view and projection), otherwise you will stumble every time.
so I can place 2d text labels on items
in 3d space
Have you looked up "billboard" techniques? Sometimes just knowing the right term to search under is all you need. This refers to polygons (typically rectangles) that always face the camera, regardless of camera position or orientation.