Is there a way to get the screen coordinates of the objects in the 3d world?
Implement gluProject().
EDIT: Mesa/SGI has some code.
EDIT2: Or grab a copy of GLU for iOS.
Just implement the transformations of the OpenGL pipeline: modelview, projection, perspective devide, viewport. You get query the current matrices with glGetDoublev and the viewort with glGetIntegerv.
Then you have to compute projection matrix times modelview matrix = MVP.
now for each vertex v compute MVP*v.
then compute v /= v.w;
So you got coordinates in range [-1,1]x[-1,1], the last thing is scaling and translating this into [x,x+w]x[y,y+h] (which are the values of the viewport).
You can also take a look info pages of the OpenGL reference for glFrustum, glViewport, to see how all these transformations are done.
Related
I have been learning DirectX 11, and in the book I am reading, it states that the Rasterizer outputs Fragments. It is my understanding, that these Fragments are the output of the Rasterizer(which inputs geometric primitives), and in-fact are just 2D Positions(your 2D Render Target View)
Here is what I think I understand, please correct me.
The Rasterizer takes Geometric Primitives(spheres, cubes or boxes, toroids
cylinders, pyramids, triangle meshes or polygon meshes) (https://en.wikipedia.org/wiki/Geometric_primitive). It then translates these primitives into pixels(or dots) that are mapped to your Render Target View(that is 2D). This is what a Fragment is. For each Fragment, it executes the Pixel Shader, to determine its color.
However, I am only assuming because there is no simple explanation of what it is (That I can find).
So my questions are ...
1: What is a Rasterizer? What are the inputs, and what is the output?
2: What is a fragment, in relation to Rasterizer output.
3: Why is a fragment a float 4 value (SV_Position)? If it just 2D Screen Space for the Render Target View?
4: How does it correlate to the Render Target Output (the 2D Screen Texture)?
5: Is this why we clear the Render Target View(to whatever color) because the Razterizer, and Pixel Shader will not execute on all X,Y locations of the Render Target View?
Thank you!
I do not use DirectXI but OpenGL instead but the terminology should bi similar if not the same. My understanding is this:
(scene geometry) -> [Vertex shader] -> (per vertex data)
(per vertex data) -> [Geometry&Teseletaion shader] -> (per primitive data)
(per primitive data) -> [rasterizer] -> (per fragment data)
(per fragment data) -> [Fragment shader] -> (fragment)
(fragment) -> [depth/stencil/alpha/blend...]-> (pixels)
So in Vertex shader you can perform any per vertex operations like transform of coordinate systems, pre-computation of needed parameters etc.
In geometry and teselation you can compute normals from geometry, emit/convert primitives and much much more.
The Rasterizer then convert geometry (primitive) into fragments. This is done by interpolation. It basically divide the viewed part of any primitive into fragments see convex polygon rasterizer.
Fragments are not pixels nor super pixels but they are close to it. The difference is that they may or may not be outputted depending on the circumstances and pipeline configuration (Pixels are visible outputs). You can think of them as a possible super-pixels.
Fragment shader convert per fragment data into final fragments. Here you are computing per fragment/pixel lighting,shading, doing all the texture stuff, compute colors etc. The output is also fragment which is basically pixel + some additional info so it does not have just position and color but can have other properties as well (like more colors, depth, alpha, stencil, etc).
This goes into final combiner which provides the depth test and any other enabled tests or functionality like Blending. And only that output goes into framebuffer as pixel.
I think that answered #1,#2,#4.
Now #3 (I may be wrong here due to my lack of knowledge about DirectX) in per fragment data you often need 3D position of fragments for proper lighting or what ever computations and as homogenuous coordinates are used we need 4D (x,y,z,w) vector for it. The fragment itself has 2D coordinates but the 3D position is its interpolated value from geometry passed from Vertex shader. So it may not contain the screen position but world coordinates instead (or any other).
#5 Yes the scene may not cover whole screen and or you need to preset the buffers like Depth, Stencil, Alpha so the rendering works as should and is not invalidated by previous frame results. So we need to clear framebuffers usually at start of frame. Some techniques require multiple clearings per frame others (like glow effect) clears once per multiple frames ...
I've been struggling for some time to find a way in Meshlab to include or transfer UV’s onto a poisson model from source meshes. I will try to explain more of what I’m trying to accomplish below.
My source meshes have uv’s along with texture data. I need to build a fused model and include the texture data. It is for facial expression scan data reconstruction for a production pipeline which ultimately builds a facial rig for animation. Our source scan data includes marker information which we use to register, build a fused scan model which is used to generate a retopologized mesh for blendshapes.
Previously, we were using David3D. http://www.david-3d.com/en/support/downloads
David 3D used poisson surface reconstruction to create a fused model. The fused model it created brought along the uvs and optimized the source textures into 1 uv tile. I'll post a picture of the result below that I'm looking to recreate in MeshLab.
My need to find this solution in meshlab is to build tools to help automate this process. David3D version 5 does not have an development kit to program around.
Is it possible in Meshlab to apply the uvs from the regions used from the source mesh onto the poison model? Could I use a filter to transfer them? Reproject them?
Or is there another reconstruction method/ process from within Meshlab that will keep the uv’s?
Here is an image of what the resulting uv parameter looks like from David. The uvs are white on the left half of the image.
Thank You,David3D UV Layout Result
Dan
No, in MeshLab there is no direct way to transfer UV mapping between two layers.
This is because UV transfer is not, in the general case, a trivial task. It is not simply a matter of assigning to the new surface the "closest" UV of the original mesh: this would not work on UV discontinuities, which are present in the example you linked. Additionally, the two meshes should be almost coincident, otherwise you would also have problems also in defining the "closest" UV.
There are a couple ways to do it, but require manual work and a re-sampling of the texture:
create a UV mapping of the re-meshed model using whatever tool you may have, then resample the existing texture on the new parametrization using "transfer: vertex attributes to Texture (1 or 2 meshes)", using texture color as source
load the original mesh, and using the screenshot function, create "virtual" photos of the model (turn off illumination and do NOT use ortho views), adding them as raster layers, until the model surface has been fully covered. Load the new model, that should be in the same space, and texture-map it using the "parametrization + texturing " using those registered images
In MeshLab it is also possible to create a new texture from the original images, if you have a way to import the registered cameras...
TL;DR: UV coords to color channels → Vertex Attribute Transfer → Color channels back to UV coords
I have had very good results kludging it through the color channels, like this (say you are transfering from layer A to layer B):
Make sure A and B are roughly aligned with eachother (you can use the ICP filter if needed).
Select layer A, then:
Texture → Convert Per Wedge UV to Per Vertex UV (if you've got wedge coords)
Color Creation → Per Vertex Color Function, and transfer the tex coords to the color channels (assuming UV range 0-1, you'll want to tweak these if your range is larger):
func r = 255.0 * vtu
func g = 255.0 * vtv
func b = 0
Sampling → Vertex Attribute Transfer, and use this to transfer the vertex colors (which now hold texture coordinates) from layer A to layer B.
source mesh = layer A
target mesh = layer B
check Transfer Color
set distance large enough to not miss any spots
Now select layer B, which contains the mapped vertex colors, and do the opposite that you did for A:
Texture → Per Vertex Texture Function
func u = r / 255.0
func v = g / 255.0
Texture → Convert Per Vertex UV to Per Wedge UV
And that's it.
The results aren't going to be perfect, but in practice I often find them sufficient. In particular:
If the texture is not continuously mapped to layer A (e.g. maybe you've got patches of image mapped to certain areas, etc.), it's very possible for the attribute transfer to B (especially when upsampling) to have some vertices be interpolated across patch boundaries, which will probably lead to visual artifacts along patch boundaries.
UV coords may be quantized by conversion to a color channel and back. (You could maybe eliminate this by stretching U out over all three color channels, then transferring U, then repeating for V -- never tried it though.)
That said, there's a lot of cases it works in.
I may or may not add images / video to this post another day.
PS Meshlab is pretty straightforward to build from source; it might be possible to add a UV coordinate option to the Vertex Attribute Transfer filter. But, to make it more useful, you'd want to make sure that you didn't interpolate across boundary edges in the mapped UV projection. Definitely a project I'd like to work on some day... in theory. If that ever happens I'll post a link here.
I currently have 16 tiles, with individual images that make up 1 big map. I pan by transforming right at the beginning before any actual drawing with this:
GL.Translate(G_.Pan(0), G_.Pan(1), 0)
Then I zoom by doing this:
GL.Ortho(-G_.Size * 1.5 ^ G_.ZoomFactor, G_.Size * 1.5 ^ G_.ZoomFactor, G_.Size * 1.5 ^ G_.ZoomFactor, -G_.Size * 1.5 ^ G_.ZoomFactor, -1, 1)
G_.Size is a constant that only varies on startup depending on parameters, zoom factor ranges from -1 to -13
What I want to be able to do is check if 1 of the 16 tiles is within the visible area, so then I stop them drawing when they are not on screen. I had found some quite complex methods for doing it, but it was 3D and seemed like a lot of work for something that should be simple. I would of thought it would of been something like just checking if a point is within the bounds of visible area, but I have no idea on how to get the visible area.
Andon M Coleman already suggested you to implement projection volume culling (a generalized form of frustum culling). This is however outside the scope of OpenGL. You must understand that OpenGL is not a "magical" scene graph that does scene management and the likes. It's mere drawing API; what it does is putting shaded, textured points, lines or triangles on the screen and that's it. The rest is up to you, or the libraries you choose to implement it.
In the case of projection volume culling you're testing if a given piece of geometry intersects with the volume defined by the planes that form the borders of the volume. Your projection matrix defines such planes, specifically it transform the view space vertex position volume into the range [-1;1]×[-1;1]×[0;1] of perspective divided clip space. So by inverting the projection matrix and unprojection the corners of the [-1;1]×[-1;1]×[0;1] cube through that you determine the limiting planes of the projection volume in view space.
You then use that information to intersect your quads with the volume to see if they cross it, i.e. are in any way visible.
How can I do a basic face alignment on a 2-dimensional image with the assumption that I have the position/coordinates of the mouth and eyes.
Is there any algorithm that I could implement to correct the face alignment on images?
Face (or image) alignment refers to aligning one image (or face in your case) with respect to another (or a reference image/face). It is also referred to as image registration. You can do that using either appearance (intensity-based registration) or key-point locations (feature-based registration). The second category stems from image motion models where one image is considered a displaced version of the other.
In your case the landmark locations (3 points for eyes and nose?) provide a good reference set for straightforward feature-based registration. Assuming you have the location of a set of points in both of the 2D images, x_1 and x_2 you can estimate a similarity transform (rotation, translation, scaling), i.e. a planar 2D transform S that maps x_1 to x_2. You can additionally add reflection to that, though for faces this will most-likely be unnecessary.
Estimation can be done by forming the normal equations and solving a linear least-squares (LS) problem for the x_1 = Sx_2 system using linear regression. For the 5 unknown parameters (2 rotation, 2 translation, 1 scaling) you will need 3 points (2.5 to be precise) for solving 5 equations. Solution to the above LS can be obtained through Direct Linear Transform (e.g. by applying SVD or a matrix pseudo-inverse). For cases of a sufficiently large number of reference points (i.e. automatically detected) a RANSAC-type method for point filtering and uncertainty removal (though this is not your case here).
After estimating S, apply image warping on the second image to get the transformed grid (pixel) coordinates of the entire image 2. The transform will change pixel locations but not their appearance. Unavoidably some of the transformed regions of image 2 will lie outside the grid of image 1, and you can decide on the values for those null locations (e.g. 0, NaN etc.).
For more details: R. Szeliski, "Image Alignment and Stitching: A Tutorial" (Section 4.3 "Geometric Registration")
In OpenCV see: Geometric Image Transformations, e.g. cv::getRotationMatrix2D cv::getAffineTransform and cv::warpAffine. Note though that you should estimate and apply a similarity transform (special case of an affine) in order to preserve angles and shapes.
For the face there is lot of variability in feature points. So it won't be possible to do a perfect fit of all feature points by just affine transforms. The only way to align all the points perfectly is to warp the image given the points. Basically you can do a triangulation of image given the points and do a affine warp of each triangle to get the warped image where all the points are aligned.
Face detection could be handled based on the just eye positions.
Herein, OpenCV, Dlib and MTCNN offers to detect faces and eyes. Besides, it is a python based framework but deepface wraps those methods and offers an out-of-the box detection and alignment function.
detectFace function applies detection and alignment in the background respectively.
#!pip install deepface
from deepface import DeepFace
backends = ['opencv', 'ssd', 'dlib', 'mtcnn']
DeepFace.detectFace("img.jpg", detector_backend = backends[0])
Besides, you can apply detection and alignment manually.
from deepface.commons import functions
img = functions.load_image("img.jpg")
backends = ['opencv', 'ssd', 'dlib', 'mtcnn']
detected_face = functions.detect_face(img = img, detector_backend = backends[3])
plt.imshow(detected_face)
aligned_face = functions.align_face(img = img, detector_backend = backends[3])
plt.imshow(aligned_face)
processed_img = functions.detect_face(img = aligned_face, detector_backend = backends[3])
plt.imshow(processed_img)
There's a section Aligning Face Images in OpenCV's Face Recognition guide:
http://docs.opencv.org/trunk/modules/contrib/doc/facerec/facerec_tutorial.html#aligning-face-images
The script aligns given images at the eyes. It's written in Python, but should be easy to translate to other languages. I know of a C# implementation by Sorin Miron:
http://code.google.com/p/stereo-face-recognition/
Using shader model 5/D3D11/HLSL.
I'd like to treat a 2D array of texels as a 2D matrix of Vectors.
u
v (1,4,3,9) (7, 5.5, 4.9, 2.1)
(Each texel is a 4-component vector). I need to access specific ranges of the data in the texture, for different shaders. So, the ranges to access in the texture naturally should be indexed as u,v components.
How would I do that in HLSL? I'm thinking the following:
Create the texture as per normal
Load your vector values into the texture (1 vector per texel)
Turn off all linear interpolation for texture sampling ("nearest neighbour")
In the shader, look up vectors you need using texture coordinates
The only thing I feel is shaky is whether there will be strange errors introduced when I index the texture using floating point u's and v's.
If the texture is 1024x1024 texels, and I'm trying to index (3,2)->(3,7), that would be u=(3/1024,2/1024)->(3/1024,7/1024) which feels a bit shaky. Is there a way to index the texture by int components, perhaps? Or will it just work out fine?
Texture2DArray
Not desiring to use a GPGPU framework just for this (so no CUDA suggestions pls :).
You can do it using operator[] in hlsl 5.0
See here