World space to screen space (perspective projection) - camera

I'm using a 3d engine and need to translate between 3d world space and 2d screen space using perspective projection, so I can place 2d text labels on items in 3d space.
I've seen a few posts of various answers to this problem but they seem to use components I don't have.
I have a Camera object, and can only set it's current position and lookat position, it cannot roll. The camera is moving along a path and certain target object may appear in it's view then disappear.
I have only the following values
lookat position
position
vertical FOV
Z far
Z near
and obviously the position of the target object.
Can anyone please give me an algorithm that will do this using just these components?
Many thanks.

all graphics engines use matrices to transform between different coordinats systems. Indeed OpenGL and DirectX uses them, because they are the standard way.
Cameras usually construct the matrices using the parameters you have:
view matrix (transform the world to position in a way you look at it from the camera position), it uses lookat position and camera position (also the up vector which usually is 0,1,0)
projection matrix (transforms from 3D coordinates to 2D Coordinates), it uses the fov, near, far and aspect.
You could find information of how to construct the matrices in internet searching for the opengl functions that create them:
gluLookat creates a viewmatrix
gluPerspective: creates the projection matrix
But I cant imagine an engine that doesnt allow you to get these matrices, because I can ensure you they are somewhere, the engine is using it.
Once you have those matrices, you multiply them, to get the viewprojeciton matrix. This matrix transform from World coordinates to Screen Coordinates. So just multiply the matrix with the position you want to know (in vector 4 format, being the 4ยบ component 1.0).
But wait, the result will be in homogeneous coordinates, you need to divide X,Y,Z of the resulting vector by W, and then you have the position in Normalized screen coordinates (0 means the center, 1 means right, -1 means left, etc).
From here it is easy to transform multiplying by width and height.
I have some slides explaining all this here: https://docs.google.com/presentation/d/13crrSCPonJcxAjGaS5HJOat3MpE0lmEtqxeVr4tVLDs/present?slide=id.i0
Good luck :)
P.S: when you work with 3D it is really important to understand the three matrices (model, view and projection), otherwise you will stumble every time.

so I can place 2d text labels on items
in 3d space
Have you looked up "billboard" techniques? Sometimes just knowing the right term to search under is all you need. This refers to polygons (typically rectangles) that always face the camera, regardless of camera position or orientation.

Related

Extracting a plane from an image taken by a camera

I have a camera at a known fixed location and orientation.
I also have a plane at a known location whose z position changes.
I want to turn the image from the camera into a top down view of the plane.
I can do this without knowing any positions by using the 4 points of the plane for a homography matrix and warping the image but each time the plane moves in Z I have to repeat this process.
After searching around online most methods seem to center on finding features of the image (using SIFT or something like it) then computing a homography matrix.
With the problem so constrained I thought there may be a simple linear algebra based approach.

Camera Frustum Planes in Unity 3D

I understand that CalculateFrustumPlanes() in Unity3D returns an array of Plane objects, each representing a different frustum plane, but I can't find any documentation to suggest which element is which?
for example
[0] = Front
[1] = Back
etc.
I need to calculate whether a point in space (like the centre point of a Bounding volume) is in the camera frustum, for a Quad tree system.
What is exactly the order of the Planes in the returned array is not documented (and I don't know it).
Anyway I think you can figure it out without much effort: you just need to put the camera in a well know orientation and check the normal value's of each Plane.
I need to calculate whether a point in space (like the centre point of
a Bounding volume) is in the camera frustum, for a Quad tree system.
For a Quad Tree system, I think the intersection between the frustum and a GameObject's AABB is enough, so you don't even need to know exactly the order of the Plane's in the array to compute it. You can just use GeometryUtility.TestPlanesAABB.
Order: left, right, bottom, top, near, far.

Creating seamless worldmaps with Fractal Brownian Motion

I'm creating heightmaps using Fractal Brownian Motion. I'm then coloring it based on the heights and mapping it to a sphere. My problem is that the heightmap doesn't wrap seamlessly. I've used the Diamond Square algorithm and it's pretty easy to make things seamless using it, but I can't seem to figure out how to do it with fBm and I seem to be having trouble finding an explanation for it on the web.
To clarify, by "seamless", I mean that when I map it to a sphere, it creates a seamless map on the sphere.
Instead of calculating the heightmap per pixel on the heightmap, calculate the heightmap in 3D space based on each point on the sphere and then map that to an image pixel. You're going to have trouble wrapping a 2D, rectangular heightmap like that onto a sphere without getting ugly results at the poles unless you start your calculations from the sphere.
fBM generalizes to 3 dimensions, so given a point on the sphere you can get the height at that point, and then you can do the math to map that value to where it should be stored in the heightmap image.
Or you could use one of the traditional map projections. A cylindrical projection (x, y)->(x, sin y) would give you a seam of just one meridian, which you could rotate to the back. Or you could "antialias" the edge by one or another means.
With a stereographic projection (x,y,z)->(x/(z+1),y/(z+1)), there's only one sour point (the projection point itself).

Simple algorithm for tracking a rectangular blob

I have created an experimental fast rectangular object tracking system; it will be used for headtracking and controllling objects in 3D engine (Ogre3D).
For now I am able to show to the webcam any kind of bright colored rectangle (text markers are good objects) and system registers basic properties of this object (hue/value/lightness and initial width and height in 0 degrees rotation).
After I have registered the trackable object, I do some simple frame processing to create grayscale probabilty map.
So now I have 2 known things:
1) 4 corners for the last object position (it's always a rectangle but it may be rotated)
2) a pretty rectangular (but still far from perfect) blob which is the brightest in the frame. I can get coordinates of any point of the blob without problems, point detection is stable enough.
I can find a bounding rectangle of the object without problems, but I have a problem with detecting the object corners themselves.
I need the simplest possible (quick&dirty would be great) algorithm to scan the image starting with some known coordinates (a point inside the blob) and detect new 4 x,y coordinates of a "blobish" rectangle corners (not corners of a bounding box but corners of the rectangular blob itself).
Ready-to-use C++ function would be awesome, but somehow google doesn't like me today :(
I think that it would be overkill to use some complicated function form OpenCV library just to extract 4 points of a single rectanglular blob. But if you know a quick and efficient way how to do it using OpenCV (it must be real-time and light on CPU because I'll run the 3D engine at the same time) then I would be really grateful.
You can apply Hough transform on segmented image to detect lines. Using detected lines you can calculate their intersection to find the corner coordinates of the blob.

How to render a 2d side-scroller game

I do not really understand the way I'm suppose to render a side-scroller? How do I know what to render when my character move? What kind of positionning should I use for the characters?
I hope my question is clear
The easiest way i've found to do it is have a characterX and characterY variable [integer or float, whatever you want] Then have a cameraX and cameraY variable. Every object in the scene is drawn at theObjectX-cameraX, theObjectY-cameraY...
CameraX/CameraY are tweened by a similar-to-midpoint formula so eventually they'll reach playerx/playery[Cx = (Cx*99+Px)/100] ... yeah
By doing this, every object moves in the stage's space, and is transformed only on render [saving you from headaches]
Use a matrix to define a camera reference frame.
Use space partitioning to split up your level into screens/windows.
Think of your player sprite as any other entity, like enemies and interactive objects.
Now what you want is the abstraction of a camera. You can define a camera as a 3x3 matrix with this layout:
[rotX_X, rotY_X, 0]
[rotX_Y, rotY_Y, 0]
[transX, transY, 1]
The 2x2 sub-matrix in the top-left corner is a rotation matrix. transX and transY defines the translation part, i.e the origin. You also get scaling for free. Just simply scale the rotation part with a scalar, and you have yourself a zoom.
For this to work properly with rotation, your sprites need to be polygons/primitives, say like triangles or quads; you can't just apply the matrix to the positions of the sprites when drawing. If you don't need rotation, just transforming the center point will work fine.
If you want the camera to follow the player, use the player's position as the camera origin. That is the translation vector [transX, transY]
So how do you apply the matrix to entity positions and model vertices? You do a vector-matrix multiplication.
v' = vM^-1, where v' is the new vector, v is the old vector, and M^-1 is the matrix inverse. A camera needs to be an inverse transform because it defines a local coordinate system. An analogy could be: If you are in front of me and I turn left from my reference frame, I am turning your right. This applies to all affine and linear transformations, like scaling, rotation and translation.
Split up your level into sub-parts so you can cull objects and scenery which does not need to be rendered. Your viewport is of a certain size/resolution. Only render scenery and entities which intersect with your viewport. Instead of checking each and every entity against the viewport bounds, assign each entity to a certain sub-screen and test the bounds of the sub-screen against the viewport and camera bounds. If your divide your levels into parts which are the same size as your viewport, then the maximum number of screens visible
at any particular time is:
2 if your camera only scrolls left and right.
4 if your camera scrolls left, right, up and down.
4 if your camera scrolls in any direction, and additionally can be rotated.
A screen-change is an event you can use to activate entities belonging to that screen. That could be enemies, background animations, doors or whatever you like.
If this is your first foray into writing a side-scroller, I'd suggest considering using an already existing game engine (like Construct or Gamemaker or XNA or whatever fits your experience level) so you don't have to worry about what order to render things and how to make it all work. Mess with that a bit--probably exploring a few of them--to get a feel for how they do things then venture out to your own once you've gotten used to it.
Not that there's anything wrong with baptism by fire but it can get pretty overwhelming in my opinion.