Relation between horizontal, vertical and diagonal Field-of-View - camera

Is there a mathematical relation between those values? If I know hFOV and vFOV can I calculate the diagonal FOV without involving other values like focal lengths etc?
My first thought was to use Pythagorean theorem but maybe it's wrong.

The physical quantities of interest are the sensor size and the focal length. The latter, in the pinhole camera model, is the the distance between the camera center and the image plane. Therefore, if you denote with f the focal length (in mm), W and H respectively the image sensor width and height (in mm), and assume the focal axis is orthogonal to the image plane, by simple trigonometry it is:
FOV_Horizontal = 2 * atan(W/2/f) = 2 * atan2(W/2, f) radians
FOV_Vertical = 2 * atan(H/2/f) = 2 * atan2(H/2, f) radians
FOV_Diagonal = 2 * atan2(sqrt(W^2 + H^2)/2, f) radians
Note that, if you have the sensor size and horizontal or vertical fov's, you can solve one of the first two equations for f and plug it into the third one to get the diagonal fov.
When, as is usual, the focal length is estimated through camera calibration, and is expressed in pixels, the above expressions need some adapting.
Denote with K the 3x3 camera matrix, with the camera frame having its origin at the camera center (focal point), X axis oriented left-to-right, Y axis top-to-bottom and Z axis toward the scene. Let Wp and Hp respectively be the width and height of the image in pixels.
In the simplest case the focal axis is orthogonal to the image plane (K12 = 0), the pixels are square (K11 = K22), and the principal point is at the image center (K13 = Wp/2; K23 = Hp/2). Then the same equations as above apply, replacing W with Wp, H with Hp and f with K11.
A lil more complex is the case just as above, but with the principal point off-center. Then one simply adds the two sides of each FOV angle. So, for example:
FOV_Horizontal = atan2(Wp/2 - K13, K11) + atan2(Wp/2 + K13, K11)
If the pixels are not square the same expressions apply for FOV_vertical, but using K22 and Hp, etc. The diagonal is a tad trickier, since you need to "convert" the image height into the same units as the width. Use the "pixel aspect ratio" PAR=K22/K11 for this purpose, so that:
FOV_Diagonal = 2 * atan2(sqrt(Wp^2 + (Hp/PAR)^2) / 2, K11)

Related

How can I implement degrees round the drawn compass

I am developing a GPS waypoint application. I have started by drawing my compass but am finding it difficult to implement degree text around the circle. Can anyone help me with a solution? The compass image am working on] 1 here shows the circle of the compass I have drawn.
This image here shows what I want to achieve, that is implementing degree text round the compass [Image of what I want to achieve] 2
Assuming you're doing this in a custom view, you need to use one of the drawText methods on the Canvas passed in to onDraw.
You'll have to do a little trigonometry to get the x, y position of the text - basically if there's a circle with radius r you're placing the text origins on (i.e. how far out from the centre they are), and you're placing one at angle θ:
x = r * cosθ
y = r * sinθ
The sin and cos functions take a value in radians, so you'll have to convert that if you're using degrees:
val radians = (degrees.toDouble() / 360.0) * (2.0 * Math.PI)
and 0 degrees is at 3 o'clock on the circle, not 12, so you'll have to subtract 90 degrees from your usual compass positions (e.g. 90 degrees on the compass is 0 degrees in the local coordinates). The negative values you get are fine, -90 is the same as 270. If you're trying to replicate the image you posted (where the numbers and everything else are rotating while the needle stays at the top) you'll have to apply an angle offset anyway!
These x and y values are distance from the centre of the circle, which probably needs to be the centre of your view (which you've probably already calculated to draw your circle). You'll also need to account for the extra space you need to draw those labels, scaling everything so it all fits in the View

Convert grid of dots in XY plane from camera coordinates to real world coordinates

I am writing a program. I have, say, a grid of dots on a piece of paper. I fix one end and bend the paper toward the screen, giving me a trapezoidal shape from the camera's point of view. I have the (x,y) camera coordinate of each dot. Is there a simple way I can change these (x,y) to real life (x,y) which should give me a rectangle? I have the camera/real (x,y) of the original flat sheet of paper pre-bend if that helps.
I have looked at 3D Camera coordinates to world coordinates (change of basis?) and Transforming screen coordinates from security camera to real world coordinates.
Look up "homography". The transformation from a plane in 3D space to its image as captured by an ideal pinhole camera is a homography. It can be represented as a 3x3 matrix H that transforms the 3D coordinates X of points in the world to their corresponding homogeneous image coordinates x:
x = H * X
where X is a 3x1 vector of the world point coordinates, and x = [u, v, w]^T is the image point in homogeneous coordinates.
Given a minimum of 4 matches between world and image points (e.g. the corners of a rectangle) you can estimate the parameters of the matrix H. For details, look up "DLT algorithm". In OpenCV the routine to use is findHomography.

Initial velocity vector for circular orbit

I'm trying to create a solar system simulation, and I'm having problems trying to figure out initial velocity vectors for random objects I've placed into the simulation.
Assume:
- I'm using Gaussian grav constant, so all my units are AU/Solar Masses/Day
- Using x,y,z for coordinates
- One star, which is fixed at 0,0,0. Quasi-random mass is determined for it
- I place a planet, at a random x,y,z coordinate, and its own quasi-random mass determined.
Before I start the nbody loop (using RK4), I would like the initial velocity of the planet to be such that it has a circular orbit around the star. Other placed planets will, of course, pull on it once the simulation starts, but I want to give it the chance to have a stable orbit...
So, in the end, I need to have an initial velocity vector (x,y,z) for the planet that means it would have a circular orbit around the star after 1 timestep.
Help? I've been beating my head against this for weeks and I don't believe I have any reasonable solution yet...
It is quite simple if you assume that the mass of the star M is much bigger than the total mass of all planets sum(m[i]). This simplifies the problem as it allows you to pin the star to the centre of the coordinate system. Also it is much easier to assume that the motion of all planets is coplanar, which further reduces the dimensionality of the problem to 2D.
First determine the magnitude of the circular orbit velocity given the magnitude of the radius vector r[i] (the radius of the orbit). It only depends on the mass of the star, because of the above mentioned assumption: v[i] = sqrt(mu / r[i]), where mu is the standard gravitational parameter of the star, mu = G * M.
Pick a random orbital phase parameter phi[i] by sampling uniformly from [0, 2*pi). Then the initial position of the planet in Cartesian coordinates is:x[i] = r[i] * cos(phi[i]) y[i] = r[i] * sin(phi[i])
With circular orbits the velocity vector is always perpendicular to the radial vector, i.e. its direction is phi[i] +/- pi/2 (+pi/2 for counter-clockwise (CCW) rotation and -pi/2 for clockwise rotation). Let's take CCW rotation as an example. The Cartesian coordinates of the planet's velocity are:vx[i] = v[i] * cos(phi[i] + pi/2) = -v[i] * sin(phi[i])vy[i] = v[i] * sin(phi[i] + pi/2) = v[i] * cos(phi[i])
This easily extends to coplanar 3D motion by adding z[i] = 0 and vz[i] = 0, but it makes no sense, since there are no forces in the Z direction and hence z[i] and vz[i] would forever stay equal to 0 (i.e. you will be solving for a 2D subspace problem of the full 3D space).
With full 3D simulation where each planet moves in a randomly inclined initial orbit, one can work that way:
This step is equal to step 1 from the 2D case.
You need to pick an initial position on the surface of the unit sphere. See here for examples on how to do that in a uniformly random fashion. Then scale the unit sphere coordinates by the magnitude of r[i].
In the 3D case, instead of two possible perpendicular vectors, there is a whole tangential plane where the planet velocity lies. The tangential plane has its normal vector collinear to the radius vector and dot(r[i], v[i]) = 0 = x[i]*vx[i] + y[i]*vy[i] + z[i]*vz[i]. One could pick any vector that is perpendicular to r[i], for example e1[i] = (-y[i], x[i], 0). This results in a null vector at the poles, so there one could pick e1[i] = (0, -z[i], y[i]) instead. Then another perpendicular vector can be found by taking the cross product of r[i] and e1[i]:e2[i] = r[i] x e1[i] = (r[2]*e1[3]-r[3]*e1[2], r[3]*e1[1]-r[1]*e1[3], r[1]*e1[2]-r[2]*e1[1]). Now e1[i] and e2[i] can be normalised by dividing them by their norms:n1[i] = e1[i] / ||e1[i]||n2[i] = e2[i] / ||e2[i]||where ||a|| = sqrt(dot(a, a)) = sqrt(a.x^2 + a.y^2 + a.z^2). Now that you have an orthogonal vector basis in the tangential plane, you can pick one random angle omega in [0, 2*pi) and compute the velocity vector as v[i] = cos(omega) * n1[i] + sin(omega) * n2[i], or as Cartesian components:vx[i] = cos(omega) * n1[i].x + sin(omega) * n2[i].xvy[i] = cos(omega) * n1[i].y + sin(omega) * n2[i].yvz[i] = cos(omega) * n1[i].z + sin(omega) * n2[i].z.
Note that by construction the basis in step 3 depends on the radius vector, but this does not matter since a random direction (omega) is added.
As to the choice of units, in simulation science we always tend to keep things in natural units, i.e. units where all computed quantities are dimensionless and kept in [0, 1] or at least within 1-2 orders of magnitude and so the full resolution of the limited floating-point representation could be used. If you take the star mass to be in units of Solar mass, distances to be in AUs and time to be in years, then for an Earth-like planet at 1 AU around a Sun-like star, the magnitude of the orbital velocity would be 2*pi (AU/yr) and the magnitude of the radius vector would be 1 (AU).
Just let centripetal acceleration equal gravitational acceleration.
m1v2 / r = G m1m2 / r2
v = sqrt( G m2 / r )
Of course the star mass m2 must be much greater than the planet mass m1 or you don't really have a one-body problem.
Units are a pain in the butt when setting up physics problems. I've spent days resolving errors in seconds vs timestep units. Your choice of AU/Solar Masses/Day is utterly insane. Fix that before anything else.
And, keep in mind that computers have inherently limited precision. An nbody simulation accumulates integration error, so after a million or a billion steps you will certainly not have a circle, regardless of the step duration. I don't know much about that math, but I think stable n-body systems keep themselves stable by resonances which absorb minor variations, whether introduced by nearby stars or by the FPU. So the setup might work fine for a stable, 5-body problem but still fail for a 1-body problem.
As Ed suggested, I would use the mks units, rather than some other set of units.
For the initial velocity, I would agree with part of what Ed said, but I would use the vector form of the centripetal acceleration:
m1v2/r r(hat) = G m1 m2 / r2 r(hat)
Set z to 0, and convert from polar coordinates to cartesian coordinates (x,y). Then, you can assign either y or x an initial velocity, and compute what the other variable is to satisfy the circular orbit criteria. This should give you an initial (Vx,Vy) that you can start your nbody problem from. There should also be quite a bit of literature out there on numerical recipes for nbody central force problems.

Extract transform and rotation matrices from homography?

I have 2 consecutive images from a camera and I want to estimate the change in camera pose:
I calculate the optical flow:
Const MAXFEATURES As Integer = 100
imgA = New Image(Of [Structure].Bgr, Byte)("pic1.bmp")
imgB = New Image(Of [Structure].Bgr, Byte)("pic2.bmp")
grayA = imgA.Convert(Of Gray, Byte)()
grayB = imgB.Convert(Of Gray, Byte)()
imagesize = cvGetSize(grayA)
pyrBufferA = New Emgu.CV.Image(Of Emgu.CV.Structure.Gray, Byte) _
(imagesize.Width + 8, imagesize.Height / 3)
pyrBufferB = New Emgu.CV.Image(Of Emgu.CV.Structure.Gray, Byte) _
(imagesize.Width + 8, imagesize.Height / 3)
features = MAXFEATURES
featuresA = grayA.GoodFeaturesToTrack(features, 0.01, 25, 3)
grayA.FindCornerSubPix(featuresA, New System.Drawing.Size(10, 10),
New System.Drawing.Size(-1, -1),
New Emgu.CV.Structure.MCvTermCriteria(20, 0.03))
features = featuresA(0).Length
Emgu.CV.OpticalFlow.PyrLK(grayA, grayB, pyrBufferA, pyrBufferB, _
featuresA(0), New Size(25, 25), 3, _
New Emgu.CV.Structure.MCvTermCriteria(20, 0.03D),
flags, featuresB(0), status, errors)
pointsA = New Matrix(Of Single)(features, 2)
pointsB = New Matrix(Of Single)(features, 2)
For i As Integer = 0 To features - 1
pointsA(i, 0) = featuresA(0)(i).X
pointsA(i, 1) = featuresA(0)(i).Y
pointsB(i, 0) = featuresB(0)(i).X
pointsB(i, 1) = featuresB(0)(i).Y
Next
Dim Homography As New Matrix(Of Double)(3, 3)
cvFindHomography(pointsA.Ptr, pointsB.Ptr, Homography, HOMOGRAPHY_METHOD.RANSAC, 1, 0)
and it looks right, the camera moved leftwards and upwards:
Now I want to find out how much the camera moved and rotated. If I declare my camera position and what it's looking at:
' Create camera location at origin and lookat (straight ahead, 1 in the Z axis)
Location = New Matrix(Of Double)(2, 3)
location(0, 0) = 0 ' X location
location(0, 1) = 0 ' Y location
location(0, 2) = 0 ' Z location
location(1, 0) = 0 ' X lookat
location(1, 1) = 0 ' Y lookat
location(1, 2) = 1 ' Z lookat
How do I calculate the new position and lookat?
If I'm doing this all wrong or if there's a better method, any suggestions would be very welcome, thanks!
For pure camera rotation R = A-1HA. To prove this consider image to plane homographies H1=A and H2=AR, where A is camera intrinsic matrix. Then H12=H2*H1-1=A-1RA, from which you can obtain R
Camera translation is harder to estimate. If the camera translates you have to a find fundamental matrix first (not homography): xTFx=0 and then convert it into an essential matrix E=ATFA; Then you can decompose E into rotation and translation E=txR, where tx means a vector product matrix. Decomposition is not obvious, see this.
The rotation you get will be exact while the translation vector can be found only up to scale. Intuitively this scaling means that from the two images alone you cannot really say whether the objects are close and small or far away and large. To disambiguate we may use a familiar size objects, known distance between two points, etc.
Finally note that a human visual system has a similar problem: though we "know" the distance between our eyes, when they are converged on the object the disparity is always zero and from disparity alone we cannot say what the distance is. Human vision relies on triangulation from eyes version signal to figure out absolute distance.
Well what your looking at is in simple terms a Pythagorean theorem problem a^2 + b^2 = c^2. However when it comes to camera based applications things are not very easy to accurately determine. You have found half of the detail you need for "a" however finding "b" or "c" is much harder.
The Short Answer
Basically it can't be done with a single camera. But it can be with done with two cameras.
The Long Winded Answer (Thought I'd explain in more depth, no pun intended)
I'll try and explain, say we select two points within our image and move the camera left. We know the distance from the camera of each point B1 is 20mm and point B2 is 40mm . Now lets assume that we process the image and our measurement are A1 is (0,2) and A2 is (0,4) these are related to B1 and B2 respectively. Now A1 and A2 are not measurements; they are pixels of movement.
What we now have to do is multiply the change in A1 and A2 by a calculated constant which will be the real world distance at B1 and B2. NOTE: Each one these is different according to measurement B*. This all relates to Angle of view or more commonly called the Field of View in photography at different distances. You can accurately calculate the constant if you know the size of each pixel on the camera CCD and the f number of the lens you have inside the camera.
I would expect this isn't the case so at different distances you have to place an object of which you know the length and see how many pixels it takes up. Close up you can use a ruler to make things easier. With these measurements. You take this data and form a curve with a line of best fit. Where the X-axis will be the distance of the object and the Y-axis will be the constant of pixel to distance ratio that you must multiply your movement by.
So how do we apply this curve. Well it's guess work. In theory the larger the measurement of movement A* the closer the object to the camera. In our example our ratios for A1 > A2 say 5mm and 3mm respectively and we would now know that point B1 has moved 10mm (2x5mm) and B2 has moved 6mm (2x6mm). But let's face it - we will never know B and we will never be able to tell if a distance moved is 20 pixels of an object close up not moving far or an object far away moving a much great distance. This is why things like the Xbox Kinect use additional sensors to get depth information that can be tied to the objects within the image.
What you attempting could be attempted with two cameras as the distance between these cameras is known the movement can be more accurately calculated (effectively without using a depth sensor). The maths behind this is extremely complex and I would suggest looking up some journal papers on the subject. If you would like me to explain the theory, I can attempt to.
All my experience comes from designing high speed video acquisition and image processing for my PHD so trust me, it can't be done with one camera, sorry. I hope some of this helps.
Cheers
Chris
[EDIT]
I was going to add a comment but this is easier due to the bulk of information:
Since it is the Kinect I will assume you have some relevant depth information associated with each point if not you will need to figure out how to get this.
The equation you will need to start of with is for the Field of View (FOV):
o/d = i/f
Where:
f is equal to the focal length of the lens usually given in mm (i.e. 18 28 30 50 are standard examples)
d is the object distance from the lens gathered from kinect data
o is the object dimension (or "field of view" perpendicular to and bisected by the optical axis).
i is the image dimension (or "field stop" perpendicular to and bisected by the optical axis).
We need to calculate i, where o is our unknown so for i (which is a diagonal measurement),
We will need the size of the pixel on the ccd this will in micrometres or µm you will need to find this information out, For know we will take it as being 14um which is standard for a midrange area scan camera.
So first we need to work out i horizontal dimension (ih) which is the number of pixels of the width of the camera multiplied by the size of the ccd pixel (We will use 640 x 320)
so: ih = 640*14um = 8960um
= 8960/1000 = 8.96mm
Now we need i vertical dimension (iv) same process but height
so: iv = (320 * 14um) / 1000 = 4.48mm
Now i is found by Pythagorean theorem Pythagorean theorem a^2 + b^2 = c^2
so: i = sqrt(ih^2 _ iv^2)
= 10.02 mm
Now we will assume we have a 28 mm lens. Again, this exact value will have to be found out. So our equation is rearranged to give us o is:
o = (i * d) / f
Remember o will be diagonal (we will assume of object or point is 50mm away):
o = (10.02mm * 50mm) / 28mm
17.89mm
Now we need to work out o horizontal dimension (oh) and o vertical dimension (ov) as this will give us the distance per pixel that the object has moved. Now as FOV α CCD or i is directly proportional to o we will work out a ratio k
k = i/o
= 10.02 / 17.89
= 0.56
so:
o horizontal dimension (oh):
oh = ih / k
= 8.96mm / 0.56 = 16mm per pixel
o vertical dimension (ov):
ov = iv / k
= 4.48mm / 0.56 = 8mm per pixel
Now we have the constants we require, let's use it in an example. If our object at 50mm moves from position (0,0) to (2,4) then the measurements in real life are:
(2*16mm , 4*8mm) = (32mm,32mm)
Again, a Pythagorean theorem: a^2 + b^2 = c^2
Total distance = sqrt(32^2 + 32^2)
= 45.25mm
Complicated I know, but once you have this in a program it's easier. So for every point you will have to repeat at least half the process as d will change on therefore o for every point your examining.
Hope this gets you on your way,
Cheers
Chris

kinect object measuring

I am currently trying to figure out a way to calcute the size of a given object with kinect
since I have the following data
angular field of view of the lens
distance
and width in pixels from a 800*600 resolution
I believe this can be possible to calculate. Does anyone has math skills to give me a little help?
With some trigonometry, it should be possible to approximate.
If you draw a right trangle ABC, with the camera at one of the legs (A), and the object at the far end (edge BC), where the right angle is (C), then the height of the object is going to be the height of leg BC. the distance to the pixel might be the distance of leg AC or AB. The Kinect sensor specifications are going to regulate that. If you get distance to the center of a pixel, then it will be AC. if you have distances to pixel corners then the distance will be AB.
With A representing the angle at the camera that the pixel takes up, d is the distance of the hypotenuse of a right angle and y is the distance of the far leg (edge BC):
sin(A) = y / d
y = d sin(A)
y is the length of the pixel projected into the object plane. You calculate it by multiplying the sin of the angel by the distance to the object.
Here I confess I do not know the API of the kinect, and what level of detail it provides. You say you have the angle of the field of vision. You might assume each pixel of your 800x600 pixel grid takes up an equal angle of your camera's field of vision. If you do, then you can break up that field of vision into equal pieces to measure the linear size of your object in each pixel.
You also mentioned that you have the distance to the object. I was assuming that you have a distance map for each pixel of the 800x600 grid. If this is incorrect, some calculations can be done to approximate a distance grid for the pixels involving the object of interest if you make some assumptions about the object being measured.