What is "chroma shift" in vpx_image_t struct of libvpx? - libvpx

libvpx codec operations use vpx_image_t structure for exchanging uncompressed frame data.
I got through understanding what majority of the members mean, but I'm stuck with x_chroma_shift and y_chroma_shift. The only explanation provided in the documentation is that it's "sub-sampling order". I am a newbie in YUV image formats, but I believe I understand what chroma sub-sampling is, but I can't quite figure out what does order of it mean.

Consider (w, h) YUV image (w and h are even numbers). Y plane size is also (w, h) but U/V plane size is (w << x_chroma_shift, h << y_chroma_shift) which is equivalent to (w / (1 << x_chroma_shift), h / (1 << y_chroma_shift)). Different chroma shift combinations define different YUV sub-samplings:
YUV | x_chroma_shift | y_chroma_shift
======+================+===============
4:2:0 | 1 | 1
4:2:2 | 1 | 0
4:4:4 | 0 | 0

Related

How to extend the polygon to a certain distance?

How to extend the polygon to a certain distance?
I create a convex hull around the multipoint. But I need to extend the range to several kilometers. At least in theory.
http://img.radiokot.ru/files/21274/1oykzc5pez.png
Assuming that you're able to get a convex hull (which maybe you're using ConvexHullAggregate!), STBuffer() should do what you want.
declare #hull geography = «your value here»;
select #hull.STBuffer(10000); -- 10 km buffer
NB: the 10000 may need to change based on the SRID that you're using since SRIDs have units of distance baked into them inherently. But SRID 4326 is what's used in the docs most often and the native unit for that SRID is meters. So 10 km → 10000 m.
Build outer bisector vector in every vertex (as sum of normalized normals na and nb of two neighbor edges) and normalize it
bis = na + nb
bis = bis / Length(bis)
Make length of bisector to provide needed distance as
l = d / Cos(fi/2)
where d is offset, and fi is angle between vectors na and nb.
fi = atan2(crossproduct(na,nb), dotproduct(na,nb))
or without trigonometric functions:
l = d / Sqrt((1 + dotproduct(na,nb))/2)
And find offset polygon vertex:
P' = P + l * bis

Understanding Quaternion equivalence to matrix transform

A quaternion is obviously equivalent to a rotation matrix, but a 4x4 matrix does more than just rotation. It also does translation and scaling.
A matrix for affine transforms can actually be represented with 12 elements because the last row is constant:
a b c d
e f g h
i j k l
0 0 0 1
x' = a*x + b*y + c*z + d
y' = e*x + f*y + g*z + h
z' = i*x + j*y + k*z + l
A full transform therefore takes 9 multiplies and 9 adds.
For the three affine transforms: rotation, scale, and translation I would like to know if a quaternion-based system is competitive. I have looked all over and have not found one anywhere.
Given a quaternion p = (w,x,y,z)
For rotation, q' = pqp'. I could add a translation vector: t=(tx,ty,tz)
q' = pqp' + t
That is just 7 elements as compared to 12 with matrices, though it is slightly more operations.
That still does not support scaling though. Is there a complete equivalent?
Note: If the only answer is to convert the rotation to a matrix, then that is not really an answer. The question is whether a quaternion system can perform affine transform without matrices.
If there is an equivalence, can anyone point me to a java or c++ class so I can see how this works?

Explanation of Processing Image to Byte Array

Can someone explain me how an image converted to byte array?
I need the theory.
I want to use the image for AES encryption {VB .Net), so after I use OpenFile Dialog, my app will load the image and then process it into byte array, but I need the explanation for that process (how pixels turn into byte array)
Thanks for the answer and sorry for the beginner question.
Reference link accepted :)
When you read the bytes from the image file via File.ReadAllBytes(), their meaning depends on the image's file format.
The image file format (e.g. Bitmap, PNG, JPEG2000) defines how pixel values are converted to bytes, and conversely, how you get pixel values back from bytes.
The PNG and JPEG formats are compressed formats, so it would be difficult for you to write code to do that. For Bitmaps, it would be rather easy because it's a simple format. (See Wikipedia.)
But it's much simpler. You can just use .NET's Bitmap class to load any common image file into memory and then use Bitmap.GetPixel() to access pixels via their x,y coordinates.
Bitmap.GetPixel() is slow for larger images, though. To speed this up, you'll want to access the raw representation of the pixels directly in memory. No matter what kind of image you load with the Bitmap class, it always creates a Bitmap representation for it in memory. Its exact layout depends on Bitmap.PixelFormat. You can access it using a pattern like this. The work flow would be:
Copy memory bitmap to byte array using Bitmap.LockBits() and Marshal.Copy().
Extract R, G, B values from byte array using e.g. this formula in case of PixelFormat.RGB24:
// Access pixel at (x,y)
B = bytes[bitmapData.Scan0 + x * 3 + y * bitmapData.Stride + 0]
G = bytes[bitmapData.Scan0 + x * 3 + y * bitmapData.Stride + 1]
R = bytes[bitmapData.Scan0 + x * 3 + y * bitmapData.Stride + 2]
Or for PixelFormat.RGB32:
// Access pixel at (x,y)
B = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 0]
G = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 1]
R = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 2]
A = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 3]
Each Pixel is a byte and the image is made by 3 or 4 bytes, depending of its pattern. Some images has 3 bytes per pixel (related to Red, Greed and Blue), other formats may require 4 bytes (ALpha Channel, R, G and B).
You may use something like:
Dim NewByteArray as Byte() = File.ReadAllbytes("c:\folder\image")
The NewByteArray will be fulfilled with every byte of image and you need to process them using AES, regardless of its position or meaning.

Finding out Force from Torque and Distance

I have solid object that is spinning with a torque W, and I want to calculate the force F applied on a certain point that's D units away from the center of the object. All these values are represented in Vector3 format (x, y, z)
I know until now that W = D x F, where x is the cross product, so by expanding this I get:
Wx = Dy*Fz - Dz*Fy
Wy = Dz*Fx - Dx*Fz
Wz = Dx*Fy - Dy*Fx
So I have this equation, and I need to find (Fx, Fy, Fz), and I'm thinking of using the Simplex method to solve it.
Since the F vector can also have negative values, I split each F variable into 2 (F = G-H), so the new equation looks like this:
Wx = Dy*Gz - Dy*Hz - Dz*Gy + Dz*Hy
Wy = Dz*Gx - Dz*Hx - Dx*Gz + Dx*Hz
Wz = Dx*Gy - Dx*Hy - Dy*Gx + Dy*Hx
Next, I define the simplex table (we need <= inequalities, so I duplicate each equation and multiply it by -1.
Also, I define the objective function as: minimize (Gx - Hx + Gy - Hy + Gz - Hz).
The table looks like this:
Gx Hx Gy Hy Gz Hz <= RHS
============================================================
0 0 -Dz Dz Dy -Dy <= Wx = Gx
0 0 Dz -Dz -Dy Dy <= -Wx = Hx
Dz -Dz 0 0 Dx -Dx <= Wy = Gy
-Dz Dz 0 0 -Dx Dx <= -Wy = Hy
-Dy Dy Dx -Dx 0 0 <= Wz = Gz
Dy -Dy -Dx Dx 0 0 <= -Wz = Hz
============================================================
1 -1 1 -1 1 -1 0 = Z
The problem is that when I run it through an online solver I get Unbounded solution.
Can anyone please point me to what I'm doing wrong ?
Thanks in advance.
edit: I'm sure I messed up some signs somewhere (for example the Z should be defined as a max), but I'm sure I'm wrong when defining something more important.
There exists no unique solution to the problem as posed. You can only solve for the tangential projection of the force. This comes from the properties of the vector (cross) product - it is zero for collinear vectors and in particular for the vector product of a vector by itself. Therefore, if F is a solution of W = r x F, then F' = F + kr is also a solution for any k:
r x F' = r x (F + kr) = r x F + k (r x r) = r x F
since the r x r term is zero by the definition of vector product. Therefore, there is not a single solution but rather a whole linear space of vectors that are solutions.
If you restrict the solution to forces that have zero projection in the direction of r, then you could simply take the vector product of W and r:
W x r = (r x F) x r = -[r x (r x F)] = -[(r . F)r - (r . r)F] = |r|2F
with the first term of the expansion being zero because the projection of F onto r is zero (the dot denotes scalar (inner) product). Therefore:
F = (W x r) / |r|2
If you are also given the magnitude of F, i.e. |F|, then you can compute the radial component (if any) but there are still two possible solutions with radial components in opposing directions.
Quick dirty derivation...
Given D and F, you get W perpendicular to them. That's what a cross product does.
But you have W and D and need to find F. This is a bad assumption, but let's assume F was perpendicular to D. Call it Fp, since it's not necessarily the same as F. Ignoring magnitudes, WxD should give you the direction of Fp.
This ignoring magnitudes, so fix that with a little arithmetic. Starting with W=DxF applied to Fp:
mag(W) = mag(D)*mag(Fp) (ignoring geometry; using Fp perp to D)
mag(Fp) = mag(W)/mag(D)
Combining the cross product bit for direction with this stuff for magnitude,
Fp = WxD / mag(WxD) * mag(Fp)
Fp = WxD /mag(W) /mag(D) *mag(W) /mag(D)
= WxD / mag(D)^2.
Note that given any solution Fp to W=DxF, you can add any vector proportional to D to Fp to obtain another solution F. That is a totally free parameter to choose as you like.
Note also that if the torque applies to some sort of axle or object constrained to rotate about some axis, and F is applied to some oddball lever sticking out at a funny angle, then vector D points in some funny direction. You want to replace D with just the part perpendicular to the axle/axis, otherwise the "/mag(D)" part will be wrong.
So from your comment is clear that all rotations are spinning around center of gravity
in that case
F=M/r
F force [N]
M torque [N/m]
r scalar distance between center of rotation [m]
this way you know the scalar size of your Force
now you need the direction
it is perpendicular to rotation axis
and it is the tangent of the rotation in that point
dir=r x axis
F = F * dir / |dir|
bolds are vectors rest is scalar
x is cross product
dir is force direction
axis is rotation axis direction
now just change the direction according to rotation direction (signum of actual omega)
also depending on your coordinate system setup
so ether negate F or not
but this is in 3D free rotation very unprobable scenario
the object had to by symmetrical from mass point of view
or initial driving forces was applied in manner to achieve this
also beware that after first hit with any interaction Force this will not be true !!!
so if you want just to compute Force it generate on certain point if collision occurs is this fine
but immediately after this your spinning will change
and for non symmetric objects the spinning will be most likely off the center of gravity !!!
if your object will be disintegrated then you do not need to worry
if not then you have to apply rotation and movement dynamics
Rotation Dynamics
M=alpha*I
M torque [N/m]
alpha angular acceleration
I quadratic mass inertia for actual rotation axis [kg.m^2]
epislon''=omega'=alpha
' means derivation by time
omega angular speed
epsilon angle

Tweaking MIT's bitcount algorithm to count words in parallel?

I want to use a version of the well known MIT bitcount algorithm to count neighbors in Conway's game of life using SSE2 instructions.
Here's the MIT bitcount in c, extended to count bitcounts > 63 bits.
int bitCount(unsigned long long n)
{
unsigned long long uCount;
uCount = n – ((n >> 1) & 0×7777777777777777)
- ((n >> 2) & 0×3333333333333333)
- ((n >> 3) & 0×1111111111111111);
return ((uCount + (uCount >> 4))
& 0x0F0F0F0F0F0F0F0F) % 255;
}
Here's a version in Pascal
function bitcount(n: uint64): cardinal;
var ucount: uint64;
begin
ucount:= n - ((n shr 1) and $7777777777777777)
- ((n shr 2) and $3333333333333333)
- ((n shr 3) and $1111111111111111);
Result:= ((ucount + (count shr 4))
and $0F0F0F0F0F0F0F0F) mod 255;
end;
I'm looking to count the bits in this structure in parallel.
32-bit word where the pixels are laid out as follows.
lo-byte lo-byte neighbor
0 4 8 C 048C 0 4 8 C
+---------------+
1|5 9 D 159D 1|5 9 D
| |
2|6 A E 26AE 2|6 A E
+---------------+
3 7 B F 37BF 3 7 B F
|-------------| << slice A
|---------------| << slice B
|---------------| << slice C
Notice how this structure has 16 bits in the middle that need to be looked up.
I want to calculate neighbor counts for each of the 16 bits in the middle using SSE2.
In order to do this I put slice A in XMM0 low-dword, slice B in XXM0-dword1 etc.
I copy XMM0 to XMM1 and I mask off bits 012-456-89A for bit 5 in the low word of XMM0, do the same for word1 of XMM0, etc. using different slices and masks to make sure each word in XMM0 and XMM1 holds the neighbors for a different pixel.
Question
How do I tweak the MIT-bitcount to end up with a bitcount per word/pixel in each XMM word?
Remarks
I don't want to use a lookup table, because I already have that approach and I want to
test to see if SSE2 will speed up the process by not requiring memory accesses to the lookup table.
An answer using SSE assembly would be optimal, because I'm programming this in Delphi and I'm thus using x86+SSE2 assembly code.
The MIT algorithm would be tough to implement in SSE2, since there is no integer modulus instruction which could be used for the final ... % 255 expression. Of the various popcnt methods out there, the one that lends itself to SSE most easily and efficiently is probably the first one in Chapter 5 of "Hackers Delight" by Henry S. Warren, which I have implemented here in C using SSE intrinsics:
#include <stdio.h>
#include <emmintrin.h>
__m128i _mm_popcnt_epi16(__m128i v)
{
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x5555)), _mm_and_si128(_mm_srli_epi16(v, 1), _mm_set1_epi16(0x5555)));
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x3333)), _mm_and_si128(_mm_srli_epi16(v, 2), _mm_set1_epi16(0x3333)));
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x0f0f)), _mm_and_si128(_mm_srli_epi16(v, 4), _mm_set1_epi16(0x0f0f)));
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x00ff)), _mm_and_si128(_mm_srli_epi16(v, 8), _mm_set1_epi16(0x00ff)));
return v;
}
int main(void)
{
__m128i v0 = _mm_set_epi16(7, 6, 5, 4, 3, 2, 1, 0);
__m128i v1;
v1 = _mm_popcnt_epi16(v0);
printf("v0 = %vhd\n", v0);
printf("v1 = %vhd\n", v1);
return 0;
}
Compile and test as follows:
$ gcc -Wall -msse2 _mm_popcnt_epi16.c -o _mm_popcnt_epi16
$ ./_mm_popcnt_epi16
v0 = 0 1 2 3 4 5 6 7
v1 = 0 1 1 2 1 2 2 3
$
It looks like around 16 arithmetic/logical instructions so it should run at around 16 / 8 = 2 clocks per point.
You can easily convert this to raw assembler if you need to - each intrinsic maps to a single instruction.