converting meter into pixel unit - kinect

i am trying to convert convert distance from meter to pixel in ros node, with pcl library and kinect xbox. I was using below code to access euclidean coordinates of every point from kinect inside ros node, which is in meter. But i wanted to get this measurments in pixel unit. What should i do?
void
cloud_cb (const sensor_msgs::PointCloud2ConstPtr& input)
{
pcl::PointCloud<pcl::PointXYZRGB> output;
pcl::fromROSMsg(*input,output );
for(int i=0;i<=400;i++)
{
for(int j=0;j<=400;j++)
{
p[i][j] = output.at(i,j);
ROS_INFO("\n p.z = %f \t p.x = %f \t p.y = %f",p[i][j].z,p[i][j].x,p[i][j].y);
}
}
sensor_msgs::PointCloud2 cloud;
pcl::toROSMsg(output,cloud);
pub.publish (cloud);
}
Here P[raw][col] is a Point structure which contains the x,y,z coordinates value in meter, which i want to convert in pixel unit. As i see the value of pixel unit is not constant, so cant use any value found in google.
I got similar question here: Kinect depth conversion from mm to pixels, but it has no solution.

There's a problem with trying to convert meters to pixels. Pixels aren't a standard unit. The physical size of 1 pixel varies on different devices depending on screen resolution and size of a screen.
If you know the resolution of the screen the conversion is still non-trivial.
const int L = 1920; //screen width
const int H = 1280; //screen height
for(int i=0;i<=L;i++){
for(int j=0;j<=H;j++){
p[i][j] = output.at(i*400/L,j*400/H);
}
}
Thus for every pixel you'll have a depth value corresponding to the depth value in the map. This will need some int conversion and improvement.

Related

Skeletal animation bug with Assimp in DirectX 12

I am using Assimp to load an FBX model with animation (created in Blender) into my DirectX 12 game, but I'm experiencing a very frustrating bug with the animation rendered by the game application.
The test model is a simple 'flagpole' containing four bones like so:
Bone0 -> Bone1 -> Bone2 -> Bone3
The model renders correctly in its rest pose when the keyframe animation is bypassed.
The model also renders and animates properly when the animation rotates the model only by the root bone (Bone0).
However, when importing a model that rotates at the first joint (i.e. at Bone1), the vertices clustered around each joint seem 'stuck' in their original positions, while the vertices surrounding the 'bones' proper appear to follow through with the correct animation.
The result is a crappy zigzag of stretched geometry like so:
Instead the model should resemble an 'allen-key' shape at the end of its animation pose, as shown by the same model rendered in the AssimpViewer utility tool:
Since the model is rendering correctly in AssimpViewer, it's reasonable to assume there are no issues with the FBX file exported by Blender. I then checked and confirmed that the vertices 'stuck' around the joints did indeed have their vertex weights correctly assigned by the game loading code.
The C++ model loading and animation code is based on the popular OGLDev tutorial: https://ogldev.org/www/tutorial38/tutorial38.html
Now the infuriating thing is, since the AssimpViewer tool was correctly rendering the model animation, I also copied in the SceneAnimator and AnimEvaluator classes from that tool to generate the final bone transforms via that code branch as well... only to end up with exactly the same zigzag bug in the game!
I'm reasonably confident there aren't any issues with finding the bone hierarchy structure at initialization, so here are the key functions that traverse the hierarchy and interpolate key frames each frame.
VOID Mesh::ReadNodeHeirarchy(FLOAT animationTime, CONST aiNode* pNode, CONST aiAnimation* pAnim, CONST aiMatrix4x4 parentTransform)
{
std::string nodeName(pNode->mName.data);
// nodeTransform is a relative transform to parent node space
aiMatrix4x4 nodeTransform = pNode->mTransformation;
CONST aiNodeAnim* pNodeAnim = FindNodeAnim(pAnim, nodeName);
if (pNodeAnim)
{
// Interpolate scaling and generate scaling transformation matrix
aiVector3D scaling(1.f, 1.f, 1.f);
CalcInterpolatedScaling(scaling, animationTime, pNodeAnim);
// Interpolate rotation and generate rotation transformation matrix
aiQuaternion rotationQ (1.f, 0.f, 0.f, 0.f);
CalcInterpolatedRotation(rotationQ, animationTime, pNodeAnim);
// Interpolate translation and generate translation transformation matrix
aiVector3D translat(0.f, 0.f, 0.f);
CalcInterpolatedPosition(translat, animationTime, pNodeAnim);
// build the SRT transform matrix
nodeTransform = aiMatrix4x4(rotationQ.GetMatrix());
nodeTransform.a1 *= scaling.x; nodeTransform.b1 *= scaling.x; nodeTransform.c1 *= scaling.x;
nodeTransform.a2 *= scaling.y; nodeTransform.b2 *= scaling.y; nodeTransform.c2 *= scaling.y;
nodeTransform.a3 *= scaling.z; nodeTransform.b3 *= scaling.z; nodeTransform.c3 *= scaling.z;
nodeTransform.a4 = translat.x; nodeTransform.b4 = translat.y; nodeTransform.c4 = translat.z;
}
aiMatrix4x4 globalTransform = parentTransform * nodeTransform;
if (m_boneMapping.find(nodeName) != m_boneMapping.end())
{
UINT boneIndex = m_boneMapping[nodeName];
// the global inverse transform returns us to mesh space!!!
m_boneInfo[boneIndex].FinalTransform = m_globalInverseTransform * globalTransform * m_boneInfo[boneIndex].BoneOffset;
//m_boneInfo[boneIndex].FinalTransform = m_boneInfo[boneIndex].BoneOffset * globalTransform * m_globalInverseTransform;
m_shaderTransforms[boneIndex] = aiMatrixToSimpleMatrix(m_boneInfo[boneIndex].FinalTransform);
}
for (UINT i = 0u; i < pNode->mNumChildren; i++)
{
ReadNodeHeirarchy(animationTime, pNode->mChildren[i], pAnim, globalTransform);
}
}
VOID Mesh::CalcInterpolatedRotation(aiQuaternion& out, FLOAT animationTime, CONST aiNodeAnim* pNodeAnim)
{
UINT rotationKeys = pNodeAnim->mNumRotationKeys;
// we need at least two values to interpolate...
if (rotationKeys == 1u)
{
CONST aiQuaternion& key = pNodeAnim->mRotationKeys[0u].mValue;
out = key;
return;
}
UINT rotationIndex = FindRotation(animationTime, pNodeAnim);
UINT nextRotationIndex = (rotationIndex + 1u) % rotationKeys;
assert(nextRotationIndex < rotationKeys);
CONST aiQuatKey& key = pNodeAnim->mRotationKeys[rotationIndex];
CONST aiQuatKey& nextKey = pNodeAnim->mRotationKeys[nextRotationIndex];
FLOAT deltaTime = FLOAT(nextKey.mTime) - FLOAT(key.mTime);
FLOAT factor = (animationTime - FLOAT(key.mTime)) / deltaTime;
assert(factor >= 0.f && factor <= 1.f);
aiQuaternion::Interpolate(out, key.mValue, nextKey.mValue, factor);
}
I've just included the rotation interpolation here, since the scaling and translation functions are identical. For those unaware, Assimp's aiMatrix4x4 type follows a column-vector math convention, so I haven't messed with original matrix multiplication order.
About the only deviation between my code and the two Assimp-based code branches I've adopted is the requirement to convert the final transforms from aiMatrix4x4 types into a DirectXTK SimpleMath Matrix (really an XMMATRIX) with this conversion function:
Matrix Mesh::aiMatrixToSimpleMatrix(CONST aiMatrix4x4 m)
{
return Matrix
(m.a1, m.a2, m.a3, m.a4,
m.b1, m.b2, m.b3, m.b4,
m.c1, m.c2, m.c3, m.c4,
m.d1, m.d2, m.d3, m.d4);
}
Because of the column-vector orientation of aiMatrix4x4 Assimp matrices, the final bone transforms are not transposed for HLSL consumption. The array of final bone transforms are passed to the skinning vertex shader constant buffer as follows.
commandList->SetPipelineState(m_psoForwardSkinned.Get()); // set PSO
// Update vertex shader with current bone transforms
CONST std::vector<Matrix> transforms = m_assimpModel.GetShaderTransforms();
VSBonePassConstants vsBoneConstants{};
for (UINT i = 0; i < m_assimpModel.GetNumBones(); i++)
{
// We do not transpose bone matrices for HLSL because the original
// Assimp matrices are column-vector matrices.
vsBoneConstants.boneTransforms[i] = transforms[i];
//vsBoneConstants.boneTransforms[i] = transforms[i].Transpose();
//vsBoneConstants.boneTransforms[i] = Matrix::Identity;
}
GraphicsResource vsBoneCB = m_graphicsMemory->AllocateConstant(vsBoneConstants);
vsPerObjects.gWorld = m_assimp_world.Transpose(); // vertex shader per object constant
vsPerObjectCB = m_graphicsMemory->AllocateConstant(vsPerObjects);
commandList->SetGraphicsRootConstantBufferView(RootParameterIndex::VSBoneConstantBuffer, vsBoneCB.GpuAddress());
commandList->SetGraphicsRootConstantBufferView(RootParameterIndex::VSPerObjConstBuffer, vsPerObjectCB.GpuAddress());
//commandList->SetGraphicsRootDescriptorTable(RootParameterIndex::ObjectSRV, m_shaderTextureHeap->GetGpuHandle(ShaderTexDescriptors::SuzanneDiffuse));
commandList->SetGraphicsRootDescriptorTable(RootParameterIndex::ObjectSRV, m_shaderTextureHeap->GetGpuHandle(ShaderTexDescriptors::DefaultDiffuse));
for (UINT i = 0; i < m_assimpModel.GetMeshSize(); i++)
{
commandList->IASetVertexBuffers(0u, 1u, &m_assimpModel.meshEntries[i].GetVertexBufferView());
commandList->IASetIndexBuffer(&m_assimpModel.meshEntries[i].GetIndexBufferView());
commandList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
commandList->DrawIndexedInstanced(m_assimpModel.meshEntries[i].GetIndexCount(), 1u, 0u, 0u, 0u);
}
Please note I am using the Graphics Resource memory management helper object found in the DirectXTK12 library in the code above. Finally, here's the skinning vertex shader I'm using.
// Luna (2016) lighting model adapted from Moller
#define MAX_BONES 4
// vertex shader constant data that varies per object
cbuffer cbVSPerObject : register(b3)
{
float4x4 gWorld;
//float4x4 gTexTransform;
}
// vertex shader constant data that varies per frame
cbuffer cbVSPerFrame : register(b5)
{
float4x4 gViewProj;
float4x4 gShadowTransform;
}
// bone matrix constant data that varies per object
cbuffer cbVSBonesPerObject : register(b9)
{
float4x4 gBoneTransforms[MAX_BONES];
}
struct VertexIn
{
float3 posL : SV_POSITION;
float3 normalL : NORMAL;
float2 texCoord : TEXCOORD0;
float3 tangentU : TANGENT;
float4 boneWeights : BONEWEIGHT;
uint4 boneIndices : BONEINDEX;
};
struct VertexOut
{
float4 posH : SV_POSITION;
//float3 posW : POSITION;
float4 shadowPosH : POSITION0;
float3 posW : POSITION1;
float3 normalW : NORMAL;
float2 texCoord : TEXCOORD0;
float3 tangentW : TANGENT;
};
VertexOut VS_main(VertexIn vin)
{
VertexOut vout = (VertexOut)0.f;
// Perform vertex skinning.
// Ignore BoneWeights.w and instead calculate the last weight value
// to ensure all bone weights sum to unity.
float4 weights = vin.boneWeights;
//weights.w = 1.f - dot(weights.xyz, float3(1.f, 1.f, 1.f));
//float4 weights = { 0.f, 0.f, 0.f, 0.f };
//weights.x = vin.boneWeights.x;
//weights.y = vin.boneWeights.y;
//weights.z = vin.boneWeights.z;
weights.w = 1.f - (weights.x + weights.y + weights.z);
float4 localPos = float4(vin.posL, 1.f);
float3 localNrm = vin.normalL;
float3 localTan = vin.tangentU;
float3 objPos = mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.x]).xyz * weights.x;
objPos += mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.y]).xyz * weights.y;
objPos += mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.z]).xyz * weights.z;
objPos += mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.w]).xyz * weights.w;
float3 objNrm = mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.x]) * weights.x;
objNrm += mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.y]) * weights.y;
objNrm += mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.z]) * weights.z;
objNrm += mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.w]) * weights.w;
float3 objTan = mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.x]) * weights.x;
objTan += mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.y]) * weights.y;
objTan += mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.z]) * weights.z;
objTan += mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.w]) * weights.w;
vin.posL = objPos;
vin.normalL = objNrm;
vin.tangentU.xyz = objTan;
//vin.posL = posL;
//vin.normalL = normalL;
//vin.tangentU.xyz = tangentL;
// End vertex skinning
// transform to world space
float4 posW = mul(float4(vin.posL, 1.f), gWorld);
vout.posW = posW.xyz;
// assumes nonuniform scaling, otherwise needs inverse-transpose of world matrix
vout.normalW = mul(vin.normalL, (float3x3)gWorld);
vout.tangentW = mul(vin.tangentU, (float3x3)gWorld);
// transform to homogenous clip space
vout.posH = mul(posW, gViewProj);
// pass texcoords to pixel shader
vout.texCoord = vin.texCoord;
//float4 texC = mul(float4(vin.TexC, 0.0f, 1.0f), gTexTransform);
//vout.TexC = mul(texC, gMatTransform).xy;
// generate projective tex-coords to project shadow map onto scene
vout.shadowPosH = mul(posW, gShadowTransform);
return vout;
}
Some last tests I tried before posting:
I tested the code with a Collada (DAE) model exported from Blender, only to observe the same distorted zigzagging in the Win32 desktop application.
I also confirmed the aiScene object for the loaded model returns an identity matrix for the global root transform (also verified in AssimpViewer).
I have stared at this code for about a week and am going out of my mind! Really hoping someone can spot what I have missed. If you need more code or info, please ask!
This seems to be a bug with the published code in the tutorials / documentation. It would be great if you could open an issue-report here: Assimp-Projectpage on GitHub .
It's taken almost another two weeks of pain, but I finally found the bug. It was in my own code, and it was self-inflicted. Before I show the solution, I should explain the further troubleshooting I did to get there.
After losing faith with Assimp (even though the AssimpViewer tool was animating my model correctly), I turned to the FBX SDK. The FBX ViewScene command line utility tool that's available as part of the SDK was also showing and animating my model properly, so I had hope...
So after a few days reviewing the FBX SDK tutorials, and taking another week to write an FBX importer for my Windows desktop game, I loaded my model and... saw exactly the same zig-zag animation anomaly as the version loaded by Assimp!
This frustrating outcome meant I could at least eliminate Assimp and the FBX SDK as the source of the problem, and focus again on the vertex shader. The shader I'm using for vertex skinning was adopted from the 'Character Animation' chapter of Frank Luna's text. It was identical in every way, which led me to recheck the C++ vertex structure declared on the application side...
Here's the C++ vertex declaration for skinned vertices:
struct Vertex
{
// added constructors
Vertex() = default;
Vertex(FLOAT x, FLOAT y, FLOAT z,
FLOAT nx, FLOAT ny, FLOAT nz,
FLOAT u, FLOAT v,
FLOAT tx, FLOAT ty, FLOAT tz) :
Pos(x, y, z),
Normal(nx, ny, nz),
TexC(u, v),
Tangent(tx, ty, tz) {}
Vertex(DirectX::SimpleMath::Vector3 pos,
DirectX::SimpleMath::Vector3 normal,
DirectX::SimpleMath::Vector2 texC,
DirectX::SimpleMath::Vector3 tangent) :
Pos(pos), Normal(normal), TexC(texC), Tangent(tangent) {}
DirectX::SimpleMath::Vector3 Pos;
DirectX::SimpleMath::Vector3 Normal;
DirectX::SimpleMath::Vector2 TexC;
DirectX::SimpleMath::Vector3 Tangent;
FLOAT BoneWeights[4];
BYTE BoneIndices[4];
//UINT BoneIndices[4]; <--- YOU HAVE CAUSED ME A MONTH OF PAIN
};
Quite early on, being confused by Luna's use of BYTE to store the array of bone indices, I changed this structure element to UINT, figuring this still matched the input declaration shown here:
static CONST D3D12_INPUT_ELEMENT_DESC inputElementDescSkinned[] =
{
{ "SV_POSITION", 0u, DXGI_FORMAT_R32G32B32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "NORMAL", 0u, DXGI_FORMAT_R32G32B32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "TEXCOORD", 0u, DXGI_FORMAT_R32G32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "TANGENT", 0u, DXGI_FORMAT_R32G32B32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
//{ "BINORMAL", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "BONEWEIGHT", 0u, DXGI_FORMAT_R32G32B32A32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "BONEINDEX", 0u, DXGI_FORMAT_R8G8B8A8_UINT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
};
Here was the bug. By declaring UINT in the vertex structure for bone indices, four bytes were being assigned to store each bone index. But in the vertex input declaration, the DXGI_FORMAT_R8G8B8A8_UINT format specified for the "BONEINDEX" was assigning one byte per index. I suspect this data type and format size mismatch was resulting in only one valid bone index being able to fit in the BONEINDEX element, and so only one index value was passed to the vertex shader each frame, instead of the whole array of four indices for correct bone transform lookups.
So now I've learned... the hard way... why Luna had declared an array of BYTE for bone indices in the original C++ vertex structure.
I hope this experience will be of value to someone else, and always be careful changing code from your original learning sources.

How can I read individual pixels from a CVPixelBuffer

AVDepthData gives me a CVPixelBuffer of depth data. But I can't find a way to easily access the depth information in this CVPixelBuffer. Is there a simple recipe in Objective-C to do so?
You have to use the CVPixelBuffer APIs to get the right format to access the data via unsafe pointer manipulations. Here is the basic way:
CVPixelBufferRef pixelBuffer = _lastDepthData.depthDataMap;
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
size_t cols = CVPixelBufferGetWidth(pixelBuffer);
size_t rows = CVPixelBufferGetHeight(pixelBuffer);
Float32 *baseAddress = CVPixelBufferGetBaseAddress( pixelBuffer );
// This next step is not necessary, but I include it here for illustration,
// you can get the type of pixel format, and it is associated with a kCVPixelFormatType
// this can tell you what type of data it is e.g. in this case Float32
OSType type = CVPixelBufferGetPixelFormatType( pixelBuffer);
if (type != kCVPixelFormatType_DepthFloat32) {
NSLog(#"Wrong type");
}
// Arbitrary values of x and y to sample
int x = 20; // must be lower that cols
int y = 30; // must be lower than rows
// Get the pixel. You could iterate here of course to get multiple pixels!
int baseAddressIndex = y * (int)cols + x;
const Float32 pixel = baseAddress[baseAddressIndex];
CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
Note that the first thing you need to determine is what type of data is in the CVPixelBuffer - if you don't know this then you can use CVPixelBufferGetPixelFormatType() to find out. In this case I am getting depth data at Float32, if you were using another type e.g. Float16, then you would need to replace all occurrences of Float32 with that type.
Note that it's important to lock and unlock the base address using CVPixelBufferLockBaseAddress and CVPixelBufferUnlockBaseAddress.

In Kinect SDK v2.0 how do I map a pixel in the color image to a voxel in the depth image?

I know how to go the other way around. What I am looking for is, given a (x,y) coordinate in the pixel space (of the 1920x1080 image), how do I get the corresponding (if available) (x,y,z) (in meters) of the depth image. I realize that there are more pixels than voxels and it could be possible not to find any, but Microsoft's SDK has a CoordinateMapper class. This exposes the MapColorFrameToCameraSpace function. If I use that, I can get an array of points in the camera space (x,y,z) but I am unable to figure out how to extract the mapping for a specific pixel.
You probably need to use
CoordinateMapper.MapDepthFrameToColorSpace
Find the color space coordinates of all depth point.
Then, compare your pixel coordinate (x, y) with these color space coordinates. My solution is to find the closest point (might have better way), because the mapped coordinates are floats.
If you use C#, here is the code. Hope it helps!
private ushort GetDepthValueFromPixelPoint(KinectSensor kinectSensor, ushort[] depthData, float PixelX, float PixelY)
{
ushort depthValue = 0;
if (null != depthData)
{
ColorSpacePoint[] depP = new ColorSpacePoint[512 * 424];
kinectSensor.CoordinateMapper.MapDepthFrameToColorSpace(_depthData, depP);
int depthIndex = FindClosestIndex(depP, PixelX, PixelY);
if (depthIndex < 0)
Console.WriteLine("-1");
else
{
depthValue = _depthData[depthIndex];
Console.WriteLine(depthValue);
}
}
return depthValue;
}
private int FindClosestIndex(ColorSpacePoint[] depP, float PixelX, float PixelY)
{
int depthIndex = -1;
float closestPoint = float.MaxValue;
for (int j = 0; j < depP.Length; ++j)
{
float dis = DistanceOfTwoPoints(depP[j], PixelX, PixelY);
if (dis < closestPoint)
{
closestPoint = dis;
depthIndex = j;
}
}
return depthIndex;
}
private float DistanceOfTwoPoints(ColorSpacePoint colorSpacePoint, float PixelX, float PixelY)
{
float x = colorSpacePoint.X - PixelX;
float y = colorSpacePoint.Y - PixelY;
return (float)Math.Sqrt(x * x + y * y);
}

Explain code in Kinect SDK

I am working with Kinect and reading example from DepthWithColor-D3D, has some code but i don't understand yet.
// loop over each row and column of the color
for (LONG y = 0; y < m_colorHeight; ++y)
{
LONG* pDest = (LONG*)((BYTE*)msT.pData + msT.RowPitch * y);
for (LONG x = 0; x < m_colorWidth; ++x)
{
// calculate index into depth array
int depthIndex = x/m_colorToDepthDivisor + y/m_colorToDepthDivisor * m_depthWidth;
// retrieve the depth to color mapping for the current depth pixel
LONG colorInDepthX = m_colorCoordinates[depthIndex * 2];
LONG colorInDepthY = m_colorCoordinates[depthIndex * 2 + 1];
How to calculate the value of colorInDepthX and colorInDepthY as above code?
colorInDepthX and colorInDepthY is a mapping between the depth and color images so that they will align. Because the Kinect's cameras are slightly offset from each other their field of views are not lined up perfectly.
m_colorCoordinates is defined at the top of the file as such:
m_colorCoordinates = new LONG[m_depthWidth*m_depthHeight*2];
This is a single dimension array representing a 2-dimensional image, it is populated just above the code block you post in your question:
// Get of x, y coordinates for color in depth space
// This will allow us to later compensate for the differences in location, angle, etc between the depth and color cameras
m_pNuiSensor->NuiImageGetColorPixelCoordinateFrameFromDepthPixelFrameAtResolution(
cColorResolution,
cDepthResolution,
m_depthWidth*m_depthHeight,
m_depthD16,
m_depthWidth*m_depthHeight*2,
m_colorCoordinates
);
As described in the comment, this is running an calculation provided by the SDK to map the color and depth coordinates onto each other. The result is placed inside of m_colorCoordinates.
colorInDepthX and colorInDepthY are simply values within the m_colorCoordinates array that are being acted upon in the current cycle of the loop. They are not "calculated", per se, but just point to what already exists in m_colorCoordinates.
The function that handles the mapping between color and depth images is explained in the Kinect SDK at MSDN. Here is a direct link:
http://msdn.microsoft.com/en-us/library/jj663856.aspx

Microsoft Kinect SDK depth data to real world coordinates

I'm using the Microsoft Kinect SDK to get the depth and color information from a Kinect and then convert that information into a point cloud. I need the depth information to be in real world coordinates with the centre of the camera as the origin.
I've seen a number of conversion functions but these are apparently for OpenNI and non-Microsoft drivers. I've read that the depth information coming from the Kinect is already in millimetres, and is contained in the 11bits... or something.
How do I convert this bit information into real world coordinates that I can use?
Thanks in advance!
This is catered for within the Kinect for Windows library using the Microsoft.Research.Kinect.Nui.SkeletonEngine class, and the following method:
public Vector DepthImageToSkeleton (
float depthX,
float depthY,
short depthValue
)
This method will map the depth image produced by the Kinect into one that is vector scalable, based on real world measurements.
From there (when I've created a mesh in the past), after enumerating the byte array in the bitmap created by the Kinect depth image, you create a new list of Vector points similar to the following:
var width = image.Image.Width;
var height = image.Image.Height;
var greyIndex = 0;
var points = new List<Vector>();
for (var y = 0; y < height; y++)
{
for (var x = 0; x < width; x++)
{
short depth;
switch (image.Type)
{
case ImageType.DepthAndPlayerIndex:
depth = (short)((image.Image.Bits[greyIndex] >> 3) | (image.Image.Bits[greyIndex + 1] << 5));
if (depth <= maximumDepth)
{
points.Add(nui.SkeletonEngine.DepthImageToSkeleton(((float)x / image.Image.Width), ((float)y / image.Image.Height), (short)(depth << 3)));
}
break;
case ImageType.Depth: // depth comes back mirrored
depth = (short)((image.Image.Bits[greyIndex] | image.Image.Bits[greyIndex + 1] << 8));
if (depth <= maximumDepth)
{
points.Add(nui.SkeletonEngine.DepthImageToSkeleton(((float)(width - x - 1) / image.Image.Width), ((float)y / image.Image.Height), (short)(depth << 3)));
}
break;
}
greyIndex += 2;
}
}
By doing so, the end result from this is a list of vectors stored in millimeters, and if you want centimeters multiply by 100 (etc.).