Shift in Point Cloud acquired using Kinect v2 API - kinect

I am acquiring Point Cloud using Kinect v2 API in Windows 10 64 Bit OS. Below is the code snippet-
depthFrame = multiSourceFrame.DepthFrameReference.AcquireFrame();
colorFrame = multiSourceFrame.ColorFrameReference.AcquireFrame();
if (depthFrame == null || colorFrame == null) return;
depthFrame.CopyFrameDataToArray(depthData);
coordinateMapper.MapDepthFrameToCameraSpace(depthData, cameraSpacePoints);
coordinateMapper.MapDepthFrameToColorSpace(depthData, colorSpacePoints);
colorFrame.CopyConvertedFrameDataToArray(pixels, ColorImageFormat.Rgba);
for (var index = 0; index < depthData.Length; index++)
{
int u = (int)Math.Floor(colorSpacePoints[index].X);
int v = (int)Math.Floor(colorSpacePoints[index].Y);
if (u < 0 || u >= COLOR_FRAME_WIDTH || v < 0 || v >= COLOR_FRAME_HEIGHT) continue;
int pixelsBaseIndex = v * COLOR_FRAME_WIDTH + u) * COLOR_BYTES_PER_PIXEL;
float x = cameraSpacePoints[index].X;
float y = cameraSpacePoints[index].Y;
float z = cameraSpacePoints[index].Z;
byte red = pixels[pixelsBaseIndex + 0];
byte green = pixels[pixelsBaseIndex + 1];
byte blue = pixels[pixelsBaseIndex + 2];
byte alpha = pixels[pixelsBaseIndex + 3];
PointXYZRGB point = new PointXYZRGB(); // Color point in 3D
point.postion(x, y, z);
point.color(red, green, blue, apha);
}
Please see below a screenshot of the point cloud-
Please look around the orange-colored ball in above picture. Upon close inspection, it is visible that there exists a shift in the point cloud.
I am wondering, why such shift exists and how to remove/minimize it? Any workaround, please.

The amount of shift in color overlay and depth map can be due to a number of reasons.
Frame acquisition of depth and color frames are not at the same instant (as that is how the _reader_MultiSourceFrameArrived function in kinect SDK works. The timestamps for both cameras are slightly different, hence the slight shift. This is more prominent if you are moving the object in view.
The coordinateMapper function in the sdk for mapping the color frame and depth frame uses the camera calibration parameters. The default camera calibration parameters are had coded in the sdk, however there are slight differences in each and every device. You could try to recalibrate the Kinect cameras and use the updated calibration parameters to get the correct overlay of the color and depth maps. Note however, that by siply rplacing the camera calibration parameters in the Kinect Fusion code and recompiling does not work, as the parameters are replaced from the closed-source Kinect fusion dll.So you'll have to write your own code to update each frame at runtime.
Hope this helps.

Related

converting meter into pixel unit

i am trying to convert convert distance from meter to pixel in ros node, with pcl library and kinect xbox. I was using below code to access euclidean coordinates of every point from kinect inside ros node, which is in meter. But i wanted to get this measurments in pixel unit. What should i do?
void
cloud_cb (const sensor_msgs::PointCloud2ConstPtr& input)
{
pcl::PointCloud<pcl::PointXYZRGB> output;
pcl::fromROSMsg(*input,output );
for(int i=0;i<=400;i++)
{
for(int j=0;j<=400;j++)
{
p[i][j] = output.at(i,j);
ROS_INFO("\n p.z = %f \t p.x = %f \t p.y = %f",p[i][j].z,p[i][j].x,p[i][j].y);
}
}
sensor_msgs::PointCloud2 cloud;
pcl::toROSMsg(output,cloud);
pub.publish (cloud);
}
Here P[raw][col] is a Point structure which contains the x,y,z coordinates value in meter, which i want to convert in pixel unit. As i see the value of pixel unit is not constant, so cant use any value found in google.
I got similar question here: Kinect depth conversion from mm to pixels, but it has no solution.
There's a problem with trying to convert meters to pixels. Pixels aren't a standard unit. The physical size of 1 pixel varies on different devices depending on screen resolution and size of a screen.
If you know the resolution of the screen the conversion is still non-trivial.
const int L = 1920; //screen width
const int H = 1280; //screen height
for(int i=0;i<=L;i++){
for(int j=0;j<=H;j++){
p[i][j] = output.at(i*400/L,j*400/H);
}
}
Thus for every pixel you'll have a depth value corresponding to the depth value in the map. This will need some int conversion and improvement.

BabylonJS - Remove smooth animation on the camera

I'm using BabylonJS to make a little game.
I'm using this code to build a camera :
this.cam = new BABYLON.FreeCamera("playerCamera", new BABYLON.Vector3(x, y, z), s);
this.cam.checkCollisions = true;
this.cam.applyGravity = false;
this.cam.keysUp = [90]; // Z
this.cam.keysDown = [83]; // S
this.cam.keysLeft = [81]; // Q
this.cam.keysRight = [68]; // D
this.cam.speed = v;
this.cam.ellipsoid = new BABYLON.Vector3(1, h, 1);
this.cam.angularSensibility = a;
And it works, i have a camera, i can move around ect ...
But my problem is here : by default they are a smooth animation when a move and when i change the orientation of the camera.
Let me explain : When i move with my arrow keys (aproximately 20 pixels to the left) it will go to 25 pixels (20 pixels + 5 smooths pixels).
I don't want it :/ Do you know how o disable it ? (To move and change orientation of the camera).
This is due to the inertia defined in the free camera.
To remove this "smooth" movements, simply disable inertia:
this.cam.inertia = 0;

Kinect V2 how to extract player/user from background with original resolution 1920x1080

In Kinect V2 as we know depth and color resolutions are different. With mapping API available in SDK it is easy to get color value from color frame and put in on depth frame as shown by many posts on the internet. That will give final image of the size 512x414.
But I wanted to extract player/user in original color frame so that final image is of resolution 1920x1080.
I can think of using the mapping API and mark color frame with User/PLayer pixel. Then apply some heuristic and expose RGB value neighboring pixels and complete the User/Player image.
Does any one has better suggestion on how best we can do that ?
Hi instead of mapping from depth to color try mapping your color frame to depthspace and set your output Image's size equal color Image's size.
coordinateMapper.MapColorFrameToDepthSpace(depthData, depthPoints);
for (int colorIndex = 0; colorIndex < depthPoints.Length; ++colorIndex)
{
DepthSpacePoint depthPoint = depthPoints[colorIndex];
if (!float.IsNegativeInfinity(depthPoint.X) && !float.IsNegativeInfinity(depthPoint.Y))
{
int depthX = (int)(depthPoint.X + 0.5f);
int depthY = (int)(depthPoint.Y + 0.5f);
if ((depthX >= 0) && (depthX < depthWidth) && (depthY >= 0) && (depthY < depthHeight))
{
int depthIndex = (depthY * depthWidth) + depthX;
byte player = bodyData[depthIndex];
if (player != 0xff)
{
int sourceIndex = colorIndex * 4;
OutImage[sourceIndex] = _colorData[sourceIndex++];
OutImage[sourceIndex] = _colorData[sourceIndex++];
OutImage[sourceIndex] = _colorData[sourceIndex++];
OutImage[sourceIndex] = 0xff;
}
}
}
}
Init for output Image:
OutImage= new byte[colorWidth*colorHeight*4]; //1920x1080x4

Explain code in Kinect SDK

I am working with Kinect and reading example from DepthWithColor-D3D, has some code but i don't understand yet.
// loop over each row and column of the color
for (LONG y = 0; y < m_colorHeight; ++y)
{
LONG* pDest = (LONG*)((BYTE*)msT.pData + msT.RowPitch * y);
for (LONG x = 0; x < m_colorWidth; ++x)
{
// calculate index into depth array
int depthIndex = x/m_colorToDepthDivisor + y/m_colorToDepthDivisor * m_depthWidth;
// retrieve the depth to color mapping for the current depth pixel
LONG colorInDepthX = m_colorCoordinates[depthIndex * 2];
LONG colorInDepthY = m_colorCoordinates[depthIndex * 2 + 1];
How to calculate the value of colorInDepthX and colorInDepthY as above code?
colorInDepthX and colorInDepthY is a mapping between the depth and color images so that they will align. Because the Kinect's cameras are slightly offset from each other their field of views are not lined up perfectly.
m_colorCoordinates is defined at the top of the file as such:
m_colorCoordinates = new LONG[m_depthWidth*m_depthHeight*2];
This is a single dimension array representing a 2-dimensional image, it is populated just above the code block you post in your question:
// Get of x, y coordinates for color in depth space
// This will allow us to later compensate for the differences in location, angle, etc between the depth and color cameras
m_pNuiSensor->NuiImageGetColorPixelCoordinateFrameFromDepthPixelFrameAtResolution(
cColorResolution,
cDepthResolution,
m_depthWidth*m_depthHeight,
m_depthD16,
m_depthWidth*m_depthHeight*2,
m_colorCoordinates
);
As described in the comment, this is running an calculation provided by the SDK to map the color and depth coordinates onto each other. The result is placed inside of m_colorCoordinates.
colorInDepthX and colorInDepthY are simply values within the m_colorCoordinates array that are being acted upon in the current cycle of the loop. They are not "calculated", per se, but just point to what already exists in m_colorCoordinates.
The function that handles the mapping between color and depth images is explained in the Kinect SDK at MSDN. Here is a direct link:
http://msdn.microsoft.com/en-us/library/jj663856.aspx

Microsoft Kinect SDK depth data to real world coordinates

I'm using the Microsoft Kinect SDK to get the depth and color information from a Kinect and then convert that information into a point cloud. I need the depth information to be in real world coordinates with the centre of the camera as the origin.
I've seen a number of conversion functions but these are apparently for OpenNI and non-Microsoft drivers. I've read that the depth information coming from the Kinect is already in millimetres, and is contained in the 11bits... or something.
How do I convert this bit information into real world coordinates that I can use?
Thanks in advance!
This is catered for within the Kinect for Windows library using the Microsoft.Research.Kinect.Nui.SkeletonEngine class, and the following method:
public Vector DepthImageToSkeleton (
float depthX,
float depthY,
short depthValue
)
This method will map the depth image produced by the Kinect into one that is vector scalable, based on real world measurements.
From there (when I've created a mesh in the past), after enumerating the byte array in the bitmap created by the Kinect depth image, you create a new list of Vector points similar to the following:
var width = image.Image.Width;
var height = image.Image.Height;
var greyIndex = 0;
var points = new List<Vector>();
for (var y = 0; y < height; y++)
{
for (var x = 0; x < width; x++)
{
short depth;
switch (image.Type)
{
case ImageType.DepthAndPlayerIndex:
depth = (short)((image.Image.Bits[greyIndex] >> 3) | (image.Image.Bits[greyIndex + 1] << 5));
if (depth <= maximumDepth)
{
points.Add(nui.SkeletonEngine.DepthImageToSkeleton(((float)x / image.Image.Width), ((float)y / image.Image.Height), (short)(depth << 3)));
}
break;
case ImageType.Depth: // depth comes back mirrored
depth = (short)((image.Image.Bits[greyIndex] | image.Image.Bits[greyIndex + 1] << 8));
if (depth <= maximumDepth)
{
points.Add(nui.SkeletonEngine.DepthImageToSkeleton(((float)(width - x - 1) / image.Image.Width), ((float)y / image.Image.Height), (short)(depth << 3)));
}
break;
}
greyIndex += 2;
}
}
By doing so, the end result from this is a list of vectors stored in millimeters, and if you want centimeters multiply by 100 (etc.).