Explain code in Kinect SDK - kinect

I am working with Kinect and reading example from DepthWithColor-D3D, has some code but i don't understand yet.
// loop over each row and column of the color
for (LONG y = 0; y < m_colorHeight; ++y)
LONG* pDest = (LONG*)((BYTE*)msT.pData + msT.RowPitch * y);
for (LONG x = 0; x < m_colorWidth; ++x)
// calculate index into depth array
int depthIndex = x/m_colorToDepthDivisor + y/m_colorToDepthDivisor * m_depthWidth;
// retrieve the depth to color mapping for the current depth pixel
LONG colorInDepthX = m_colorCoordinates[depthIndex * 2];
LONG colorInDepthY = m_colorCoordinates[depthIndex * 2 + 1];
How to calculate the value of colorInDepthX and colorInDepthY as above code?

colorInDepthX and colorInDepthY is a mapping between the depth and color images so that they will align. Because the Kinect's cameras are slightly offset from each other their field of views are not lined up perfectly.
m_colorCoordinates is defined at the top of the file as such:
m_colorCoordinates = new LONG[m_depthWidth*m_depthHeight*2];
This is a single dimension array representing a 2-dimensional image, it is populated just above the code block you post in your question:
// Get of x, y coordinates for color in depth space
// This will allow us to later compensate for the differences in location, angle, etc between the depth and color cameras
As described in the comment, this is running an calculation provided by the SDK to map the color and depth coordinates onto each other. The result is placed inside of m_colorCoordinates.
colorInDepthX and colorInDepthY are simply values within the m_colorCoordinates array that are being acted upon in the current cycle of the loop. They are not "calculated", per se, but just point to what already exists in m_colorCoordinates.
The function that handles the mapping between color and depth images is explained in the Kinect SDK at MSDN. Here is a direct link:


How to apply virtual apperture with 4D-STEM dataset in EFFICIENT way?

I would like to apply arbitrarily defined bit mask as virtual aperture and apply it to 4D-STEM data set in an EFFICIENT way.
I did it using the SliceN function and apply the mask pixel-by-pixel, which is very slow for large datasets. How to optimize it to so to run faster?
Image 4DSTEM := GetFrontImage() // dimention [ScanX, ScanY, Dx, Dy]
Image mask: = iradius // just an arbitrary mask (aperture)
Image out // dimention [ScanX, ScanY]
for (number i=0; i<ScanX; i++)
{ for (number j=0; j<ScanY; j++)
Diff2D = 4DSTEM.SliceN(4,2,i,j,0,0,2,Dx,1,3,Dy,1)
out.setpixel(i,j, sum(diff2D*mask))
for an [100,100,512,512] dataset, that took few minutes to finish. When I have to repeat the operation several times, that is way to slow compare to matrix operation. but I dont know how to make it in an efficient way.
you're hitting the limitations of scripting languages here. Using sliceN is already pretty much the optimum you can get to, unfortunately. Everything else in speed optimization requires parallelized, compiled code. (i.e. you could code C++ code and use the SDK to compile your own plugin.)
However, there is a bit of room for improvement over your example.
First of all, your example above doesn't run :c) But that is quickly fixed.
Point #1:
Try to avoid number type casting. DM script only knows number but internally there is a difference between the proper number types (integer, floating point, signed/unsigned, byte-size). The script languages uses real-4-byte as the default unless told differently explicitly. And some methods will return real-4-byte by default. For this reason, the processing will be fastest, if both data and mask use real-4-byte data as well.
In my testing, the time-difference between running with uint16 data plus uint8 mask and *real4 data plus real4 mask) was significant! Nearly 30% time difference.
Point #2:
Don't copy you sliced image! Use := not = for your Dif2D.
The SliceN command returns an expression directly addressing the required memory. You can use it directly in any other expression (like I do below) or you can assign an image variable to it using := to give it a name.
The speed increase is not huge, but it's one copy-operation less per loop iteration.
Point #3:
You additional knowledge: Now for arbitrary masks there is not much you can do, but most often masks are zero-valued over large stretches and it is possible to define a smaller ROI containing all non-zero points. If this is the case, you can limit your math operations to that region.
i.e. instead of multiplying the whole DP with the same sized mask, just use a smaller mask and use the according sub-section of the DP.
This can actually make a big difference, but it will depend on your mask.
Of course you need to "find" this ROI first. In my script below I'm having a helper method to do that, utilizing the comparatively fast max() command and image rotation as trick for speed-up.
Point #4:
...would be to get rid of the double-for loop and replace it with image-expressions. Unfortunately, DigitalMicrograph does currently (GMS 3.3) not support this for 4D or 5D data.
The script below executed on a [53 x 52 x 512 x 512] STEM DI (of real-4 byte data) gave me the following timings:
Original: 12.80910 sec
Test 1 : 10.77700 sec
Test 2 : 1.83017 sec
// Helper class for timing
class CTimer{
number s
string n
~CTimer(object self){result("\n"+n+": "+ (GetHighResTickCount()-s)/GetHighResTicksPerSecond()+" sec");}
object Start(object self, string n_) { n=n_; s=GetHighResTickCount(); return self;}
// Helper method to find best non-zero containing ROI
void GetNonZeroArea( image src, number &t, number &l, number &b, number &r )
image work = !!src // Make a binary image which is 0 only where src==0
number d
max(work,d,t) // get "first" non-zero pixel coordinate, this is y = dist from TOP
rotateRight(work) // rotate image right
max(work,d,l) // get "first" non-zero pixel coordinate, this is y = dist from LEFT
rotateRight(work) // rotate image right
max(work,d,b) // get "first" non-zero pixel coordinate, this is y = dist from BOTTOM
b = work.ImageGetDimensionSize(1) - b // Opposite side!
rotateRight(work) // rotate image right
max(work,d,r) // get "first" non-zero pixel coordinate
r = work.ImageGetDimensionSize(1) - r // Opposite side!
// The original proposed script (plus fixes to make it actually run)
image Original(image STEM4D, image mask)
Number ScanX = STEM4D.ImageGetDimensionSize(0)
Number ScanY = STEM4D.ImageGetDimensionSize(1)
Number Dx = STEM4D.ImageGetDimensionSize(2)
Number Dy = STEM4D.ImageGetDimensionSize(3)
Image out := RealImage("Test1",4,ScanX,ScanY)
for (number i=0; i<ScanX; i++)
{ for (number j=0; j<ScanY; j++)
image Diff2D = STEM4D.SliceN(4,2,i,j,0,0,2,Dx,1,3,Dy,1)
out.setpixel(i,j, sum(Diff2D*mask))
return out
// Remove copying the slice, just reference it
image Test1(image STEM4D, image mask)
Number ScanX = STEM4D.ImageGetDimensionSize(0)
Number ScanY = STEM4D.ImageGetDimensionSize(1)
Number Dx = STEM4D.ImageGetDimensionSize(2)
Number Dy = STEM4D.ImageGetDimensionSize(3)
Image out := RealImage("Test1",4,ScanX,ScanY)
for (number i=0; i<ScanX; i++)
{ for (number j=0; j<ScanY; j++)
image Diff2D := STEM4D.SliceN(4,2,i,j,0,0,2,Dx,1,3,Dy,1)
out.setpixel(i,j, sum(Diff2D*mask))
return out
// Limit mask size to what is needed!
image Test2(image STEM4D, image mask )
Number ScanX = STEM4D.ImageGetDimensionSize(0)
Number ScanY = STEM4D.ImageGetDimensionSize(1)
Number Dx = STEM4D.ImageGetDimensionSize(2)
Number Dy = STEM4D.ImageGetDimensionSize(3)
Image out := RealImage("Test1",4,ScanX,ScanY)
Number t,l,b,r
Number w = r - l
Number h = b - t
image subMask := mask.slice2(l,t,0, 0,w,1, 1,h,1 )
for (number i=0; i<ScanX; i++)
for (number j=0; j<ScanY; j++)
out.setpixel(i,j, sum(STEM4D.SliceN(4,2,i,j,l,t,2,w,1,3,h,1)*subMask))
return out
Image src := GetFrontImage() // dimention [ScanX, ScanY, Dx, Dy]
Number ScanX = src.ImageGetDimensionSize(0)
Number ScanY = src.ImageGetDimensionSize(1)
Number Dx = src.ImageGetDimensionSize(2)
Number Dy = src.ImageGetDimensionSize(3)
Number r = 50 // mask radius
Image maskImg := RealImage("Mask",4,Dx,Dy)
maskImg = iradius < r ? 1 : 0 // just an aperture mask
image resultImg
object timer = Alloc(CTimer).Start("Original")
resultImg := Original(src,maskImg)
object timer = Alloc(CTimer).Start("Test 1")
resultImg.SetName("Test 1")
object timer = Alloc(CTimer).Start("Test 2")
resultImg.SetName("Test 2")
Compiled code comparison:
Now, it should be added that the above script still is rather slow. Because it is iterating and using script language. The fully compiled c++ code of DigitalMicrograph is much faster. So if you have the licensed packages giving you the SI menu, then you want to use the SI/Map/Signal command. This is near-instantaneous for the example STEM DI I've mentioned above. My other answer shows how one could utilize this functionality by script.
As mentioned in my other answer, a real speed-win comes when compiled, parallelized code is used. DigitalMicrograph does this, after all, in the available SI "signal" map functionality. This feature is not available in the free version, but if you have Spectrum-Imaging acquisition, you most likely have the appropriated license as well.
The answer below utilizes this functionality by accessing the UI with the command ChooseMenuItem() and applying a few more tricks. The script is a bit lengthy, but its parts also show some other nice tricks worthwhile knowing:
TestSignalIntegrationInSI is the main script demoing how things can work.
CreatePickerByScript shows how one can create picker-spectra on SIs. This is used to open a 'Picker Diffraction Pattern' image from the STEM DI.
AddTestMasksToDP_ROIs programmatically adds ROIs to the diffraction pattern to be used as mask
AddTestMasksToDP_Threshold programmatically adds an intensity-threshold mask to be used as mask.
AddTestMasksToDP_DPMasks programmatically adds the various types of diffraction-masks to be used as mask
GetIntegratedSignalViaSIMenu is the central step of the script. With a picker-DP and required 'masks' on it front-most, the menu command is called to perform the signal-extraction (as fast as possible.) Then the displayed result-image is returned.
GetNewestImage is just a utility method showing how on can access the latest memory-created image.
Here is the script:
image GetNewestImage()
// New images get the next higher imageID.
// This can be used to identify the "latest" created image.
if ( 0 == CountImages() ) Throw( "No image in memory!" )
// We create a temp. image to get the uppder limit
number lastID = RealImage("Dummy",4,1).ImageGetID()
// Then we search for the next lower existing one
image lastImg
for( number ID = lastID - 1; ID>0; ID-- )
lastImg := FindImageByID(ID)
if ( lastImg.ImageIsValid() ) break
return lastImg
image CreatePickerByScript( image SI, number t, number l, number b, number r )
if ( SI.ImageGetNumDimensions()<3 ) Throw( "Sorry, LineScans are not supprorted here." )
// Adding a non-volatile ROI of specific RoiNAME acts as if using
// the picker-tool. The ID string must be unique!
ROI pickerROI = NewROI()
pickerROI.RoiSetVolatile( 0 )
string uniqueID = GetDate(0)+"#"+GetTime(1)+";"+round(random()*1000)
pickerROI.RoiSetName( "SICursor(##"+uniqueID+"##)" )
SI.ImageGetImageDisplay(0).ImageDisplayAddROI( pickerROI )
// This creates the picker image.
// So the child is now the "newest" image in memory
image child := GetNewestImage()
return child
void AddTestMasksToDP_ROIs( image DP )
// Add ROIs to the DP which are your masks (any numebr and type of ROI works)
imageDisplay DPdisp = DP.ImageGetImageDisplay(0)
number dpX = DP.ImageGetDimensionSize(0)
number dpY = DP.ImageGetDimensionSize(1)
// Only simple RECT ROIs are supported
ROI maskRoi1 = NewROI()
maskRoi1.ROISetRectangle( dpY*0.1, dpX*0.1, dpY*0.8, dpX*0.3 )
// Arbitrary multi-vertex (use for ovals etc.)
ROI maskRoi2 = NewROI()
maskRoi2.ROISetRectangle( dpY*0.7, dpX*0.1, dpY*0.9, dpX*0.9 )
void AddTestMasksToDP_Threshold( image DP )
// Add intensity treshhold mask (highest 95% intensity range)
imageDisplay DPdisp = DP.ImageGetImageDisplay(0)
DPdisp.RasterImageDisplaySetThresholdOn( 1 )
number low = max(DP) * 0.05
number high = max(DP)
DPdisp.RasterImageDisplaySetThresholdLimits( low, high )
void AddTestMasksToDP_DPMasks( image DP )
// Add Diffraction masks to the DP
imageDisplay DPdisp = DP.ImageGetImageDisplay(0)
// Spot masks (always symmetric pair)
Component spotMask = NewComponent(8,0,0,0,0) // 8 = Spotmask
spotMask.ComponentSetControlPoint(4, 0, 0,0) // 4 = TopLeft of one spot [Size only]
spotMask.ComponentSetControlPoint(7,10,10,0) // 7 = BottomRight of one spot [Size only]
spotMask.ComponentSetControlPoint(8,150,0,0) // 8 = Spot position [center]
// Bandpass mask (Only circles are correctly supported)
Component bandpassMask = NewComponent(15,0,0,0,0) // 15 = Bandpass (ring)
number r1 = 100
number r2 = 120
bandpassMask.ComponentSetControlPoint(7,r1,r1,0) // 7 = BottomRight of one ring [Size only]
bandpassMask.ComponentSetControlPoint(14,r2,r2,0) // 14 = BottomRight of one ring [Size only]
// Wege mask (symmetric)
Component wedgeMask = NewComponent(19,0,0,0,0) // 19 = wedgemask (ringsegment)
wedgeMask.ComponentSetControlPoint(9,10,20,0) // 9 = One wedge vector
wedgeMask.ComponentSetControlPoint(10,-20,40,0) // 10 = Other wedge vector
// Array mask (symmetric)
Component arrayMask = NewComponent(9,0,0,0,0) // 9 = arrayMask (ringsegment)
arrayMask.ComponentSetControlPoint(9,-70,-60,0) // 9 = One array vector
arrayMask.ComponentSetControlPoint(10,99,-99,0) // 10 = Other array vector
arrayMask.ComponentSetControlPoint(4, 0, 0,0) // 4 = TopLeft of one spot [Size only]
arrayMask.ComponentSetControlPoint(7,20,20,0) // 7 = BottomRight of one spot [Size only]
image GetIntegratedSignalViaSIMenu( image pickerChild )
// Call the Menu to do the work
// The picker-spectrum or DP needs to be front-most
// The created signal map is NOT the newest image
// (some internal iamges are created for the mask)
// but it is the front-most displayed one.
image signalMap := GetFrontImage()
return signalMap
image GetMaskFromSignalMap( image signalMap, number DPx, number DPy )
// The actual mask is stored in the tags
string tagPath = "Processing:[0]:Parameters:Mask"
tagGroup tg = signalMap.ImageGetTagGroup()
if ( !tg.TagGroupDoesTagExist(tagPath) )
Throw( "Sorry, no mask tag found." )
image mask := RealImage("Mask",4,DPx, DPy )
if ( !tg.TagGroupGetTagAsArray(tagPath,mask) )
Throw( "Sorry, could not retrieve mask. Maybe wrong size?" )
return mask
void TestSignalIntegrationInSI()
image STEMDI := GetFrontImage()
image DP := STEMDI.CreatePickerByScript(0,0,1,1)
if ( TwoButtonDialog( "Add ROIs as mask?", "Yes", "No" ) )
AddTestMasksToDP_ROIs( DP )
else if ( TwoButtonDialog( "Add intensity treshold as mask?", "Yes", "No" ) )
AddTestMasksToDP_Threshold( DP )
else if ( TwoButtonDialog( "Add diffraction masks as mask?", "Yes", "No" ) )
AddTestMasksToDP_DPMasks( DP )
image signalMap := GetIntegratedSignalViaSIMenu( DP )
number dpX = DP.ImageGetDimensionSize(0)
number dpY = DP.ImageGetDimensionSize(1)
// We may want to close the DP again. No longer needed
// Verification: Get Mask image form SignalMap
image usedMask := GetMaskFromSignalMap( signalMap, dpX, dpY )
usedMask.SetName( "This mask was used." )
The solution below utilizes the intrinsic expression loops by performing in-place multiplication and then projection.
Disappointingly, it turns out the solution is actually a bit slower then the for-loop with the SliceN command.
For the same test-data of size [53 x 52 x 512 x 512] the resulting timing is:
Data copy: 1.28073 sec
Inplace multiply: 30.1978 sec
Project 1/2: 1.1208 sec
Project 2/2: 0.0019557 sec
InPlace multiplication with projections (total): 32.9045 sec
InPlace multiplication with projections (total): 34.9853 sec
// Helper class for timing
class CTimer{
number s
string n
~CTimer(object self){result("\n"+n+": "+ (GetHighResTickCount()-s)/GetHighResTicksPerSecond()+" sec");}
object Start(object self, string n_) { n=n_; s=GetHighResTickCount(); return self;}
image MaskMultipliedSum( image STEM4D, image MASK2D, number copyFirst )
// Boring feasability checks...
if ( 4 != STEM4D.ImageGetNumDimensions() )
Throw( "Input data is not 4D." )
if ( 2 != MASK2D.ImageGetNumDimensions() )
Throw( "Input mask is not 2D." )
Number ScanX = STEM4D.ImageGetDimensionSize(0)
Number ScanY = STEM4D.ImageGetDimensionSize(1)
Number Dx = STEM4D.ImageGetDimensionSize(2)
Number Dy = STEM4D.ImageGetDimensionSize(3)
if ( Dx != MASK2D.ImageGetDimensionSize(0) )
Throw ("X dimension of mask does not match input data." )
if ( Dy != MASK2D.ImageGetDimensionSize(1) )
Throw ("Y dimension of mask does not match input data." )
// Do the maths!
image workCopy4D
if ( copyFirst )
object timer = Alloc(CTimer).Start("Data copy")
workCopy4D = STEM4D
workCopy4D := STEM4D
object timer = Alloc(CTimer).Start("Inplace multiply")
workCopy4D *= MASK2D[idimindex(2),idimindex(3)]
// Now we want to "sum up" over Dx and Dy
image p1,p2
object timer = Alloc(CTimer).Start("Project 1/2")
p1 := project( workCopy4D, 3 )
object timer = Alloc(CTimer).Start("Project 2/2")
p2 := project( p1, 2 )
return p2
image stack4D, mask2D
If ( GetTwoLabeledImagesWithPrompt("Please select 4D data and 2D mask", "Select input", "4D data", stack4D, "2D mask", mask2D ) )
number doCopy = TwoButtonDialog("Create workcopy?","Yes (takes time)","No (overwrites input data!)")
object timer = Alloc(CTimer).Start("InPlace multiplication with projections (total)")

Shift in Point Cloud acquired using Kinect v2 API

I am acquiring Point Cloud using Kinect v2 API in Windows 10 64 Bit OS. Below is the code snippet-
depthFrame = multiSourceFrame.DepthFrameReference.AcquireFrame();
colorFrame = multiSourceFrame.ColorFrameReference.AcquireFrame();
if (depthFrame == null || colorFrame == null) return;
coordinateMapper.MapDepthFrameToCameraSpace(depthData, cameraSpacePoints);
coordinateMapper.MapDepthFrameToColorSpace(depthData, colorSpacePoints);
colorFrame.CopyConvertedFrameDataToArray(pixels, ColorImageFormat.Rgba);
for (var index = 0; index < depthData.Length; index++)
int u = (int)Math.Floor(colorSpacePoints[index].X);
int v = (int)Math.Floor(colorSpacePoints[index].Y);
if (u < 0 || u >= COLOR_FRAME_WIDTH || v < 0 || v >= COLOR_FRAME_HEIGHT) continue;
int pixelsBaseIndex = v * COLOR_FRAME_WIDTH + u) * COLOR_BYTES_PER_PIXEL;
float x = cameraSpacePoints[index].X;
float y = cameraSpacePoints[index].Y;
float z = cameraSpacePoints[index].Z;
byte red = pixels[pixelsBaseIndex + 0];
byte green = pixels[pixelsBaseIndex + 1];
byte blue = pixels[pixelsBaseIndex + 2];
byte alpha = pixels[pixelsBaseIndex + 3];
PointXYZRGB point = new PointXYZRGB(); // Color point in 3D
point.postion(x, y, z);
point.color(red, green, blue, apha);
Please see below a screenshot of the point cloud-
Please look around the orange-colored ball in above picture. Upon close inspection, it is visible that there exists a shift in the point cloud.
I am wondering, why such shift exists and how to remove/minimize it? Any workaround, please.
The amount of shift in color overlay and depth map can be due to a number of reasons.
Frame acquisition of depth and color frames are not at the same instant (as that is how the _reader_MultiSourceFrameArrived function in kinect SDK works. The timestamps for both cameras are slightly different, hence the slight shift. This is more prominent if you are moving the object in view.
The coordinateMapper function in the sdk for mapping the color frame and depth frame uses the camera calibration parameters. The default camera calibration parameters are had coded in the sdk, however there are slight differences in each and every device. You could try to recalibrate the Kinect cameras and use the updated calibration parameters to get the correct overlay of the color and depth maps. Note however, that by siply rplacing the camera calibration parameters in the Kinect Fusion code and recompiling does not work, as the parameters are replaced from the closed-source Kinect fusion dll.So you'll have to write your own code to update each frame at runtime.
Hope this helps.

converting meter into pixel unit

i am trying to convert convert distance from meter to pixel in ros node, with pcl library and kinect xbox. I was using below code to access euclidean coordinates of every point from kinect inside ros node, which is in meter. But i wanted to get this measurments in pixel unit. What should i do?
cloud_cb (const sensor_msgs::PointCloud2ConstPtr& input)
pcl::PointCloud<pcl::PointXYZRGB> output;
pcl::fromROSMsg(*input,output );
for(int i=0;i<=400;i++)
for(int j=0;j<=400;j++)
p[i][j] = output.at(i,j);
ROS_INFO("\n p.z = %f \t p.x = %f \t p.y = %f",p[i][j].z,p[i][j].x,p[i][j].y);
sensor_msgs::PointCloud2 cloud;
pub.publish (cloud);
Here P[raw][col] is a Point structure which contains the x,y,z coordinates value in meter, which i want to convert in pixel unit. As i see the value of pixel unit is not constant, so cant use any value found in google.
I got similar question here: Kinect depth conversion from mm to pixels, but it has no solution.
There's a problem with trying to convert meters to pixels. Pixels aren't a standard unit. The physical size of 1 pixel varies on different devices depending on screen resolution and size of a screen.
If you know the resolution of the screen the conversion is still non-trivial.
const int L = 1920; //screen width
const int H = 1280; //screen height
for(int i=0;i<=L;i++){
for(int j=0;j<=H;j++){
p[i][j] = output.at(i*400/L,j*400/H);
Thus for every pixel you'll have a depth value corresponding to the depth value in the map. This will need some int conversion and improvement.

A* Pathfinding - how to modify G and H to include rough terrain movement cost?

I have A* pathfinding implemented in my 2D game and it works well on a plain map with obstacles. Now I'm trying to understand how to modify the algorithm, so it counts rough terrain (hills, forest, etc) as 2 moves instead of 1.
With the 1 movement cost, the algorithm uses integers 10 and 14 in the move cost function. Im interested in how to modify these values if one cell actually has a movement cost of 2? will it be 20:17?
Here's how my current algorithm currently computes G and H (adopted from Ray Wenderleich):
// Compute the H score from a position to another (from the current position to the final desired position
- (int)computeHScoreFromCoord:(CGPoint)fromCoord toCoord:(CGPoint)toCoord
// Here we use the Manhattan method, which calculates the total number of step moved horizontally and vertically to reach the
// final desired step from the current step, ignoring any obstacles that may be in the way
return abs(toCoord.x - fromCoord.x) + abs(toCoord.y - fromCoord.y);
// Compute the cost of moving from a step to an adjecent one
- (int)costToMoveFromStep:(ShortestPathStep *)fromStep toAdjacentStep:(ShortestPathStep *)toStep
return ((fromStep.position.x != toStep.position.x)
&& (fromStep.position.y != toStep.position.y))
? 14 : 10;
If some of the edges have movement cost 2, you will simply add 2 to the G of the parent node, rather than 1.
As for H: it doesn't need to change. The resulting heuristic will still be admissible/consistent.
I think I got it, with this line the tutorial author checks if the move is 1 square or 2 squares(diagonal) from the move that is currently being considered.
return ((fromStep.position.x != toStep.position.x)
&& (fromStep.position.y != toStep.position.y))
? 14 : 10;
Unfortunately, this is a really simple case and does not really explain what has to be done. Number 10 is used to make calculations easier (10 = 1 move cost), and (14 = 1 diagonal move) is an approximation of sqrt(10*10).
I attempted to introduce terrain cost below, and this requires extra information - I need to know which cell I'm going through to reach the destination. This turned out to be really annoying, and the code below is clearly not my best, but I attempted to spell out what's going on at each step.
If I'm making a diagonal move, I need to know it's move cost AND the move cost of 2 squares that can be used to get there. I can then pick the lowest movement cost among two squares and plug it into the equation of the form:
moveCost = (int)sqrt(lowestMoveCost*lowestMoveCost + (stepNode.moveCost*10) * (stepNode.moveCost*10));
Here's the entire loop that checks adjacent steps and creates new steps out of them with the move cost. It finds tile in my map array and returns it's terrain cost.
NSArray *adjSteps = [self walkableAdjacentTilesCoordForTileCoord:currentStep.position];
for (NSValue *v in adjSteps) {
ShortestPathStep *step = [[ShortestPathStep alloc] initWithPosition:[v CGPointValue]];
// Check if the step isn't already in the closed set
if ([self.spClosedSteps containsObject:step]) {
continue; // Ignore it
tileIndex = [MapOfTiles tileIndexForCoordinate:step.position];
DLog(#"point (x%.0f y%.0f):%i",step.position.x,step.position.y,tileIndex);
stepNode = [[MapOfTiles sharedInstance] mapTiles] [tileIndex];
// int moveCost = [self costToMoveFromStep:currentStep toAdjacentStep:step];
//in my case 0,0 is bottom left, y points up x points right
if((currentStep.position.x != step.position.x) && (currentStep.position.y != step.position.y))
//move one step away - easy, multiply move cost by 10
moveCost = stepNode.moveCost*10;
possibleMove1 = 0;
possibleMove2 = 0;
//we are moving diagonally, figure out in which direction
if(step.position.y > currentStep.position.y)
//moving up
possibleMove1 = tileIndex + 1;
if(step.position.x > currentStep.position.x)
//moving right and up
possibleMove2 = tileIndex + tileCountTall;
//moving left and up
possibleMove2 = tileIndex - tileCountTall;
//moving down
possibleMove1 = tileIndex - 1;
if(step.position.x > currentStep.position.x)
//moving right and down
possibleMove2 = tileIndex + tileCountTall;
//moving left and down
possibleMove2 = tileIndex - tileCountTall;
moveNode1 = nil;
moveNode2 = nil;
CGPoint coordinate1 = [MapOfTiles tileCoordForIndex:possibleMove1];
CGPoint coordinate2 = [MapOfTiles tileCoordForIndex:possibleMove2];
if([adjSteps containsObject:[NSValue valueWithCGPoint:coordinate1]])
//we know that possible move to reach destination has been deemed walkable, get it's move cost from the map
moveNode1 = [[MapOfTiles sharedInstance] mapTiles] [possibleMove1];
if([adjSteps containsObject:[NSValue valueWithCGPoint:coordinate2]])
//we know that the second possible move is walkable
moveNode2 = [[MapOfTiles sharedInstance] mapTiles] [possibleMove2];
#warning not sure about this one if the algorithm has to backtrack really far back
//find out which square has the lowest move cost
lowestMoveCost = fminf(moveNode1.moveCost, moveNode2.moveCost) * 10;
moveCost = (int)sqrt(lowestMoveCost*lowestMoveCost + (stepNode.moveCost*10) * (stepNode.moveCost*10));
// Compute the cost form the current step to that step
// Check if the step is already in the open list
NSUInteger index = [self.spOpenSteps indexOfObject:step];
if (index == NSNotFound) { // Not on the open list, so add it
// Set the current step as the parent
step.parent = currentStep;
// The G score is equal to the parent G score + the cost to move from the parent to it
step.gScore = currentStep.gScore + moveCost;
// Compute the H score which is the estimated movement cost to move from that step to the desired tile coordinate
step.hScore = [self computeHScoreFromCoord:step.position toCoord:toTileCoord];
// Adding it with the function which is preserving the list ordered by F score
[self insertInOpenSteps:step];
else { // Already in the open list
step = (self.spOpenSteps)[index]; // To retrieve the old one (which has its scores already computed ;-)
// Check to see if the G score for that step is lower if we use the current step to get there
if ((currentStep.gScore + moveCost) < step.gScore) {
// The G score is equal to the parent G score + the cost to move from the parent to it
step.gScore = currentStep.gScore + moveCost;
// Because the G Score has changed, the F score may have changed too
// So to keep the open list ordered we have to remove the step, and re-insert it with
// the insert function which is preserving the list ordered by F score
// Now we can removing it from the list without be afraid that it can be released
[self.spOpenSteps removeObjectAtIndex:index];
// Re-insert it with the function which is preserving the list ordered by F score
[self insertInOpenSteps:step];
These types of problems are quite common in, say, chip routing and, yes, gamedev.
Standard approach is to have your graph (in C++ I would say you have Boost "grid graph" or similar structure). If you can afford to have an object each vertex, then the solution is quite easy.
You connect two vertices (neighbors or diagonally adjacent) by an edge, unless there is an obstacle between them. You assign this edge a weight equal to edge length (10 or 14) times terrain cost. Sometimes people prefer not to exclude obstacle edges but assign extremely high weights to them (an advantage is that with such approach you are guaranteed to find at least some path, even when object is stuck at an island).
Then you apply A* algorithm. Your heuristic function (H) can be "pessimistic" (equal to Euclidean distance times the max move cost) or "optimistic" (Euclidean distance times min move cost) or anything in between. Different heuristics will result in slightly different "personalities" of your search but usually do not matter much.

Microsoft Kinect SDK depth data to real world coordinates

I'm using the Microsoft Kinect SDK to get the depth and color information from a Kinect and then convert that information into a point cloud. I need the depth information to be in real world coordinates with the centre of the camera as the origin.
I've seen a number of conversion functions but these are apparently for OpenNI and non-Microsoft drivers. I've read that the depth information coming from the Kinect is already in millimetres, and is contained in the 11bits... or something.
How do I convert this bit information into real world coordinates that I can use?
Thanks in advance!
This is catered for within the Kinect for Windows library using the Microsoft.Research.Kinect.Nui.SkeletonEngine class, and the following method:
public Vector DepthImageToSkeleton (
float depthX,
float depthY,
short depthValue
This method will map the depth image produced by the Kinect into one that is vector scalable, based on real world measurements.
From there (when I've created a mesh in the past), after enumerating the byte array in the bitmap created by the Kinect depth image, you create a new list of Vector points similar to the following:
var width = image.Image.Width;
var height = image.Image.Height;
var greyIndex = 0;
var points = new List<Vector>();
for (var y = 0; y < height; y++)
for (var x = 0; x < width; x++)
short depth;
switch (image.Type)
case ImageType.DepthAndPlayerIndex:
depth = (short)((image.Image.Bits[greyIndex] >> 3) | (image.Image.Bits[greyIndex + 1] << 5));
if (depth <= maximumDepth)
points.Add(nui.SkeletonEngine.DepthImageToSkeleton(((float)x / image.Image.Width), ((float)y / image.Image.Height), (short)(depth << 3)));
case ImageType.Depth: // depth comes back mirrored
depth = (short)((image.Image.Bits[greyIndex] | image.Image.Bits[greyIndex + 1] << 8));
if (depth <= maximumDepth)
points.Add(nui.SkeletonEngine.DepthImageToSkeleton(((float)(width - x - 1) / image.Image.Width), ((float)y / image.Image.Height), (short)(depth << 3)));
greyIndex += 2;
By doing so, the end result from this is a list of vectors stored in millimeters, and if you want centimeters multiply by 100 (etc.).