Generating tensors with elements based on position in xtensor - xtensor

I am trying to build a data structure to represent an RGB image in xtensor (a 3D matrix, with shape in the form of (WIDTH, HEIGHT, 3).
Each "pixel" contains data collected by a function of the pixel coordinates. Basically, I want to replicate what this code does in python:
image = [[cell_info(x, y) for x in range(WIDTH)]
for y in range(HEIGHT)]
where cell info returns a 3 elements list representing the color channels.
I suppose the proper way to do this should be using an xgenerator, but to be honest I cannot understand how to use that class.

I found a solution:
I changed cell_info to accept a int channel parameter, so that it returns an integer instead of an array. Then I wrote this:
class img_generator_fn {
public:
using value_type = int;
img_generator_fn(const Map *map, const Position &center, shared_ptr<const Player> player,
const unsigned int field_radius)
: m_map(map), m_center(center), m_player(player), m_translation(-(field_radius + 1)) {}
~img_generator_fn() { m_map = nullptr; }
inline auto operator()(const unsigned int x, const unsigned int y, const unsigned int channel) const {
return m_map->at(m_center + Position(x, y))->cell_info(m_player, channel);
}
template <class It> inline auto element(It, It end) const {
return m_map->at(m_center + Position(*(end - 2) + m_translation, (*(end - 3)) + m_translation))
->cell_info(m_player, *(end - 1));
}
private:
const Map *m_map;
const Position &m_center;
shared_ptr<const Player> m_player;
const unsigned int m_translation;
};
template <unsigned int field_side> auto field(const Position &center, shared_ptr<const Player> player) const {
const array<unsigned int, 3> shape = {field_side, field_side, 3};
auto gen = xt::detail::make_xgenerator(img_generator_fn(this, center, player, (field_side - 1) / 2), shape);
return xt::xtensor_fixed<int, xt::xshape<field_side, field_side, 3>>(gen);
}
Here Map represents a 2D matrix, it is the structure which contains, togheter with player, the informations I want to store in an image. The function at picks up the map cell at the specified position (the map cell will be converted into a pixel). The function field generates an image centered around center from a given Map, using an xgenerator.

Related

How do I extract the output from CGAL::poisson_surface_reconstruction_delaunay?

I am trying to convert a point cloud to a trimesh using CGAL::poisson_surface_reconstruction_delaunay() and extract the data inside the trimesh to an OpenGL friendly format:
// The function below should set vertices and indices so that:
// triangle 0: (vertices[indices[0]],vertices[indices[1]],vertices[indices[2]]),
// triangle 1: (vertices[indices[3]],vertices[indices[4]],vertices[indices[5]])
// ...
// triangle n - 1
void reconstructPointsToSurfaceInOpenGLFormat(const& std::list<std::pair<Kernel::Point_3, Kernel::Vector_3>> points, // input: points and normals
std::vector<glm::vec3>& vertices, // output
std::vector<unsigned int>& indices) { // output
CGAL::Surface_mesh<Kernel::Point_3> trimesh;
double spacing = 10;
bool ok = CGAL::poisson_surface_reconstruction_delaunay(points.begin(), points.end(),
CGAL::First_of_pair_property_map<std::pair<Kernel::Point_3, Kernel::Vector_3>>(),
CGAL::Second_of_pair_property_map<std::pair<Kernel::Point_3, Kernel::Vector_3>>(),
trimesh, spacing);
// How do I set the vertices and indices values?
}
Please help me on iterating trough the triangles in trimesh and setting the vertices and indices in the code above.
The class Polyhedron_3 is not indexed based so you need to provide a item class with ids like Polyhedron_items_with_id_3. You will then need to call CGAL::set_halfedgeds_items_id(trimesh) to init the ids. If you can't modify the Polyhedron type, then you can use dynamic properties and will need to init the ids.
Note that Surface_mesh is indexed based and no particular handling is needed to get indices.
Based on sloriots code from his answer:
void mesh2GLM(CGAL::Surface_mesh<Kernel::Point_3>& trimesh, std::vector<glm::vec3>& vertices, std::vector<int>& indices) {
std::map<size_t, size_t> meshIndex2Index;
// Loop over all vertices in mesh:
size_t index = 0;
for (Mesh::Vertex_index v : CGAL::vertices(trimesh)) {
CGAL::Epick::Point_3 point = trimesh.point(v);
std::size_t vi = v;
vertices.push_back(glm::vec3(point.x(), point.y(), point.z()));
meshIndex2Index[vi] = index;
index++;
}
// Loop over all triangles (faces):
for (Mesh::Face_index f : faces(trimesh)) {
for (Mesh::Vertex_index v : CGAL::vertices_around_face(CGAL::halfedge(f, trimesh), trimesh)) {
trimesh.point(v);
std::size_t vi = v;
size_t index = meshIndex2Index[vi];
indices.push_back(index);
}
}
}
Seems to work fine.

Skeletal animation bug with Assimp in DirectX 12

I am using Assimp to load an FBX model with animation (created in Blender) into my DirectX 12 game, but I'm experiencing a very frustrating bug with the animation rendered by the game application.
The test model is a simple 'flagpole' containing four bones like so:
Bone0 -> Bone1 -> Bone2 -> Bone3
The model renders correctly in its rest pose when the keyframe animation is bypassed.
The model also renders and animates properly when the animation rotates the model only by the root bone (Bone0).
However, when importing a model that rotates at the first joint (i.e. at Bone1), the vertices clustered around each joint seem 'stuck' in their original positions, while the vertices surrounding the 'bones' proper appear to follow through with the correct animation.
The result is a crappy zigzag of stretched geometry like so:
Instead the model should resemble an 'allen-key' shape at the end of its animation pose, as shown by the same model rendered in the AssimpViewer utility tool:
Since the model is rendering correctly in AssimpViewer, it's reasonable to assume there are no issues with the FBX file exported by Blender. I then checked and confirmed that the vertices 'stuck' around the joints did indeed have their vertex weights correctly assigned by the game loading code.
The C++ model loading and animation code is based on the popular OGLDev tutorial: https://ogldev.org/www/tutorial38/tutorial38.html
Now the infuriating thing is, since the AssimpViewer tool was correctly rendering the model animation, I also copied in the SceneAnimator and AnimEvaluator classes from that tool to generate the final bone transforms via that code branch as well... only to end up with exactly the same zigzag bug in the game!
I'm reasonably confident there aren't any issues with finding the bone hierarchy structure at initialization, so here are the key functions that traverse the hierarchy and interpolate key frames each frame.
VOID Mesh::ReadNodeHeirarchy(FLOAT animationTime, CONST aiNode* pNode, CONST aiAnimation* pAnim, CONST aiMatrix4x4 parentTransform)
{
std::string nodeName(pNode->mName.data);
// nodeTransform is a relative transform to parent node space
aiMatrix4x4 nodeTransform = pNode->mTransformation;
CONST aiNodeAnim* pNodeAnim = FindNodeAnim(pAnim, nodeName);
if (pNodeAnim)
{
// Interpolate scaling and generate scaling transformation matrix
aiVector3D scaling(1.f, 1.f, 1.f);
CalcInterpolatedScaling(scaling, animationTime, pNodeAnim);
// Interpolate rotation and generate rotation transformation matrix
aiQuaternion rotationQ (1.f, 0.f, 0.f, 0.f);
CalcInterpolatedRotation(rotationQ, animationTime, pNodeAnim);
// Interpolate translation and generate translation transformation matrix
aiVector3D translat(0.f, 0.f, 0.f);
CalcInterpolatedPosition(translat, animationTime, pNodeAnim);
// build the SRT transform matrix
nodeTransform = aiMatrix4x4(rotationQ.GetMatrix());
nodeTransform.a1 *= scaling.x; nodeTransform.b1 *= scaling.x; nodeTransform.c1 *= scaling.x;
nodeTransform.a2 *= scaling.y; nodeTransform.b2 *= scaling.y; nodeTransform.c2 *= scaling.y;
nodeTransform.a3 *= scaling.z; nodeTransform.b3 *= scaling.z; nodeTransform.c3 *= scaling.z;
nodeTransform.a4 = translat.x; nodeTransform.b4 = translat.y; nodeTransform.c4 = translat.z;
}
aiMatrix4x4 globalTransform = parentTransform * nodeTransform;
if (m_boneMapping.find(nodeName) != m_boneMapping.end())
{
UINT boneIndex = m_boneMapping[nodeName];
// the global inverse transform returns us to mesh space!!!
m_boneInfo[boneIndex].FinalTransform = m_globalInverseTransform * globalTransform * m_boneInfo[boneIndex].BoneOffset;
//m_boneInfo[boneIndex].FinalTransform = m_boneInfo[boneIndex].BoneOffset * globalTransform * m_globalInverseTransform;
m_shaderTransforms[boneIndex] = aiMatrixToSimpleMatrix(m_boneInfo[boneIndex].FinalTransform);
}
for (UINT i = 0u; i < pNode->mNumChildren; i++)
{
ReadNodeHeirarchy(animationTime, pNode->mChildren[i], pAnim, globalTransform);
}
}
VOID Mesh::CalcInterpolatedRotation(aiQuaternion& out, FLOAT animationTime, CONST aiNodeAnim* pNodeAnim)
{
UINT rotationKeys = pNodeAnim->mNumRotationKeys;
// we need at least two values to interpolate...
if (rotationKeys == 1u)
{
CONST aiQuaternion& key = pNodeAnim->mRotationKeys[0u].mValue;
out = key;
return;
}
UINT rotationIndex = FindRotation(animationTime, pNodeAnim);
UINT nextRotationIndex = (rotationIndex + 1u) % rotationKeys;
assert(nextRotationIndex < rotationKeys);
CONST aiQuatKey& key = pNodeAnim->mRotationKeys[rotationIndex];
CONST aiQuatKey& nextKey = pNodeAnim->mRotationKeys[nextRotationIndex];
FLOAT deltaTime = FLOAT(nextKey.mTime) - FLOAT(key.mTime);
FLOAT factor = (animationTime - FLOAT(key.mTime)) / deltaTime;
assert(factor >= 0.f && factor <= 1.f);
aiQuaternion::Interpolate(out, key.mValue, nextKey.mValue, factor);
}
I've just included the rotation interpolation here, since the scaling and translation functions are identical. For those unaware, Assimp's aiMatrix4x4 type follows a column-vector math convention, so I haven't messed with original matrix multiplication order.
About the only deviation between my code and the two Assimp-based code branches I've adopted is the requirement to convert the final transforms from aiMatrix4x4 types into a DirectXTK SimpleMath Matrix (really an XMMATRIX) with this conversion function:
Matrix Mesh::aiMatrixToSimpleMatrix(CONST aiMatrix4x4 m)
{
return Matrix
(m.a1, m.a2, m.a3, m.a4,
m.b1, m.b2, m.b3, m.b4,
m.c1, m.c2, m.c3, m.c4,
m.d1, m.d2, m.d3, m.d4);
}
Because of the column-vector orientation of aiMatrix4x4 Assimp matrices, the final bone transforms are not transposed for HLSL consumption. The array of final bone transforms are passed to the skinning vertex shader constant buffer as follows.
commandList->SetPipelineState(m_psoForwardSkinned.Get()); // set PSO
// Update vertex shader with current bone transforms
CONST std::vector<Matrix> transforms = m_assimpModel.GetShaderTransforms();
VSBonePassConstants vsBoneConstants{};
for (UINT i = 0; i < m_assimpModel.GetNumBones(); i++)
{
// We do not transpose bone matrices for HLSL because the original
// Assimp matrices are column-vector matrices.
vsBoneConstants.boneTransforms[i] = transforms[i];
//vsBoneConstants.boneTransforms[i] = transforms[i].Transpose();
//vsBoneConstants.boneTransforms[i] = Matrix::Identity;
}
GraphicsResource vsBoneCB = m_graphicsMemory->AllocateConstant(vsBoneConstants);
vsPerObjects.gWorld = m_assimp_world.Transpose(); // vertex shader per object constant
vsPerObjectCB = m_graphicsMemory->AllocateConstant(vsPerObjects);
commandList->SetGraphicsRootConstantBufferView(RootParameterIndex::VSBoneConstantBuffer, vsBoneCB.GpuAddress());
commandList->SetGraphicsRootConstantBufferView(RootParameterIndex::VSPerObjConstBuffer, vsPerObjectCB.GpuAddress());
//commandList->SetGraphicsRootDescriptorTable(RootParameterIndex::ObjectSRV, m_shaderTextureHeap->GetGpuHandle(ShaderTexDescriptors::SuzanneDiffuse));
commandList->SetGraphicsRootDescriptorTable(RootParameterIndex::ObjectSRV, m_shaderTextureHeap->GetGpuHandle(ShaderTexDescriptors::DefaultDiffuse));
for (UINT i = 0; i < m_assimpModel.GetMeshSize(); i++)
{
commandList->IASetVertexBuffers(0u, 1u, &m_assimpModel.meshEntries[i].GetVertexBufferView());
commandList->IASetIndexBuffer(&m_assimpModel.meshEntries[i].GetIndexBufferView());
commandList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
commandList->DrawIndexedInstanced(m_assimpModel.meshEntries[i].GetIndexCount(), 1u, 0u, 0u, 0u);
}
Please note I am using the Graphics Resource memory management helper object found in the DirectXTK12 library in the code above. Finally, here's the skinning vertex shader I'm using.
// Luna (2016) lighting model adapted from Moller
#define MAX_BONES 4
// vertex shader constant data that varies per object
cbuffer cbVSPerObject : register(b3)
{
float4x4 gWorld;
//float4x4 gTexTransform;
}
// vertex shader constant data that varies per frame
cbuffer cbVSPerFrame : register(b5)
{
float4x4 gViewProj;
float4x4 gShadowTransform;
}
// bone matrix constant data that varies per object
cbuffer cbVSBonesPerObject : register(b9)
{
float4x4 gBoneTransforms[MAX_BONES];
}
struct VertexIn
{
float3 posL : SV_POSITION;
float3 normalL : NORMAL;
float2 texCoord : TEXCOORD0;
float3 tangentU : TANGENT;
float4 boneWeights : BONEWEIGHT;
uint4 boneIndices : BONEINDEX;
};
struct VertexOut
{
float4 posH : SV_POSITION;
//float3 posW : POSITION;
float4 shadowPosH : POSITION0;
float3 posW : POSITION1;
float3 normalW : NORMAL;
float2 texCoord : TEXCOORD0;
float3 tangentW : TANGENT;
};
VertexOut VS_main(VertexIn vin)
{
VertexOut vout = (VertexOut)0.f;
// Perform vertex skinning.
// Ignore BoneWeights.w and instead calculate the last weight value
// to ensure all bone weights sum to unity.
float4 weights = vin.boneWeights;
//weights.w = 1.f - dot(weights.xyz, float3(1.f, 1.f, 1.f));
//float4 weights = { 0.f, 0.f, 0.f, 0.f };
//weights.x = vin.boneWeights.x;
//weights.y = vin.boneWeights.y;
//weights.z = vin.boneWeights.z;
weights.w = 1.f - (weights.x + weights.y + weights.z);
float4 localPos = float4(vin.posL, 1.f);
float3 localNrm = vin.normalL;
float3 localTan = vin.tangentU;
float3 objPos = mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.x]).xyz * weights.x;
objPos += mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.y]).xyz * weights.y;
objPos += mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.z]).xyz * weights.z;
objPos += mul(localPos, (float4x3)gBoneTransforms[vin.boneIndices.w]).xyz * weights.w;
float3 objNrm = mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.x]) * weights.x;
objNrm += mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.y]) * weights.y;
objNrm += mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.z]) * weights.z;
objNrm += mul(localNrm, (float3x3)gBoneTransforms[vin.boneIndices.w]) * weights.w;
float3 objTan = mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.x]) * weights.x;
objTan += mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.y]) * weights.y;
objTan += mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.z]) * weights.z;
objTan += mul(localTan, (float3x3)gBoneTransforms[vin.boneIndices.w]) * weights.w;
vin.posL = objPos;
vin.normalL = objNrm;
vin.tangentU.xyz = objTan;
//vin.posL = posL;
//vin.normalL = normalL;
//vin.tangentU.xyz = tangentL;
// End vertex skinning
// transform to world space
float4 posW = mul(float4(vin.posL, 1.f), gWorld);
vout.posW = posW.xyz;
// assumes nonuniform scaling, otherwise needs inverse-transpose of world matrix
vout.normalW = mul(vin.normalL, (float3x3)gWorld);
vout.tangentW = mul(vin.tangentU, (float3x3)gWorld);
// transform to homogenous clip space
vout.posH = mul(posW, gViewProj);
// pass texcoords to pixel shader
vout.texCoord = vin.texCoord;
//float4 texC = mul(float4(vin.TexC, 0.0f, 1.0f), gTexTransform);
//vout.TexC = mul(texC, gMatTransform).xy;
// generate projective tex-coords to project shadow map onto scene
vout.shadowPosH = mul(posW, gShadowTransform);
return vout;
}
Some last tests I tried before posting:
I tested the code with a Collada (DAE) model exported from Blender, only to observe the same distorted zigzagging in the Win32 desktop application.
I also confirmed the aiScene object for the loaded model returns an identity matrix for the global root transform (also verified in AssimpViewer).
I have stared at this code for about a week and am going out of my mind! Really hoping someone can spot what I have missed. If you need more code or info, please ask!
This seems to be a bug with the published code in the tutorials / documentation. It would be great if you could open an issue-report here: Assimp-Projectpage on GitHub .
It's taken almost another two weeks of pain, but I finally found the bug. It was in my own code, and it was self-inflicted. Before I show the solution, I should explain the further troubleshooting I did to get there.
After losing faith with Assimp (even though the AssimpViewer tool was animating my model correctly), I turned to the FBX SDK. The FBX ViewScene command line utility tool that's available as part of the SDK was also showing and animating my model properly, so I had hope...
So after a few days reviewing the FBX SDK tutorials, and taking another week to write an FBX importer for my Windows desktop game, I loaded my model and... saw exactly the same zig-zag animation anomaly as the version loaded by Assimp!
This frustrating outcome meant I could at least eliminate Assimp and the FBX SDK as the source of the problem, and focus again on the vertex shader. The shader I'm using for vertex skinning was adopted from the 'Character Animation' chapter of Frank Luna's text. It was identical in every way, which led me to recheck the C++ vertex structure declared on the application side...
Here's the C++ vertex declaration for skinned vertices:
struct Vertex
{
// added constructors
Vertex() = default;
Vertex(FLOAT x, FLOAT y, FLOAT z,
FLOAT nx, FLOAT ny, FLOAT nz,
FLOAT u, FLOAT v,
FLOAT tx, FLOAT ty, FLOAT tz) :
Pos(x, y, z),
Normal(nx, ny, nz),
TexC(u, v),
Tangent(tx, ty, tz) {}
Vertex(DirectX::SimpleMath::Vector3 pos,
DirectX::SimpleMath::Vector3 normal,
DirectX::SimpleMath::Vector2 texC,
DirectX::SimpleMath::Vector3 tangent) :
Pos(pos), Normal(normal), TexC(texC), Tangent(tangent) {}
DirectX::SimpleMath::Vector3 Pos;
DirectX::SimpleMath::Vector3 Normal;
DirectX::SimpleMath::Vector2 TexC;
DirectX::SimpleMath::Vector3 Tangent;
FLOAT BoneWeights[4];
BYTE BoneIndices[4];
//UINT BoneIndices[4]; <--- YOU HAVE CAUSED ME A MONTH OF PAIN
};
Quite early on, being confused by Luna's use of BYTE to store the array of bone indices, I changed this structure element to UINT, figuring this still matched the input declaration shown here:
static CONST D3D12_INPUT_ELEMENT_DESC inputElementDescSkinned[] =
{
{ "SV_POSITION", 0u, DXGI_FORMAT_R32G32B32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "NORMAL", 0u, DXGI_FORMAT_R32G32B32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "TEXCOORD", 0u, DXGI_FORMAT_R32G32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "TANGENT", 0u, DXGI_FORMAT_R32G32B32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
//{ "BINORMAL", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "BONEWEIGHT", 0u, DXGI_FORMAT_R32G32B32A32_FLOAT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
{ "BONEINDEX", 0u, DXGI_FORMAT_R8G8B8A8_UINT, 0u, D3D12_APPEND_ALIGNED_ELEMENT, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0u },
};
Here was the bug. By declaring UINT in the vertex structure for bone indices, four bytes were being assigned to store each bone index. But in the vertex input declaration, the DXGI_FORMAT_R8G8B8A8_UINT format specified for the "BONEINDEX" was assigning one byte per index. I suspect this data type and format size mismatch was resulting in only one valid bone index being able to fit in the BONEINDEX element, and so only one index value was passed to the vertex shader each frame, instead of the whole array of four indices for correct bone transform lookups.
So now I've learned... the hard way... why Luna had declared an array of BYTE for bone indices in the original C++ vertex structure.
I hope this experience will be of value to someone else, and always be careful changing code from your original learning sources.

Tensorflow retrained graph in C# (Tensorflowsharp)

I'am just trying to use a retrained inception model in Tensorflow sharp in Unity.
The retrained model was prepared with optimize_for_inference and is working like a charm in python.
But it is pretty inaccurate in c#.
the code works like this:
First i get the Picture
//webcamtexture transformed to picture in jpg
var pic = _texture.EncodeToJpg();
//added Picture to queue for the object detection thread
_detectedObjects.addTens(pic);
After that a thread will handle each collected picture
public void HandlePicture(byte[] picture)
{
var tensor = ImageUtil.CreateTensorFromImageFile(picture);
var runner = session.GetRunner();
runner.AddInput(g_input, tensor).Fetch(g_output);
var output = runner.Run();
var bestIdx = 0;
float best = 0;
var result = output[0];
var rshape = result.Shape;
var probabilities = ((float[][])result.GetValue(jagged: true))[0];
for (int r = 0; r < probabilities.Length; r++)
{
if (probabilities[r] > best)
{
bestIdx = r;
best = probabilities[r];
}
}
Debug.Log("Tensorflow thinks this is: " + labels[bestIdx] + " Prob : " + best * 100);
}
so my guess is:
1.it has something to do with retrained graphs (because i can't find any application/test it is used and working).
2.It has something to do with how i handle the picture transform into a tensor?! (but if that is wrong i could need help there, the code further down)
to transform the picture i'am also using a graph like it is used in the tensorsharp example
public static class ImageUtil
{
// Convert the image in filename to a Tensor suitable as input to the Inception model.
public static TFTensor CreateTensorFromImageFile(byte[] contents, TFDataType destinationDataType = TFDataType.Float)
{
// DecodeJpeg uses a scalar String-valued tensor as input.
var tensor = TFTensor.CreateString(contents);
TFGraph graph;
TFOutput input, output;
// Construct a graph to normalize the image
ConstructGraphToNormalizeImage(out graph, out input, out output, destinationDataType);
// Execute that graph to normalize this one image
using (var session = new TFSession(graph))
{
var normalized = session.Run(
inputs: new[] { input },
inputValues: new[] { tensor },
outputs: new[] { output });
return normalized[0];
}
}
// The inception model takes as input the image described by a Tensor in a very
// specific normalized format (a particular image size, shape of the input tensor,
// normalized pixel values etc.).
//
// This function constructs a graph of TensorFlow operations which takes as
// input a JPEG-encoded string and returns a tensor suitable as input to the
// inception model.
private static void ConstructGraphToNormalizeImage(out TFGraph graph, out TFOutput input, out TFOutput output, TFDataType destinationDataType = TFDataType.Float)
{
// Some constants specific to the pre-trained model at:
// https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip
//
// - The model was trained after with images scaled to 224x224 pixels.
// - The colors, represented as R, G, B in 1-byte each were converted to
// float using (value - Mean)/Scale.
const int W = 299;
const int H = 299;
const float Mean = 128;
const float Scale = 1;
graph = new TFGraph();
input = graph.Placeholder(TFDataType.String);
output = graph.Cast(graph.Div(
x: graph.Sub(
x: graph.ResizeBilinear(
images: graph.ExpandDims(
input: graph.Cast(
graph.DecodeJpeg(contents: input, channels: 3), DstT: TFDataType.Float),
dim: graph.Const(0, "make_batch")),
size: graph.Const(new int[] { W, H }, "size")),
y: graph.Const(Mean, "mean")),
y: graph.Const(Scale, "scale")), destinationDataType);
}
}

calculating forward kinematics using D-H matrix

I have a 6-DOF robot arm model:
robot arm structure
I want to calculate forward kinematics, so I uses the D-H matrix. the D-H parameters are:
static const std::vector<float> theta = {
0,0,90.0f,0,-90.0f,0};
// d
static const std::vector<float> d = {
380.948f,0,0,-560.18f,0,0};
// a
static const std::vector<float> a = {
-220.0f,522.331f,80.0f,0,0,94.77f};
// alpha
static const std::vector<float> alpha = {
90.0f,0,90.0f,-90.0f,-90.0f,0};
and the calculation :
glm::mat4 Robothand::armForKinematics() noexcept
{
glm::mat4 pose(1.0f);
float cos_theta, sin_theta, cos_alpha, sin_alpha;
for (auto i = 0; i < 6;i++)
{
cos_theta = cosf(glm::radians(theta[i]));
sin_theta = sinf(glm::radians(theta[i]));
cos_alpha = cosf(glm::radians(alpha[i]));
sin_alpha = sinf(glm::radians(alpha[i]));
glm::mat4 Ai = {
cos_theta, -sin_theta * cos_alpha,sin_theta * sin_alpha, a[i] * cos_theta,
sin_theta, cos_theta * cos_alpha, -cos_theta * sin_alpha,a[i] * sin_theta,
0, sin_alpha, cos_alpha, d[i],
0, 0, 0, 1 };
pose = pose * Ai;
}
return pose;
}
the problem I have is that, I can't get the correct result, for example, I want to calculate the transformation matrix from first joint to the 4th joint, I will change the for loop i < 3,then I can get the pose matrix, and I can the origin coordinate in 4th coordinate system by pose * (0,0,0,1).but the result (380.948,382.331,0) seems not correct because it should be move along x-axis not y-axis. I have read many books and materials about D-H matrix, but I can't figure out what's wrong with it.
I have figured it out by myself, the real problem behind is glm::mat, glm::mat is col-type which means columns will be initialized before rows,I changed the code and get the correct result:
for (int i = 0; i < joint_num; ++i)
{
pose = glm::rotate(pose, glm::radians(degrees[i]), glm::vec3(0, 0, 1));
pose = glm::translate(pose,glm::vec3(0,0,d[i]));
pose = glm::translate(pose, glm::vec3(a[i], 0, 0));
pose = glm::rotate(pose,glm::radians(alpha[i]),glm::vec3(1,0,0));
}
then I can get the position by:
auto pos = pose * glm::vec4(x,y,z,1);

In Kinect SDK v2.0 how do I map a pixel in the color image to a voxel in the depth image?

I know how to go the other way around. What I am looking for is, given a (x,y) coordinate in the pixel space (of the 1920x1080 image), how do I get the corresponding (if available) (x,y,z) (in meters) of the depth image. I realize that there are more pixels than voxels and it could be possible not to find any, but Microsoft's SDK has a CoordinateMapper class. This exposes the MapColorFrameToCameraSpace function. If I use that, I can get an array of points in the camera space (x,y,z) but I am unable to figure out how to extract the mapping for a specific pixel.
You probably need to use
CoordinateMapper.MapDepthFrameToColorSpace
Find the color space coordinates of all depth point.
Then, compare your pixel coordinate (x, y) with these color space coordinates. My solution is to find the closest point (might have better way), because the mapped coordinates are floats.
If you use C#, here is the code. Hope it helps!
private ushort GetDepthValueFromPixelPoint(KinectSensor kinectSensor, ushort[] depthData, float PixelX, float PixelY)
{
ushort depthValue = 0;
if (null != depthData)
{
ColorSpacePoint[] depP = new ColorSpacePoint[512 * 424];
kinectSensor.CoordinateMapper.MapDepthFrameToColorSpace(_depthData, depP);
int depthIndex = FindClosestIndex(depP, PixelX, PixelY);
if (depthIndex < 0)
Console.WriteLine("-1");
else
{
depthValue = _depthData[depthIndex];
Console.WriteLine(depthValue);
}
}
return depthValue;
}
private int FindClosestIndex(ColorSpacePoint[] depP, float PixelX, float PixelY)
{
int depthIndex = -1;
float closestPoint = float.MaxValue;
for (int j = 0; j < depP.Length; ++j)
{
float dis = DistanceOfTwoPoints(depP[j], PixelX, PixelY);
if (dis < closestPoint)
{
closestPoint = dis;
depthIndex = j;
}
}
return depthIndex;
}
private float DistanceOfTwoPoints(ColorSpacePoint colorSpacePoint, float PixelX, float PixelY)
{
float x = colorSpacePoint.X - PixelX;
float y = colorSpacePoint.Y - PixelY;
return (float)Math.Sqrt(x * x + y * y);
}