How to get the coordinates of the object in YOLO object detection using depth camera? - yolo

I want to get the coordinates of the object using depth camera. The depth camera is astra pro. And I use ubuntu 20.04.

Related

How to highlight the desk in RGB with assistence of 3d point cloud?

What I describe next is a preprocess of object detection.
I have a desk with some snacks and pasters,and a Depth Camera(Intel Realsense R300).Now I want to avoid the paster on it(set the pixel value to 0), and avoid the interfering object around the desk(but not on the desk,instead,on the ground).To make full use of Depth Camera,I think that I can using Depth Camera to get the depth image and use RANSAC algorithm to recognize the plane of the desk,which can avoid the paster at the same time.
However, now the problem is, I successfully get the plane of desk with RANSAC algorithm in 3D point cloud, but I don't know how to use it to fit 2D RGB image to set the pixel value of the plane to 0.

3D Human Pose Estimation

I am working on human pose estimation work.
I am able to generate 2d coordinates of different joints of a person in an image.
But I need 3d coordinates to solve the purpose of my project.
Is there any library or code available to generate 3d coordinates of joints ?
Please help.
for 3d coordinates on pose estimation there is a limit for you. you cant get 3d pose with only one camera (monocular). you have 2 way to estimate those :
use RGBD ( red, green, blue and depth) cameras like Kinect
or use stereo vision with using at least two camera.
for RGBD opencv contrib has a library for that.
but if you want to use stereo vision you have some steps:
1.Get camera calibration parameters
for calibration you can follow this.
2.then you should get undistorted of your points with using calibration parameters.
3.then you should get projection matrix of your both cameras.
4.at last, you can use opencv triangulation for getting 3D coordinates.
for more info about each step, you can search about stereo vision, camera calibration, triangulation and etc.

Bounding box coordinates via video object detection

Is it possible to get the coordinates of the detected object through video object detection? I used OpenCV to change video to images and LabelImg to put the bounding box to them. For the output, I want to be able to read (for a video file) the coordinates of the bounding box so I could get the centre of the box.
Object Detection works on per image or per frame basis.
The basic object detection model takes image as input and gives back the detected bounding boxes.
Now, what you need to do is, once your model is trained, read a video, frame by frame, or by skippng 'n' frames, pass the frame or image to the object detector and get the coordinates from it and show it into the output video frame.
That is how object detection works for a video.
Please refer to below links for references :
https://github.com/tensorflow/models/tree/master/research/object_detection
https://www.edureka.co/blog/tensorflow-object-detection-tutorial/

How to transfer depth pixel to camera space using kinect sdk v2

I'm using Kinect v2 and Kinect SDK v2.
I have couple of questions about coordinate mapping:
How to transfer a camera space point (point in 3d coordinate system) to depth space with depth value?
Current MapCameraPointToDepthSpace method can only return the depth space coordinate.
But without depth value, this method is useless.
Did anyone know how to get the depth value?
How to get the color camera intrinsic?
There is only a GetDepthCameraIntrinsics methos to get depth camera intrinsic.
But how about color camera?
How to use the depth camera intrinsic?
Seems that the Kinect 2 consider the radial distortion.
But how to use these intrinsic to do the transformation between depth pixel and 3d point?
Is there any example code can do this?
Regarding 1: The depth value of your remapped world coordinate is the same as in the original world coordinate's Z value. Read the description of the depth buffer and world coordinate space: this value in both is simply the distance from the point to Kinect's plane, in meters. Now, if you want the depth value of the object being seen on the depth frame directly behind your remapped coordinate, you have to read the depth image buffer in that position.
Regarding 3: You use the camera's intrinsic when you have to manually construct a CoordinateMapper object (i.e. when you don't have a Kinect available). When you get the CoordinateMapper associated to a Kinect (using Kinect object's CoordinateMapper property), it already contains that Kinect's intrinsics... that's why you have a GetDepthCameraIntrinsics method which returns that specific Kinect's intrinsics (they can vary from device to device).
Regarding 2: There is now way to get the color camera intrinsic. You have to evaluate them by camera calibration.

How to texture map point cloud from Kinect with image taken by another camera?

Kinect camera has a very low resolution RGB image. I want to use point cloud from the depth kinect but want to texture map it with another image taken from another camera.
Could anyone please guide me how to do that?
See the Kinect Calibration Toolbox, v2.0 http://sourceforge.net/projects/kinectcalib/files/v2.0/
2012-02-09 - v2.0 - Major update. Added new disparity distortion model and simultaneous calibration of external RGB camera.