Dual kinect calibration using powerfull IR LED illuminator - kinect

i am using multiple Kinects within the scene. So I need to calibrate them and find the extrinsic parameters like translation and rotation world coordinate system. Once I have that information, i can reconstruct the scene at highest level of accuracy. the important point is : i want to get submillimeter accuracy and may be it would be nice if i could use powerfull IR projector in my system. But i do not have any Background about IR sensor and calibration methods. So i need to know about tow subject : 1- is it possible to add IR LED illuminator to kinect and manage it? 2- if i could add how to calibrate my new system?

Calibration (determining relative transforms (rotation, scale, position)) is only part of the problem. You also need to consider whether each Kinect can handle the interference of the other Kinect's projected IR reference patterns.
"Shake n Sense" (by Microsoft Research) is a novel approach that you may be able to use that has been demonstrated to work.
https://www.youtube.com/watch?v=CSBDY0RuhS4

Related

Is programming a voxel based graphics API theoretically possible?

This is entirely a theoretical question because I understand the time it would take to do such a thing would be ridiculous
I've been working with "voxels" a lot lately and the only way I can display them to a user is to either triangulate the visible surfaces or make a CPU ray-tracer but both come with their own problems.
Simply put, if we dismiss the storage space needed for voxel meshs and targeted a very specific GPU would someone who was wanting to create a graphics API like OpenGL but with "true" voxel primitives that don't need to be converted be able to make such thing or are GPUs designed specifically for triangles with no way to introduce a new base primitive?
Its possible and it was already done many times
games like Minecraft,SpaceEngineers...
3D printing tools and slicers
MRI/PET scans tools
Yes rendering on GPU is possible with the two base methods you mention. Games usually use the transform to boundary representation 3D geometry. With rise of shaders even ray tracers are now possible here mine:
simple GLSL voxel ray tracer
using native OpenGL architecture and passing geometry as 3D texture. In order to obtain speed you need to add BVH or similar spatial subdivision of geometry...
However voxel based tools have been here for quite some time. For example many isometric games/engines are voxel based (tile is a voxel) like this one:
Improving performance of click detection on a staggered column isometric grid
Also do you remember UFO ? It was playable on x286 and it was also "voxel/tile" based isometric.

It is possible to recognize all objects from a room with Microsoft Kinect?

I have a project where I have to recognize an entire room so I can calculate the distances between objects (like big ones eg. bed, table, etc.) and a person in that room. It is possible something like that using Microsoft Kinect?
Thank you!
Kinect provides you following
Depth Stream
Color Stream
Skeleton information
Its up to you how you use this data.
To answer your question - Official Micorosft Kinect SDK doesnt provides shape detection out of the box. But it does provide you skeleton data/face tracking with which you can detect distance of user from kinect.
Also with mapping color stream to depth stream you can detect how far a particular pixel is from kinect. In your implementation if you have unique characteristics of different objects like color,shape and size you can probably detect them and also detect the distance.
OpenCV is one of the library that i use for computer vision etc.
Again its up to you how you use this data.
Kinect camera provides depth and consequently 3D information (point cloud) about matte objects in the range 0.5-10 meters. With this information it is possible to segment out the floor (by fitting a plane) of the room and possibly walls and the ceiling. This step is important since these surfaces often connect separate objects making them a one big object.
The remaining parts of point cloud can be segmented by depth if they don't touch each other physically. Using color one can separate the objects even further. Note that we implicitly define an object as 3D dense and color consistent entity while other definitions are also possible.
As soon as you have your objects segmented you can measure the distances between your segments, analyse their shape, recognize artifacts or humans, etc. To the best of my knowledge however a Skeleton library can recognize humans after they moved for a few seconds. Below is a simple depth map that was broken on a few segments using depth but not color information.

how can I detect floor movements such as push-ups and sit-ups with Kinect?

I have tried to implement this using skeleton tracking provided by Kinect. But it doesn't work when I am lying down on a floor.
According to Blitz Games CTO Andrew Oliver, there are specific ways to implement with depth stream or tracking silhouette of a user instead of using skeleton frames from Kinect API. You may find a good example in video game Your Shape Fitness. Here is a link showing floor movements such as push-ups!
Do you guys have any idea how to implement this or detect movements and compare them with other movements using depth stream?
What if a 360 degree sensor was developed, one that recognises movements not only directly in front, to the left, or right of it, but also optimizes for movement above(?)/below it? The image that I just imagined was the spherical, 360 degree motion sensors often used in secure buildings and such.
Without another sensor I think you'll need to track the depth data yourself. Here's a paper with some details about how MS implements skeletal tracking in the Kinect SDK, that might get you started. They implement object matching while parsing the depth data to capture joints in the body, you may need to implement some object templates and algorithms to do the matching yourself. Unless you can reuse some of the skeletal tracking libraries to parse out objects from the depth data for you.

What frameworks for depth cameras are out there?

I want to evaluate the performance of several SDKs / frameworks for depth cameras. These cameras can either be using Time-of-Flight or structured light.
The framework should be capable (at least) of person tracking / blob detection and gesture recognition.
So far I found the following frameworks:
OpenNI (structured light only)
Microsoft Kinect SDK (Kinect only)
Beckon SDK by Omek Interactive (ToF and structured light)
iisu by SoftKinetic (ToF and structured light)
Are there any other frameworks I should be aware of?
EDIT: I found this article by Techradar that seems to indicate that these are indeed the only options currently available.
Any feedback would be very much appreciated!
I have found some interesting links on this. You can take MIT's approach using CodAC . They list lots of facts on this post, the most important ones I will post here.
9. What are limitations of this technique?
The main limitation of our framework is inapplicability to scenes with curvilinear
objects, which would require extensions of the current mathematical model.
Another limitation is that a periodic light source creates a wrap-around error
as it does in other TOF devices. For scenes in which surfaces have high reflectance
or texture variations, availability of a traditional 2D image prior to our data
acquisition allows for improved depth map reconstruction as discussed in our paper.
10. What are advantages of this technique/device and how does it
compare with existing TOF-based range sensing techniques?
In laser scanning, spatial resolution is limited by the scanning time.
TOF cameras do not provide high spatial resolution because they rely on a
low-resolution 2D pixel array of range-sensing pixels. CoDAC is a single-sensor,
high spatial resolution depth camera which works by exploiting the sparsity of natural
scene structure.
11. What is the range resolution and spatial resolution of the CoDAC system?
We have demonstrated sub-centimeter range resolution in our experiments.
This is significantly better than fundamental limit of about 10 cm that would
arise from using a detector with 0.7 nanosecond rise time if we were not using
parametric signal modeling. The improvement in range resolution comes from the
parametric modeling and deconvolution in our framework. We refer the reader to
our publications for complete details and analysis.
We have demonstrated 64-by-64 pixel spatial resolution,
as this is the spatial resolution of our spatial light modulator.
Spatially patterning with a digital micromirror device (DMD) will enable
much higher spatial resolution. Our experiments use only 205 projection patterns,
which correspond to just 5% of number of pixels in the reconstructed depth map.
This is a significant improvement over raster scanning in LIDAR, and it is
obtained without the 2D sensor array used in TOF cameras.
Also another interesting project I found on Youtube uses libfreenect and libusb
There is also dSensingNI which is described as
This work presents an approach to overcome the disadvantages of existing interaction
frameworks and technologies for touch detection and object interaction. The robust and
easy to use framework dSensingNI (Depth Sensing Natural Interaction) is described,
which supports multitouch and tangible interaction with arbitrary objects. It uses
images from a depth-sensing camera and provides tracking of users fingers of palm of
hands and combines this with object interaction, such as grasping, grouping and
stacking, which can be used for advanced interaction techniques.
So you have hit most of them out there, especially that use Kinect, but there are a few other options out there! Hope this Helps!

Robot camera + motion detection

I have a project in which we (me and my student) will develop a system for robot.
In this robot we have a camera that capture.
My question is how to detect motions, movements.
Is there a solution?? Which technics and tools to use ??
Which language to use (possible for Java for example) ??
Thanks in advance.
Best regards.
Ali
Consider using OpenCV:
http://opencv.org
It has a lot of useful vision algorithms built in, and supports, C, C++ and Python, as well as GPU functionality.
I will suggest you Microsoft Visual Studio wich is an integrated development environment and c# programming language. Emgu CV library wich is a cross platform .Net wrapper to the OpenCV image processing library. A simple method from a static position is this:
Convert a single frame to grayscale.
Convert a new frames from real time into grayscale.
Make abstractions between the first frame and new frame from real time.
The result of this is a third, new frame comprised of the differences between the first two. Use erosion and thresholding for that to get a frame with white representing the motioned section and black representing the rest of the space.
If the objects you attempt to track have a distinct color, you should be able to target them adequately.
One way to accomplish this is to choose an appropriate space for color as RGB space. Keep in mind this may be too sensitive, even to small illumination variance. (It really depends on the objects you want to track and tracking scenario.)
You can use OpenCV
Here you can find a C++ tutorial: http://blog.cedric.ws/opencv-simple-motion-detection
using two cameras instead of one could be useful in your project, for detecting depth in the image and real distance of a motion (in a perspective vision)
Stereoscopic depth rendition
Real-time detection of independent motion using stereo