kinect SKD skeletonization method - kinect

I was wondering if there's a way to modify the depth map prior to sending it to the skeletonization algorithm used by the kinect, for example, if we want to run the skeletonization on the output of a segmented depth image. So far I have reviewed the methods in the sdk but I haven't been able to find a skeletonization method exposed. It's like you either turn the skeleton on or off but you have no control on its inputs.
If anyone has any idea regarding this topic I will be much obliged.
Shamita: skeletonization means tracking the joints of the user in real time. I edit because I can't comment (not enought reputation).

All the joints' give a depth coordinate and I don't think you can mess with the Kinect hardware input stream. But you can categorize the joints regarding to depth segments. For example with the live stream you categorize it with the corresponding category if it is below 10 and above five it is in category A. this can be done with the live stream itself because it is just a simple calculation.

Related

Are there cameras that capture not only color but the actual depth?

Getting into 3D reconstruction techniques, I'm curious whether there are cameras that capture not only color but the depth at the moment of image being captured. It would appear that getting the depth of a particular pixel on the sensor would be far more accurate than needing to reconstruct after the fact by using many images.
Thoughts? Suggestions?
Yes, there are sensors which are not based on the triangulation principle. They use the time of flight or similar principles to capture the depth for a particular pixel. Take a look at PMD-Sensors
There are Microsoft Kinect, and IntelĀ® RealSenseā„¢ that you can take a look at.

Insert skeleton in 3D model programmatically

Background
I'm working on a project where a user gets scanned by a Kinect (v2). The result will be a generated 3D model which is suitable for use in games.
The scanning aspect is going quite well, and I've generated some good user models.
Example:
Note: This is just an early test model. It still needs to be cleaned up, and the stance needs to change to properly read skeletal data.
Problem
The problem I'm currently facing is that I'm unsure how to place skeletal data inside the generated 3D model. I can't seem to find a program that will let me insert the skeleton in the 3D model programmatically. I'd like to do this either via a program that I can control programmatically, or adjust the 3D model file in such a way that skeletal data gets included within the file.
What have I tried
I've been looking around for similar questions on Google and StackOverflow, but they usually refer to either motion capture or skeletal animation. I know Maya has the option to insert skeletons in 3D models, but as far as I could find that is always done by hand. Maybe there is a more technical term for the problem I'm trying to solve, but I don't know it.
I do have a train of thought on how to achieve the skeleton insertion. I imagine it to go like this:
Scan the user and generate a 3D model with Kinect;
1.2. Clean user model, getting rid of any deformations or unnecessary information. Close holes that are left in the clean up process.
Scan user skeletal data using the Kinect.
2.2. Extract the skeleton data.
2.3. Get joint locations and store as xyz-coordinates for 3D space. Store bone length and directions.
Read 3D skeleton data in a program that can create skeletons.
Save the new model with inserted skeleton.
Question
Can anyone recommend (I know, this is perhaps "opinion based") a program to read the skeletal data and insert it in to a 3D model? Is it possible to utilize Maya for this purpose?
Thanks in advance.
Note: I opted to post the question here and not on Graphics Design Stack Exchange (or other Stack Exchange sites) because I feel it's more coding related, and perhaps more useful for people who will search here in the future. Apologies if it's posted on the wrong site.
A tricky part of your question is what you mean by "inserting the skeleton". Typically bone data is very separate from your geometry, and stored in different places in your scene graph (with the bone data being hierarchical in nature).
There are file formats you can export to where you might establish some association between your geometry and skeleton, but that's very format-specific as to how you associate the two together (ex: FBX vs. Collada).
Probably the closest thing to "inserting" or, more appropriately, "attaching" a skeleton to a mesh is skinning. There you compute weight assignments, basically determining how much each bone influences a given vertex in your mesh.
This is a tough part to get right (both programmatically and artistically), and depending on your quality needs, is often a semi-automatic solution at best for the highest quality needs (commercial games, films, etc.) with artists laboring over tweaking the resulting weight assignments and/or skeleton.
There are algorithms that get pretty sophisticated in determining these weight assignments ranging from simple heuristics like just assigning weights based on nearest line distance (very crude, and will often fall apart near tricky areas like the pelvis or shoulder) or ones that actually consider the mesh as a solid volume (using voxels or tetrahedral representations) to try to assign weights. Example: http://blog.wolfire.com/2009/11/volumetric-heat-diffusion-skinning/
However, you might be able to get decent results using an algorithm like delta mush which allows you to get a bit sloppy with weight assignments but still get reasonably smooth deformations.
Now if you want to do this externally, pretty much any 3D animation software will do, including free ones like Blender. However, skinning and character animation in general is something that tends to take quite a bit of artistic skill and a lot of patience, so it's worth noting that it's not quite as easy as it might seem to make characters leap and dance and crouch and run and still look good even when you have a skeleton in advance. That weight association from skeleton to geometry is the toughest part. It's often the result of many hours of artists laboring over the deformations to get them to look right in a wide range of poses.

It is possible to recognize all objects from a room with Microsoft Kinect?

I have a project where I have to recognize an entire room so I can calculate the distances between objects (like big ones eg. bed, table, etc.) and a person in that room. It is possible something like that using Microsoft Kinect?
Thank you!
Kinect provides you following
Depth Stream
Color Stream
Skeleton information
Its up to you how you use this data.
To answer your question - Official Micorosft Kinect SDK doesnt provides shape detection out of the box. But it does provide you skeleton data/face tracking with which you can detect distance of user from kinect.
Also with mapping color stream to depth stream you can detect how far a particular pixel is from kinect. In your implementation if you have unique characteristics of different objects like color,shape and size you can probably detect them and also detect the distance.
OpenCV is one of the library that i use for computer vision etc.
Again its up to you how you use this data.
Kinect camera provides depth and consequently 3D information (point cloud) about matte objects in the range 0.5-10 meters. With this information it is possible to segment out the floor (by fitting a plane) of the room and possibly walls and the ceiling. This step is important since these surfaces often connect separate objects making them a one big object.
The remaining parts of point cloud can be segmented by depth if they don't touch each other physically. Using color one can separate the objects even further. Note that we implicitly define an object as 3D dense and color consistent entity while other definitions are also possible.
As soon as you have your objects segmented you can measure the distances between your segments, analyse their shape, recognize artifacts or humans, etc. To the best of my knowledge however a Skeleton library can recognize humans after they moved for a few seconds. Below is a simple depth map that was broken on a few segments using depth but not color information.

Transformation between point clouds

I hope to find some hints where to start with a problem I am dealing with.
I am using a Kinect sensor to capture 3d point clouds. I created a 3d object detector which is already working.
Here my task:
Lets say I have a point cloud 1. I detected a object in cloud A and I know the centroid position of my object (x1,y1,z1). Now I move my sensor around a path and create new clouds (e.g. cloud 2). In that cloud 2 I see the same object but e.g. from the side, where the object detection is not working fine.
I would like to transform the detected object form cloud 1 to cloud 2, to get the centroid also in cloud 2. For me it sound like I need a matrix (Translation, Rotation) to transform point from 1 to 2.
And ideas how I could solve my problem?
Maybe ICP? Are there better solutions?
THX!
In general, this task is called registration. It relies on having a good estimation of which points in cloud 1 correspond to which clouds in point 2 (more specifically, which given a point in cloud 1, which point in cloud 2 represents the same location on the detected object). There's a good overview in the PCL library documentation
If you have such a correspondence, you're in luck and you can directly compute a rotation and translation as demonstrated here.
If not, you'll need to estimate that correspondence. ICP does that for approximately aligned point clouds, but if your point clouds are not already fairly well aligned, you may want to start by estimating "key points" (such as book corners, distinct colors, etc) in your point clouds, computing a rotation and translation as above, and then performing ICP. As D.J.Duff mentioned, ICP works better in practice on point clouds that are already approximately aligned because it estimates correspondences using one of two metrics, minimal point-to-point distance or minimal point to plane distance, according to wikipedia, the latter works better in practice, but it does involve estimating normals, which can be tricky. If the correspondences are far off, the transforms likely will be as well.
I think what you were asking about was in particular to the Kinect Sensor and the API Microsoft released for it.
If you are not planning to do reconstruction, you can look into the AlignPointClouds function in Sensor Fusion namespace. This should take care of it automatically, in methods similar to the answer given by #pnhgiol.
On the other hand, if you are looking at doing reconstruction as well as point cloud transforms, the Reconstruction class is what you are looking for. All of which can be found out about, here.

how can I detect floor movements such as push-ups and sit-ups with Kinect?

I have tried to implement this using skeleton tracking provided by Kinect. But it doesn't work when I am lying down on a floor.
According to Blitz Games CTO Andrew Oliver, there are specific ways to implement with depth stream or tracking silhouette of a user instead of using skeleton frames from Kinect API. You may find a good example in video game Your Shape Fitness. Here is a link showing floor movements such as push-ups!
Do you guys have any idea how to implement this or detect movements and compare them with other movements using depth stream?
What if a 360 degree sensor was developed, one that recognises movements not only directly in front, to the left, or right of it, but also optimizes for movement above(?)/below it? The image that I just imagined was the spherical, 360 degree motion sensors often used in secure buildings and such.
Without another sensor I think you'll need to track the depth data yourself. Here's a paper with some details about how MS implements skeletal tracking in the Kinect SDK, that might get you started. They implement object matching while parsing the depth data to capture joints in the body, you may need to implement some object templates and algorithms to do the matching yourself. Unless you can reuse some of the skeletal tracking libraries to parse out objects from the depth data for you.