Separate noise from skeleton with Kinect - kinect

Looking to do a proof of concept, and new to Kinect. I believe this is possible, but trying to gauge difficulty with links to tutorials etc explaining how this may work.
Looking to have the Kinect look at a walkway, and essentially detect people movement. This does not mean Skeletal movement, but essentially "foot traffic". I want to determine the noise of traffic, i.e. are there alot of people walking past, or a few. (Note this does not mean counting, just a rough indication. Can this be just pixel movement etc?)
Secondly, if a person then stops and faces the Kinect, pick them up as a user, and track rudimentary movements.
The second part I'm relatively comfortable with, the first I'm not.
Any help is appreciated is pointing me in the right direction. We are a Microsoft house, so any indication if Microsoft SDK, or OpenKinect is the best path would be great too.

Related

Kinect fast hand tracking method

How can I track fast hand movement using kinect?
I've tried both Openni and Microsoft sdk to track hand. On both of them, there are lots of jitters and inaccurate movement of joints.
Here is an example video of kinect fruit ninja: Example Video
On that video, there are no jitters and inaccuracy and also it's tracking the fast hand movements.
What am I missing? Is there any kinds of kinect hardware versions or types which I should look into.
My best guess is that Fruit Ninja applies some sort of smoothing at some point. What you're seeing in that video is almost certainly not the raw data they're getting from the Kinect. The data from the Kinect will always have some kind of jitter; real-world sensor data almost always does. You'll need to smooth it - exactly how to do that depends on the application; it could be something simple like modelling a kind of damping and/or inertia on the point that's being moved by the hand (which is what I suspect Fruit Ninja is doing), or you could look at something like a Kalman filter for a robust (but more computationally-intensive) way to reduce the noise in your sensor readings.

Shape (preferably human) recognition API for use with standard webcam

I am interested in getting into user interaction/shape detection with a simple usb webcam. I can use multiple webcams, but don't want to be restricted to using something like the kinect sensor. My detection cameras need to be set up on either side of a helmet (or if an individual one, on top). I have found some, but they don't really have the functionality I need and most are angled towards facial recognition. I need to be able to detect a basic human skeletal structure and determine if something is obstructing it. I would really rather be able to do it without using any sort of marker system on the target person. I would like for it to be able to target multiple structures. Obviously I am willing to do tweaking if necessary, but want to see how close I can get to what I need before I rebuild the wheel. I am trying to design an ai system that can determine how many people are in an area and where they are.
Doubt there will be anything like this since Microsoft spent a ton of money on the R&D for Kinect and it's probably all locked behind an NDA. I'm also guessing there's a lot of hardware within the Kinect that is not available in a standard webcam.
The closest thing that I could find to what you're looking for is the OpenKinect project, might be a good place to start your research.

Kinect - Tracking people in a crowd - Sports Motion Tracking

I'm interested in programming the Kinect to track people over a largish area.
In particular, I'm looking to track players on a small sports field using gestures to record events in a sports game.
So far I have not found any examples of this being done before, other than Processing examples of tracking players on recorded video.
Could anybody please provide any examples of Microsoft's Kinect technology being applied to sport?
This is not what the Kinect is designed to do, and is not something that it will do for you.
The Kinect is capable to tracking no more then 6 people at a time, and only 2 people actively. It works best for people 6-8 feet away and will not track anyone much further then that.
For what you are proposing the Kinect would not benefit you. It would probably hinder you, since it is designed for 1-2 persons at a limited distance. You would be better off with a higher quality camera.
I never used or tried Kinect (but i really want to get one:). So i don't know if it suits what you need. But there is this amazing tutorial from Amnon Owed on Kinect and processing. The example is with one person but might be of your interest. The video is very cool.
It is possible. You will need to depart from the skeleton tracking available to you in the Kinect SDK and process the depth data yourself. You'll need multiple Kinects to handle a larger area. Topdown-mounted from the ceiling will let you more easily distinguish individuals that would be grouped together in a multi-person blob using side view.

Is the depth image returned by Microsoft Kinect SDK already undistorted?

Supposedly the Microsoft SDK has access to the Kinect's intrinsic parameters but does anyone have any idea if the depth image it returns is actually undistorted? I couldn't find anything relevant.
Let me know if I'm out of topic although I consider this as an implicit programming question :)
edit: some other useful links I found that support #Coeffect's answer
http://ros.org/wiki/kinect_calibration/technical
Accuracy Analysis of Kinect Depth Data, by K.Khoshelham
So the IR pattern that the Kinect displays isn't your normal grid. For an example, check out this blog post. The Kinect handles making a normal depth map out of this. Thinking about focal lengths and such for this system is just going to dig yourself a hole. Your thoughts about precision are probably misplaced. The Kinect isn't accurate enough to BE picky about such things. Having used the Kinect for motion detection, there's a lot of noise. If you have a certain situation in mind, you might want to post about it.
edit: Here's a post showing that the depth isn't linear, and more of the precision is focused on closer objects. So the farther away you are, the less precise the data is, and the more severe the noise will become (because having the returned depth change by 1 nearby is pretty much nothing, but farther away that accounts for a larger distance change).

Kinect joint detection from top

I'm wondering, does the Kinect detects joints correctly when it's put on the top (on the ceiling).
I don't have necessary equipment to attach it to ceiling and test, but was wondering whether it reliably detects human. I'm ok even if it confuses the joints, actually.
Has anybody tested this?
From what I've seen while using it, the skeleton detection is iffy from any angle other than directly pointing at a person's front or back. A Kinect pointed straight down with people walking under it would almost certainly not detect anyone, because the human form from above does not look anything like it does from the front. I have had the Kinect pick up random people around me in odd positions (sitting, viewed from the side, etc), but the joints were largely spastic. If you have it mounted on the ceiling and pointed downwards at a sufficient angle to still see people from the front instead of from above.. it could do a fairly good job of picking them up.
So when you say on the ceiling do you mean pointing straight down or still looking at a fairly horizontal angle?
I did a little bit of testing with the Kinect mounted in a very high position (2.5 m, 70° to the ground). As answered by Coeffect it just doesn't work. It doesn't work with Microsoft SDK nor with OpenNI. What I can add is that the skeleton recognition only works if the user is facing the camera with her/his whole body-front. Even worse, both frameworks seem to expect the head at the top of the depth-frame.