Defining a tracking area for the Kinect - kinect

Is it possible to specify a (rectangular) area for skeleton tracking with the Kinect (using any of the available SDKs)? I want to make sure that only users inside that designated area are tracked and that the sensor is not distracted by people outside it. Think of a game zone, in which a player interacts with the Kinect and where bystanders outside of the zone should be ignored lest they confuse the sensor.
The reason I want this is that many times the Kinect "locks" onto someone or even something, whether it should or not, and then it's difficult for the sensor to track other individuals, who come into tracking range. I want to avoid that by defining this zone.

It's not possible to specify a target area for the skeleton tracking with Microsoft's official SDKs, but there are some potential workarounds.
(Note that I'm not familiar with other SDKs for the Kinect, and note that I'm not sure if you are using the Kinect v1 or v2.)
If you are using the Kinect v1, note that it can track 6 players simultaneously (with a skeleton body position), but it can only provide full-body skeletal tracking for 2 players at a time. It's possible to specify which 2 players you want full skeletal tracking for in the official SDK, and you can do this based on which skeletons are in your target game zone.
If this isn't the problem, and the problem is that the Kinect (v1 or v2) has already detected 6 players and it can't detect a 7th individual that's in your game zone, then that is a more difficult problem. With the official SDK, you have no control over which 6 players are selected to be tracked. The sensor will lock onto the first 6 players it finds, so if a 7th player walks in, there is no simple way to lock onto that player.
However, there are some possible workarounds that involve resetting the sensor to clear all skeletons to re-select the 6 tracked skeletons (see the thread Skeleton tracking in crowds - Kinect v2):
Kinect body tracking is always scanning and finding candidate bodies
to track. The body tracking only locks on when it detects head and
shoulders of the person facing the camera. You could do something like
look for stable blob points in the target area and if there isn't a
tracked body, reset the Kinect Monitor service.
The SDK is resilient to this type of failure of the runtime, but it is
a hard approach. Additionally, you could employ a way to cover the
depth camera (your hand) to reset the tracking since this will make
all depth/ir invalid and will need to rebuild.
-- Carmine Sirignano - MSFT
In the same thread, RobAcheson points out that restarting the sensor is another workaround:
I've been using the by-hand method successfully for a while and that
definitely works - when I'm in the crowd :)
I have started calling KinectSensor.Close() and KinectSensor.Open()
when there are >6 skeletons if none are in the target area. That seems
to be working well too. Now I just need a crowd to test with.
-- RobAcheson

Related

Relocalize a smartphone on a preloaded point cloud

Being a novice I need an advice how to solve the following problem.
Say, with photogrammetry I have obtained a point cloud of the part of my room. Then I upload this point cloud to an android phone and I want it to track its camera pose relatively to this point cloud in real time.
As far as I know there can be problems with different cameras' (simple camera or another phone camera VS my phone camera) intrinsics that can affect the presision of localisation, right?
Actually, it's supposed to be an AR-app, so I've tried existing SDKs - vuforia, wikitude, placenote (haven't tried arcore yet cause my device highly likely won't support it). The problem is they all use their own clouds for their services and I don't want to depend on them. Ideally, it's my own PC where I perform 3d reconstruction and from where my phone downloads a point cloud.
Do I need a SLAM (with IMU fusion) or VIO on my phone, don't I? Are there any ready-to-go implementations within libs like ARtoolKit or, maybe, PCL? Will any existing SLAM catch up a map, reconstructed with other algorithms or should I use one and only SLAM for both mapping and localization?
So, the main question is how to do everything arcore and vuforia does without using third party servers. (I suspect the answer is to device the same underlay which vuforia and other SDKs use to employ all available hardware..)

Postprocess Depth Image to get Skeleton using the Kinect sdk / other tools?

The short question: I am wondering if the kinect SDK / Nite can be exploited to get a depth image IN, skeleton OUT software.
The long question: I am trying to dump depth,rgb,skeleton data streams captured from a v2 Kinect into rosbags. However, to the best of my knowledge, capturing the skeleton stream on Linux with ros, kinect v2 isn't possible yet. Therefore, I was wondering if I could dump rosbags containing rgb,depth streams, and then post-process these to get the skeleton stream.
I can capture all three streams on windows using the Microsoft kinect v2 SDK, but then dumping them to rosbags, with all the metadata (camera_info, sync info etc) would be painful (correct me if I am wrong).
It's quite some time ago that I worked with NITE (and I only used Kinect v1) so maybe someone else can give a more up-to-date answer, but from what I remember, this should easily be possible.
As long as all relevant data is published via ROS topics, it is quite easy to record them with rosbag and play them back afterwards. Every node that can handle live data from the sensor will also be able to do the same work on recorded data coming from a bag file.
One issue you may encounter is that if you record kinect-data, the bag files are quickly becoming very large (several gigabytes). This can be problematic if you want to edit the file afterwards on a machine with very little RAM. If you only want to play the file or if you have enough RAM, this should not really be a problem, though.
Indeed it is possible to perform a NiTE2 skeleton tracking on any depth-image-stream.
Refer to:
https://github.com/VIML/VirtualDeviceForOpenNI2/wiki/How-to-use
and
https://github.com/VIML/VirtualDeviceForOpenNI2/wiki/About-PrimeSense-NiTE
With this extension one can add a virtual device which allows to manipulate each pixel of the depth stream. This device can then be used for creation of a userTracker object. As long as the right device name is provided skeleton tracking can be done
\OpenNI2\VirtualDevice\Kinect
but consider usage limits:
NiTE only allow to been used with "Authorized Hardware"

Will Kinect v2 support multiple sensors?

Working with multiple Kinect v1 sensors is very difficult because of the IR interference between the sensors.
Based on what I read on this gamastura article, Microsoft got rid of the interference problem with the time-of-flight mechanism that Kinect v2 sensor uses to gauge depth.
Does that mean I could use multiple Kinect v2 sensors at the same time, or did I misunderstand the article?
Thanks for the help!
I asked this question, in person, of the dev team at the meetup in San Francisco in April. The answer I got was:
"This feature is 3+ months away. We want to prioritize single-Kinect features before working on multiple Kinects."
I'm a researcher, and my goal is to have a bunch of odd setups, so this is a frustrating answer, but I understand that they need to prioritize usage that will be immediately useful to a larger market.
Could you connect them to multiple computers and stream data back and forth?
As #escapecharacter mentioned not likely to have support for multiple kinect v2 sensors in the very near future.
I can also confirm, one of the Kinect V2 SDK samples has this comment:
// for Alpha, one sensor is supported
this.kinectSensor = KinectSensor.Default;
I think the hardware itself is capable of avoiding the interference problem. Hopefully the slightly larger amount of data (higher res RGB stream) won't be a problem with multiple sensors(and available USB bandwidth) and it would be a matter of enabling the SDK to safely handle multiple sensor instances in the future.
I wouldn't expect a fast/quick update to the SDK enabling though, so in the meantime, although not ideal you could try either:
Using multiple V2 sensors on multiple machines communicating over a
local network, passing only processed/minimal data (to keep the delay
as small as possible)
Using multiple V1 sensors using Shake'n'Sense (pdf link to paper) to reduce interference
At least you would to a certain extent make some progress testing some of your assumptions for your project with multiple sensors, and update the project when the updated SDK is out.
I realize I misread your question, and interpreted it as "how can I connect to Kinect 2's to a computer" when you were actually asking about how to avoid interference, and Kinect 2 was your hoped-for solution.
You can hack avoiding Kinect 1 interference by lighting shaking one of them independent of the other. See here:
http://channel9.msdn.com/coding4fun/kinect/Shaking-some-sense-into-using-multiple-Kinects-with-Shake-n-Sense
One of the craziest things I've ever seen that actually worked. I was at Microsoft Research when they figured this out, and it works quite well.
You can have a Kinect v1 viewing the same scene as a Kinect v2 without interference. I know this isn't exactly what you're looking for, but it could be useful.
2 Years later, and this still cannot be done.
See:
https://social.msdn.microsoft.com/Forums/en-US/8e2233b6-3c4f-485b-a683-6bacd6a74d53/how-to-prevent-interference-between-multiple-kinect-v2-sensors?forum=kinectv2sdk
https://github.com/OpenKinect/libfreenect2/issues/424
As stated in the second link,
What happens is this: Each Kinect v2 continuously switches between different modulation frequencies. When two Kinects switch to the same frequency range, the interference occurs. They typically gradually drift into the same range and after a while drift out of that range again. So, theoretically, you just have to wait a bit until the interference is gone. The only way I found to stop the interference immediately was to disconnect (and reconnect) the concerned Kinect from its power supply
...
Quite unfortunate that these modulation frequencies aren't controllable at this time. Let's hope MS surprises us with that custom firmware
IIRC, I came across a group at MIT that got custom firmware from MS which solved the problem, but I can't seem to find the reference. Unfortunately, it is not available to the public.
I think we cant use multiple Kinect v2 in same environment because they will interfere lot comparatively kinect v1. As Kinect v2 depth sensing based on time of flight principle, multiple kinect v2 will interfere lot. For kinect v1 interference is not that much severe.

How to modify DirectX camera

Suppose I have a 3D (but not stereoscopic) DirectX game or program. Is there a way for a second program (or a driver) to change the camera position in the game?
I'm trying to build a head-tracking plugin or driver that I can use for my DirectX games/programs. An inertial motion sensor will give me the position of my head but my problem is using that position data to change the camera position, not with the hardware/math concerns of head tracking.
I haven't been able to find anything on how to do this so far, but iZ3D was able to create two cameras near the original camera and use it for stereoscopic stuff, so I know there exists some hook/link/connection into DirectX that makes camera manipulation by a second program possible.
If I am able to get this to work I'll release the code.
-Shane
Hooking Direct3D calls in its nature is just hooking DLL calls. I.e. its not something special to D3D but just a generic technique. Try googling for "hook dll" or start from here: [C++] Direct3D hooking sample. As it always happens with hooks there are many caveats and you'll have to make a pretty huge boilerplate to satisfy all needs of the hooked application.
Though, manipulation with camera in games usually gives not good results. There are at least two key features of modern PC game which will severely limit your idea:
Pre-clipping. Almost any game engine filters out objects that are behind the viewing plane. So when you rotate camera to a side you won't see the objects you'd expect to see in a real world - they were just not sent to D3D since game doesn't know that viewing plane has changed.
Multiple passes rendering. Many popular post processing effects are done in extra passes (either thru the whole scene or just part of it). Mirrors and "screens" are the most known such effects. Without knowing what camera you're manipulating with you'll most likely just break the scene.
Btw, #2 is the reason why stereoscopic mode is not 100% compatible with all games. For example, in Source engine HDR scenes are rendered in three passes and if you don't know how to distinguish them you'll do nothing but break the game. Take a look at how nVidia implements their stereoscopic mode: they make a separate hook for every popular game and even with this approach it's not always possible to get expected result.

How to detect heart pulse rate without using any instrument in iOS sdk?

I am right now working on one application where I need to find out user's heartbeat rate. I found plenty of applications working on the same. But not able to find a single private or public API supporting the same.
Is there any framework available, that can be helpful for the same? Also I was wondering whether UIAccelerometer class can be helpful for the same and what can be the level of accuracy with the same?
How to implement the same feature using : putting the finger on iPhone camera or by putting the microphones on jaw or wrist or some other way?
Is there any way to check the blood circulation changes ad find the heart beat using the same or UIAccelerometer? Any API or some code?? Thank you.
There is no API used to detect heart rates, these apps do so in a variety of ways.
Some will use the accelerometer to measure when the device shakes with each pulse. Other use the camera lens, with the flash on, then detect when blood moves through the finger by detecting the light levels that can be seen.
Various DSP signal processing techniques can be used to possibly discern very low level periodic signals out of a long enough set of samples taken at an appropriate sample rate (accelerometer or reflected light color).
Some of the advanced math functions in the Accelerate framework API can be used as building blocks for these various DSP techniques. An explanation would require several chapters of a Digital Signal Processing textbook, so that might be a good place to start.