I have a bunch of jpeg and depth(raw) files saved on disk using kinect sdk
Is there a way to create the skeleton data (joint points) using these files with openni?
If so how it could be done?
Thanks!!
OpenNI does not handle the skeleton tracking. Rather it is done through the NITE middleware layer that plugs into OpenNI. NITE, and the algorithms that handle the skeleton generation, are closed source and not available to dissection.
I am not aware of an API call to push a raw image into the skeleton process for pulling out the skeleton data. I'd bet that movement within the stream actually plays a part in the algorithm, making single image processing very imprecise.
Related
The short question: I am wondering if the kinect SDK / Nite can be exploited to get a depth image IN, skeleton OUT software.
The long question: I am trying to dump depth,rgb,skeleton data streams captured from a v2 Kinect into rosbags. However, to the best of my knowledge, capturing the skeleton stream on Linux with ros, kinect v2 isn't possible yet. Therefore, I was wondering if I could dump rosbags containing rgb,depth streams, and then post-process these to get the skeleton stream.
I can capture all three streams on windows using the Microsoft kinect v2 SDK, but then dumping them to rosbags, with all the metadata (camera_info, sync info etc) would be painful (correct me if I am wrong).
It's quite some time ago that I worked with NITE (and I only used Kinect v1) so maybe someone else can give a more up-to-date answer, but from what I remember, this should easily be possible.
As long as all relevant data is published via ROS topics, it is quite easy to record them with rosbag and play them back afterwards. Every node that can handle live data from the sensor will also be able to do the same work on recorded data coming from a bag file.
One issue you may encounter is that if you record kinect-data, the bag files are quickly becoming very large (several gigabytes). This can be problematic if you want to edit the file afterwards on a machine with very little RAM. If you only want to play the file or if you have enough RAM, this should not really be a problem, though.
Indeed it is possible to perform a NiTE2 skeleton tracking on any depth-image-stream.
Refer to:
https://github.com/VIML/VirtualDeviceForOpenNI2/wiki/How-to-use
and
https://github.com/VIML/VirtualDeviceForOpenNI2/wiki/About-PrimeSense-NiTE
With this extension one can add a virtual device which allows to manipulate each pixel of the depth stream. This device can then be used for creation of a userTracker object. As long as the right device name is provided skeleton tracking can be done
\OpenNI2\VirtualDevice\Kinect
but consider usage limits:
NiTE only allow to been used with "Authorized Hardware"
I'm working with a new Kinect v2 sensor, and using Kinect Studio to record the Kinect stream data during some experiments. The problem is our experiments are expected to last ~10 minutes, which including the uncompressed video would be equivalent to ~80gb. In addition, the buffer fills up quite fast and around 2 minutes in and the remainder of the data ends up stuttering at around 2fps instead of the smooth 25fps.
Is there any way I can record all the data I need in compressed form? Would it be easy to create an app similar to kinect studio that just prints out a video file and a .xed file containing all the other sensor data?
Kinect Studio does have APIs that can be used to programmatically record particular data streams into an XEF file. Additionally, it's possible to have multiple applications using the sensor simultaneously, so in theory you should be able to have three applications collecting data from the sensor (you could combine these into one application as well):
Your application;
An application using the Kinect Studio APIs, or Kinect Studio itself, to record the non-RGB streams;
Another application that collects the RGB data stream and performs compression and then saves the data.
However, the latency and buffer issue is likely to be a problem here. Kinect Studio data collection is extremely resource-intensive and it may not be possible to do real-time video compression while maintaining 25fps. Depending on the network infrastructure available you might be able to offload the RGB data to another machine to compress and store, but this would need to be well tested. This is likely to be a lot of work.
I'd suggest that you first see whether switching to another high-spec machine, with a fast SSD drive and good CPU and GPU, makes the buffering issue go away. If this is the case you could then record using Kinect Studio and then post-process the XEF files after the sessions to compress the videos (using the Kinect Studio APIs to open the XEF files).
I've been looking at the NAudio demo application "Audio file playback". What I'm missing from this demo is a way to get hold of the samples while the audio file is being played.
I figured that it would somehow be possible to fill a BufferedWaveProvider with samples using a callback whenever new samples are needed, but I can't figure out how.
My other (non-preferred) idea is to make a special version of e.g. DirectSoundOut where I can get hold of the samples before they are written to the sound card.
Any ideas?
With audio file playback in NAudio you construct an audio pipeline, starting with your audio file and going through various transformations (e.g. changing volume) along the way before ending up at your output device. The NAudioDemo does in fact show how the samples can be accessed along the way by drawing a waveform (pre-volume adjustment) and by showing a volume meter (post-volume adjustment).
You could, for example, create an implementer of IWaveProvider or ISampleProvider and insert it into the pipeline. Then, in the Read method, you read from your source, and then you can process or examine or write to disk the samples before passing them on to the next stage in the pipeline. Look at the AudioPlaybackPanel.CreateInputStream to see how this is done in the demo.
Do you know any application that will display me all the headers/parameters of a single H264 frame? I don't need to decode it, I just want to see how it is built up.
Three ways come to my mind (if you are looking for something free, otherwise google "h264 analysis" for paid options):
Download the h.264 parser from (from this thread # doom9 forums)
Download the h.264 reference software
libh264bitstream provides h.264 bitstream reading/writing
This should get you started. By the way, the h.264 bitstream is described in Annex. B. in the ITU specs.
I've created a Web version - https://mradionov.github.io/h264-bitstream-viewer/
Based on h264bitstream and inspired by H264Naked. Done by compiling h264bitstream into WebAssembly and building a simple UI on top of it. Output information for NAL units is taken from H264Naked at the moment. Also supports files of any size, just will take some time initially to load the file, but navigation throughout the stream should be seamless.
I had the same question. I tried h264 analysis, but it only supports windows. So I made a similar tool with Qt to support different platforms.Download H264Naked. This tool is essentially a wrapper around libh264bitstream
I am planning to write an application (C/C++/Objective-C) that will
play media files in own (private) container format. The files will
contain: multiple video streams, encoded by a video codec (like
XVid or H264, it is supposed, that components capable of decoding
the viideo formats are present in the system); multiple audio
streams in some compressed formats (it is supposed, that decoding
will be performed by a system component or by own code).
So, it seems it is required to implement the following scheme:
1) Implement container demuxer (may be in the form of media handler
component).
2) Pass video frames to a video decoder component, and mix
decompressed frames (using some own rules).
3) Pass audio data to an audio decoder component, or decompress the
audio by own code, and mix decoded audio data.
4) Render video frames to a window.
5) Pass audio data to a selected audio board.
Could anybody provide tips to any of the above stages, that is:
toolkits, that I should use; useful samples; may be names of
functions to be used; may be improvements to the scheme,....
I know I am quite late, so you might not need this anymore, but I just wanted to mention, that the right way to do it is to write a QuickTime component.
Although it is pretty old school, it's the same way Apple uses to support new formats and codecs.
Look at the Perian project as an orientation point.
Best