Partial skeleton tracking using Kinect: lower body tracking

Partial skeleton tracking using Kinect: lower body tracking - kinect

I am interested in tracking leg and foot positions of a human body, with the upper body being outside the field of view. Since version 1.5, Microsoft's Kinect SDK offers a "seated" skeleton tracking mode, where lower body joints are disregarded. Apparently, a special mode for the opposite (tracking the lower torso) seems not available. However, there is this FrameEdges Enumeration which can seemingly be used to "Find out how much of a user's body is visible [...]". So does that mean that the SDK suppports tracking of the lower body joints with the top of the user's body being outside the field of view and using the default tracking mode? Does anyone have experience with this task and have any tips?
Apparently OpenNI offers partial skeleton tracking but I wasn't able to find more details on that. Anyone can help me out?

You can start with the NiSimpleSkeleton sample.
If should be a matter of replacing the
XN_SKEL_PROFILE_ALL
constant with
XN_SKEL_PROFILE_LOWER
in the sample.

Related

Media Foundation - Custom Media Source & Sensor Profile

I am writing an application for previewing, capturing and snapshotting camera input. To this end I am using Media Foundation for the input. One of the requirements is that this works with a Black Magic Intensive Pro 4K capture card, which behaves similar to a normal camera.
Media Foundation is unfortunately unable to create an IMFMediaSource object from this device. Some research lead me to believe that I could implement my own MediaSource.
Then I started looking at samples, and tried to unravel the documentation.
At that point I encountered some questions:
Does anyone know if what I am trying to do is possible?
A Windows example shows a basic implementation of a source, but uses IMFSensorProfile. What is a Sensor Profile, and what should I use it for? There is almost no documentation about this.
Can somebody explain how implementing a custom media source works in terms of: what actually happens on the inside? Am I simply creating my own format, or does it allow me to pull my own frames from the camera and process them myself? I tried following the msdn guide, but no luck so far.
Specifics:
Using WPF with C# but I can write C++ and use it in C#.
Rendering to screen uses Direct3D9.
The capture card specs can be found on their site (BlackMagic Intensity Pro 4K).
The specific problem that occurs is that I can acquire the IMFActivator for the device, but I am not able to activate it. On activation, an MF_E_INVALIDMEDIATYPE error occurs.
The IMFActivator can tell me that the device should output a UYVY format.
My last resort is using the DeckLinkAPI, but since I am working with several different types of cameras, I do not want to be stuck with another dependency.
Any pointers or help would be appreciated. Let me know if anything is unclear or needs more detail.

Image comparison and visual testing for Windows desktop application and C# with the usage of WinAPppDriver

Please help in choosing a tool for testing watermark/image overlay. The transparency can be 0%, it should not be a problem.
The application under test is a WPF desktop application on Windows, the autotests are written in Winappdriver + C#, now it looks like I have to take a screenshot of a specific element and compare the actual image with the ideal sample by a mask.
The product under test is a video camera with the ability to insert a logotype/watermark and/or additional details (date/name/address) on the image and video. The task is to verify automatically the correctness of the inserted logo and the correctness of the inserted details in the image/video (size, color, if the logo was mirrored after insert or whatever if a name was entered badly...).
At the moment I am thinking about using OpenCV or Sikuli. I know that Appium had something similar but it probably won't work with my driver.
It is also unclear how and what can be tested with video. Just to take one frame randomly and make a test for it as for an image?
Many thanks for your help and suggestions!

Perhaps not a complete answer to you your questions but a few words on how Sikuli works and what might be a disadvantage, if I understand your needs correctly. First of all, Sikuli is using OpenCV internally by calling the Imgproc.matchTemplate() function. There is not much control over it from Sikuli but you can set a minimum similarity score that varies between 0 (everything will match) and 1 (pixel perfect comparison). Given you intend to use it for video originated patterns, you'd want to be somewhere in the middle. Having said that, I am not sure what quality of comparison you'd like to obtain so not sure if the minimum similarity by itself will be enough.
Another thought is to integrate the OpenCv lib itself in your code and use it directly. This is not an easy task and some basic understand of image processing techniques might be required.

How do you draw text in DirectX 12?

This is a follow-up question of How do you draw text in DirectX 11?
In Direct3D-12, things got much more complex and since it's new I couldn't find any suitable libraries online.
I'm building a basic Direct3D12 FPS Test application, and I like to display the FPS data on screen with my rendered image.

The general answer to questions like this is "if you have to ask, then you probably should be using DirectX 11." DirectX 12 is a graphics expert API that provide immense control, and is not particularly concerned with ease-of-use for novices. See this thread for more thoughts in this vein.
With that out of the way, one option is to use device interop and Direct2D/DirectWrite. See Working with Direct3D 11, Direct3D 10 and Direct2D.
UPDATE: DirectX Tool Kit for DirectX 12 is now available. It includes a SpriteFont / SpriteBatch implementation that will draw text on Direct3D 12 render targets. See this tutorial.

Pure DirectX 12, then you need to load the font glyph data into a vertex buffer and render with a vertex shader and pixel shader. You mentioned libraries online, will this is expert stuff and fortunately James Stanard at Microsoft release a how to with their open source MiniEngine project. He handles multiple fonts, antialiasing, and drop shadows in DirectX 12.
Find the project files at GitHub https://github.com/Microsoft/DirectX-Graphics-Samples/tree/master/MiniEngine and check out Textrender.h and Textrender.cpp

If you want maximum feature set with minimum work you probably should go with DirectWrite on top of a D3D11 interop device, like Chuck said in his answer.
If you want to roll your own high performance text rendering you may want to take a look at the text renderer in the miniengine example repository on github, it has some interesting ideas.

Unfortunately the only ways have already been described. Interface with DirectWrite or create your glyph file system.
What you are doing is importing a texture file with glyphs on it, cutting out small squares around each character from the glyph texture file, and then gluing it all together to form a string. It results in some faster drawing (in some case).
I think the approach to this as referenced by the others is slightly outdated and destined to fail. Direct3D11 had the same lack of text drawing support as Direct3D12 (perhaps misinformation on that). It was Direct3D9 which had the built in text drawing support, which nonetheless worked fine, and later supported sprite batch drawing where you could render all text in one sprite.
It seems backwards to state that you simply "need to know" or "are not an expert" to implement such a basic yet tedious system. Such a system is destined to fail in the same way why no one wants to use Assembly to code something they can code in C and onward.
The D3D11 and D3D12 math library also suffers from the same failures. To define and convert vectors you are better off including D3D9X math or custom math structures because the newer methods included are so backwards. "Someone" made it and must like it, but I remember making a complaint showing how easy it is to do vector operations before vs afterward, it nearly doubles or triples the amount of lines needed to perform basic vectors operation and conversions, not even counting the amount of references and learning time you would need to see how someone else's lib works. It seems to be a big failure presented by mathematicians who were never good at programming.

Kinect Hand Gestures

I have been working with Kinect gestures for a while now and so far the tools that are available to create gestures are only limited to track entire body movements for instance swiping your arm to left and right. The JOINT TYPES available in the original Kinect SDK involves elbows, wrists, hands, shoulders etc but doesn’t include minor details like index finger, thumb, and middle finger. I am mentioning al this because I am trying to create gestures involving only hand movements (like victory sign, thumb up/down). Can anyone guide me though this? Is there a blog or website where codes for hand movements are written?

I have been developing application with Kinect one year ago, and then it was very hard or nearly impossible to do that. Now Google shows me projects like this, be sure to check it out. If you generally want to focus on hands gestures, I really advise you to use LEAP Motion

My friends at SigmaRD have developed something called the SigmaNIL Framework. You can get it from the OpenNI website.
It offers "HandSegmentation", "HandSkeleton", "HandShape" and "HandGesture" modules which may cover your needs.
Also check out the rest of the OpenNI Middleware and Libraries that you can download from their website. Some of them also work with the Microsoft SDK.

Shipment Tracking in iOS

iOS 4 automatically detects tracking numbers found in emails, notes, and messages and turns them into clickable links.
And it redirects to this URL,
http://trackingshipment.apple.com/?Company=UPS&Locale=&TrackingNumber=1Z1234567890123456
How can we use this API or library into our iOS apps so it will automatically detect or force detect shipping numbers?

Unfortunately, the publicly-released data detector types don't include common carrier tracking numbers. I wrote a small project showing how to detect UPS, USPS, and FedEx package numbers and got pretty good results:
You'll have to do the work of assembling the tracking URLs yourself, but this sample code may help you get started. Download here.

The class being used to do this is called NSDataDetector.
It is a subclass of NSRegularExpression where you can specify some built in patterns to look for.
The list of built in type values in the NSTextCheckingType enum can be seen here.
I don't see one specifically for tracking information, but the closest thing appears to be NSTextCheckingTypeTransitInformation. That is most likely the one you're going to be using.
Good luck!

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas