Object Detection Using WebRTC and Kurento - object-detection

I have been following a very interesting tutorial about "Kurento" the media server for the WebRTC which allows multimedia communication directly through browsers. I have run the tutorials here http://www.kurento.org/docs/current/tutorials.html and have found it interesting.
My plan now is to implement a very basic object detection/recognition algorithm based on WebRTC (or Kurento) that given a simple object, it can detect it. In order to do this, I thought about the following steps.
Let's say we wish to find fruits, like apples and oranges:
Step 1 : I want to put a fix bounding box on the screen that limits the detection area, like the US-green card photo tool: http://travel.state.gov/content/visas/en/general/photos.html
(The green shape around the user face)
Step 2 : Implement a button that once pressed it can tell you the object inside the bounding box is apple or orange (e.g. based on its color or shape)
If there are ideas, I appreciate if I could know about them. Thanks

You will need to create your own module for that. There is a piece of documentation regarding that. You can use any of the existing modules as example, but I think the face detector filter is quite similar to what you are proposing.
Creating pluggable modules for Kurento is very easy. The hardest part is adding the computer vision algorithms inside those modules ;-)

Related

Displaying an interactive map of polygons efficiently in react native

I am currently working on a project where I want to display a map to a user on which the user can click on certain regions. The user should also be able to pan and zoom the map to be able to click on certain smaller regions.
I have formatted a shapefile with a +-500 polygons nicely into json data I can use. The data format is not relevant, I can change it to the library I'm using.
What I have tried.
First, I went for react-native-maps because it looked promising as it allows for inserting custom polygons and clicking on them. But then it turned out that I need a Google Maps API Key for just using the library even when I am not actually using Google Maps at all. I have no interest in setting up an API Key and an account on Google for something I don't use so I started looking for an alternative.
Turns out, there aren't much alternatives that support only rendering polygons and not using an API Key(Mapbox has the same issue).
I also tried the react-native-svg-pan-zoom library but this had performance issues because I couldn't use an acceptable resolution where people would not see the pixels after zooming in a bit (canvasWidth above 2000) or my app would just crash, and with values of 1500 the movement was also not smooth.
I'm a bit stuck now because I can't find any alternatives, it's just very frustrating because react-native-maps will probably do the trick but it requires an API Token for no reason.
I created this question because maybe there is still a way to not use an API Key or some other library I just couldn't find but maybe you know where to find it.

Media Foundation - Custom Media Source & Sensor Profile

I am writing an application for previewing, capturing and snapshotting camera input. To this end I am using Media Foundation for the input. One of the requirements is that this works with a Black Magic Intensive Pro 4K capture card, which behaves similar to a normal camera.
Media Foundation is unfortunately unable to create an IMFMediaSource object from this device. Some research lead me to believe that I could implement my own MediaSource.
Then I started looking at samples, and tried to unravel the documentation.
At that point I encountered some questions:
Does anyone know if what I am trying to do is possible?
A Windows example shows a basic implementation of a source, but uses IMFSensorProfile. What is a Sensor Profile, and what should I use it for? There is almost no documentation about this.
Can somebody explain how implementing a custom media source works in terms of: what actually happens on the inside? Am I simply creating my own format, or does it allow me to pull my own frames from the camera and process them myself? I tried following the msdn guide, but no luck so far.
Specifics:
Using WPF with C# but I can write C++ and use it in C#.
Rendering to screen uses Direct3D9.
The capture card specs can be found on their site (BlackMagic Intensity Pro 4K).
The specific problem that occurs is that I can acquire the IMFActivator for the device, but I am not able to activate it. On activation, an MF_E_INVALIDMEDIATYPE error occurs.
The IMFActivator can tell me that the device should output a UYVY format.
My last resort is using the DeckLinkAPI, but since I am working with several different types of cameras, I do not want to be stuck with another dependency.
Any pointers or help would be appreciated. Let me know if anything is unclear or needs more detail.

Create kinect skeleton for comparison

I'm going build an application where the user is supposed to try to mimic a static pose of a person on a picture. So I'm thinking that a Kinect is the most suitable way to get the information about the users pose.
I have found answers here on Stackoverflow suggesting that the comparison of the two skeletons (the skeleton defining the pose on the picture and the skeleton of the user) is best done by comparing the joint angles etc. I was thinking that there already would be some functionality for comparing poses of skeletons in the SDK but haven't found any information saying otherwise.
One thing makes me very unsure:
Is it possible to manually define a skeleton so I can make the static pose from the picture somehow? Or do I need to record it with help of Kinect Studio? I would really prefer some tool for creating the poses by hand...
If you are looking for users to pose and get recognized for the correct pose made by the user. Then you can follow these few steps to have it implemented in c#.
You can refer to the sample project Controls Basics-WPF provided by microsoft in the SDK Browser v2.0( Kinect for Windows)
Steps:
Record in Kinect studio 2 the position you want the pose to be.
open up Visual gesture builder to train your clips( selection of the clip that is correct)
build the vgbsln in the visual gesture builder to produce a gbd file( this will be imported into your project as the file that the gesturedetector.cs will read and implement into your project.
code out your own logic on what will happen when user have matching poses in the gestureresultview.cs.
Start off with one and slowly make the files into an array to loop when you have multiple poses.
I would prefer this way instead of coding out the exact skeleton joints of the poses.
Cheers!

Spritekit NPC quests & scripting in general

I'm in the process of developing an RPG using Apples SpriteKit framework. All is well: I have NPC & game objects all that the player can interact with. Plus, I have a text box to display text. Now, I want to implement quests & such but I'm currently stumped on figuring out a good way to create such game content. I did come across Ray Wenderlich's tutorial (https://www.raywenderlich.com/30561/how-to-make-a-rpg) on making an RPG and using Lua as the scripting language of choice, but after trial & error, I realized the luaObjectiveC bridge is far too old & deprecated (not sure how to fix the myraid of defects & errors) & there aren't any viable counter alternatives. I tried looking in github but couldn't find anything useful to get me started. Thus, I realize I must code my own implementation from scratch.
Any suggestions/tips for how to go about solving this process? Should I have some sort of JSON file that stores dialogue text which is called upon & has appropriate content retrieved & displayed inside the text box?
I'm hoping to make this a more-or-less flexible solution so I can use it in future projects.
If you are just starting your game now, go ahead and start using the iOS 10 Tile interface for world construction. See this raywenderlich tutorial for more. The tile interface handles constructing the world and the performance of tile based rendering for you. You still need to implement the game logic, items, and so on. JSON is just as good as any other storage means. You are going to need to implement dialog logic yourself. I would also suggest that you have a peek at my texture memory reduction framework which is very useful for loading complex images that would otherwise consume a ton of texture memory.

3d files in vb.net

I know this will be a difficult question, so I am not necessarily looking for a direct answer but maybe a tutorial or a point in the right direction.
What I am doing is programing a robot that will be controlled by a remote operator. We have a 3D rendering of the robot in SolidWorks. What I am looking to do is get the 3D file into VB (probably using DX9) and be able to manipulate it using code so that the remote operator will have a better idea of what the robot is doing. The operator will also have live video to look at, but that doesn't really matter for this question.
Any help would be greatly appreciated. Thanks!
Sounds like a tough idea to implement. Well, for VB you are stuck with MDX 1.1(Comes with DirectX SDK) or SlimDX (or other 3rd party Managed DirectX wrapper). The latest XNA (replacement for MDX 1.1/2.0b) is only available for C# coder. You can try some workaround but it's not recommended and you won't get much community support. These are the least you need to get your VB to display some 3d stuffs.
If you want to save some trouble, you could use ready made game engine to simplified you job. Try Ogre, and it's managed wrapper MOgre. It was one of the candidate for my project. But I ended up with SlimDX due to Ogre not supporting video very well. But since video is not your requirement, you can really consider it. Most sample would be in C# also, so you need to convert to VB.Net to use. It won't be hard.
Here comes the harder part, you need to export your model exported from SolidWorks to DirectX Format (*.x). I did a quick search in google and only found a few paid tools to do that. You might need to spend a bit on that or spend more time looking for free converter tools.
That's about it. If you have more question, post again. Good Luck
I'm not sure what the real question is but what I suspect that you are trying to do is to be able to manipulate a SW model of a robot with some sort of a manual input. Assuming that this is the correct question, there are two aspects that need to be dwelt with:
1) The Solidworks module: Once the model of the robot is working properly in SW, a program can be written in VB.Net that can manipulate the positional mates for each of the joints. Also using VB, a window can be programmed with slide bars etc. that will allow the operator to be able to "remotely" control the robot. Once this is done, there is a great opportunity to setup a table that could store the sequencial steps. When completed, the VB program could be further developed to allow the robot to "cycle" through a sequence of moves. If any obstacles are also added to the model, this would be a great tool for collission detection and training off line.
2) If the question also includes the incorporation of a physical operator pendent there are a number of potential solutions for this. It would be hoped that the robot software would provide a VB library for communicating and commanding the Robot programatically. If this is the case, then the VB code could then be developed with a "run" mode where the SW robot is controlled by the operator pendent, instead of the controls in the VB window, (as mentioned above). This would then allow the opertor to work "offline" with a virtual robot.
Hope this helps.