My app uses opencv's cv::findContours on a binary image. I now need to make it realtime. GPUImage has a cannyedge filter but I couldn't find anything related to findContours. Does GPUImage have anything that closely resembles findContours? If not, can someone suggest an alternative?
Thanks
There was an issue reported in this regard in 2015.
https://github.com/BradLarson/GPUImage/issues/2094
The answer of that issue states that GPUImage does not offer any such alternative at the moment. So the best guess might be to feed the output from GPUImage's Canny or Sobel edge detection into a contour determination operation outside of the GPUImage framework.
Related
I have a task: to determine the sound source location.
I had some experience working with tensorflow, creating predictions on some simple features and datasets. I assume that for this task, there would be necessary to analyze the sound frequences and probably other related data on training and then prediction steps. The sound goes from the headset, so human ear is able to detect the direction.
1) Did somebody already perform that? (unfortunately couldn't find any similar project)
2) What kind of caveats could I meet while trying to achieve that?
3) Am I able to do that using this technology approach? Are there any other sound processing frameworks / technologies / open source projects that could help me ?
I am asking that here, since my research on google, github, stackoverflow didn't show me any relevant results on that specific topic, so any help is highly appreciated!
This is typically done with more traditional DSP with multiple sensors. You might want to look into time difference of arrival(TDOA) and direction of arrival(DOA). Algorithms such as GCC-PHAT and MUSIC will be helpful.
Issues that you might encounter are: DOA accuracy is function of the direct to reverberant ratio of the source, i.e. the more reverberant the environment the harder it is to determine the source location.
Also you might want to consider the number of location dimensions you want to resolve. A point in 3D space is much more difficult than a direction relative to the sensors
Using ML as an approach to this is not entirely without merit but you will have to consider what it is you would be learning, i.e. you probably don't want to learn the test rooms reverberant properties but instead the sensors spatial properties.
How to overcome scale problem in monocular visual odometery?
Can someone suggest me a good paper on the implementation?
Visual-inertial fusion is the most recent approach used for monocular odometry. You need to calculate the scale between world units and meters.
You can use an Extended Kalman Filter approach or optimization as you like. Try with this paper: http://www.ee.ucr.edu/~mourikis/papers/Li2012-ICRA.pdf
I know that object detection is not possible using Kinect v1. We need to use 3rd party libraries like open CV or pointclouds (pcl).
But was just curious to know does can it be achived using Kinect v2? Has anyone done any work on it?
Take a look at this project and use google with blob detection keyword.
The short answer is that object detection using the Kinect V2 is possible in two ways, but there isn't much by way of complete solutions out there right now (Nov. 2014) because of how new it is and because it hasn't been hacked yet. Currently, I am trying to implement PCL on Windows 8 with visual studio 2012, which are the bare minimum req.s for Kinect v2, and I will keep posted so that you can know how it goeshttp://cs.unc.edu/~kwaegel/pcl/pcl_build_notes.htmlRealistically, the fastest approach would likely be using the v2 SDK (sorry about the link above, don't have enough reputation to share more than 2 links, needed to think of a clever way to get it to you without StackExchange recognizing it as a link :P ). In a brief search I found that this guy acquired the color point clouds from the kinect v2 and was able to output them:
http://laht.info/kinect-v2-colored-point-clouds/
After that, you should be able to segment the point clouds with previous open source software by simply importing your newly acquired point cloud!
http://pointclouds.org/documentation/tutorials/random_sample_consensus.php
again, haven't gotten all of these moving parts working together in one environment yet, but it is definitely possible
I've been working with augmented reality API's lately but haven't been able to achieve irregular shape detection, namely the hand. I want to be able to detect hand shapes through the video/camera and execute code based on hand signs. Does anything like this already exist?
Did you have a look at OpenCV?
These are some of the links I found just using Google: Face Detection using OpenCV, Vision For Robots
I'm currently using a Processing Kinect library which supplies a depth map. I was wondering how I could take that and use it to create a 2D skeleton, if possible. Not looking for any code here, just a general process I could use to achieve those results.
Also, given that we've seen this in several of the Kinect games so far, would it be difficult to have multiple skeletons running at once?
Disclaimer: the reason why you still didn't get an answer for this question is probably because that's a current research problem. So I can't give you a direct answer but will try to help with some information and useful resources for this topic.
There are mainly 2 different approaches to create a skeleton from a depth map. The first one is to use machine learning, the second is purely algorithmic.
For the machine learning one, you'd need many samples of people doing a predetermined move, and use those samples to train your favorite learning algorithm. That's the approach that was taken and implemented by Microsoft in the XBox (source), it works really well BUT you need millions of samples to make it reliable... quite a drawback.
The "algorithmic" approach (understand without using a training set) can be done in many different ways and is a research problem. It's often based on modeling the possible body postures and trying to match that with the depth image received. That's the approach that was chosen by PrimeSense (the guys behind the kinect depth camera technology) for their skeleton tracking tool NITE.
The OpenKinect community maintains a wiki where they list some interesting research material about this topic. You might also be interested in this thread on the OpenNI mailing list.
If you're looking for an implementation of a skeleton tracking tool, PrimeSense released NITE (closed source), the one they made: it's part of the OpenNI framework. That's what's used in most of the videos you might have seen that involve skeleton tracking. I think it's able to handle up to 2 skeletons at the same time, but that requires confirmation.
The best solution is to use FAAST (http://projects.ict.usc.edu/mxr/faast/) which requires OpenNI. I have struggled to get OpenNI to work on my computer. I have not seen an approach yet using Code Laboratories' CL NUI.
An algorithmic approach is http://code.google.com/p/skeletonization/ but you may have a problem because your depthmap only represents surfaces and no closed objects.