I want to be able to label all of the muscles on an athletes body. I got a lot of the images that the athletes are almost in the same body pose but the issue that I am running into is that drawing a box around them makes them inaccurate as it ends up overlapping other muscles. Drawing exact lines around them is a bit difficult as they are a lot of smaller muscles and creates inconsistently over 20-30images. I was wondering if there is a way to feed in a human anatomy and then have tensorflow go in and label all of the muscles in given pictures.
Or I was wondering if you all had a different idea on approach this problem that I'm running into.
I don't have anybody else to ask and I've been researching this for awhile so if I missed or overlooked something please forgive me
The way i see is you need to combine with some prepossessing steps to normalize your target object in the image such as:
identify the human,
identify the pose or skeleton (which nowadays many open-source such as openpose-plus),
the pose estimation results can label the limbs, or part of the body from which you can do something either by hand-crafted image processing or other segmentation model.
I have a task: to determine the sound source location.
I had some experience working with tensorflow, creating predictions on some simple features and datasets. I assume that for this task, there would be necessary to analyze the sound frequences and probably other related data on training and then prediction steps. The sound goes from the headset, so human ear is able to detect the direction.
1) Did somebody already perform that? (unfortunately couldn't find any similar project)
2) What kind of caveats could I meet while trying to achieve that?
3) Am I able to do that using this technology approach? Are there any other sound processing frameworks / technologies / open source projects that could help me ?
I am asking that here, since my research on google, github, stackoverflow didn't show me any relevant results on that specific topic, so any help is highly appreciated!
This is typically done with more traditional DSP with multiple sensors. You might want to look into time difference of arrival(TDOA) and direction of arrival(DOA). Algorithms such as GCC-PHAT and MUSIC will be helpful.
Issues that you might encounter are: DOA accuracy is function of the direct to reverberant ratio of the source, i.e. the more reverberant the environment the harder it is to determine the source location.
Also you might want to consider the number of location dimensions you want to resolve. A point in 3D space is much more difficult than a direction relative to the sensors
Using ML as an approach to this is not entirely without merit but you will have to consider what it is you would be learning, i.e. you probably don't want to learn the test rooms reverberant properties but instead the sensors spatial properties.
I'm trying to do the training process, but I don't understand even how to start. I would like to train for read it numbers. My images are from real world, so it didn't go so good with the reading process.
It says that I have to have a ".tif" image with the examples... is a single image of every number (in this case) or a image with a lot of different types of number (same font, though)?
And what about the makebox? The command didn't work here.
https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Could someone explain me better, at least how to start?
I saw a few softwares that do this more quickly, but I tryied one (SunnyPage 1.8) but isn't free. Anyone know any free software that does this? Or a good tutorial?
Using Tesseract 3, Windows 8 (32bits).
It is important to patiently follow the training wiki google code project site. If needed multiple times. It is an open source library and is constantly evolving.
You will have to create a training image(tiff) with a lot of different types of numbers probably should have all the numbers you wish the engine to recognize.
Please consider posting the exact error message you got with make box.
I think Tesseract is the best free solution available. You have to keep working and seek help from community.
There is a very good post from Cédric here explaining the training process for Tesseract.
A good free OCR software is PDF OCR X which is also based on Tesseract. I tried to copy my notes from German which I had scanned at 1200dpi, and the results were commendable but not perfect. I found that this website - http://onlineocr.net - is a lot more accurate. If you are not registered, it allows a maximum of 4mb file size from most image formats (BMP, PNG, JPEG etc.) and PDF. It can output them as a Word file, an Excel file or an txt file.
Hope this helps.
in a few months I've got a game to play, but I want to cheat a bit :P by using some webcams.
The game is simple you drive with your car around and try to spot images that are taken of the roadside. But I was thinking if I use some webcams, I could scan the surrounding for these photos. I will probably still need to look for these photos but it helps me a lot.
I was thinking maybe I should use OpenCV with Feature Detection, but is this the smartest?
Better than the smartest, you should look for the stronger one.
Some times ago I've worked with opencv + SIFT which gives support for rototraslation of what you're looking for and can do progressive finding by adding feature to the checks.
It may be a good idea, and you can find opencv implementation here
There are lots of non-image-based CAPTCHA ideas floating around. But what about the old-fashioned way?
What are the elements of a good image CAPTCHA? What visual elements are hard for computers, but easier for humans? What about mistakes, elements that are easier for computers than they are for humans? What are good techniques for increasing the speed of a CAPTCHA generator?
Here's an example of a CAPCHA I've been working on. It generates the functions for two sine waves, then stretches a text between them. It lays that over a background drawn from a pool of images.
How could this be improved? (Specifically, I'm using PHP GD.) Things that come to mind are:
Change the color of the text, possibly making it multicolored.
Add "scratches" or marks that mildly obscure the text.
Add to the distortion so that it's affected by sine waves horizontally as well.
What goes into a superb image CAPTCHA?
Edit:
I know that there are some very worthy third-party CAPTCHA resources. I'm looking for attributes that make them good. I'd like to use my own CAPTCHAs, just for the purpose of self-improvement. So, you can talk about reCAPTCHA, but it's not exactly what I'm looking for.
Also, it has been brought up that not only the image, but also the experience matters, so feel free to comment on that.
Make each letter/number out of a pattern, I.E. unconnected dots. Meaning the computer has no way of knowing that a dot is part of a letter other than pattern recognition (which they don't have yet.) Then the usual distortions and random lines.
How you do this is the challenge.
EDIT: Also, bonus points for patterns of different shapes, and try alpha transparency on the characters (on the edges or the whole character), so they merge with the background.
Make letters difficult to separate. Use handwriting-like font or add lines that join letters. Decrease and randomize spacing between letters.
Add wave distortion in other axis too. Distortion in one axis only can be relatively easily analyzed and reversed.
Don't bother with color background at all. It's super-easy to automatically filter black from other colors. Your background hinders only humans.
Don't add scratches or other noise unless it has the same thickness as letters. Noise-removal algorithms can easily remove things that are thinner than letters.
What if the color of the letters faded into other colors... for instance the 5 can start off as yellow on top and fade into blue or something. The colors chosen should be random.
With the multicolored background it might make it hard for the computer to pickup where the background ends and the character begins.. and hopefully it would not be too difficult for the human to actually pick up the pattern.
Instead of generating captcha you can create a captcha table in your database and you yourself create the table by search on google for good captcha images.
So no need to worry "Will this generation method work?"
I really hate CAPTCHA on sites, they just annoy me, but if you want to try and make a robust one try the following:
Ability to get a new image without submitting
Spoken version for the visually impaired
Non-uniform characters
I've used Recaptcha on a few sites, it's a nice and robust solution.
Or if you want to be really funky about it check out this: http://research.microsoft.com/asirra/
Algorithms that try to break captcha are pattern matchers that work by a few different ways: scaling and skewing the symbols that they already know about, finding and tracing edges, and counting interior holes to help. If you can break the letter up into pieces, vary the letter quality, or add strong lines or “scratches” along the letters these techniques will help. However all of this is fairly moot considering we have recaptcha for this purpose and it’s a wonderful third party app for this. Additionally captcha will help the security of your site, but will not stop those who are truly enticed.
I like the idea of KittenAuth and Microsoft's Asirra project. The idea is that, while OCR will eventually evolve to break your traditional captcha, the ability to distinguish a kitten from a dog is many orders of magnitude more complex a problem, while absolutely trivial for humans.
This solution, while probably the sexiest captcha idea ever, has the limitation of not being easily portable to hearing-impaired methods.
What about shearing and shuffling bands to mangle display and mouse-only input?
Start by taking your sine-wave morphed text, divide into horizontal bands or maybe even a grid.
That makes optical recognition harder and might allow you to avoid the kind of nasty background games that make some captchas hard for humans.
For a site where you can rely on local drag in the browser, instead of typing in an entry use shuffling requiring the user to re-order pieces (just in sloppy order, not like one of those puzzles). Or, if you wanted to use clicks alone, the classic sliding tile puzzle.
Note, I've run into a captcha where you had to identify which of N cartoons had an animal in them which succeeded in blocking me!
Wellington Grey sums up the AI CAPTCHA race nicely.
You could add a random array of fonts so that GD renders each character using a different one.
Be wary of suggestions of ReCaptcha. I have submitted incorrect input into it a couple few dozen times, and have had success each time. Several of those times I have submitted incorrect input for both words rather than just the most obscured word; the success rate, as I said, has been 100%.
I also think that image-based CAPTCHAs are user-hostile and should be avoided wherever possible. The advantage of text-based solutions is that you can tailor them to your site's audience, adding a level of obscurity that may trip up machines as they become more savvy with text-based solutions.
At the very least, don't use this all the time:
(source: codinghorror.com)