Detecting images using tensorflow - tensorflow

i am actually working on a deep learning program and i try to develop an convolutional neural network for image recognition, my problem is how can we identify in the input pictures the differents patterns that we try to extract?
Thanks

Related

Bald detection using Keras

I was wondering if anyone can help by providing me with some guidelines for creating a bald-or-not image classifier.
So far I have a model for face and eye detection and to sum it up, this is my main questions:
Where can I find datasets for this kind of classification without going to google and download thousands of images by hand?
What classification model (i.e. the structure of layers in the network) should be used for this?
Question 1:
You could start by looking at some of the datasets available in Kaggle or Tensor Flow Datasets to see if there is anything available.
If none, you could try using an Image scraper tool to download images quickly compared to by hand.
Question 2:
Typically Image Classification model uses Convolutional Layers and MaxPooling layers. On top of the commonly used Dense Layer for Multi-layer Perceptron.
To get started you can study the Tensor Flow tutorial for Image Classification in this link,
which classifies whether the Image is Cat or Dog.
This example can provide you with the general idea of how to build an Image Classifier.
Hope this helps you. Thanks

Fixing error output from seq2seq model

I want to ask you how we can effectively re-train a trained seq2seq model to remove/mitigate a specific observed error output. I'm going to give an example about Speech Synthesis, but any idea from different domains, such as Machine Translation and Speech Recognition, using seq2seq model will be appreciated.
I learned the basics of seq2seq with attention model, especially for Speech Synthesis such as Tacotron-2.
Using a distributed well-trained model showed me how naturally our computer could speak with the seq2seq (end-to-end) model (you can listen to some audio samples here). But still, the model fails to read some words properly, e.g., it fails to read "obey [əˈbā]" in multiple ways like [əˈbī] and [əˈbē].
The reason is obvious because the word "obey" appears too little, only three times out of 225,715 words, in our dataset (LJ Speech), and the model had no luck.
So, how can we re-train the model to overcome the error? Adding extra audio clips containing the "obey" pronunciation sounds impractical, but reusing the three audio clips has the danger of overfitting. And also, I suppose we use a well-trained model and "simply training more" is not an effective solution.
Now, this is one of the drawbacks of seq2seq model, which is not talked much. The model successfully simplified the pipelines of the traditional models, e.g., for Speech Synthesis, it replaced an acoustic model and a text analysis frontend etc by a single neural network. But we lost the controllability of our model at all. It's impossible to make the system read in a specific way.
Again, if you use a seq2seq model in any field and get an undesirable output, how do you fix that? Is there a data-scientific workaround to this problem, or maybe a cutting-edge Neural Network mechanism to gain more controllability in seq2seq model?
Thanks.
I found an answer to my own question in Section 3.2 of the paper (Deep Voice 3).
So, they trained both of phoneme-based model and character-based model, using phoneme inputs mainly except that character-based model is used if words cannot be converted to their phoneme representations.

TensorFlow and person recognition in video stream

I'd like to make app for recognition persons in video stream using tensorflow or keras.
What kind of neural network can i use? CNN or RNN? Shoud i analyze freame one by one or video stream as a whole? Any good source to learn?
This is a very large question. And a hard task.
I think, the easy way is to extract one frame by second if the source is a video stream.
Then use OpenCV to make a face detection.
Once you got the faces, feed a NN for recognitions.
Some links for face recognition in Deep Learning:
https://aboveintelligent.com/face-recognition-with-keras-and-opencv-2baf2a83b799
https://github.com/rajathkumarmp/FaceRecog-Keras/blob/master/faceRecog.ipynb

deep learning for shape localization and recognition

There is a set of images, each of which contains different shape entities, such as shown in the following figure. I am trying to localize and recognize these different shapes. For instance, adding a bounding box for each different shape and maybe even label it. What are the major research papers/deep learning models that have been able to solve this kind of problem?
Object detection papers such as rcnn, faster rcnn, yolo and ssd would help you solve this if you were bent on using a deep learning approach.
It’s easy to say this is a trivial problem that can be solved with tools in OpenCV and deep learning is overkill, but I can see many reasons to use deep learning tools and that does not answer your question.
We assume that your shapes has different scales and rotations. Actually your main image shown above is very large for training process and it needs a lot of training samples to generate a good accuracy at the end on test samples. In this case it is better to train a Convolutional Neural Network on a short images (like 128x128) with only one shape per each image and then use slide trick!
This project will have three main steps:
Generate test and train samples, each image should have only one shape
Train a classifier to recognize a single shape within each input image
Use slide trick! Break your original image containing many shapes to overlapping blocks of size 128x128. Pass each block to your model trained in the second step.
In this way at the end you will have label for each shape from your trained model, and also you will have location of each shape using slide trick.
For the classifier you can use exactly CNN structure of Tensorflow's MNIST tutorial.
Here is a paper with exactly same method applied to finger print images to extract local features.
A direct fingerprint minutiae extraction approach based on convolutional neural networks

Can we use Yolo to detect and recognize text in a image

Currently I am using a deep learing model which is called "Yolov2" for object detection, and I want to use it to extract text and use save it in disk, but i don't know how to do that, if anyone know more about that, please advice me
I use Tensorflow
Thanks
If you use the pretrained model, you would need to save those outputs and input the images into a character recognition network, if using neural net, or another approach.
What you are doing is "scene text recognition". You can check out the Reading Text in the Wild with Convolutional Neural Networks paper, here's a demo and homepage. Github user chongyangtao has a whole list of resources on the topic.
I have a similar question and I am making a digit detection model with svhn dataset. It is not a finished project yet, but it seems to work well. You can see the code at Yolo-digit-detector.