Text recognition with tensorfow - tensorflow

I'm new to tensorflow and played around with the hand written numbers MNIST set.
I'd like to do my own project that recognises text instead of numbers but can't find a good tutorial.
Is it the same principle as numbers but instead of 10 layers at the end I have to use 26? Or include upper and lowercase and special characters?
If so I'd have to first crop the words into each character, right? Or is there a way to recognise entire sentences?
I'd like to train three different fonts, so no handwriting, and don't care about upper or lower case.
Later I'd like to use the trained model on photographs. A printed article for example. Does the model work if I align the image, do I have to retrain for a little bit or train it from the start with the new data?
Where do I start? The Keras example is overwhelming.

You're looking for an OCR model, a simple CNN can't detect text from scanned images, you need to segment them first which can be completed based on the language script.
You can start with tesseract. There is a python wrapper named pytesseract.
import pytesseract
from PIL import Image
text = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng',
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
For your own model, try CRNN models. https://github.com/qjadud1994/CRNN-Keras

Related

OCR input format for tensorflow model

I am trying to build a ocr model using tensorflow 2.0. So I want to start with CNN. I know that I can give image in array format to the model. But I am not able to understand how can I give input format for my labels for training the model. Label length is 10 and it can contain any 4 alphabets and 6 numbers. I was looking for the solution on google and found out https://github.com/JackonYang/captcha-tensorflow/blob/master/captcha-solver-tf2-4digits-AlexNet-98.8.ipynb. But here they only have digits in label but my label has characters also. Can anybody please throw some light on this?

How to makeup FSNS dataset with my own image for attention OCR tensorflow model

I want to apply attention-ocr to detect all digits on number board of cars.
I've read your README.md of attention_ocr on github(https://github.com/tensorflow/models/tree/master/research/attention_ocr), and also the way I should do to use my own image data to train model with the StackOverFlow page.(https://stackoverflow.com/a/44461910/743658)
However, I didn't get any information of how to store annotation or label of the picture, or the format of this problem.
For object detection model, I was able to make my dataset with LabelImg and converting this into csv file, and finally make .tfrecord file.
I want to make .tfrecord file on FSNS dataset format.
Can you give me your advice to go on this training steps?
Please reread the mentioned answer it has a section explaining how to store the annotation. It is stored in the three features image/text, image/class and image/unpadded_class. The image/text field is used for visualization, some models support unpadded sequences and use image/unpadded_class, while the default version relies on the text padded with null characters to have the same length stored in the feature image/class. Here is the excerpt to store the text annotation:
char_ids_padded, char_ids_unpadded = encode_utf8_string(
text, charset, length, null_char_id)
example = tf.train.Example(features=tf.train.Features(
feature={
'image/class': _int64_feature(char_ids_padded),
'image/unpadded_class': _int64_feature(char_ids_unpadded),
'image/text': _bytes_feature(text)
...
}
))
If you have worked with tensorflow object detection, then the apporach should be much easier for you.
You can create the annotation file (in .csv format) using labelImg or any other annotation tool.
However, before converting it into tensorflow format (.tfrecord), you should keep in mind the annotation format. (FSNS format in this case)
The format is : files text xmin ymin xmax ymax
So while annotating dont bother much about the class (as you would have done in object detection !! Some random name should suffice.)
Convert it into .tfrecords.
And finally labelMap is a list of characters which you have annotated.
Hope it helps !

How to train a reverse embedding, like vec2word?

how do you train a neural network to map from a vector representation, to one hot vectors? The example I'm interested in is where the vector representation is the output of a word2vec embedding, and I'd like to map onto the the individual words which were in the language used to train the embedding, so I guess this is vec2word?
In a bit more detail; if I understand correctly, a cluster of points in embedded space represents similar words. Thus if you sample from points in that cluster, and use it as the input to vec2word, the output should be a mapping to similar individual words?
I guess I could do something similar to an encoder-decoder, but does it have to be that complicated/use so many parameters?
There's this TensorFlow tutorial, how to train word2vec, but I can't find any help to do the reverse? I'm happy to do it using any deeplearning library, and it's OK to do it using sampling/probabilistic.
Thanks a lot for your help, Ajay.
One easiest thing that you can do is to use the nearest neighbor word. Given a query feature of an unknown word fq, and a reference feature set of known words R={fr}, then you can find out what is the nearest fr* for fq, and use the corresponding fr* word as fq's word.

how to find similar words for a certain word in tensorflow_word2vec like using model.most_similar in gensim?

I've using tensorflow to build word2vec model,reference here:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/word2vec/word2vec_basic.py#L118
my question is that, how can i find top n similar words for a certain word.I know in gensim, I can save and load word2vec model,and then use model.most_similar to find what I want.but how in tensorflow and even more is there any way to save model in tensorflow since i find what i get is only an embedding vector,is that right?
I think as long as you have computed the weight vector for each token, then you can manipulate all the tokens in the vector space. You can simply calculate the cosine similarity between each vector and then sort by score. For your reference, you can look at the source code of most_similar method implemented in gensim word2vec model. Hope this helps.

Neural Network in VB.NET

I want to implement a very simple neural net in VB.NET.
I have array of integer, that is actually unmapped from black and white bitmap - we can assume every integer is 1 or 0. There is 1 spot painted on the bitmap, nothing else. I want to create and train the neural network to tell me if it's more like:
1-circle
2-square
3-horizontal line
4-vertical line
5-horizontal ellipse
6-vertical ellipse
7-horizontal and vertical elipses merged
8-vertical and horizontal elipses merged
9-2horizontal elipses merged
And the shapes are really ugly, but a person can clearly make this decision, so I think a NN can be easily trained to do that. But I'm new to neural networks, and I have no idea how to approach the problem, for example different spots have various sizes so I don't know how to get fixed number of inbuts to feed in my NN - resizing is not an option
One option would be to use an existing Neural Net implementation, I'd recommend checking out the Accord.Net libary and the corresponding Accord.Neuro namespace.
Neural nets have been used successfully for handwriting recognition, which may be close to your problem. Check out the article Classifying Digits with Deep Belief Nets which includes a sample using Accord.Net.
Note: you would need hand labelled set of data to train the neural net with.