Trying to custom train MobilenetV2 with 40x40px images - wrong results after training - tensorflow

I need to classify small images in 4 different categories, +1 "background" for false detection.
While training the loss quickly drop to 0.7, but stay there even after 800k steps. In the end, the frozen graph seems to classify most images with the background label.
I'm probably missing something, I'll detail the steps I used below, and any feedback is welcomed.
I'm new to tf-slim, so it can be an obvious mistake, maybe too little samples ? I'm not looking for top accuracy, just something working for prototyping.
Source materials can be found there : https://www.dropbox.com/s/k55xoygdzb2efag/TilesDataset.zip?dl=0
I used tensorflow-gpu 1.15.3 on windows 10.
I created the dataset using :
python ./createTfRecords.py --tfrecord_filename=tilesV2_40 --dataset_dir=.\tilesV2\Tiles_40
I added a dataset provider in models-master\research\slim\datasets based on the flowers provider.
I modified the mobilnet_v2.py in models-master\research\slim\nets\mobilenet, changed num_classes=5 and mobilenet.default_image_size = 40
I trained the net with : python ./models-master/research/slim/train_image_classifier.py --model_name "mobilenet_v2" --learning_rate 0.045 --preprocessing_name "inception_v2" --label_smoothing 0.1 --moving_average_decay 0.9999 --batch_size 96 --learning_rate_decay_factor 0.98 --num_epochs_per_decay 2.5 --train_dir ./weight --dataset_name Tiles_40 --dataset_dir .\tilesV2\Tiles_40
When I try this python .\models-master\research\slim\eval_image_classifier.py --alsologtostderr --checkpoint_path ./weight/model.ckpt-XXX --dataset_dir ./tilesV2/Tiles_40 --dataset_name Tiles_40 --dataset_split_name validation --model_name mobilenet_v2 I get eval/Recall_5[1]eval/Accuracy[1]
I then export the graph with python .\models-master\research\slim\export_inference_graph.py --alsologtostderr --model_name mobilenet_v2 --image_size 40 --output_file .\export\output.pb --dataset_name Tiles_40
And freeze it with freeze_graph --input_graph .\export\output.pb --input_checkpoint .\weight\model.ckpt-XXX --input_binary true --output_graph .\export\frozen.pb --output_node_names MobilenetV2/Predictions/Reshape_1
I then try the net with images from the dataset with python .\label_image.py --graph .\export\frozen.pb --labels .\tilesV2\Tiles_40\labels.txt --image .\tilesV2\Tiles_40\photos\lac\1_1.png --input_layer input --output_layer MobilenetV2/Predictions/Reshape_1. This is where I get wrong classifications.,
like 0:background 0.92839915 2:lac 0.020171663 1:house 0.019106707 3:road 0.01677236 4:start 0.0155500565 for a "lac" image of the dataset
I tried changing the depth_multiplier, the learning rate, learning on a cpu, removing --preprocessing_name "inception_v2" from the learning command. I don't have any idea left...

Change your learning rate, maybe start from the usual choice of 3e-5.

Related

How to Modify Data Augmentation in Darknet's YOLOv4

I built a digital scale reader using Darknet's YOLOv4Tiny. It is having trouble confusing 2's and 5's which leads me to believe that I am doing some unwanted data augmentation during training. (The results are mostly correct, and glare could be a factor, but I am expecting better results).
I have referenced this post:
Understanding darknet's yolo.cfg config files
and the darknet github:
https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-%5Bnet%5D-section
Below is a link to the yolov4-tiny.cfg that I modified for my model:
https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov4-tiny.cfg
And a snippet from the link above:
[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=1
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
Am I correct that angle=0 means that there is no rotation?
Are there any other possible ways I might be augmenting my data that could cause an issue?
Edit: If I wanted, how could I eliminate all data augmentation?
Or do I just need more data (currently 2484 images for 10 digit classes)?
horizontal flip is applied by default, add "flip=0" to disable.
https://github.com/AlexeyAB/darknet
Here, the angle, saturation, exposure, and hue all are part of data augmentation. You can eliminate all data augmentation by setting the value to 0. As data argumentation's values are hyperparameters, modifying the value of these could lead the accuracy to better or worse both. Here I suggest you keep the value given by darknet the same as they found these values are good to get good accuracy. And by all these 4 types of data augmentation darknet initially generated all images. If you add more images without replication it is always good to add more images for the deep learning model to learn the necessary complexity during training.

Darkflow accurate on demo but not on code

I trained my own model with darkflow yolov2 for just one class, and the results are pretty good when running this on the terminal with a threshold configuration of 0.55
python3 flow --model cfg/yolov2-tiny-voc-1c.cfg --load 5250 --demo BARCELONA_WALK.mp4
but then I convert the checkpoint on pb and meta files to use on code
and when I specify the threshold on the code like this
options = {"model": "cfg/yolov2-tiny-voc-1c.cfg",
"pbload": "built_graph/yolov2-tiny-voc-1c.pb",
"metaload": "built_graph/yolov2-tiny-voc-1c.meta",
"threshold": 0.55,
"gpu": 0.9}
it detects nothing from my image samples, but when the threshold is 0.5 or lower it detects like 280 objects and the ones with confidence greater than 0.5 are like 190, so, why is the neural network not working the same way when using the code and when running demo from terminal if I'm using the same weights and the same threshold?
SOLVED!!! On my options I had to put "pbLoad" and "metaLoad" instead of "pbload" and "metaload" too bad that it didn't throw any errors but anyways, I realized it may be the Uppercases when reading this post. I hope it helps someone in the future!!

what's the pipeline to train tensorflow attention-ocr on customized dataset?

I've read some questions on stackoverflow about attention-ocr, and most of them are about the implementation detail of a specific step. What I wanted to know is the pipeline for us to fine-tune this model on our own dataset.
As far as I know, the steps should be:
0) Should we first download FSNS dataset?? I tried to bypass this step and try running inference on just one image, but it always give me error:"ImportError: No module named 'fsns". So I wonder if this error will go away once I set my own dataset up.
1) Store our data in the same format as FSNS. (Links on this topic: How to create dataset in the same format as the FSNS dataset?, how to create cutomized dataset for google tensorflow attention ocr? )
2) Download the pre-trained checkpoint(http://download.tensorflow.org/models/attention_ocr_2017_08_09.tar.gz)
3) Somehow modify the 'model.py' to fit your own purpose.
4) Somehow modify the 'train.py' to train your own module using tensorflow serving.
I am still on the early stage (creating own dataset) on this project now, and confused on how to do it and what's the next stage.
The error was caused by incorrect version of Python. They should be run with Python 2, and you can just change the 'import' sentence to solve this error. Try to change the 'import fsns' to 'from datasets import fsns'.

Trying to convert nasnet tensorflow model to caffemodel. cell_stem_1/1x1 output is a little different

I'm trying to convert nasnet tensorflow model to caffemodel. The cell_stem_0 block's output is right, but when it came to cell_stem_1/1x1, the output feature map of my caffemodel is a little different from the tensorflow model:
All the 22 feature maps's bounding pixels are different, but the others are right.
Will a 1x1conv cause this difference? Is there any difference between the 1x1conv in cell_stem_0 and cell_stem_1 (in my caffemodel, the output of cell_stem_0/1x1 is right) ?

Increasing number of predictions in Inception for Tensorflow

I am going through the training tutorial on retraining Inception's final layer after having installed Tensorflow for Ubuntu with regular CPU support. I successfully made the flower examples work however after switching to a new set of categories with ten sub-folders I cannot make Inception produce ten scores for each input image rather than the default five. My current command line to run a test image looks like this, working with headers labelled 0-9.
bazel build tensorflow/examples/label_image:label_image && \
bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--output_layer=final_result \ --input_layer=Mul
--image=$HOME/Input/Example.jpg
Which produces as a result
5 (4): 0.642959
3 (2): 0.243444
9 (8): 0.0513504
4 (5): 0.0231318
6 (7): 0.0180509
However I cannot find anything in the programs that Inception runs to reconfigure how many output scores are produced so that all ten of my categories have scores rather than just five. How do I change this?
I tried with 8 categories and was able to get result for all of them.
If your code has below line
top_k = predictions[0].argsort()[-5:][::-1]
change it to
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
If code contains predictions = np.squeeze(predictions) then use predictions instead of predictions[0]
I have run this using following command instead of bazel and I found it easier.
python /path_to_file/label_image.py /path_to_image/image.jpeg
First make sure that graph is created after you run retrain.py and it is at the correct location. (default is inside /tmp/).