General steps for fine-tuning/pre-training a model for an object-detector - tensorflow

I am facing a similar question as here [How to Pre-train the Resnet101 for a Faster RCNN in Object Detection API of Tensorflow.
I use a pre-trained faster R-CNN Resnet101 model and want to fine-tune it with similar data. Currently, I am working with the TF OD API.
Until now, I only run the code here [https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_locally.md#running-the-training-job]. However, the accuracy of the validation set is either -1.000 or 0.02% after 10k iterations.
Is the model_main.py function the correct way to fine-tune a model or do I miss a step in between.
Any help/remark/hint is highly appreciated.
Thanks a lot in advance

Related

Best case to use tensorflow

I followed all the steps mentioned in the article:
https://stackabuse.com/tensorflow-2-0-solving-classification-and-regression-problems/
Then I compared the results with Linear Regression and found that the error is less (68) than the tensorflow model (84).
from sklearn.linear_model import LinearRegression
logreg_clf = LinearRegression()
logreg_clf.fit(X_train, y_train)
pred = logreg_clf.predict(X_test)
print(np.sqrt(mean_squared_error(y_test, pred)))
Does this mean that if I have large dataset, I will get better results than linear regression?
What is the best situation - when I should be using tensorflow?
Answering your first question, Neural Networks are notoriously known for overfitting on smaller datasets, and here you are comparing the performance of a simple linear regression model with a neural network with two hidden layers on the testing data set, so it's not very surprising to see that the MLP model falling behind (assuming that you are working with relatively a smaller dataset) the linear regression model. Larger datasets will definitely help neural networks in learning more accurate parameters and generalize the phenomena well.
Now coming to your second question, Tensorflow is basically a library for building deep learning models, so whenever you are working on a deep learning problem like image recognition, Natural Language Processing, etc. you need massive computational power and will be processing a ton of data to train your models, and this is where TensorFlow becomes handy, it offers you GPU support which will significantly boost your training process which otherwise becomes practically impossible. Moreover, if you are building a product that has to be deployed in a production environment for it to be consumed, you can make use of TensorFlow Serving which helps you to take your models much closer to the customers.

Quantization aware training examples?

I want to do quantization-aware training with a basic convolutional neural network that I define directly in tensorflow (I don't want to use other API's such as Keras). The only ressource that I am aware of is the readme here:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize
However its not clear exactly where the different quantization commands should go in the overall process of training and then freezing the graph for actual inference.
Therefore I am wondering if there is any code example out there that shows how to define, train, and freeze a simple convolutional neural network with quantization aware training in tensorflow?
It seems that others have had the same question as well, see for instance here.
Thanks!

How to implement minibatch gradient descent in tensorflow without using feeddict?

From what I understand, using feed_dict is a computationally expensive process and should be avoided according to this article. Tensorflow's input pipelines are supposedly better.
All mini-batch gradient descent tutorials that I've found are implemented with feed_dict. Is there a way to use input pipelines and minibatch gradient descent?
If you are just making a small model, you will do fine with feed_dict. Many large models have been trained with the feed_dict method in the past. If you are scaling to a very deep convnet with a large dataset or something, you may want to use tf.data and the dataset pipeline, probably with the dataset serialized to a .tfrecord file so that the data can be pre-fetched to the GPU to reduce idle time. These optimizations are worthwhile with large models, and if you would really like to learn the API, visit the quickstart guide and this helpful Medium article on getting started with the API.

Tensorflow SSD-Mobilenet model accuracy drop after quantization using transform_graph

I am working on the recently released "SSD-Mobilenet" model by google for object detection.
Model downloaded from following location: https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md
The frozen graph file downloaded from the site is working as expected, however after quantization the accuracy drops significantly (mostly random predictions).
I built tensorflow r1.2 from source, and used following method to quantize:
bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=frozen_inference_graph.pb --out_graph=optimized_graph.pb --inputs='image_tensor' --outputs='detection_boxes','detection_scores','detection_classes','num_detections' --transforms='add_default_attributes strip_unused_nodes(type=float, shape="1,224,224,3") fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights strip_unused_nodes sort_by_execution_order'
I tried various combinations in the "transforms" part, and the transforms mentioned above gave sometimes correct predictions, however no where close to the original model.
Is there any other way to improve performance of the quantized model?
In this case SSD uses mobilenet as it's feature extractor . In-order to increase the speed. If you read the mobilenet paper , it's a lightweight convolutional neural nets specially using separable convolution inroder to reduce parameters .
As I understood separable convolution can loose information because of the channel wise convolution.
So when quantifying a graph according to TF implementation it makes 16 bits ops and weights to 8bits . If you read the tutorial in TF for quantization they clearly have mentioned how this operation is more like adding some noise in to already trained net hoping our model has well generalized .
So this will work really well and almost lossless interms of accuracy for a heavy model like inception , resnet etc. But with the lightness and simplicity of ssd with mobilenet it really can make a accuracy loss .
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
How to Quantize Neural Networks with TensorFlow

Would adding dropout help reduce overfitting when following tensorflow's transfer learning example?

I'm using the pretrained tensorflow inception v3 model and transfer learning to do some image classification on a new image training set I have. I'm following the instructions laid out here:
https://www.tensorflow.org/versions/r0.8/how_tos/image_retraining/index.html
However, I'm getting some severe overfitting (training accuracy is in the high 90s but CV/test accuracy is in the 50s).
Besides doing some image augmentation to try to increase my training sample size, I was wondering if doing some dropout in the retrain phase might help.
I am using this file (that came with tensorflow) as the base/template for my retraining/transfer learning:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py
Looking at the inception v3 model, dropout is in there. However, I don't see any dropout added in the retrain.py file.
Does it make sense that I could try to add dropout to the retraining to solve my overfitting? If so, where would I add that? If not, why?
Thanks
From Max's comment above, which was a good answer:
Max got some good improvement adding dropout to the retrain.py source. If you want to try it, you can reference his forked script. It has some additional updates, but the main part you should look at starts on line 784