Is it possible to bias the training of an object detection model towards classification in tensorflow ModelMaker? - tensorflow

I'm using Tensorflow 2 Model Maker to perform transfer training of EfficientDet-Lite (ultimately to run on a Coral EdgeTPU) and I care much more about the classification output and much less about the precision of the bounding boxes. Is there a way to modify some training parameters to improve the accuracy of the classes at the expense of the accuracy of the bounding boxes? Or does this not make sense?

Unfortunately, TensorFlow 2 Model Maker doesn't support such customization at this moment.
If you want to do so, you can bypass Model Maker and directly use AutoML repo. The technical detail is to adjust weights for different losses by adding loss_weights in compile() function.

Related

How do I perform a confusion matrix calculation on my already custom trained YOLOv3 Tiny model?

I just finished training a yolov3 tiny model via google colab. I need the information regarding its accuracy in detection. How do I perform the evaluation for this model in terms of confusion matrix?

Why are Keras models instantiated with imagenet weights only?

If we look at the list of available models in Keras as shown here we see that almost all of them are instantiated with weights='imagenet'. For instance:
model = VGG16(weights='imagenet', include_top=False)
Why always imagenet? is it because it is the baseline? If not, what are the other options available?
Thank you
Imagenet is a defacto standard for images classification. A yearly contest is run with millions of training images in 1000 categories. The models used in the imagenet classification competitions are measured against each other for performance. Therefore it provides a "standard" measure for how good a model is for image classification. So many often used transfer learning model models use the imagenet weights. Your model if you are using transfer learning can be customized for your application by adding additional layers to the model. You do not have to use the imagenet weighst but it generally is beneficial as it helps the model converge in less epochs. I use them but I also set all layers to be trainable which helps adapt the weights of the model to your application.

Is that a good idea to use transfer learning in real world projects?

SCENARIO
What if my intention is to train for a dataset of medical images and I have chosen a coco pre-trained model.
My Doubts
1 Since I have chosen medical images there is no point of train it on COCO dataset, right? if so what is a possible solution to do the same?
2 Adding more layers to a pre-trained model will screw the entire model? with classes of around 10 plus and 10000's of training datasets?
3 Without train from scratch what are the possible solutions , like fine-tuning the model?
PS - let's assume this scenario is based on deploying the model for business purposes.
Thanks-
Yes, it is a good idea to reuse the Pre-Trained Models or Transfer Learning in Real World Projects, as it saves Computation Time and as the Architectures are proven.
If your use case is to classify the Medical Images, that is, Image Classification, then
Since I have chosen medical images there is no point of train it on
COCO dataset, right? if so what is a possible solution to do the same?
Yes, COCO Dataset is not a good idea for Image Classification as it is efficient for Object Detection. You can reuse VGGNet or ResNet or Inception Net or EfficientNet. For more information, refer TF HUB Modules.
Adding more layers to a pre-trained model will screw the entire model?
with classes of around 10 plus and 10000's of training datasets?
No. We can remove the Top Layer of the Pre-Trained Model and can add our Custom Layers, without affecting the performance of the Pre-Trained Model.
Without train from scratch what are the possible solutions , like
fine-tuning the model?
In addition to using the Pre-Trained Models, you can Tune the Hyper-Parameters of the Model (Custom Layers added by you) using HParams of Tensorboard.

TensorFlow model saving to be approached differently during training Vs. deployment?

Assume that I have a CNN which I am training on some dataset. The most important part of the model is the CNN architecture.
Now when I write a code, I define the model structure in a Python class. However, outside that class, I define a number of other nodes such as loss, accuracy, tf.Variable to keep count of epochs and so on.
When I am training, for properly resuming the training, I'd like to save all these nodes (e.g - loss, epoch variable etc), and not just the CNN structure.
However, once I am done with training, I would like to save only the CNN architecture and no nodes for loss, accuracy etc. This is because it will enable people using my model to exercise freedom in writing their own finetuning codes.
How to achieve this in TF code ? Can someone show an example ?
Is this approach towards saving followed by others also ? I just want to know if my approach is right.

Pre Trained LeNet Model for License plate Recognition

I have implemented a form of the LeNet model via tensorflow and python for a Car number plate recognition system. My model was trained solely on my train data and tested on the test data. My dataset contains segmented images wherein every image has only one character in them. This is what my data looks like. My created model does not perform very well, so I'm now looking for models which I can use via Transfer Learning. Since most models, are already trained on a humongous dataset, I looked over a few like AlexNet, ResNet, GoogLeNet and Inception v2. Most of these models have not been trained on the type of data that I want which would be, Letters and digits.
Question: Should I still go forward with one of these models and train them on my dataset or are there any better models which would help ? For such models would keras be a better option since it is more high level than Tensorflow?
Question: I'd prefer to work with the LeNet model itself since training the other models would definitely take a long time due to the insufficient specs of my laptop. So is there any implementation of the model which uses machine printed character images to train the model which I could use to then train the final layers of the model on my data?
to get good results you should use a model explicitly designed for text recognition.
First, (roughly) crop the input image to the region around the text.
Then, feed the image of the text into a neural network (NN) to detect the text.
A typical NN for text recognition extracts relevant features (with convolutional NN), propagates those features through the image (with recurrent NN) and finally predicts a character score for each position in the image.
Usually, those networks are trained with the CTC loss.
As a starting point I would suggest looking at the CRNN implementation (they also provide a pre-trained model) [1] and the corresponding paper [2]. There is, as far as I remember, also a TensorFlow implementation on github.
You can use any framework (e.g TensorFlow or CNTK or ...) you like as long as it features convolutional and recurrent NN and the CTC loss.
I once attended a presentation about CNTK where they claimed that they have a very fast implementation of recurrent NN - so maybe CNTK would be a good choice for your slow computer?
[1] CRNN implementation: https://github.com/bgshih/crnn
[2] Shi - An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition