Understanding what a model is in regards to Tensorflow and object detection - tensorflow

I'm starting to dive into tensor and object detection for a drone my friend and I are building. I keep seeing the word "model" thrown around and I'm sorry but I don't know what I should be picturing when I see the word "model" in terms of tensorflow and object detection.

Usually, in deep learning model is simply architecture of a neural network. It defines type of layers, number of nodes, connections, etc. Tensorflow uses static graph, which describes your model architecture in terms of nodes and operations. As a start you can use Keras API for defining your model.
https://keras.io/
Also read more about TF graph https://www.tensorflow.org/guide/graphs and take a look at tutorials https://www.tensorflow.org/tutorials

Related

how to use tensorflow object detection API for face detection

Open CV provides a simple API to detect and extract faces from given images. ( I do not think it works perfectly fine though because I experienced that it cuts frames from the input pictures that have nothing to do with face images. )
I wonder if tensorflow API can be used for face detection. I failed finding relevant information but hoping that maybe an experienced person in the field can guide me on this subject. Can tensorflow's object detection API be used for face detection as well in the same way as Open CV does? (I mean, you just call the API function and it gives you the face image from the given input image.)
You can, but some work is needed.
First, take a look at the object detection README. There are some useful articles you should follow. Specifically: (1) Configuring an object detection pipeline, (3) Preparing inputs and (3) Running locally. You should start with an existing architecture with a pre-trained model. Pretrained models can be found in Model Zoo, and their corresponding configuration files can be found here.
The most common pre-trained models in Model Zoo are on COCO dataset. Unfortunately this dataset doesn't contain face as a class (but does contain person).
Instead, you can start with a pre-trained model on Open Images, such as faster_rcnn_inception_resnet_v2_atrous_oid, which does contain face as a class.
Note that this model is larger and slower than common architectures used on COCO dataset, such as SSDLite over MobileNetV1/V2. This is because Open Images has a lot more classes than COCO, and therefore a well working model need to be much more expressive in order to be able to distinguish between the large amount of classes and localizing them correctly.
Since you only want face detection, you can try the following two options:
If you're okay with a slower model which will probably result in better performance, start with faster_rcnn_inception_resnet_v2_atrous_oid, and you can only slightly fine-tune the model on the single class of face.
If you want a faster model, you should probably start with something like SSDLite-MobileNetV2 pre-trained on COCO, but then fine-tune it on the class of face from a different dataset, such as your own or the face subset of Open Images.
Note that the fact that the pre-trained model isn't trained on faces doesn't mean you can't fine-tune it to be, but rather that it might take more fine-tuning than a pre-trained model which was pre-trained on faces as well.
just increase the shape of the input, I tried and it's work much better

Evaluate a model created using Tensorflow Object Detection API

I trained a model using Tensorflow object detection API for detecting swimming pools using satellite images. I used 'faster_rcnn_inception_v2_coco_2018_01_28' model for training. I generated a frozen inference graph (.pb). I want to evaluate the precision and recall of the model. Can someone tell me how I can do that, preferably without using pycocotools as I was facing some issues with that. Any suggestions are welcome :)
From the Object Detection API you can run "eval.py" from "models/research/object_detection/legacy/".
Your have to define an evaluation metric in your config file (see the supported evaluation protocols)
For example:
eval_config: {metrics_set: "coco_detection_metrics"}
The Pascal VOC e.g. then gives you the mean Average Precsion (mAP)

Customize MobileNet model architecture with Tensorflow Object Detection API

Tensorflow object detection API provides a number of pretrained object detection models to choose from. However, I would like to introduce modifications to the architecture of those models.
Particularly, I would like to make Faster RCNN into a more shallow network and use it to train my model. I want to gain in performance despite loss in accuracy. MobileNet is too inaccurate for my application.
Is it possible to achieve this without having to implement everything from scratch ?
Thank you.

What does freezing a graph in TensorFlow mean?

I am a beginner in NN APIs and TensorFlow.
I am trying to save my trained model in protobuff format (.pb), there are many blogs explaining how to save the model as protobuff. One thing I did not understand is what is the importance of freezing the graph before saving it as protobuff? I read that freezing coverts variable to constants, does that mean the model is not trainable anymore?
What else will freezing do on models?
What is that model loses after freezing?
Can anyone please explain or give some pointers on details of freezing?
This is only a partial answer to your question.
A freezed graph is easily optimizable. When doing inference (forward propagation) for instance you can fuse some of the layers together. This you can't do with a graph separated between variables and operations (a not frozen graph). Why would you want to fuse layers together? There are multiple reasons. Going hardware specific: it might be easier to compute a number of operations together in a group of tensors, specific to the structure of your cpu or gpu. TensorRT is a graph optimizer for instance that works starting from a frozen graph (here more info on graph optimizations done by tensorRT: https://devblogs.nvidia.com/tensorrt-integration-speeds-tensorflow-inference/ ). This software does graph optimizations as well as hardware specific ones.
As far as I understand you can unfreeze a graph. I have only worked optimizing them, so I haven't use this feature. But here there is code for it: https://gist.github.com/tokestermw/795cc1fd6d0c9069b20204cbd133e36b
Here is another question that might be useful:
TensorFlow: Is there a way to convert a frozen graph into a checkpoint model?
It has not yet been answered though.
Freezing the model means producing a singular file containing information about the graph and checkpoint variables, but saving these hyperparameters as constants within the graph structure. This eliminates additional information saved in the checkpoint files such as the gradients at each point, which are included so that the model can be reloaded and training continued from where you left off. As this is not needed when serving a model purely for inference they are discarded in freezing. A frozen model is a file of the Google .pb file type.

Custom Object detection using tensorflow

I have trained the object detection API using ssd_mobilenet_v1_coco_2017_11_17 model to detect a custom object. But after training, the API only detects the custom object and not the objects for which the API is already trained. ssd_mobilenet_v1_coco_2017_11_17 model detect 90 objects.
Is there any way to add more classes to an existing model so that it can detect new objects along with the one it has been trained for?
This question is already asked here and some answer could be found here.
The very last layer of the networks is softmax layer. when the network is trained, the weights of the network is optimized for the exact number of classes on the training set. So, if you need to add a new class and also the classes it was trained, the easiest way is to get the original dataset it was trained on along with your new class images. Then start the training from the pre-trained model weights. The training should converge faster as it has to do relatively little adjustments.