How to build tensorflow object detection model for custom classes and also include the 90 classess the SSD Mobilenet model contains - tensorflow

I am building a new tensorflow model based off of SSD V1 coco model in order to perform real time object detection in a video but i m trying to find if there is a way to build a model where I can add a new class to the existing model so that my model has all those 90 classes available in SSD MOBILENET COCO v1 model and also contains the new classes that i want to classify.
For example, I have created training data for two classes: man, woman
Now, I built a new tensorflow model that identifies a man and/or woman in a video. However, my model does not have the other 90 classes present in original SSD Mobilenet model. I am looking for a way to concatenate both models or pass more than one model to my code to detect the objects.
If you have any questions or if I am not clear, please feel free to probe me further.

The only way i find is you need to get dataset of SSD Mobilenet model on which it was trained.
Make sure all the images are present in one directory and annotations in another directory.
We should have a corresponding annotation file for each image file
ex: myimage.jpg and myimage.xml
If all the images of your customed dataset are of same formate with SSD Mobilenet model then annotate it with a tool called LabelImg.
Add that images and annotated files to respective images and annotations directory where we have already saved SSD Mobilenet.
Try regenerate new TFrecord and continue with remaining procedure on it.

You can use transfer learning with Tensorflow API.
Transfer learning allows you to load re-trained network and modify the fully connected layer by introducing your classes.
There is full description for this in the following references:
Codelab
A good explanation here
Tensorflow API here for more details
Also you can use google cloud platform for better and faster results:
I wish this helps you.

I don't think there is a way you can add your classes to the existing 90 classes without using the dataset it is previously trained with. Your only way is to use that dataset plus your own and retrain the model.

Related

How to extract weights of DQN agent in TF-Agents framework?

I am using TF-Agents for a custom reinforcement learning problem, where I train a DQN (constructed using DqnAgents from the TF-Agents framework) on some features from my custom environment, and separately use a keras convolutional model to extract these features from images. Now I want to combine these two models into a single model and use transfer learning, where I want to initialize the weights of the first part of the network (images-to-features) as well as the second part which would have been the DQN layers in the previous case.
I am trying to build this combined model using keras.layers and compiling it with the Tf-Agents tf.networks.sequential class to bring it to the necessary form required when passing it to the DqnAgent() class. (Let's call this statement (a)).
I am able to initialize the image feature extractor network's layers with the weights since I saved it as a .h5 file and am able to obtain numpy arrays of the same. So I am able to do the transfer learning for this part.
The problem is with the DQN layers, where I saved the policy from the previous example using the prescribed Tensorflow Saved Model Format (pb) which gives me a folder containing model attributes. However, I am unable to view/extract the weights of my DQN in this way, and the recommended tf.saved_model.load('policy_directory') is not really transparent with respect to what data I can see regarding the policy. If I have to follow the transfer learning as I do in statement (a), I need to extract the weights of my DQN and assign them to the new network. The documentation seems to be quite sparse for this case where transfer learning needs to be applied.
Can anyone help me in this, by explaining how I can extract weights from the Saved Model method (from the pb file)? Or is there a better way to go about this problem?

Is there any way to modify available TensorFlow models architecture (such as ssd or fast r-cnn) so it is optimized for only one object detection?

I am new to Machine Learning and TensorFlow so I'm sorry and please correct me if my understanding is wrong. I have this project, developing a real time traffic-light detection with TensorFlow.
I've been working with pre-trained TensorFlow models such as SSD Mobilenet and Faster R-CNN Resnet. However, the expected accuracy result have not yet reached. I already considered to add some more data to the dataset (my dataset contains +/-1000 images), but because it is more work to add more data (since I have to do another data taking and label all images), which could take days. I want to consider another option.
Is there any way to modify TensorFlow models architecture so I could optimize and make it focused for only traffic light detection? I've been looking through TensorFlow models folder and could not find in which file these model architectures defined.
Any help will be appreciated. Thank you
What you need is fine-tuning a pretrained model on your traffic light dataset. Given you already have about 1000 images for just one class, this is a descent dataset.
In order to perform fine-tuning, there are some important steps to do.
First you need to transform your data into tfrecord format. Follow this tutorial to generate tfrecord files. This is actually a difficult step.
Create a label_map.pbtxt for your model, since it is only traffic light, what you need is this. Here are sample label_map files.
item {
name: "traffic-light"
id: 1
display_name: "traffic-light"
}
Then you need to prepare a pipeline config file for the model. Since you want to have a real-time detector, I suggest you use SSD-mobilenet models. Some sample config files are available here. You can take one of this sample configs and modify some fields to get the pipeline config for your model.
Suppose you choose ssd_mobilenet, then you can modify this config file, ssd_mobilenet_v2_coco.config. Specifically you need to modify these fields:
num_classes: you need to change from 90 to 1 since you only have traffic lights to detect.
input_path: (in both train_input_reader and eval_input_reader), point this to the tfrecord file you created.
label_map_path: (in both train_input_reader and eval_input_reader), point this to the label_map file you created.
fine_tune_checkpoint: set this path to a downloaded pretrained ssd-mobilenet model.
Depending on your training results, you may need to further adjust some of the fields in the config file, but after the training your model will focus only on traffic light class and will likely have a high accuracy.
All the specific tutorials can be found on the repo site. If you have any further questions, you can ask on stackoverflow with tag: object-detection-api and a lot people would help.

how to use tensorflow object detection API for face detection

Open CV provides a simple API to detect and extract faces from given images. ( I do not think it works perfectly fine though because I experienced that it cuts frames from the input pictures that have nothing to do with face images. )
I wonder if tensorflow API can be used for face detection. I failed finding relevant information but hoping that maybe an experienced person in the field can guide me on this subject. Can tensorflow's object detection API be used for face detection as well in the same way as Open CV does? (I mean, you just call the API function and it gives you the face image from the given input image.)
You can, but some work is needed.
First, take a look at the object detection README. There are some useful articles you should follow. Specifically: (1) Configuring an object detection pipeline, (3) Preparing inputs and (3) Running locally. You should start with an existing architecture with a pre-trained model. Pretrained models can be found in Model Zoo, and their corresponding configuration files can be found here.
The most common pre-trained models in Model Zoo are on COCO dataset. Unfortunately this dataset doesn't contain face as a class (but does contain person).
Instead, you can start with a pre-trained model on Open Images, such as faster_rcnn_inception_resnet_v2_atrous_oid, which does contain face as a class.
Note that this model is larger and slower than common architectures used on COCO dataset, such as SSDLite over MobileNetV1/V2. This is because Open Images has a lot more classes than COCO, and therefore a well working model need to be much more expressive in order to be able to distinguish between the large amount of classes and localizing them correctly.
Since you only want face detection, you can try the following two options:
If you're okay with a slower model which will probably result in better performance, start with faster_rcnn_inception_resnet_v2_atrous_oid, and you can only slightly fine-tune the model on the single class of face.
If you want a faster model, you should probably start with something like SSDLite-MobileNetV2 pre-trained on COCO, but then fine-tune it on the class of face from a different dataset, such as your own or the face subset of Open Images.
Note that the fact that the pre-trained model isn't trained on faces doesn't mean you can't fine-tune it to be, but rather that it might take more fine-tuning than a pre-trained model which was pre-trained on faces as well.
just increase the shape of the input, I tried and it's work much better

How to detect objects in addition to coco dataset?

I'm using tensorflow objection detection API with the coco dataset provided in the tutorial.
If I use the api to detect custom objects, how do I "add" to the list of objects being detected from the coco dataset? Is there a way to merge?
If you mean using a model which is trained on the COCO dataset to detect objects that are not in the COCO dataset, you cannot do that. I think you will need to train a model, in this case one already trained on COCO, on your new objects that you want to detect. There is a tutorial here that shows how to train a model on a custom dataset.
If you do not want to train a model you need to find one that is already trained for the objects that you want to detect.
Have I understood correctly?

Object detection using CNTK

I am very new to CNTK.
I wanted to train a set of images (to detect objects like alcohol glasses/bottles) using CNTK - ResNet/Fast-R CNN.
I am trying to follow below documentation from GitHub; However, it does not appear to be a straight forward procedure. https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN
I cannot find proper documentation to generate ROI's for the images with different sizes and shapes. And how to create object labels based on the trained models? Can someone point out to a proper documentation or training link using which I can work on the cntk model? Please see the attached image in which I was able to load a sample image with default ROI's in the script. How do I properly set the size and label the object in the image ? Thanks in advance!
sample image loaded for training
Not sure what you mean by proper documentation. This is an implementation of the paper (https://arxiv.org/pdf/1504.08083.pdf). Looks like you are trying to generate ROI's. Can you look through the helper functions as documented at the site to parse what you might need:
To run the toy example, make sure that in PARAMETERS.py the datasetName is set to "grocery".
Run A1_GenerateInputROIs.py to generate the input ROIs for training and testing.
Run A2_RunCntk_py3.py to train a Fast R-CNN model using the CNTK Python API and compute test results.
The algo will work on several candidate regions and then generate outputs: one for the classes of objects and another one that generates the bounding boxes for the objects belonging to those classes. Please refer to the code for getting the details of the implementation.
Can someone point out to a proper documentation or training link using which I can work on the cntk model?
You can take a look at my repository on GitHub.
It will guide you through all the steps required to train your own model for object detection and classification with CNTK.
But in short the proper steps should look something like this:
Setup environment
Prepare data
Tag images (ground truth)
Download pretrained model and create mappings for your custom dataset
Run training
Evaluate the model on test set