I am training a ssd_inception neural network using the Tensorflow Object Detection API. In the pipeline config file, there are preprocessor options to augment images during training. Is there any way to introduce probability of applying a given preprocessing? E.g 20% that the image will change contrast etc. If not, are there any plans to do so?
We don't have any current plan to do that. But feel free to send in a pull request. We are happy to review.
See https://github.com/tensorflow/models/blob/master/object_detection/builders/preprocessor_builder.py and https://github.com/tensorflow/models/blob/master/object_detection/core/preprocessor.py to get started.
Related
How can I evaluate my object detection model in a simple and understandable way, I used the TensorFlow's Object Detection API, but I didn’t understand the Tensorboard graphs. Can I evaluate it manually?
Any help? :(
Welcome to StackOverflow!
In short, yes you can. Yet it could be quite time-consuming to achieve your goal.
Here are the steps you might want to follow(assuming you have some basic understanding of Tensorflow graphs and sessions, otherwise please update your question):
Export your model to a frozen graph(*.pb file) via HERE. This step will give you an out-of-the-box model that you could load without any dependencies of Object Detection API.
Write a script to load your model(frozen graph) and perform the evaluation. Some instructions can be found from HERE. Make sure you use tools such as Netron to check the input and output node names of your frozen graph.
Once you could perform the evaluation, you could write metrics on your own dataset, such as mAP, and loop through all images to get the desired evaluation performed.
You could use the confusion matrix to evaluate your model on the test dataset.
After training the model on your dataset, export the inference graph for evaluation.
Find the attached link which helps you step by step towards evaluation.
Best of luck!
confusion_matrix
I've been hand-rolling augmenters using imgaug, as I really like some of the options that are not available in the tf object detection api. For instance, I use motion blur because so much of my data has fast-moving, blurry objects.
How can I best integrate my augmentation sequence with the api for on-the-fly training?
E.g., say I have an augmenter:
aug = iaa.SomeOf((0, 2),
[iaa.Fliplr(0.5), iaa.Flipud(0.5), iaa.Affine(rotate=(-10, 10))])
Is there some way to configure the object detection api to work with this?
What I am currently doing is using imgaug to generate (augmented) training data, and then creating tfrecord files from each iteration of this augmentation pipeline. This is very inefficient as I am saving large amounts of data to disk rather than running augmentation on the fly, during training.
Someone has made a repo for this:
https://github.com/JinLuckyboy/TensorFlowObjectDetectionAPI-with-imgaug
Sorry this is not a code answer and I have not actually looked into it, so I will not mark this as officially answered. If I ever get a chance to test it I will let people know.
I am currently working with the Tensorflow Object-Detection API and I want to fine-tune a pre-trained model. Therefore, a hyperparameter-tuning is required.
Does the API already provide some kind of hyperparameter-tuning (like a grid search)? If there is nothing available, how can I implement a simple grid search to tune (the most relevant) hyperparameters?
Furthermore, does the API provide some kind of Early Stopping function that automatically aborts the training process if the accuracy does not change anymore.
Thanks a lot in advance.
Is it possible and/or easy to disable the non-maxima suppression part of the off-the-shelf object detectors provided in the Tensorflow Object Detection API? E.g., I'd like to run the provided "SSD mobilenet which was trained on MSCOCO" without the non-maxima operation at the end. How can I achieve this?
If you want to do this for speed reasons, the only way is to edit the code itself (see https://github.com/tensorflow/models/blob/master/object_detection/meta_architectures/ssd_meta_arch.py#L331) --- this is a bit involved as you need to replace the call to NMS by code that would still put the boxes in the expected output format.
If you just want to get rid of the effect of NMS, you can simply set the score_threshold and iou_threshold of the postprocessing part of the config file:
https://github.com/tensorflow/models/blob/master/object_detection/samples/configs/ssd_mobilenet_v1_pets.config#L131
to be 0.0 and 1.0 respectively, meaning, don't filter low scoring boxes, and prune boxes based on iou only if they perfectly overlap (which in practice will be never).
I am very new to CNTK.
I wanted to train a set of images (to detect objects like alcohol glasses/bottles) using CNTK - ResNet/Fast-R CNN.
I am trying to follow below documentation from GitHub; However, it does not appear to be a straight forward procedure. https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN
I cannot find proper documentation to generate ROI's for the images with different sizes and shapes. And how to create object labels based on the trained models? Can someone point out to a proper documentation or training link using which I can work on the cntk model? Please see the attached image in which I was able to load a sample image with default ROI's in the script. How do I properly set the size and label the object in the image ? Thanks in advance!
sample image loaded for training
Not sure what you mean by proper documentation. This is an implementation of the paper (https://arxiv.org/pdf/1504.08083.pdf). Looks like you are trying to generate ROI's. Can you look through the helper functions as documented at the site to parse what you might need:
To run the toy example, make sure that in PARAMETERS.py the datasetName is set to "grocery".
Run A1_GenerateInputROIs.py to generate the input ROIs for training and testing.
Run A2_RunCntk_py3.py to train a Fast R-CNN model using the CNTK Python API and compute test results.
The algo will work on several candidate regions and then generate outputs: one for the classes of objects and another one that generates the bounding boxes for the objects belonging to those classes. Please refer to the code for getting the details of the implementation.
Can someone point out to a proper documentation or training link using which I can work on the cntk model?
You can take a look at my repository on GitHub.
It will guide you through all the steps required to train your own model for object detection and classification with CNTK.
But in short the proper steps should look something like this:
Setup environment
Prepare data
Tag images (ground truth)
Download pretrained model and create mappings for your custom dataset
Run training
Evaluate the model on test set