I am a beginner in machine learning, I am trying to do my own object detection using my own dataset. However, it would be more practical if the object is labeled with polygon shaped bound. yet tensorflow object detection API can only accept bounding box.
So is it possible to modify the API such that, it can accept polygon labeled dataset??
Yes, it is possible. You have to give directory of the training set. But bounding box is recommended because during inference time, you will get bounding box around the object detected. You can see an example here in tensorflow.org.
For labeling you can use LabelImg, which is very simple and easy to use also will increase the detection accuracy.
Related
I need to train a new model(keras+tensorflow) and I was asking myself if there is any difference between
Providing a bunch of images containing only the object of interest(cropped from the original image)
Providing bigger images with object annotations(coordinates of the bounding box and the class)
My logic tells my that most probably internally the training should be done only on the cropped part, so technically there shouldn`t be a difference.
Regards
The two approaches are you describing are commonly referred to as image classification (where a model needs to only classify the image) and object detection (where a model needs to detect the location of an object in an image and classify it). Sometimes simply differentiated as "classification" and "detection". These two approaches require different techniques, and different models have been developed to handle each approach. In general, image classification is an easier problem as you may have intuited.
Which approach to use depends on your end application. If you only need to know, "does an object exist in this image" then you can use classification techniques. If you need to know "where in this image is the object" or "how many of these objects are in the image", then you should use detection techniques.
What may be non-intuitive is that object detection is not simply an extension of image classification, so if you need object detection it is best to start with object detection models instead of building an image classifier that you then extend to object detection. This article provides some intuition on this topic.
I use Tensorflow Object Detection API with MobilenetV2 as network backbone and SSD as meta-structure to do the object detection job.
In SSD, for each anchor point, we make several candidate bounding boxes with different aspect_ratios. For each bounding box, if its intersection with the bounding box ground-truth is greater than a threshold, we say that this bounding box is positive. Otherwise, it is negative. And then we use these positive and negative to do the training. (So it is important to note that it is NOT the entire image is used to train, but only one (or several) crops of these images are used)
To debug, I'd like to save these positive and negative crops to hard disk to see what are really samples that the algorithm uses to train.
I read the python code of Tensorflow Object Detection API but I'm lost :(
If you have any hint, please show me !
Thanks !
I'm using tensor-flow models object detection for my use case, and I have some boxes/classes that I would like to ignore in the training process because the quality of them is not the best.
I don't want to delete the boxes area with black rectangle because that will change the image
and I don't want them to be a Negative example in the training process
Is there an easy way to do that?
I'm using tensorflow models object detection faster-RCNN implementation with PASCAL VOC data presentation
I was working with the recently released Tensorflow's API for object detection, with Faster RCNN on Resnet 101 on my own dataset. It seems to train and evaluate on Validation data, but I was hoping if there was a way I could get/store bounding boxes for all images in the Eval set, in a file, or maybe, get the location in the source code where I can get the predicted bounding boxes with image names.
If you just want to obtain the detected bounding boxes given a set of images, the Jupyter notebook contains a good example of how to do this.
We have been using Tensorflow for image classification, and we all see the results for the Admiral Grace Hopper, and we get:
military uniform (866): 0.647296
suit (794): 0.0477196
academic gown (896): 0.0232411
bow tie (817): 0.0157356
bolo tie (940): 0.0145024
I was wondering if there is any way to get the coordinates for each category within the image.
Tensorflow doesn't have sample code yet on image detection and localization but it's an open research problem with different approaches to do it using deep nets; for example you can lookup the papers on algorithms called OverFeat and YOLO (You Only Look Once).
Also, usually there's some preprocessing on the object coordinates labels, or postprocessing to suppress duplicate detections. Usually a second, different network is used to classify the object once it's detected.