Tensorflow object detection API how to add background class samples? - tensorflow

I am using tensorflow object detection API. I have two classes of interest. In the first trial, I got reasonable results, but I found it was easy to get false positive of both classes in the pure background images. These background images (i.e., images without any class bbx) have not been included in the training set.
How can I add them into the training set? It seems not work if I simply add samples without bbx.

Your goal is to add negative images to your training dataset to strength the background class (id 0 in the detection API). You can reach this with the VOC Pascal XML annotation format. In your XML file is the height and width of your image without object. Usually you label objects the coordinates and height and width of your object and object name is in the XML file. If you use labelImg you can generate a XML file corresponded to your negative image with the verify button. Also can Roboflow generates XML files with and without objects.

Related

Is it possible to use polygon data annotation to perform tensorflow object detection?

My problem is not exactly annotate data using polygon, circle or line, it's how to use these annotated data to gerenate a ".tfrecord" file and perform an object detection. The tutorials I saw use rectangle annotation, like these: taylor swift detection raccon detection
It would be a great one for me if the objects I want to detect (pipelines) were not too close.
Example of rectangle drawn in PASCAL VOC format:
<bndbox>
<xmin>82</xmin>
<xmax>172</xmax>
<ymin>108</ymin>
<ymax>146</ymax>
</bndbox>
Is there a way to add a "mask" to highlight some part of this bounding box?
If it's something unclear, please let me know.
You can go for instance segmentation instead of object detection if your objects are very close to each other, there you can use polygons to generate masks and bounding boxes to train the model.
Consider this well presented and easy to use repository for mask-rcnn(kind of instance segmentation)
https://github.com/matterport/Mask_RCNN
check this for lite weight mask-rcnn

Object detection when the object occupies the full region on the image?

I am working with object detection using Tensorflow. I have mix of 7-8 classes. Initially, we had an image classification model and now moving it to object detection model. For once class alone, the object to be detected occupies the entire image. Can we have the bounding box dimension to be the entire width and height of the image? Will it hinder the performance?
It shouldn't hinder the performance as long as there's enough such examples in the training set.
the OD API clips detections outbounding the image, so in these cases the resulting bounding box would be the of the entire image (or one axis would be the entire size, and the other less, depending on the object occupation).
Assuming your OD model uses anchors, make sure you have anchors which are responsible for such cases (i.e. with scale of about the entire image).

TensorFlow: Collecting my own training data set & Using that training dataset to find the location of object

I'm trying to collect my own training data set for the image detection (Recognition, yet). Right now, I have 4 classes and 750 images for each. Each images are just regular images of the each classes; however, some of images are blur or contain outside objects such as, different background or other factors (but nothing distinguishable stuff). Using that training data set, image recognition is really bad.
My question is,
1. Does the training image set needs to contain the object in various background/setting/environment (I believe not...)?
2. Lets just say training worked fairly accurately and I want to know the location of the object on the image. I figure there is no way I can find the location just using the image recognition, so if I use the bounding box, how/where in the code can I see the location of the bounding box?
Thank you in advance!
It is difficult to know in advance what features your programm will learn for each class. But then again, if your unseen images will be in the same background, the background will play no role. I would suggest data augmentation in training; randomly color distortion, random flipping, random cropping.
You can't see in the code where the bounding box is. You have to label/annotate them yourself first in your collected data, using a tool as LabelMe for example. Then comes learning the object detector.

If I resize images using Tensorflow Object Detection API, are the bboxes automatically resized too?

Tensorflow's Object Detection API has an option in the .config file to add an keep_aspect_ratio_resizer. If I resize my training data using this, will the corresponding bounding boxes be resized as well? If they don't match up then the network is seeing incorrect examples.
Yes, the boxes will be resized to be compatible with the images as well!

extract an object from image using some image processing filter

I am working on an application which something like that I have an image and e.g. there is a glass or a cup or a chair in it. The object can be of any type
My question here is that is there any way that i can apply some image processing filters or something like that which returns me an image that just contain the object and the background is transparent
You can use object detection methods such as
http://opencv.willowgarage.com/documentation/object_detection.html
http://docs.opencv.org/modules/objdetect/doc/latent_svm.html
to detect the object, plot a bounding box around it and extract it from the image.
depends on your application, but you can also use image difference (background subtraction) to get the object...
Actually I have solved the problem
the issue was I do not want to use any advance method that uses template matching or neural networks or anything like that
so in my case the aim was to recognize an object in an image and that object could be anything (e.g. a table,a cellphone, a person, a shirt etc) and the catch was that there could be at most one object in an image
so just using watershed segmentation of opencv I was able to separate the object from the background
but the threshold used for the watershed differs with respect to the frequency of the image and the difference of shades of the object from the background