Train AWS Sagemaker Object detection algorythm with empty images? - object-detection

I want to train Sagemaker Object detection algorythm with my own dataset and want to know if the dataset must have images without object or not.
My dataset comes from Ground Truth with images from an outdoor camera, somes with object labelled, some without label (no object).

yes, absolutely. This will help you model learn better. For example, the Caltech-256 dataset has a clutter class (#257) with random images containing no objects: http://www.vision.caltech.edu/Image_Datasets/Caltech256/images/

Related

Model training - cropped image of the object VS bigger image with bounding box

I need to train a new model(keras+tensorflow) and I was asking myself if there is any difference between
Providing a bunch of images containing only the object of interest(cropped from the original image)
Providing bigger images with object annotations(coordinates of the bounding box and the class)
My logic tells my that most probably internally the training should be done only on the cropped part, so technically there shouldn`t be a difference.
Regards
The two approaches are you describing are commonly referred to as image classification (where a model needs to only classify the image) and object detection (where a model needs to detect the location of an object in an image and classify it). Sometimes simply differentiated as "classification" and "detection". These two approaches require different techniques, and different models have been developed to handle each approach. In general, image classification is an easier problem as you may have intuited.
Which approach to use depends on your end application. If you only need to know, "does an object exist in this image" then you can use classification techniques. If you need to know "where in this image is the object" or "how many of these objects are in the image", then you should use detection techniques.
What may be non-intuitive is that object detection is not simply an extension of image classification, so if you need object detection it is best to start with object detection models instead of building an image classifier that you then extend to object detection. This article provides some intuition on this topic.

Image Detection & Classification - general approach?

I'm trying to build a detection + classification model that will recognize an object in an image and classify it. Every image will contain at most 1 object among my 10 classes (i.e. same image cannot contains 2 classes). An image can, however, contain none of my classes/objects. I'm struggling with the general approach to this problem, especially due to the nature of my problem; my objects have different sizes. This is what I have tried:
Trained a classifier with images that only contains my objects/classes, i.e. every image is the object itself with background pre-removed. Now, since the objects/images have different shapes (aspect ratios) I had to reshape the images to the same size (destroying the aspect ratios). This would work just fine if my purpose was to only build a classifier, but since I also need to detect the objects, this didn't work so good.
The second approach was similar to (1), except that I didn't reshape the objects naively, but kept the aspect ratios by padding the image with 0 (black). This completely destroyed my classifiers ability to perform well (accuracy < 5%).
Mask RCNN - I followed this blogpost to try build a detector + classifier in the same model. The approach took forever and I wasn't sure it was the right approach. I even used external tools (RectLabel) to generate annotated image files containing information about the bounding boxes.
Question:
How should I approach this problem, on a general level:
Should I build 2 separate models? (One for detection/localization and one for classification?)
Should I be annotating my images using annotations file as in approach (3)?
Do I have to reshape my images at any stage?
Thanks,
PS. In all of my approaches, I augmented the images to generate ~500-1000 images per class.
To answer your questions:
No, you don't have to build two separate models. What you are describing is called Object detection, which is classification along with localization. There are many models which do this: Mask_RCNN, Yolo, Detectron, SSD, etc..
Yes, you do need to annotate your images for training a model for your custom classes. Each of the models mentioned above has needs a different way of annotation.
No, you don't need to do any image resizing. Most of the time it is done when the model loads the data for training or inference.
You are on the right track with trying MaskRCNN.
Other than MaskRCNN, you could also try Yolo. There is also an accompanying easy-to-use annotating tool Yolo-Mark.
If you go through this tutorial, you would understand what you care about.
How to train your own Object Detector with TensorFlow’s Object Detector API
The SSD model is small so that it would not take so much time for training.
There are some object detection models.
On RectLabel, you can save bounding boxes in the PASCAL VOC format.
You can export TFRecord for Tensorflow.
https://rectlabel.com/help#tf_record

Ignore some class in train

I'm using tensor-flow models object detection for my use case, and I have some boxes/classes that I would like to ignore in the training process because the quality of them is not the best.
I don't want to delete the boxes area with black rectangle because that will change the image
and I don't want them to be a Negative example in the training process
Is there an easy way to do that?
I'm using tensorflow models object detection faster-RCNN implementation with PASCAL VOC data presentation

Using Tensorflow object detection API to detect objects and classify objects by color

I am able to use Tensorflow to train the model on my own dataset. For example, I have trained a model to only detect the safety helmet and the result is good.
My plan for next step is to classify the identified safety helmet by colors. But I still in search of methods.
I am wondering should I retrain the model with different label map like: [item1 red_helmet] [item2 blue_helmet] and label my training dataset respectively? Or is there any other tricky way to achieve the same outcome?
You already have the region of interest in the picture.
All you need is extract the helmets from the picture and pass the cropped images to openCV routine that can detect colours.
Thats it, you are done :)

How to detect objects in addition to coco dataset?

I'm using tensorflow objection detection API with the coco dataset provided in the tutorial.
If I use the api to detect custom objects, how do I "add" to the list of objects being detected from the coco dataset? Is there a way to merge?
If you mean using a model which is trained on the COCO dataset to detect objects that are not in the COCO dataset, you cannot do that. I think you will need to train a model, in this case one already trained on COCO, on your new objects that you want to detect. There is a tutorial here that shows how to train a model on a custom dataset.
If you do not want to train a model you need to find one that is already trained for the objects that you want to detect.
Have I understood correctly?