If there are multiple detection objects in a picture, what is the output shape of fastRCNN - faster-rcnn

If fast RCNN is used for target detection, my classification task is 10+1. What is the final output shape of FasterRCNN classification task and regression task? Or what is the corresponding label shape when creating a dataset

Related

How to make tensorflow object detection work on data having same shape but different color?

I have dataset of 3 types of cars: CarA, CarB, CarC.
All cars have the same shape but different color and/or logo.
I have trained an object detection model using tensor flow object detection api's with the base model as SSD ResNet50 V1 FPN 640x640 (RetinaNet50). Training was performed for 10000 steps and the training was stopped when loss reached 0.15
When I test the model, it classifies any car with the same shape as all 3 CarA, CarB, CarC. Is the model not able to distinguish based on color/logo and only works based on shape? I want to ask whether this color aspect can be handled better using any specific base model?

Training YOLO on dataset containing "hidden" tableware objects

I need to train YOLO on the dataset containing partly visible tableware overlapped by other objects.
Would it be better to exclude such "hard-detectable" examples from the dataset for training?

How to generate the labels of custom data for YOLO

The labels for YOLO is like [class, x , y, width, height] . Since the dataset is very large, is there any shortcut to generate the labels for YOLO, or we have to hardcode them through measurement?
Method 1: Using Pre-trained YOLOv4 models.
YOLOv4 models were pre-trained on COCO dataset. So, if your object(s) can be found in this list, then, you can use the pre-trained weights to pseudo-label your object(s).
To process a list of images data/new_train.txt and save results of detection in Yolo training format for each image as label <image_name>.txt, use: darknet.exe detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights -thresh 0.25 -dont_show -save_labels < data/new_train.txt
Method 2: Using Other Pre-trained Models. It's the same concept. Use other pre-trained models to detect your object (as long as they have trained their models on your object), then export/convert the labels to YOLO format.
Method 3: Use hand-crafted feature descriptors. Examples are shape detection, color-based detection, etc.
Method 4: Manual labelling. If everything else fails, do the labelling yourself or hire some data labelling services. Here's a list of tools that you can use if you want to label them yourself.

TensorFlow Object Detection API - what do the losses mean in the object detection api?

What do each for the following losses mean? (in the TensorFlow Object detection API, while training FasterRCNN based models)
Loss/BoxClassifierLoss/classification_loss/mul_1
Loss/BoxClassifierLoss/localization_loss/mul_1
Loss/RPNLoss/localization_loss/mul_1
Loss/RPNLoss/objectness_loss/mul_1
clone_loss_1
The losses for the Region Proposal Network:
Loss/RPNLoss/localization_loss/mul_1: Localization Loss or the Loss of the Bounding Box regressor for the RPN
Loss/RPNLoss/objectness_loss/mul_1: Loss of the Classifier that classifies if a bounding box is an object of interest or background
The losses for the Final Classifier:
Loss/BoxClassifierLoss/classification_loss/mul_1: Loss for the classification of detected objects into various classes: Cat, Dog, Airplane etc
Loss/BoxClassifierLoss/localization_loss/mul_1: Localization Loss or the Loss of the Bounding Box regressor
clone_loss_1 is relevant only if you train on multiple GPUs: Tensorflow would create a clone of the model to train on each GPU and report the loss on each clone. If you are training the model on a single GPU/CPU, then you will just see clone_loss_1, which is the same as TotalLoss.
The other losses are as described in Rohit's answer.
There are four losses that you will encounter if you are using the faster rcnn network
1.RPN LOSS/LOCALIZATION LOSS
If we see the architecture of faster rcnn we will be having the cnn for getting the regoin proposals. For getting the region proposals from the feature map we have the loss functions . This is the localization loss for bounding boxes for the anchors generated.'
2.RPN LOSS/OBJECTNESS LOSS
This is also when we are extracting the region proposals whether the object is present in the anchorbox or not.
3.BOX_CLASSIFIERLOSS/CLASSIFICATION_LOSS
This is at the final layer to which class the object belongs to whether dog or cat??
4.BOX_CLASSIFIERLOSS/LOCALIZATION_LOSS
This is also at the final layer for the bounding boxes of the object. (coordinates for dog and cat)

Changing Inception-v4 architecture to do Multi-label classification in Tensorflow

I am working on image tagging and annotation problem, simply an image may contain multiple objects. I want to train inception-v4 for multi-label classification. My training data will be an image and a vector of length equals the number of classes and has 1 in each index if the object exists in the image. For example, If I have four classes (Person, car, tree, buildings). If an image contains a person and car. Then my vector will be (1, 1, 0, 0).
What changes do I need to make to train inception-v4 for the tagging and annotation problem?
Do I only need to change the input format and change the loss function from softmax to sigmoid_cross_entropy_with_logits in the inception-v4 architecture?
https://github.com/tensorflow/models/blob/master/slim/nets/inception_v4.py
Thank you in advance.
If you'd like to retrain a model to output different labels, check out the image_retraining example: https://github.com/tensorflow/tensorflow/blob/r1.1/tensorflow/examples/image_retraining/retrain.py
In that example, we retrain the standard inception v3 model to recognize flowers instead of the standard ImageNet categories.