when resize images in tensorflow object detection api, which function realize the resize of box coordinates? - tensorflow

When I check the code, if I set a image_resizer in the .config file, I found the image will be resized in the function SSDMetaArch.preprocess(), but I cannot find which function is used to resize corresponding groundtruth bbox coordinates.
Thank you so much for any help!

Related

can I get coordinates of objects in a image with yolov5?

I want to know, if it is possible to use yolov5 to find objects in an image and than give back the type of object and where it is in the picture? Not only the image with the bounding boxes
found it:
results.pandas().xyxy[0]
or
results.xyxy[0]

Bounding box coordinates via video object detection

Is it possible to get the coordinates of the detected object through video object detection? I used OpenCV to change video to images and LabelImg to put the bounding box to them. For the output, I want to be able to read (for a video file) the coordinates of the bounding box so I could get the centre of the box.
Object Detection works on per image or per frame basis.
The basic object detection model takes image as input and gives back the detected bounding boxes.
Now, what you need to do is, once your model is trained, read a video, frame by frame, or by skippng 'n' frames, pass the frame or image to the object detector and get the coordinates from it and show it into the output video frame.
That is how object detection works for a video.
Please refer to below links for references :
https://github.com/tensorflow/models/tree/master/research/object_detection
https://www.edureka.co/blog/tensorflow-object-detection-tutorial/

Resize images for object detection

I want to train images with mask RCNN and my understanding is that all the images need to be the same size. I also read that you can add "padding" to images so that you can retain the right aspect ration.
Does anyone know how to add padding to the images and resize?Does anyone have a code for that or an online tool which can do that?
Thanks
In opencv library, there are padding function that can add borders to your images.
also, resize function too.
refer to this webpage.

Why put the whole image in a tfrecord file? Why not just crop according to the bounding-box and put the cropped object in the tfrecord file?

Why do we put the whole image in a tfrecord file? Why not just crop the image according to the bounding-box and put the cropped object in the tfrecord file? This should greatly reduce the size of that file.
Because you want to learn to detect where that object is in the image. In image classification, you would cut out the images as you proposed and the network would output "car" or "not car". In object detection, the network will output the bounding boxes for the objects along with the class. ("car is at x1-x2-y1-y2") It learns by having the whole picture with the bounding boxes for the loss function.

If I resize images using Tensorflow Object Detection API, are the bboxes automatically resized too?

Tensorflow's Object Detection API has an option in the .config file to add an keep_aspect_ratio_resizer. If I resize my training data using this, will the corresponding bounding boxes be resized as well? If they don't match up then the network is seeing incorrect examples.
Yes, the boxes will be resized to be compatible with the images as well!