RGB or BGR for Tensorflow-slim-ResNet V2 pre-trained model? - tensorflow

For CNN training, the exact order of input image channels can be different from library to library, even model to model. For Caffe, the input image is usually expected to be in BGR order, while in Tensorflow, the order can be arbitrary.
So does anyone know for sure in what order (BGR or RGB) is the ResNet_V2 pre-trained model of Tensorflow slim library trained? It reads in the document that:
And I checked the script in this link: https://github.com/tensorflow/models/blob/master/research/slim/datasets/build_imagenet_data.py, it says the image is encoded in RGB. But I'm still not sure in which order is ResNet_V2 trained?
Does anyone have similar confusion about this issue? Thanks for any feedback!

It is RGB. The colorspace depends on how the image was read into memory during the data preparation. Caffe uses OpenCV for many image operations, and OpenCV defaults to reading images into BGR, while in TensorFlow universe it is more often to rely upon PIL library.
The colorspace stated in the script is RGB, see line 206.

Related

Image decoded/encoded/decoded in Tensorflow is not the same as the original

I store images in tfrecord files for training an image classification model with Tensorflow 2.10. TFrecords are read in a dataset on which I apply the fit() function. After training I'm making an inference with
image from dataset
same image read from disk
I notice that the predictions are not the same because in the first case, image encompasses (in the process of writing then reading the tfrecord file to build the dataset) a decode/encode/decode transformation (TF functions tf.io.decode_jpeg and tf.io.encode_jpeg) that is not symmetrical: image after transformation is not the same as original image (even if I encode with quality=100).
It can make a difference: in the first case, the good class is not in the top-3. In the second case, yes.
Is there any way to avoid this asymmetrical behavior?

Having a trained classifier like VGG16 how to automate image segmentation?

I Have a trained classifier: VGG16 on say Image Net (or my own images DB and classes). I want to segment my images automatically knowing there are classes on images my classifier knows. How to automate image segmentation?
For this you can extract Grad-CAM features. Kears already has published an official documentation for Grad-CAM extraction you can find it here.
So for your task steps need to followed are
Extract Grad-CAM from the images
Based on Grad-CAM create a segmentation mask using simple image processing technique
In this method you can easily create segmentation mask for images but masks may not be so accurate . Beacuse, see this picture,
it is for Xception model (ImageNet).
Hope you will understand and you will be helpful.

Tensorflow RGB-D Training

I have RGB-D (color&depth) images for given scene. I would like to use tensorflow to train a classification model based on pre-trained network such as inception. As far as I understood, these pre-trained models were built using 3-channel RGB images. However, the inclusion of 4th channel cannot be handled.
How do I use RGB-D images directly? Do I need to pre-process the images, and seperate RGB and D, if so, how do I use the D (1-channel) alone?
Thank you!
If you want to use a pre-trained model you can only use RGB, as they were only trained to understand RGB. In this case, it is as you said: separate them and discard depth.
To use a 4 channel image like this you would need to retrain the network from scratch rather than loading a pre-trained set of weights.
You will probably get good results using the same architecture as is used for 3 channel images (save for the minor change required to support the 4 channel input), so retraining shouldn't be terribly hard.

object detection api for gray scale images

I'm training a tf model to recognize various objects. I have many color images i use to train the model. however, in real life my app will encounter many images taken at gray-scale due to lighting conditions or camera capabilities.is there an easy way to tell TF to train the model both on the color images and on their gray-scale version without creating a double image set:color & grayscale. i saw some options for grayscale in data augmentation capabilities but am unclear on the exact usage.

What to expect from deep learning object detection on black and white pictures?

With TensorFlow, I want to train an object detection model with my own images based on ssd_inception_v2_coco model. The problem I have is that all my pictures are black and white. What performance can I expect? Should I try to colorize my B&W pictures first? Or at the opposite, should I try to retrain base network with images "uncolorized"? Are there general guidelines for B&W processing of images for deep learning object detection?
I wouldn't go through the trouble of colorizing if you are planning on using a pretrained model. I would expect that explicitly colorizing your images as a pre-processing step would help very little (if at all) since in theory the features that a colorizing network learns can also be learned by the detection network.
If you are planning on pretraining your detection network that was trained on an RGB dataset, make sure you either (i) replace the first convolution in the network with a convolutional layer that expects a single-channel input, or (ii) pad your image with two all-zero channels.
You may get slightly worse detection performance simply because you lose two thirds of the image's pixel information when using BW instead of RGB.