Tensorflow object detection:ValueError: cannot reshape array - tensorflow

I have trained "SSD with Mobilenet" model from Tensorflow. And training went fine.
Now when I try to test the performance of inference graph by running object_detection_tutorial.ipynb on an image, I get following error:
ValueError: cannot reshape array of size X into shape (a,b,c)
X,a,b,c are different values for different test images.
I don't think image size is causing the issue as model must perform independent of input image size. Infact, I get this error even with an image I used for training.
Please help here.

As suggested by #Mandroid, programmatically changing the input image to 3 channel might be the way to go, but this is how I ended up solving my issue.
Note: I am not sure that removing alpha from the images might have some consequences. This is some kind of information loss however.
Replacing image = Image.open(<image_path>) with image = Image.open(<image_path>).convert('RGB') did the job for me.

Related

Set batch size of trained keras model to 1

I am having a keras model trained on my own dataset. However after loading weights the summary shows None as the first dimension(the batch size).
I want to know the process to fix the shape to batch size of 1, as it is compulsory for me to fix it so i can convert the model to tflite with GPU support.
What worked for me was to specify batch size to the Input layer, like this:
input = layers.Input(shape=input_shape, batch_size=1, dtype='float32', name='images')
This then carried through the rest of the layers.
The bad news is that despite this "fix" the tfl runtime still complains about dynamic tensors. I get these non-fatal errors in logcat when it runs:
E/tflite: third_party/tensorflow/lite/core/subgraph.cc:801 tensor.data.raw != nullptr was not true.
E/tflite: Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors (tensor#26 is a dynamic-sized tensor).
E/tflite: Ignoring failed application of the default TensorFlow Lite delegate indexed at 0.
The good news is that despite these errors it seems to be using the GPU anyway, based on performance testing.
I'm using:
tensorflow-lite-support:0.2.0'
tensorflow-lite-metadata:0.2.1'
tensorflow-lite:2.6.0'
tensorflow:tensorflow-lite-gpu:2.3.0'
Hopefully, they'll fix the runtime so it doesn't matter whether the batch size is 'None'. It shouldn't matter for doing inference.

How to predict Faster RCNN model in batches?

I have a trained RCNN (Keras-Retinanet) model and I could predict only one image at a time.
boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
Full script is here.
Is there a way to predict multiple images at a time?
Thanks
The reason you expand the dims there is to artificially add a batch dimension. Simply stack a bunch of images together and pass it into this function to get a batch of results.
Said differently, right now you are passing in:
[image].
You could instead pass in:
[image_1, image_2, image_3, ...] (as a numpy, not python array)

how to manage batches for model.provide_groundtruth

I'm trying to use TensorFlow 2 Object Detection API with a custom dataset for multi classes to train an SSD, I took as base the example provide by the documentation: https://github.com/tensorflow/models/blob/master/research/object_detection/colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb
My current problem is when I start the fine tuning:
InvalidArgumentError: The first dimension of paddings must be the rank
of inputs[2,2] [6] [Op:Pad]
That seems to be related with the section of model.provide_groundtruth on train_step_fn, as I mention I took my data from a TensorFlow record, I mapped this to a dataset and divide it into batches using padded_batches(tf.data.TFRecordDataset) seems that this is the correct to feed the training with the image but now my problem is the groundtruth because this now is also converted to batches [batch_size,num_detections,coordinate_bbox], is this the problem? any idea on how to fix this issue.
Thanks
P.S. I tried to used the version of modified the pipeline.config file and run the model_main_tf2.py as was in the past with TensorFlow 1 but this method is buggy.
Just to share with everyone this resolves my issue was that I manage to split the data into batches the images and ground truth correctly but I never convert my labels to one hot vector encoding.

HOW (how) does Keras and Pytorch handle the last batch when batch_size is not a multiple of size of training data?

Before you happen to suggest or report as duplicate, I am asking HOW does the two libraries handle the problem NOT WHAT HAPPENS because I know it can make a batch from remaining data. And I do know that it handles but all I am asking is HOW.
If I have 100 images as training data and batch_size=15, the last batch will have 10 images to train. My Question is that when the Input() layer already knows that the data is coming as a shape of (Batch_size,channel,width,height) for PyTorch and (batch_size,width,height,channels) for Keras using Tensorflow as backend.
If the last batch has size=10, isn't the model supposed to throw an error because it will get (10,1,28,28) in place of (15,1,28,28) given we have (28,28) pixels Grayscale images?
What is happening behind the scenes?
If you look at the doc for example you can see that the batch size is optional. That is because the batch size is treated a variable in any iteration

Why do I get ValueError('\'image\' must be fully defined.') when transforming image in Tensorflow?

I want to do real time data augmentation by chaining different image transformation operators in tensorflow. My code begins with image decoding and then runs different transformations but it throw a ValueError('\'image\' must be fully defined.'). Here is an example to reproduce this error :
def decode_and_augment(image_raw):
decoded = tf.image.decode_jpeg(image_raw)
flipped = tf.image.random_flip_left_right(decoded)
return flipped
This error arises because the tf.image.random_flip_left_right() op checks the static shape of its input when you build the graph, and tf.image.decode_jpeg() produces tensors that have a data dependency on the contents of image_raw so it the shape isn't statically known. Currently the only way to work around this is to set the static shape of the decoded tensor using Tensor.set_shape(), as follows:
decoded = tf.image.decode_jpeg(image_raw)
decoded.set_shape([IMAGE_HEIGHT, IMAGE_WIDTH, NUM_CHANNELS])
flipped = tf.image.random_flip_left_right(decoded)
The downside of this is that all images must now have the same size (and number of channels).
Many of the image ops don't follow the same gradual and dynamic shape inference as the rest of TensorFlow (which allows you to have unknown shapes or dimensions, assumes that the program is correct as you build the graph, and checks the real shapes at runtime). This is considered a bug at the present time, and we'll figure out a way to fix it.