classify batch of images using tensorflow mobilenet retrain example - tensorflow

I have trained a classification net by using Tensorflow MobileNet retrain.py example file (in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py)
When I use the trained net, i managed to run only a single image at a time (the input is tensorflow 4d array shape [1, 128, 128, 3])
Don't know if it means anything, but the train process used train batch flag --train_batch_size=100
When i try to classify a batch of images (for example, tensorflow 4d array shape [2, 128, 128, 3] to 'input' layer), i get the following error:
ValueError: Cannot feed value of shape (2, 128, 128, 3) for Tensor 'input:0', which has shape '(1, 128, 128, 3)'
(before running this trained net session, i am using a preprocess tensorflow session that prepares the images for this net (resize, normalise, etc.)
Does anyone knows what should i do in order to run a batch of images over such a net or how can i configure the retrain.py file that would create a net that allows batch run?

Related

Keras Conv2D layer has different output when using AWS Sagemaker

I've been trying to train a model on AWS Sagemaker as I found that my computer is no longer powerful enough to train my model in a reasonable amount of time. However, when I tried to load the model (after copy pasting the code from my computer) I got an unexpected error.
After tinkering around for a little bit, I found that the very first Conv2D layer has a different output shape than it was on my computer.
Sagemaker output dimensions:
(None, 128, 498, 3)
Expected output dimensions:
(None, 498, 498, 3)
My code is below:
import tensorflow as tf
from tensorflow import keras
model = keras.models.Sequential()
model.add(keras.Input(shape = (500,500,3)))
model.add(keras.layers.Conv2D(filters=128, kernel_size = (3,3), activation='relu'))
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0001),
loss=keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.summary()
How can I fix this?
I came here, because I had the same problem. I found the solution, but I am still really confused about it. I just want to mention that I use the same tensorflow version locally and on sagemaker (2.10). And on both EXACTLY the same code.
If you go https://keras.io/api/layers/convolution_layers/convolution2d/
It states:
"Output shape
4+D tensor with shape: batch_shape + (filters, new_rows, new_cols) if data_format='channels_first' or 4+D tensor with shape: batch_shape + (new_rows, new_cols, filters) if data_format='channels_last'. rows and cols values might have changed due to padding."
So I forced the Sagemaker's version to `data_format='channels_last'
Now both version the local one and AWS one are consistent.
`

Multi-channel inputs with Tensorflow Objection Detection API V2

I'd like to construct a network in the Tensorflow V2 object detection API using 5-channel images. However, I am stuck on how to modify the weights of the first convolutional layer using the Tensorflow 2.2 framework.
I have downloaded the pre-trained RetinaNet from the V2 Model Zoo. I then tried the following to modify the weights in the first layer of the checkpoint and save them back:
tf_path = tf.train.latest_checkpoint('./RetinaNet/checkpoint/')
init_vars = tf.train.list_variables(tf_path)
tf_vars = {}
for name, shape in init_vars:
array = tf.train.load_variable(tf_path, name)
try:
if shape[2]==3:#look for a layer who's 3rd input dimension is 3 i.e. the 1st convolutional layer
array=np.concatenate((array,array[:,:,:2,:]),axis=2)
array=array.astype('float32')
tf_vars[name]=tf.Variable(array)
else:
tf_vars[name]=tf.Variable(array)
except:
tf_vars[name]=tf.Variable(array)
saver = tf.compat.v1.train.Saver(var_list=tf_vars)
sess = tf.compat.v1.Session()
saver.save(sess, './RetinaNet/checkpoint/ckpt-0')
I loaded the model back in to make sure the 1st convolutional layer had been changed - all looks ok.
But when I go to train the model, I get the following error:
Model was constructed with shape (None, None, None, 3) for input Tensor("input_1:0", shape=(None, None, None, 3), dtype=float32), but it was called on an input with incompatible shape (64, 128, 128, 5)
Which leads me to believe my method of modifying the weights is not so "ok" after all. Would anyone have some tips on how to modify these weights correctly?
Thanks
This now works but the solution is very hacky... it also means not training from the pretrained weights from the model zoo - so you need to comment everything to do with the fine_tune_checkpoint in the config file.
Then, go to .\Lib\site-packages\official\vision\image_classification\efficientnet and change the number of input channels and number of classes in efficientnet_model.py and efficientnet_config.py.

Convolutional Neural Network (CNN) input shape

I am new to CNN and I have a question regarding CNN. I am a bit confused about the input shape of CNN (specifically with Keras).
My data is a 2D data (let's say 10X10) in different time slots. Therefore, I have 3D data.
I am going to feed this data to my model to predict the coming time slot. So, I will have a certain number of time slots for prediction (let's say 10 slots, so far, I may have a 10X10X10 data).
Now, my question is that I have to deal with this data as a 2D image with 10 channels (like ordinary kinds of data in CNN, RGB images) or as a 3D data. (conv2D or conv3D in Keras).
Thank you in advance for your help.
In your case,Conv2D will be useful. Please refer below description for understanding input shape of Convolution Neural Network (CNN) using Conv2D.
Let’s see how the input shape looks like. The input data to CNN will look like the following picture. We are assuming that our data is a collection of images.
Input shape has (batch_size, height, width, channels). Incase of RGB image would have a channel of 3 and the greyscale image would have a channel of 1.
Let’s look at the following code
import tensorflow as tf
from tensorflow.keras.layers import Conv2D
model=tf.keras.models.Sequential()
model.add(Conv2D(filters=64, kernel_size=1, input_shape=(10,10,3)))
model.summary()
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 10, 10, 64) 256
=================================================================
Thought it looks like out input shape is 3D, but you have to pass a 4D array at the time of fitting the data which should be like (batch_size, 10, 10, 3). Since there is no batch size value in the input_shape argument, we could go with any batch size while fitting the data.
The output shape is (None, 10, 10, 64). The first dimension represents the batch size, which is None at the moment. Because the network does not know the batch size in advance.
Note: Once you fit the data, None would be replaced by the batch size you give while fitting the data.
Let’s look at another code with batch Size
import tensorflow as tf
from tensorflow.keras.layers import Conv2D
model=tf.keras.models.Sequential()
model.add(Conv2D(filters=64, kernel_size=1, batch_input_shape=(16,10,10,3)))
model.summary()
Output:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (16, 10, 10, 64) 256
=================================================================
Here I have replaced input_shape argument with batch_input_shape. As the name suggests, this argument will ask you the batch size in advance, and you can not provide any other batch size at the time of fitting the data.

I use batch = 5 in tensorflow during training phase, why I cant use only batch = 1 test in tensorflowjs?

I train the GAN model in tensorflow with batchsize=5,so the generator input size is [5,imagesize,imagesize,3].After training,I convert tensorflow model into the tensorflowjs model.
So,I load the model by loadFrozenModel.Then use model.predict to predict an image.However, the shape of dict['concat'] provided in model.execute(dict) must be [5,512,512,12], but was [1,512,512,12].
How to solve this problem?I use mini-batch in traning phase in tensorflow,and only predict an image with one input not 5 inputs in tensorflowjs
Figure 1. the error
It sounds like you set the batch size explicitly as part of the input shape in your training job, e.g.
x = tf.placeholder("float", shape=[5, 512, 512, 12])
Instead you should leave the batch size unspecified, like this:
x = tf.placeholder("float", shape=[None, 512, 512, 12])
That way the graph will work with whatever batch size you give it, both at training and at inference time.
If you have code that needs to know the batch size explicitly, see here for some tips.

InceptionV3 and transfer learning with tensorflow

I would like to do a transfer learning from the given inceptionV3 in tensorflow example. Following the classify image example and the operator and tensor names given here https://github.com/AKSHAYUBHAT/VisualSearchServer/blob/master/notebooks/notebook_network.ipynb I can create my graph. But when, I put a batch of images of size (100, 299, 299, 3) in the pre-computed inception graph, I get the following shape error at the pool_3 layer :
ValueError: Cannot reshape a tensor with 204800 elements to shape [1, 2048] (2048 elements)
It seems that this inceptionV3 graph doesn't accept image batch as input. am I wrong ?
Actually it works for transfer learning if you extract the right thing. There is no problem feeding a batch of images in the shape of [N, 299, 299, 3] as ResizeBilinear:0 and then using the pool_3:0 tensor. It's the reshaping afterwards that breaks, but you can reshape yourself (you'll have your own layers afterwards anyway). If you wanted to use the original classifier with a batch, you could add your own reshaping on top of pool_3:0 and then add the softmax layer, reusing the weights/biases tensors of the original softmax.
TLDR: With double_img being a stack of two images with shape (2, 299, 299, 3) this works:
pooled_2 = sess.graph.get_tensor_by_name("pool_3:0").eval(session=sess, feed_dict={'ResizeBilinear:0':double_img})
pooled_2.shape
# => (2, 1, 1, 2048)
You're not wrong. This seems like a very reasonable feature request, so I've opened a ticket for it on github. Follow that for updates.
Something like this should do it:
with g.as_default():
inputs = tf.placeholder(tf.float32, shape=[batch_size, 299, 299, 3],
name='input')
with slim.arg_scope(inception.inception_v3_arg_scope()):
logits, end_points = inception.inception_v3( inputs,
num_classes=FLAGS.num_classes, is_training=False)
variables_to_restore = lim.get_variables_to_restore(exclude=exclude)
sess = tf.Session()
saver = tf_saver.Saver(variables_to_restore)
Then you should be able to call the operation:
sess.run("pool_3:0",feed_dict={'ResizeBilinear:0':images})
etarion made a very good point. However, we don't have to reshape it ourselves; instead, we could change the value of shape that reshape takes as input. I.e.,
input_tensor_name = 'import/input:0'
shape_tensor_name = 'import/InceptionV3/Predictions/Shape:0'
output_tensor_name= 'import/InceptionV3/Predictions/Reshape_1:0'
output_tensor = tf.import_graph_def(
graph.as_graph_def(),
input_map={input_tensor_name: image_batch,
shape_tensor_name: [batch_size, num_class]},
return_elements=[output_tensor_name])
These tensor names are based on inception_v3_2016_08_28_frozen.pb.