Input 0 is incompatible with layer model_1: expected shape=(None, 244, 720, 3), found shape=(None, 720, 3) - tensorflow

I wanted to test my model by uploading an image but I got this error. And I think I got the error somewhere in these lines, I'm just not sure how to fix.
IMAGE_SIZE = [244,720]
inception = InceptionV3(input_shape=IMAGE_SIZE + [3], weights='imagenet',include_top=False)
Also here's the code of uploading my test image
picture = image.load_img('/content/DSC_0365.JPG', target_size=(244,720))
img = img_to_array(picture)
prediction = model.predict(img)
print (prediction)
I'm still a newbie in Machine learning so my knowledge right now is not yet that deep.

This is mostly because you didn't prepare your input (its dimension) for your inception model. Here is one possible solution.
Model
from tensorflow.keras.applications import *
IMAGE_SIZE = [244,720]
inception = InceptionV3(input_shape=IMAGE_SIZE + [3],
weights='imagenet', include_top=False)
# check it's input shape
inception.input_shape
(None, 244, 720, 3)
Inference
Let's test a sample by passing it to the model.
from PIL import Image
a = Image.open('/content/1.png').convert('RGB')
display(a)
Check its basic properties.
a.mode, a.size, a.format
('RGB', (297, 308), None)
So, its shape already in (297 x 308 x 3). But to able to pass it to the model, we need an extra axis which is the batch axis. To do that, we can do
import tensorflow as tf
import numpy as np
a = tf.expand_dims(np.array(a), axis=0)
a.shape
TensorShape([1, 308, 297, 3])
Much better. Now, we may want to normalize our data and resize it according to the model input shape. To do that, we can do:
a = tf.divide(a, 255)
a = tf.image.resize(a, [244,720])
a.shape
TensorShape([1, 244, 720, 3])
And lastly, pass it to the model.
inception(a).shape
TensorShape([1, 6, 21, 2048])
# or, preserve the prediction to later analysis
y_pred = inception(a)
Updated
If you're using the [tf.keras] image processing function which loads the image into PIL format, then we can do simply:
image = tf.keras.preprocessing.image.load_img('/content/1.png',
target_size=(244,720))
input_arr = tf.keras.preprocessing.image.img_to_array(image)
input_arr = np.array([input_arr]) # Convert single image to a batch.
inception(input_arr).shape
TensorShape([1, 6, 21, 2048])

Related

Correct way to iterate over Keras ragged tensor

I have an input Tensorflow ragged tensor structured like this [batch num_images width height channels] and I need to iterate over the dimension num_images to extract some features relevant for downstream applications.
Example code is the following:
from tensorflow.keras.applications.efficientnet import EfficientNetB7
from tensorflow.keras.layers import Input
import tensorflow as tf
eff_net = EfficientNetB7(weights='imagenet', include_top=False)
input_claim = Input(shape=(None, 600, 600, 3), name='input_1', ragged=True)
eff_out = tf.map_fn(fn=eff_net,
elems=input_claim, fn_output_signature=tf.float32)
The first Input dimension is set to None as it can differ across data points, and for this reason the input receives instances of tf.RaggedTensor.
This code breaks with a TypeError in this way TypeError: Could not build a TypeSpec for KerasTensor(type_spec=RaggedTensorSpec(TensorShape([None, None, 600, 600, 3]), tf.float32, 1, tf.int64), name='input_1', description="created by layer 'input_1'") of unsupported type <class 'keras.engine.keras_tensor.RaggedKerasTensor'>.
I suspect there is a better way to perform this type of preprocessing though
Update: num_images is needed because (although not described here) I am doing some following reduce operation on this dimension
You can use tf.ragged.map_flat_values to achieve the same
Create a model like:
def eff_net(x): #dummy eff_net for testing that returns [batch, dim]
return tf.random.normal(shape=tf.shape(x)[:2])
input_claim = keras.Input(shape=(None, 600, 600, 3), name='input_1', ragged=True)
class RaggedMapLayer(layers.Layer):
def call(self, x):
return tf.ragged.map_flat_values(eff_net, x)
outputs = RaggedMapLayer()(input_claim)
model = keras.Model(inputs=input_claim, outputs=outputs)
testing,
inputs = tf.RaggedTensor.from_row_splits( tf.random.normal(shape=(10, 600, 600, 3)), row_splits=[0, 2, 5,10])
#shape [3, None, 600, 600, 3]
model(inputs).shape
#[3, None, 600]

How do I use a pretrained network as a layer in Tensorflow?

I want to use a feature extractor (such as ResNet101) and add layers after that which use the output of the feature extractor layer. However, I can't seem to figure out how. I have only found solutions online where an entire network is used without adding additional layers.
I am inexperienced with Tensorflow.
In the code below you can see what I have tried. I can run the code properly without the additional convolutional layer, however my goal is to add more layers after the ResNet.
With this attempt at adding the extra conv layer, this type error is returned:
TypeError: Expected float32, got OrderedDict([('resnet_v1_101/conv1', ...
Once I have added more layers, I would like to start training on a very small test set to see if my model can overfit.
import tensorflow as tf
import tensorflow.contrib.slim as slim
from tensorflow.contrib.slim.python.slim.nets import resnet_v1
import matplotlib.pyplot as plt
numclasses = 17
from google.colab import drive
drive.mount('/content/gdrive')
def decode_text(filename):
img = tf.io.decode_jpeg(tf.io.read_file(filename))
img = tf.image.resize_bilinear(tf.expand_dims(img, 0), [224, 224])
img = tf.squeeze(img, 0)
img.set_shape((None, None, 3))
return img
dataset = tf.data.TextLineDataset(tf.cast('gdrive/My Drive/5LSM0collab/filenames.txt', tf.string))
dataset = dataset.map(decode_text)
dataset = dataset.batch(2, drop_remainder=True)
img_1 = dataset.make_one_shot_iterator().get_next()
net = resnet_v1.resnet_v1_101(img_1, 2048, is_training=False, global_pool=False, output_stride=8)
net = slim.conv2d(net, numclasses, 1)
sess = tf.Session()
global_init = tf.global_variables_initializer()
local_init = tf.local_variables_initializer()
sess.run(global_init)
sess.run(local_init)
img_out, conv_out = sess.run((img_1, net))
resnet_v1.resnet_v1_101 does not return just net, but instead returns a tuple net, end_points. The second element is a dictionary, which is presumably why you are getting this particular error message.
For the documentation of this function:
Returns:
net: A rank-4 tensor of size [batch, height_out, width_out,
channels_out]. If global_pool is False,
then height_out and width_out are reduced by a
factor of output_stride compared to the respective height_in and width_in,
else both height_out and width_out equal one. If num_classes is 0 or None,
then net is the output of the last ResNet block, potentially after global
average pooling. If num_classes a non-zero integer, net contains the
pre-softmax activations.
end_points: A dictionary from components of the network to the corresponding
activation.
So you can write for example:
net, _ = resnet_v1.resnet_v1_101(img_1, 2048, is_training=False, global_pool=False, output_stride=8)
net = slim.conv2d(net, numclasses, 1)
You can also choose an intermediate layer, e.g.:
_, end_points = resnet_v1.resnet_v1_101(img_1, 2048, is_training=False, global_pool=False, output_stride=8)
net = slim.conv2d(end_points["main_Scope/resnet_v1_101/block3"], numclasses, 1)
(you can look into end_points to find the names of the endpoints. Your scope name will be different than main_Scope.)

TensorFlow network is receiving wrong tensor shape after using `dataset.map()`

Following the example at https://www.tensorflow.org/guide/datasets#preprocessing_data_with_datasetmap, I want to create a tf.Dataset which takes in paths to images, and maps these to image tensors.
My first attempt was the following, which is very similar to the example in the above link:
def input_parser(image_path):
image_data_string = tf.read_file(image_path)
image_decoded = tf.image.decode_png(image_data_string, channels=3)
image_float = tf.image.convert_image_dtype(image_decoded, dtype=tf.float32)
return image_float
def train_model():
image_paths = ['test_image1.png', .test_image2.png', 'test_image3.png']
dataset = tf.data.Dataset.from_tensor_slices(image_paths)
dataset = dataset.map(map_func=input_parser)
iterator = dataset.make_initializable_iterator()
input_images = iterator.get_next()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(iterator.initializer)
for i in range(3):
x = sess.run(input_images)
print(x.shape)
This seemed to work ok, and printed out:
(64, 64, 3)
(64, 64, 3)
(64, 64, 3)
Which are indeed the dimensions of my images.
So then I tried to actually feed this data into a network to train, and modified the code accordingly:
def input_parser(image_path):
image_data_string = tf.read_file(image_path)
image_decoded = tf.image.decode_png(image_data_string, channels=3)
image_float = tf.image.convert_image_dtype(image_decoded, dtype=tf.float32)
return image_float
def train_model():
image_paths = ['test_image1.png', .test_image2.png', 'test_image3.png']
dataset = tf.data.Dataset.from_tensor_slices(image_paths)
dataset = dataset.map(map_func=input_parser)
iterator = dataset.make_initializable_iterator()
input_images = iterator.get_next()
x = tf.layers.conv2d(inputs=input_images, filters=50, kernel_size=[5, 5], name='layer1')
x = tf.layers.flatten(x, name='layer2')
prediction = tf.layers.dense(inputs=x, units=4, name='layer3')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(iterator.initializer)
for i in range(3):
p = sess.run(prediction)
print(p)
This then gave me the following error message:
ValueError: Input 0 of layer layer1 is incompatible with the layer: expected ndim=4, found ndim=3. Full shape received: [None, None, 3]
I have two questions about this:
1) Why is my network receiving an input of shape [None, None, 3], when as we have seen, the data read by the iterator is of shape [64, 64, 3].
2) Why isn't the shape of the input actually [1, 64, 64, 3], i.e. with 4 dimensions? I thought that the first dimension would be 1 because this is the batch size (I am not batching the data, so effectively this is a batch size of 1).
Thanks!
The shape is None in the spatial dimensions because in principle you could be loading images of any size. There is no guarantee that they will be 64x64 so Tensorflow uses None shapes to allow for inputs of any size. Since you know that the images will always be the same size, you can use a Tensor's set_shape method to give this information. Just include a line image_float.set_shape((64, 64, 3)) in your parse function. Note that this seems to modify the tensor in place. There is even an example using images here.
You are not batching the data, so no batch axis is added at all. The elements of the dataset are simply images of shape (64, 64, 3) and these elements are returned one by one by the iterator. If you want batches of size 1 you should use dataset = dataset.batch(1). Now the elements of the dataset are image "batches" of shape (1, 64, 64, 3). Of course you could also use any other method to add an axis in front, such as tf.expand_dims.

Tensorflow avoid shape information with crop

again I have some issue with Tensorflow. I am using a FCN model and need to apply a random crop due to memory usage.
tf.random_crop(combined, size=[512, 512, 4])
unfortunately now the new size "sticks" to the tensor and I can not get rid of it.
The issue caused by this is, that the resulting model only accepts input of size 512x512, which cannot be worked around in a nice way, as far as I know.
Is there any solution to either remove the shape information caused by random_crop or to easily adapt the size afterwards after obtaining a trained model?
Thank you in advance.
I don't know if it will completely suit your use-case, but the size parameter of tf.random_crop() can be a tensor, so you can for instance use a placeholder as shown in the example below.
import tensorflow as tf
import numpy as np
image = tf.placeholder(tf.float64, [None, None, 4])
cropped_size = tf.placeholder(tf.int32, [2])
cropped_image = tf.random_crop(image, size=[cropped_size[0], cropped_size[1], 4])
print(cropped_image.get_shape().as_list())
# [None, None, 4]
with tf.Session() as sess:
res = sess.run(cropped_image,
feed_dict={image: np.random.rand(900, 600, 4), cropped_size: [512, 512]})
print(res.shape)
# (512, 512, 4)
EDIT:
There may be different solutions to have the value of cropped_size assigned without using a feed_dict, depending how the crop dimensions are stored ; e.g. using TF file readers (the values would stay unknown till read).
Another simple hack otherwise: take advantage of tf.placeholder_with_default(default_val, shape) (doc), providing default_val with the crop dimensions acquired anyhow. As tf.placeholder_with_default() value isn't actually assigned until runtime (in case you you want to feed this placeholder with a different value), your dimensions would stay None in the graph:
import tensorflow as tf
image = tf.random_uniform((900, 600, 4)) # image tensor, acquired anyhow e.g. from tf.data
cropped_size_for_this_run = [512, 512] # crop dimensions, acquired anyhow
cropped_size = tf.placeholder_with_default(cropped_size_for_this_run, shape=[2])
cropped_image = tf.random_crop(image, size=[cropped_size[0], cropped_size[1], 4])
print(cropped_image.get_shape().as_list())
# [None, None, 4]
with tf.Session() as sess:
# You can leave cropped_size with its default value assigned at runtime:
res = sess.run(cropped_image)
print(res.shape)
# (512, 512, 4)
# ... or you can specify a new one if you wish so:
res = sess.run(cropped_image, feed_dict={cropped_size: [256, 256]})
print(res.shape)
# (256, 256, 4)
# ... It would switch back to the default value if you don't feed one:
res = sess.run(cropped_image)
print(res.shape)
# (512, 512, 4)

Resize MNIST in Tensorflow

I have been working on MNIST dataset to learn how to use Tensorflow and Python for my deep learning course.
I want to resize MNIST as 22 & 22 using tensorflow, then I train it, but I do not how to do?
Could you help me?
TheRevanchist's answer is correct. However, for the mnist dataset, you first need to reshape the mnist array before you send it to tf.image.resize_images():
import tensorflow as tf
import numpy as np
import cv2
mnist = tf.contrib.learn.datasets.load_dataset("mnist")
batch = mnist.train.next_batch(10)
X_batch = batch[0]
batch_tensor = tf.reshape(X_batch, [10, 28, 28, 1])
resized_images = tf.image.resize_images(batch_tensor, [22,22])
The code above takes out a batch of 10 mnist images and reshapes them from 28x28 images to 22x22 tensorflow images.
If you want to display the images, you can use opencv and the code below. The resized_images.eval() converts the tensorflow image to a numpy array!
with tf.Session() as sess:
numpy_imgs = resized_images.eval(session=sess) # mnist images converted to numpy array
for i in range(10):
cv2.namedWindow('Resized image #%d' % i, cv2.WINDOW_NORMAL)
cv2.imshow('Resized image #%d' % i, numpy_imgs[i])
cv2.waitKey(0)
Did you try tf.image.resize_image?
The method:
resize_images(images, size, method=ResizeMethod.BILINEAR,
align_corners=False)
where images is a batch of images, and size is a vector tensor which determines the new height and width. You can look at the full documentation here: https://www.tensorflow.org/api_docs/python/tf/image/resize_images
Updated: TensorFlow 2.4.1
Short Answer
Use tf.image.resize (instead of resize_images). The link other provided no longer exits. Updated link.
Long Answer
MNIST in tf.keras.datasets.mnist is the following shape
(batch_size, 28 , 28)
Here is the full implementation. Please read the comment which attach with the code.
(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
# expand new axis, channel axis
x_train = np.expand_dims(x_train, axis=-1)
# [optional]: we may need 3 channel (instead of 1)
x_train = np.repeat(x_train, 3, axis=-1)
# it's always better to normalize
x_train = x_train.astype('float32') / 255
# resize the input shape , i.e. old shape: 28, new shape: 32
x_train = tf.image.resize(x_train, [32,32]) # if we want to resize
print(x_train.shape)
# (60000, 32, 32, 3)
You can use cv2.resize() function of opencv
Use a for loop to go iterate through every image
And inside for loop for every image add this line cv2.resize(source_image, (22, 22))
def resize(mnist):
train_data = []
for img in mnist.train._images:
resized_img = cv2.resize(img, (22, 22))
train_data.append(resized_img)
return train_data