Keras input shape ordering is (width, height, ch)? - tensorflow

I use Keras2 with TensorFlow as back-end and tried feed horizontal rectangle image (width:150 x height:100 x ch:3) into network.
I use cv2 for pre-processing images and cv2 & TensorFlow treats the shape of images as [height, width, ch] ordering (in my case, it's [100, 150, 3] This format is opposite of (width:150 x height:100 x ch:3), but it's not mistake.)
So I defined Keras model API input as follow code, but it occurred an error.
img = cv2.imread('input/train/{}.jpg'.format(id))
img = cv2.resize(img, (100, 150))
inputs = Input(shape=(100, 150, 3))
x = Conv2D(8, (3, 3), padding='same', kernel_initializer='he_normal')(inputs)
~~~
error message is below
ValueError: Error when checking input: expected input_4 to have shape
(None, 100, 150, 3) but got array with shape (4, 150, 100, 3)
By the way input = Input((150, 100, 3)) can be run.
I feel weird with discrepancy between Keras & TensorFlow, so I'm suspicious that it just don't occurred error, it does not worked properly.
Anybody can explain that? I couldn't locate the input shape ordering in Keras Document.

You can change the dimension ordering as you prefer.
You can print and change the dimension ordering like this:
from keras import backend as K
print(K.image_data_format()) # print current format
K.set_image_data_format('channels_last') # set format
If you want to permanently change the dimension ordering, you should edit it in the keras.json file, usually located at ~/.keras/keras.json:
"image_data_format": "channels_last"

My problem occurred from order of width&height at argument of cv2.resize().
cv2.resize() takes the argument like cv2.resize(img, (width, height)), whereas numpy treats image array order of (height, width).

Taken from https://keras.io/api/layers/convolution_layers/convolution2d/
Arguments
...
data_format: A string, one of channels_last (default) or
channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch_size, height,
width, channels) while channels_first corresponds to inputs with shape
(batch_size, channels, height, width). It defaults to the
image_data_format value found in your Keras config file at
~/.keras/keras.json. If you never set it, then it will be
channels_last.
...
So it is: (batch_size, height, width, channels) (by default)

Related

Tensorflow image shape ValueError

I have an image and I'd like to do some predictions using a pretrained model. I've prepared my model and everything looks fine in theory but when I use "model.predict(image)" Firstly, these are my codes:
import cv2
import tensorflow as tf
img_array = cv2.imread("hamburger.jpg") #importing image to project
img_array = tf.cast(img_array, tf.int32) / 255
image = tf.image.resize(img_array, (513, 513)) #reshaping image to make it's shape like (513, 513,3)
Everything looks fine as here but when I try to make prediction, I'm getting this error.
ValueError: The argument 'images' (value Tensor("IteratorGetNext:0", shape=(None, 513, 3), dtype=float32)) is not compatible with the shape this function was traced with. Expected shape (None, 513, 513, 3), but got shape (None, 513, 3).
I first time saw a shape like (None, 513, 513, 3). How can I do this with my image or should I make my image like it.
To predict a single image with model.predict you need to add a dimension for batch size. To do that use
image=np.expand_dims(image, axis=0)

How does Conv2d works when input's feature map is different from the kernel weight's input size?

In this code "x" is an input to the convolutional 2D layer with 1 batch size, 256 both height & width size and 40 feature maps.
And, w is the weight for the convolutional layer, where the kernel size is (3x3), with depth of 10 and with 80 filters.
x = tf.random.uniform((1, 256, 256, 40))
w = tf.random.uniform((3, 3, 10, 80))
out = tf.nn.conv2d(x, w, strides = (1, 1), padding = 'SAME', data_format='NHWC')
output.shape
##############
OUTPUT:
TensorShape([1, 256, 256, 80])
Here, this particular code works fine, where the output shape is (1x256x256x80). But Why?
Here, the input's feature map is not same as the depth, which I think it should be.
According to me, the weight should be of shape (3x3x40x80) instead of (3x3x10x80)
where the 3rd position value (i.e; in 2nd index which is 40) should be equal to the 4th position value (i.e; in 3rd index which is 40) of the shape of input.
Otherwise, how would the convolution operation will happen?
*** It also works fine for the weights with the shape ***
(3x3x1x80)
(3x3x2x80)
(3x3x4x80)
(3x3x5x80)
(3x3x8x80)
(3x3x20x80)
It seems like, it works with all the depth which divides 40.

Input 0 is incompatible with layer model_1: expected shape=(None, 244, 720, 3), found shape=(None, 720, 3)

I wanted to test my model by uploading an image but I got this error. And I think I got the error somewhere in these lines, I'm just not sure how to fix.
IMAGE_SIZE = [244,720]
inception = InceptionV3(input_shape=IMAGE_SIZE + [3], weights='imagenet',include_top=False)
Also here's the code of uploading my test image
picture = image.load_img('/content/DSC_0365.JPG', target_size=(244,720))
img = img_to_array(picture)
prediction = model.predict(img)
print (prediction)
I'm still a newbie in Machine learning so my knowledge right now is not yet that deep.
This is mostly because you didn't prepare your input (its dimension) for your inception model. Here is one possible solution.
Model
from tensorflow.keras.applications import *
IMAGE_SIZE = [244,720]
inception = InceptionV3(input_shape=IMAGE_SIZE + [3],
weights='imagenet', include_top=False)
# check it's input shape
inception.input_shape
(None, 244, 720, 3)
Inference
Let's test a sample by passing it to the model.
from PIL import Image
a = Image.open('/content/1.png').convert('RGB')
display(a)
Check its basic properties.
a.mode, a.size, a.format
('RGB', (297, 308), None)
So, its shape already in (297 x 308 x 3). But to able to pass it to the model, we need an extra axis which is the batch axis. To do that, we can do
import tensorflow as tf
import numpy as np
a = tf.expand_dims(np.array(a), axis=0)
a.shape
TensorShape([1, 308, 297, 3])
Much better. Now, we may want to normalize our data and resize it according to the model input shape. To do that, we can do:
a = tf.divide(a, 255)
a = tf.image.resize(a, [244,720])
a.shape
TensorShape([1, 244, 720, 3])
And lastly, pass it to the model.
inception(a).shape
TensorShape([1, 6, 21, 2048])
# or, preserve the prediction to later analysis
y_pred = inception(a)
Updated
If you're using the [tf.keras] image processing function which loads the image into PIL format, then we can do simply:
image = tf.keras.preprocessing.image.load_img('/content/1.png',
target_size=(244,720))
input_arr = tf.keras.preprocessing.image.img_to_array(image)
input_arr = np.array([input_arr]) # Convert single image to a batch.
inception(input_arr).shape
TensorShape([1, 6, 21, 2048])

Keras - How should I specify the input_shape of my training data? (The data are gray-scale images)

I'm using Conv2d in Keras to do some classification for gray-scale images. Each image is stored as a 240*300 matrix, (namely a list [ A_1, A_2,..., A_240 ] and each A_k is a list of length 300
How should I specify the input_shape of the first layer of my ConvNet?
Thanks
ValueError: Input 0 of layer conv2d is incompatible with the layer:
expected ndim=4, found ndim=3. Full shape received
: [None, 240, 300]
First, you need to reshape your data, adding a dimension at the end with size of one, which represents one channel (a grayscale image). Assuming data has shape (samples, 240, 300):
data = data.reshape((-1, 240, 300, 1))
This will make data have shape (samples, 240, 300, 1). Then to your first layer you should give input_shape=(240, 300, 1)

Tensorflow avoid shape information with crop

again I have some issue with Tensorflow. I am using a FCN model and need to apply a random crop due to memory usage.
tf.random_crop(combined, size=[512, 512, 4])
unfortunately now the new size "sticks" to the tensor and I can not get rid of it.
The issue caused by this is, that the resulting model only accepts input of size 512x512, which cannot be worked around in a nice way, as far as I know.
Is there any solution to either remove the shape information caused by random_crop or to easily adapt the size afterwards after obtaining a trained model?
Thank you in advance.
I don't know if it will completely suit your use-case, but the size parameter of tf.random_crop() can be a tensor, so you can for instance use a placeholder as shown in the example below.
import tensorflow as tf
import numpy as np
image = tf.placeholder(tf.float64, [None, None, 4])
cropped_size = tf.placeholder(tf.int32, [2])
cropped_image = tf.random_crop(image, size=[cropped_size[0], cropped_size[1], 4])
print(cropped_image.get_shape().as_list())
# [None, None, 4]
with tf.Session() as sess:
res = sess.run(cropped_image,
feed_dict={image: np.random.rand(900, 600, 4), cropped_size: [512, 512]})
print(res.shape)
# (512, 512, 4)
EDIT:
There may be different solutions to have the value of cropped_size assigned without using a feed_dict, depending how the crop dimensions are stored ; e.g. using TF file readers (the values would stay unknown till read).
Another simple hack otherwise: take advantage of tf.placeholder_with_default(default_val, shape) (doc), providing default_val with the crop dimensions acquired anyhow. As tf.placeholder_with_default() value isn't actually assigned until runtime (in case you you want to feed this placeholder with a different value), your dimensions would stay None in the graph:
import tensorflow as tf
image = tf.random_uniform((900, 600, 4)) # image tensor, acquired anyhow e.g. from tf.data
cropped_size_for_this_run = [512, 512] # crop dimensions, acquired anyhow
cropped_size = tf.placeholder_with_default(cropped_size_for_this_run, shape=[2])
cropped_image = tf.random_crop(image, size=[cropped_size[0], cropped_size[1], 4])
print(cropped_image.get_shape().as_list())
# [None, None, 4]
with tf.Session() as sess:
# You can leave cropped_size with its default value assigned at runtime:
res = sess.run(cropped_image)
print(res.shape)
# (512, 512, 4)
# ... or you can specify a new one if you wish so:
res = sess.run(cropped_image, feed_dict={cropped_size: [256, 256]})
print(res.shape)
# (256, 256, 4)
# ... It would switch back to the default value if you don't feed one:
res = sess.run(cropped_image)
print(res.shape)
# (512, 512, 4)