I have an image and I'd like to do some predictions using a pretrained model. I've prepared my model and everything looks fine in theory but when I use "model.predict(image)" Firstly, these are my codes:
import cv2
import tensorflow as tf
img_array = cv2.imread("hamburger.jpg") #importing image to project
img_array = tf.cast(img_array, tf.int32) / 255
image = tf.image.resize(img_array, (513, 513)) #reshaping image to make it's shape like (513, 513,3)
Everything looks fine as here but when I try to make prediction, I'm getting this error.
ValueError: The argument 'images' (value Tensor("IteratorGetNext:0", shape=(None, 513, 3), dtype=float32)) is not compatible with the shape this function was traced with. Expected shape (None, 513, 513, 3), but got shape (None, 513, 3).
I first time saw a shape like (None, 513, 513, 3). How can I do this with my image or should I make my image like it.
To predict a single image with model.predict you need to add a dimension for batch size. To do that use
image=np.expand_dims(image, axis=0)
Related
Reshaping tensor with tf.reshape during forward pass causes disconnect error in TensorFlow.
For example,
...
image = tf.keras.layers.Input(INPUT_SHAPE, name='image', dtype=tf.uint8)
image = tf.reshape(image, shape = (-1, 1344, 768, 1))
image_norm = normalize(image)
...
Above code causes the following error
Graph disconnected: cannot obtain value for tensor
KerasTensor(type_spec=TensorSpec(shape=(None, 8, 1344, 768, 1),
dtype=tf.uint8, name='image'), name='image', description="created by
layer 'image'") at layer "tf.reshape". The following previous layers
were accessed without issue: []
Is there any way to reshape tensor without disconnet?
I wanted to test my model by uploading an image but I got this error. And I think I got the error somewhere in these lines, I'm just not sure how to fix.
IMAGE_SIZE = [244,720]
inception = InceptionV3(input_shape=IMAGE_SIZE + [3], weights='imagenet',include_top=False)
Also here's the code of uploading my test image
picture = image.load_img('/content/DSC_0365.JPG', target_size=(244,720))
img = img_to_array(picture)
prediction = model.predict(img)
print (prediction)
I'm still a newbie in Machine learning so my knowledge right now is not yet that deep.
This is mostly because you didn't prepare your input (its dimension) for your inception model. Here is one possible solution.
Model
from tensorflow.keras.applications import *
IMAGE_SIZE = [244,720]
inception = InceptionV3(input_shape=IMAGE_SIZE + [3],
weights='imagenet', include_top=False)
# check it's input shape
inception.input_shape
(None, 244, 720, 3)
Inference
Let's test a sample by passing it to the model.
from PIL import Image
a = Image.open('/content/1.png').convert('RGB')
display(a)
Check its basic properties.
a.mode, a.size, a.format
('RGB', (297, 308), None)
So, its shape already in (297 x 308 x 3). But to able to pass it to the model, we need an extra axis which is the batch axis. To do that, we can do
import tensorflow as tf
import numpy as np
a = tf.expand_dims(np.array(a), axis=0)
a.shape
TensorShape([1, 308, 297, 3])
Much better. Now, we may want to normalize our data and resize it according to the model input shape. To do that, we can do:
a = tf.divide(a, 255)
a = tf.image.resize(a, [244,720])
a.shape
TensorShape([1, 244, 720, 3])
And lastly, pass it to the model.
inception(a).shape
TensorShape([1, 6, 21, 2048])
# or, preserve the prediction to later analysis
y_pred = inception(a)
Updated
If you're using the [tf.keras] image processing function which loads the image into PIL format, then we can do simply:
image = tf.keras.preprocessing.image.load_img('/content/1.png',
target_size=(244,720))
input_arr = tf.keras.preprocessing.image.img_to_array(image)
input_arr = np.array([input_arr]) # Convert single image to a batch.
inception(input_arr).shape
TensorShape([1, 6, 21, 2048])
I am using VGG16 for transfer learning. My images are grayscale. So, I need to change the input channel shape of Vgg16 from (224, 224, 3) to (224, 224, 1). I tried the following code and got error:
TypeError: build() takes from 1 to 2 positional arguments but 4 were given
Can anyone help me where Am I doing it wrong?
vgg16_model= load_model('Fetched_VGG.h5')
vgg16_model.summary()
# transform the model to Sequential
model= Sequential()
for layer in vgg16_model.layers[1:-1]:
model.add(layer)
# Freezing the layers (Oppose weights to be updated)
for layer in model.layers:
layer.trainable = False
model.build(224,224,1)
model.add(Dense(2, activation='softmax', name='predictions'))
you can't, even if you get rid of the input layer, this model has a graph that has already been compiled and your first conv layer expects an input with 3 channels. I don't think there is really an easy work around to make it accept 1 channel if there is any at all.
you need to repeat your data in third dimension and have the same grayscale image in all 3 bands instead of RGB, that works just fine.
if your image has the shape of : (224,224,1):
import numpy as np
gray_image_3band = np.repeat(gray_img, repeats = 3, axis = -1)
if your image has the shape of : (224,224)
gray_image_3band = np.repeat(gray_img[..., np.newaxis], repeats = 3, axis = -1)
you don't need to call the model.build() anymore this way, keep the input layer. but if you ever wanted to call it you need to pass the shape as a tuple like this:
model.build( (224, 224, 1) ) # this is correct, notice the parentheses
I'm trying to do binary classification for labeled data for 300+ videos. The goal is to extract features using a ConvNet and feed into to an LSTM for sequencing with a binary output after evaluating all the frames in the video. I've preprocessed each video to have exactly 200 frames with each image being 256 x 256 so that it would be easier to feed into a DNN and split the dataset into two folders as labels. (e.g. dog and cat)
However, after searching stackoverflow for hours, I'm still unsure how to reshape the dataset of video frames so that the model accounts for the number of frames. I'm trying to feed the video frames into a 3D ConvNets and TimeDistributed (2DConvNets) + LSTM, (e.g. (300, 200, 256, 256, 3) ) with no luck. I'm able to perform 2D ConvNet classification (data is a 4D Tensor, need to add a time step dimension to make it a 5D Tensor
) pretty easily but now having issues wrangling with the temporal aspect.
I've been using Keras ImageDataGenerator and train_datagen.flow_from_directory to read in the images and have been running into shape mismatch errors when I attempt to feed it to a TimeDistributed ConvNet. I know hypothetically if I have a X_train dataset I can potentially do X_train = X_train.reshape(...). Any example code would be very much appreciated.
I think you could use ConvLSTM2D in Keras for your purpose. ImageDataGenerator is very good for CNN with images, but may be not convenient for CRNN with videos.
You have already transformed your 300 videos data in the same shape (200, 256, 256, 3), each video 200 frames, each frame 256x256 rgb. Next, you need to load them in a numpy array in shape (300, 200, 256, 256, 3). For reading videos in numpy arrays see this answer.
Then you can feed the data in a CRNN. Its first ConvLSTM2D layer should have input_shape = (None, 200, 256, 256, 3).
A sample according to your data: (only illustrated and not tested)
from keras.models import Sequential
from keras.layers import Dense
from keras.layers.convolutional_recurrent import ConvLSTM2D
model = Sequential()
model.add(ConvLSTM2D(filters = 32, kernel_size = (5, 5), input_shape = (None, 200, 256, 256, 3)))
### model.add(...more layers)
model.add(Dense(units = num_of_categories, # num of your vedio categories
kernel_initializer = 'Orthogonal', activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
# then train it
model.fit(video_data, # shape (300, 200, 256, 256, 3)
[list of categories],
batch_size = 20,
epochs = 50,
validation_split = 0.1)
I hope this could be a little helpful.
I use Keras2 with TensorFlow as back-end and tried feed horizontal rectangle image (width:150 x height:100 x ch:3) into network.
I use cv2 for pre-processing images and cv2 & TensorFlow treats the shape of images as [height, width, ch] ordering (in my case, it's [100, 150, 3] This format is opposite of (width:150 x height:100 x ch:3), but it's not mistake.)
So I defined Keras model API input as follow code, but it occurred an error.
img = cv2.imread('input/train/{}.jpg'.format(id))
img = cv2.resize(img, (100, 150))
inputs = Input(shape=(100, 150, 3))
x = Conv2D(8, (3, 3), padding='same', kernel_initializer='he_normal')(inputs)
~~~
error message is below
ValueError: Error when checking input: expected input_4 to have shape
(None, 100, 150, 3) but got array with shape (4, 150, 100, 3)
By the way input = Input((150, 100, 3)) can be run.
I feel weird with discrepancy between Keras & TensorFlow, so I'm suspicious that it just don't occurred error, it does not worked properly.
Anybody can explain that? I couldn't locate the input shape ordering in Keras Document.
You can change the dimension ordering as you prefer.
You can print and change the dimension ordering like this:
from keras import backend as K
print(K.image_data_format()) # print current format
K.set_image_data_format('channels_last') # set format
If you want to permanently change the dimension ordering, you should edit it in the keras.json file, usually located at ~/.keras/keras.json:
"image_data_format": "channels_last"
My problem occurred from order of width&height at argument of cv2.resize().
cv2.resize() takes the argument like cv2.resize(img, (width, height)), whereas numpy treats image array order of (height, width).
Taken from https://keras.io/api/layers/convolution_layers/convolution2d/
Arguments
...
data_format: A string, one of channels_last (default) or
channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch_size, height,
width, channels) while channels_first corresponds to inputs with shape
(batch_size, channels, height, width). It defaults to the
image_data_format value found in your Keras config file at
~/.keras/keras.json. If you never set it, then it will be
channels_last.
...
So it is: (batch_size, height, width, channels) (by default)