Keras - How should I specify the input_shape of my training data? (The data are gray-scale images) - tensorflow

I'm using Conv2d in Keras to do some classification for gray-scale images. Each image is stored as a 240*300 matrix, (namely a list [ A_1, A_2,..., A_240 ] and each A_k is a list of length 300
How should I specify the input_shape of the first layer of my ConvNet?
Thanks
ValueError: Input 0 of layer conv2d is incompatible with the layer:
expected ndim=4, found ndim=3. Full shape received
: [None, 240, 300]

First, you need to reshape your data, adding a dimension at the end with size of one, which represents one channel (a grayscale image). Assuming data has shape (samples, 240, 300):
data = data.reshape((-1, 240, 300, 1))
This will make data have shape (samples, 240, 300, 1). Then to your first layer you should give input_shape=(240, 300, 1)

Related

Input 0 of layer "model" is incompatible with the layer: expected shape=(None, 250, 3), found shape=(None, 3) in trained transformer model

I have a keras transformer model trained with tensorflow 2.7.0 and python 3.7 with input shape: (None, 250, 3) and a 2D array input with shape: (250, 3)(not an image)
When making a prediction with:
prediction = model.predict(state)
I get ValueError: Input 0 of layer "model" is incompatible with the layer: expected shape=(None, 250, 3), found shape=(None, 3)
project code: https://github.com/MikeSifanele/TT
This is how state looks like:
state = np.array([[-0.07714844,-0.06640625,-0.140625],[-0.140625,-0.1650391,-0.2265625]...[0.6376953,0.6005859,0.6083984],[0.7714844,0.7441406,0.7578125]], np.float32)
Some explanation:
For input shape to the model i.e. (None, 250, 3), the first axis (represented by None) is the "sample" axis, while the rest i.e. 250,3 denotes the input dimension. Thus, when the input shape is (250, 3) it assumes the first axis as the "sample" axis and the rest as the input dimension i.e. just 3. So, to make it consistent we need to add a dimension at the beginning described in the following:
state = np.expand_dims(state, axis=0)
The shape of state then becomes (1, 250, 3) ~(None, 250, 3).

How to pass a list of lists to tensorflow?

I'm trying to build a model in tensorflow that uses sentences in order to predict images. I transformed all the sentences to a list of lists of size 300 each one.
0 [-0.22607538080774248, 0.30380163341760635, 0....
1 [-0.10856867488473654, 0.17990960367023945, 0....
2 [-0.15721752890385687, 0.1608753204345703, 0.4...
3 [-0.12894394318573177, 0.13585415855050087, 0....
4 [-0.27382510248571634, 0.22385768964886665, 0....
40449 [-0.28715573996305466, 0.2722414545714855, 0.6...
40451 [-0.04035807272884995, 0.2275269404053688, 0.3...
40452 [-0.19741788890678436, 0.3378600552678108, 0.7...
40453 [-0.10771899553947151, 0.13040382787585258, 0....
40454 [-0.07718773453962058, 0.28313175216317177, 0....
Name: Text, Length: 31978, dtype: object
How can I give it to tensorflow as an input?
I tried
model = Sequential([
Dense(2, activation="relu", input_shape = (300,)),
Reshape((256, 256, 3), input_shape = (300,))
])
model.compile(loss='mse', optimizer='adam')
history = model.fit(x_ent, y_ent, epochs=3, batch_size=64)
But when I compile the model, it says
ValueError: Error when checking input: expected dense_2_input to have shape (300,) but got array with shape (1,)
Also, I used the Reshape layer in order to transform vectors to images, but I don't know if there is a better way to do that.
Does each image need 300 sentences for classification? Or does each sentence has a feature vector of size 300? If you have each sentence as a list which has a lenght of 300 and if you have 40454 sentences your input shape must be 40454x300. So you could pass input_shape = (40454,300) to Dense input layer. It should work.
I referred to the tensorflow keras documentation.
N-D tensor with shape: (batch_size, ..., input_dim). The most common
situation would be a 2D input with shape (batch_size, input_dim).

Correct input for TimeDistributed Convolution2D Keras

I have a sequence of 327 frames of dimension 480 rows and 640 colums, greyscale.
print (X_train.shape)
gives:(327, 480, 640, 1)
I have the following model:
N = 2 #number of frames to distribute
model = Sequential()
model.add(TimeDistributed(Convolution2D(32, activation='relu'), input_shape = (N, 480,640,1)))
...
print (model.output_shape)
gives:(None, 2, 480, 640, 32)
I need one more dimension to pass to this input to the convolution.
In fact I have the following error:
ValueError: Error when checking input: expected time_distributed_1_input to have 5 dimensions, but got array with shape (327, 480, 640, 1)
How to solve this?
Thanks!
Edit: Fundamentally what I need is to transform the input (327, 480, 640, 1) into (x, 2, 480, 640, 1) (x=327/2 ?)
You are trying to perform a 2D convolution on 3D data (2x480x640). Use the Convolution3D.

Data Preprocessing - Input Shape for TimeDistributed CNN (LRCN) & ConvLSTM2D for Video Classification

I'm trying to do binary classification for labeled data for 300+ videos. The goal is to extract features using a ConvNet and feed into to an LSTM for sequencing with a binary output after evaluating all the frames in the video. I've preprocessed each video to have exactly 200 frames with each image being 256 x 256 so that it would be easier to feed into a DNN and split the dataset into two folders as labels. (e.g. dog and cat)
However, after searching stackoverflow for hours, I'm still unsure how to reshape the dataset of video frames so that the model accounts for the number of frames. I'm trying to feed the video frames into a 3D ConvNets and TimeDistributed (2DConvNets) + LSTM, (e.g. (300, 200, 256, 256, 3) ) with no luck. I'm able to perform 2D ConvNet classification (data is a 4D Tensor, need to add a time step dimension to make it a 5D Tensor
) pretty easily but now having issues wrangling with the temporal aspect.
I've been using Keras ImageDataGenerator and train_datagen.flow_from_directory to read in the images and have been running into shape mismatch errors when I attempt to feed it to a TimeDistributed ConvNet. I know hypothetically if I have a X_train dataset I can potentially do X_train = X_train.reshape(...). Any example code would be very much appreciated.
I think you could use ConvLSTM2D in Keras for your purpose. ImageDataGenerator is very good for CNN with images, but may be not convenient for CRNN with videos.
You have already transformed your 300 videos data in the same shape (200, 256, 256, 3), each video 200 frames, each frame 256x256 rgb. Next, you need to load them in a numpy array in shape (300, 200, 256, 256, 3). For reading videos in numpy arrays see this answer.
Then you can feed the data in a CRNN. Its first ConvLSTM2D layer should have input_shape = (None, 200, 256, 256, 3).
A sample according to your data: (only illustrated and not tested)
from keras.models import Sequential
from keras.layers import Dense
from keras.layers.convolutional_recurrent import ConvLSTM2D
model = Sequential()
model.add(ConvLSTM2D(filters = 32, kernel_size = (5, 5), input_shape = (None, 200, 256, 256, 3)))
### model.add(...more layers)
model.add(Dense(units = num_of_categories, # num of your vedio categories
kernel_initializer = 'Orthogonal', activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
# then train it
model.fit(video_data, # shape (300, 200, 256, 256, 3)
[list of categories],
batch_size = 20,
epochs = 50,
validation_split = 0.1)
I hope this could be a little helpful.

Keras input shape ordering is (width, height, ch)?

I use Keras2 with TensorFlow as back-end and tried feed horizontal rectangle image (width:150 x height:100 x ch:3) into network.
I use cv2 for pre-processing images and cv2 & TensorFlow treats the shape of images as [height, width, ch] ordering (in my case, it's [100, 150, 3] This format is opposite of (width:150 x height:100 x ch:3), but it's not mistake.)
So I defined Keras model API input as follow code, but it occurred an error.
img = cv2.imread('input/train/{}.jpg'.format(id))
img = cv2.resize(img, (100, 150))
inputs = Input(shape=(100, 150, 3))
x = Conv2D(8, (3, 3), padding='same', kernel_initializer='he_normal')(inputs)
~~~
error message is below
ValueError: Error when checking input: expected input_4 to have shape
(None, 100, 150, 3) but got array with shape (4, 150, 100, 3)
By the way input = Input((150, 100, 3)) can be run.
I feel weird with discrepancy between Keras & TensorFlow, so I'm suspicious that it just don't occurred error, it does not worked properly.
Anybody can explain that? I couldn't locate the input shape ordering in Keras Document.
You can change the dimension ordering as you prefer.
You can print and change the dimension ordering like this:
from keras import backend as K
print(K.image_data_format()) # print current format
K.set_image_data_format('channels_last') # set format
If you want to permanently change the dimension ordering, you should edit it in the keras.json file, usually located at ~/.keras/keras.json:
"image_data_format": "channels_last"
My problem occurred from order of width&height at argument of cv2.resize().
cv2.resize() takes the argument like cv2.resize(img, (width, height)), whereas numpy treats image array order of (height, width).
Taken from https://keras.io/api/layers/convolution_layers/convolution2d/
Arguments
...
data_format: A string, one of channels_last (default) or
channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch_size, height,
width, channels) while channels_first corresponds to inputs with shape
(batch_size, channels, height, width). It defaults to the
image_data_format value found in your Keras config file at
~/.keras/keras.json. If you never set it, then it will be
channels_last.
...
So it is: (batch_size, height, width, channels) (by default)