Error: expected conv3d_1_input to have 5 dimensions, but got array with shape (10, 224, 224, 3) - tensorflow

I'm trying to train a Neural Network on a dataset for liveness anti-spoofing. I have some videos in two folders named genuine and fake. I have extracted 10 frames of each video and saved them in two folders with aforementioned names under a new directory tarining.
----/genuine/ #containes 10frame*300videos=3000images
----/fake/ #containes 10frame*800videos=8000images
I designed the following 3D Convent using Keras as my first try, but after running it, it throws the following exception:
from keras.preprocessing.image import ImageDataGenerator
from keras import Model, optimizers, activations, losses, regularizers, backend, Sequential
from keras.layers import Dense, MaxPooling3D, AveragePooling3D, Conv3D, Input, Flatten, BatchNormalization
TARGET_SIZE = (224, 224)
train_datagen = ImageDataGenerator(rescale=1.0/255,
train_generator = train_datagen.flow_from_directory("./training/",
validation_generator = train_datagen.flow_from_directory("./training/",
SHAPE = (10, 224, 224, 3)
model = Sequential()
model.add(Conv3D(filters=128, kernel_size=(1, 3, 3), data_format='channels_last', activation='relu', input_shape=(10, 224, 224, 3)))
model.add(MaxPooling3D(data_format='channels_last', pool_size=(1, 2, 2)))
model.add(Conv3D(filters=64, kernel_size=(2, 3, 3), activation='relu'))
model.add(MaxPooling3D(pool_size=(1, 2, 2)))
model.add(Conv3D(filters=32, kernel_size=(2, 3, 3), activation='relu'))
model.add(Conv3D(filters=32, kernel_size=(2, 3, 3), activation='relu'))
model.add(MaxPooling3D(pool_size=(1, 2, 2)))
model.add(Conv3D(filters=16, kernel_size=(2, 3, 3), activation='relu'))
model.add(Conv3D(filters=16, kernel_size=(2, 3, 3), activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer=optimizers.adam(), loss=losses.binary_crossentropy, metrics=['accuracy'])
model.fit_generator(train_generator, steps_per_epoch=train_generator.samples/train_generator.batch_size, epochs=5, validation_data=validation_generator, validation_steps=validation_generator.samples/validation_generator.batch_size)'3d.h5')
Here is the Error:
ValueError: Error when checking input: expected conv3d_1_input to have 5 dimensions, but got array with shape (10, 224, 224, 3)
And this is the output of model.summary()
Model: "sequential_1"
Layer (type) Output Shape Param #
conv3d_1 (Conv3D) (None, 10, 222, 222, 128) 3584
max_pooling3d_1 (MaxPooling3 (None, 10, 111, 111, 128) 0
conv3d_2 (Conv3D) (None, 9, 109, 109, 64) 147520
max_pooling3d_2 (MaxPooling3 (None, 9, 54, 54, 64) 0
conv3d_3 (Conv3D) (None, 8, 52, 52, 32) 36896
conv3d_4 (Conv3D) (None, 7, 50, 50, 32) 18464
max_pooling3d_3 (MaxPooling3 (None, 7, 25, 25, 32) 0
conv3d_5 (Conv3D) (None, 6, 23, 23, 16) 9232
conv3d_6 (Conv3D) (None, 5, 21, 21, 16) 4624
average_pooling3d_1 (Average (None, 2, 10, 10, 16) 0
batch_normalization_1 (Batch (None, 2, 10, 10, 16) 64
dense_1 (Dense) (None, 2, 10, 10, 32) 544
dense_2 (Dense) (None, 2, 10, 10, 1) 33
Total params: 220,961
Trainable params: 220,929
Non-trainable params: 32
I'd appreciate any help to fix the exception. By the way, I'm using TensorFlow as backend if it helps to solve the problem.

As #thushv89 mentioned in the comments Keras has no build-in video generator which causes a lot of problems for those who will work with big video datasets. Therefore, I wrote a simple VideoDataGenerator which works almost as simple as ImageDataGenerator. The script could be found here on my github in case someone needs it in the future.


How to merge 2 trained model in keras?

Good evening everyone,
I have 5 classes and each one has 2000 images, I built 2 Models with different model names and that's my model code
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Dense(5, activation=tf.nn.softmax)
], name="Model1")
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history =, train_labels,
batch_size=128, epochs=30, validation_split=0.2)'f3_1st_model_seg.h5')
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Dense(5, activation=tf.nn.softmax)
], name="Model2")
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history =, train_labels,
batch_size=128, epochs=30, validation_split=0.2)'f3_2nd_model_seg.h5')
then I used this code to merge the 2 models
input_shape = [150, 150, 3]
model = keras.models.load_model('1st_model_seg.h5')
Layer (type) Output Shape Param #
conv2d (Conv2D) (None, 148, 148, 32) 896
max_pooling2d (MaxPooling2D (None, 74, 74, 32) 0
conv2d_1 (Conv2D) (None, 72, 72, 32) 9248
max_pooling2d_1 (MaxPooling (None, 36, 36, 32) 0
conv2d_2 (Conv2D) (None, 34, 34, 64) 18496
max_pooling2d_2 (MaxPooling (None, 17, 17, 64) 0
conv2d_3 (Conv2D) (None, 15, 15, 128) 73856
max_pooling2d_3 (MaxPooling (None, 7, 7, 128) 0
flatten (Flatten) (None, 6272) 0
dense (Dense) (None, 5) 31365
Total params: 133,861
Trainable params: 133,861
Non-trainable params: 0
model2 = keras.models.load_model('2nd_model_seg.h5')
Layer (type) Output Shape Param #
conv2d (Conv2D) (None, 148, 148, 32) 896
max_pooling2d (MaxPooling2D (None, 74, 74, 32) 0
conv2d_1 (Conv2D) (None, 72, 72, 32) 9248
max_pooling2d_1 (MaxPooling (None, 36, 36, 32) 0
conv2d_2 (Conv2D) (None, 34, 34, 64) 18496
max_pooling2d_2 (MaxPooling (None, 17, 17, 64) 0
conv2d_3 (Conv2D) (None, 15, 15, 128) 73856
max_pooling2d_3 (MaxPooling (None, 7, 7, 128) 0
flatten (Flatten) (None, 6272) 0
dense (Dense) (None, 5) 31365
Total params: 133,861
Trainable params: 133,861
Non-trainable params: 0
def concat_horizontal(models, input_shape):
models_count = len(models)
hidden = []
input = tf.keras.layers.Input(shape=input_shape)
for i in range(models_count):
output = tf.keras.layers.concatenate(hidden)
model = tf.keras.Model(inputs=input, outputs=output)
return model
new_model = concat_horizontal(
[model, model2], (input_shape))'f1_1st_merged_seg.h5')
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) [(None, 150, 150, 3 0 []
model1 (Sequential) (None, 5) 133861 ['input_1[0][0]']
model2 (Sequential) (None, 5) 133861 ['input_1[0][0]']
concatenate (Concatenate) (None, 10) 0 ['model1[0][0]',
Total params: 267,722
Trainable params: 267,722
Non-trainable params: 0
so after I tested the merged model I found some images getting classes 7 and 9 although I have only 5 classes and that's my code for prediction
class_names = ['A', 'B', 'C', D', 'E']
for img in os.listdir(path):
# predicting images
img2 = tf.keras.preprocessing.image.load_img(
os.path.join(path, img), target_size=(150, 150))
x = tf.keras.preprocessing.image.img_to_array(img2)
x = np.expand_dims(x, axis=0)
images = np.vstack([x])
classes = np.argmax(model.predict(images), axis=-1)
y_out = class_names[classes[0]]
I got this error
y_out = class_names[classes[0]]
IndexError: list index out of range
for this case it could have been done even by sequential method, look you are trying to concatenate two output layers with 5 columns; so it would lead into increase classes from 5 to 10; try out to define these two models up to output layer (the flatten layer as the last layer defined for both these models) and then define final model with input layer, these two models, and concatenate layer and then the output layer with five units and activation;
so remove output layer
tf.keras.layers.Dense(5, activation=tf.nn.softmax)
from those two models, and implement it just as one layer after the output layer you have defined here
def concat_horizontal(models, input_shape):
models_count = len(models)
hidden = []
input = tf.keras.layers.Input(shape=input_shape)
for i in range(models_count):
output = tf.keras.layers.concatenate(hidden)
output = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(output)
model = tf.keras.Model(inputs=input, outputs=output)
return model
But notice it would be better to define branch models based on functional API method for these cases

conv-autoencoder that val_loss doesn't decrease

I build a anomaly detection model using conv-autoencoder on UCSD_ped2 dataset. What puzzles me is that after very few epochs ,the val_loss don't decrease. It seem that the model couldn't learn any longer. I have done some research to improve my model,but it doesn't getting better. what should i do to fix it?
Here's my model's struct:
input_img = Input(shape = (x, y, inChannel))
bn1= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(input_img)
conv1 = Conv2D(256, (11, 11), strides=(4,4),activation='relu',
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
bn2= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool1)
conv2 = Conv2D(128, (5, 5),activation='relu',
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
bn3= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool2)
conv3 = Conv2D(64, (3, 3), activation='relu',
ubn3=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(conv3)
uconv3=Conv2DTranspose(128, (3,3),activation='relu',
upool3=UpSampling2D(size=(2, 2))(uconv3)
ubn2=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool3)
uconv2=Conv2DTranspose(256, (3, 3),activation='relu',
upool2=UpSampling2D(size=(2, 2))(uconv2)
ubn1=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool2)
decoded = Conv2DTranspose(1, (11, 11), strides=(4, 4),
activation='sigmoid', padding='same')(ubn1)
autoencoder = Model(input_img, decoded)
autoencoder.compile(loss = 'mean_squared_error', optimizer ='Adadelta',metrics=['accuracy']), Y_train,validation_split=0.3,
batch_size = batch_size, epochs = epochs, verbose = 0,
Layer (type) Output Shape Param #
input_1 (InputLayer) (None, 144, 224, 1) 0
batch_normalization_1 (Batch (None, 144, 224, 1) 4
conv2d_1 (Conv2D) (None, 36, 56, 256) 31232
max_pooling2d_1 (MaxPooling2 (None, 18, 28, 256) 0
batch_normalization_2 (Batch (None, 18, 28, 256) 1024
conv2d_2 (Conv2D) (None, 18, 28, 128) 819328
max_pooling2d_2 (MaxPooling2 (None, 9, 14, 128) 0
batch_normalization_3 (Batch (None, 9, 14, 128) 512
conv2d_3 (Conv2D) (None, 9, 14, 64) 73792
batch_normalization_4 (Batch (None, 9, 14, 64) 256
conv2d_transpose_1 (Conv2DTr (None, 9, 14, 128) 73856
up_sampling2d_1 (UpSampling2 (None, 18, 28, 128) 0
batch_normalization_5 (Batch (None, 18, 28, 128) 512
conv2d_transpose_2 (Conv2DTr (None, 18, 28, 256) 295168
up_sampling2d_2 (UpSampling2 (None, 36, 56, 256) 0
batch_normalization_6 (Batch (None, 36, 56, 256) 1024
conv2d_transpose_3 (Conv2DTr (None, 144, 224, 1) 30977
Total params: 1,327,685
Trainable params: 1,326,019
Non-trainable params: 1,666
the batch size=30;epoch=100 training data has 1785 pic; validation data has 765 pic.
I have tried :
delete kernel_regularizer;
adding ReduceLROnPlateau.
,but it only get a little improve.
Epoch 00043: ReduceLROnPlateau reducing learning rate to 9.99999874573554e-12.
Epoch 00044: val_loss did not improve from 0.00240
Epoch 00045: val_loss did not improve from 0.00240
As the val_loss get 0.00240, it didn't decrease...
The following figure was loss with epoch.
The following figure show model's reconstruction result which are truly poor.How can I making my model more workful?
Based on your screenshot, It seems that it is not an issue of overfitting or underfitting.
On my understanding:
Underfitting – Validation and training error high
Overfitting – Validation error is high, training error low
Good fit – Validation error low, slightly higher than the training error
Generally speaking, the dataset should be split properly for training and validation.
Typically the training set should be 4 times (80/20) the number of your validation set.
My suggestion is that you can try to increase the number of your datasets by doing data augmentation and continue the training.
Kindly refer to the documentation for data augmentation.

How to fix the output shape in Keras 2.1.0

I get a dense layer shape error with Keras Version 2.1.0. This problem only happens with this version of Keras (2.1.0). I am in no position to upgrade the version since it's on a cluster so I am trying to find a fix for the time being. My model is defined as below.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
input_shape=(32, 32, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
I have done one hot encoding as shown below.
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
The model summary is
Layer (type) Output Shape Param #
conv2d_1 (Conv2D) (None, 30, 30, 32) 896
conv2d_2 (Conv2D) (None, 28, 28, 64) 18496
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 64) 0
dropout_1 (Dropout) (None, 14, 14, 64) 0
flatten_1 (Flatten) (None, 12544) 0
dense_1 (Dense) (None, 128) 1605760
dropout_2 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 10) 1290
Total params: 1,626,442
Trainable params: 1,626,442
Non-trainable params: 0
The error I get is :
ValueError: Error when checking target: expected dense_2 to have 2
dimensions, but got array with shape (50000, 1, 10)
The exact same code works perfectly in Keras 2.2.4

maxpooling results not displaying in model.summary() output

I am beginner in Keras. I am tring to build a model for which i am using Sequential model. When i am tring to reduce the input size from 28 to 14 or lesser by using maxpooling function then the maxpooling function results does't display on call to the model.summary() function. I am tring to achive an accuracy of 0.99 or above after traing i.e, on call to model.score() the accuracy result should be 0.99 or above. Model build my me so far can be seen here
from keras.layers import Activation, MaxPooling2D
model = Sequential()
model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(28,28,1)))
model.add(Convolution2D(32, 1, activation='relu'))
MaxPooling2D(pool_size=(2, 2))
model.add(Convolution2D(32, 26))
model.add(Convolution2D(10, 1))
Output -
Layer (type) Output Shape Param #
conv2d_29 (Conv2D) (None, 26, 26, 32) 320
conv2d_30 (Conv2D) (None, 26, 26, 32) 1056
conv2d_31 (Conv2D) (None, 1, 1, 32) 692256
conv2d_32 (Conv2D) (None, 1, 1, 10) 330
flatten_7 (Flatten) (None, 10) 0
activation_7 (Activation) (None, 10) 0
Total params: 693,962
Trainable params: 693,962
Non-trainable params: 0
Batch size i am using is 32 and number of epoch is 10.
metrics=['accuracy']), Y_train, batch_size=32, nb_epoch=10, verbose=1)
score = model.evaluate(X_test, Y_test, verbose=0)
Output after training -
[0.09016687796734459, 0.9814]
You are not adding the Maxpooling2D layer to your model...
model.add(MaxPooling2D(pool_size=(2, 2)))
Also, the output of your maxpooling will have shape (None, 13, 13, 32), the convolutional kernel in the next layer (in your case 26) can't be larger than the dimensions your current (13). Your code should be something like this:
from keras.layers import Activation, MaxPooling2D, Dense
model = Sequential()
model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(28,28,1)))
model.add(Convolution2D(32, 1, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 8))
model.add(Convolution2D(10, 6))
Layer (type) Output Shape Param #
conv2d_1 (Conv2D) (None, 26, 26, 32) 320
conv2d_2 (Conv2D) (None, 26, 26, 32) 1056
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32) 0
conv2d_3 (Conv2D) (None, 6, 6, 32) 65568
conv2d_4 (Conv2D) (None, 1, 1, 10) 11530
flatten_1 (Flatten) (None, 10) 0
activation_1 (Activation) (None, 10) 0
Total params: 78,474
Trainable params: 78,474
Non-trainable params: 0
P.S.: I would consider using smaller kernel sizes and a FC layer at the output, as it is a more practical solution in most cases than trying to match convolution output shapes

Keras (Tensorflow) Reshape Layer input error

I have a reshape input error and i don't know why.
The requested shape is 1058400, what is (1, 21168) multiplied with the batch size of 50.
What I do not understand is the apparent input size of 677376.
I don't know where this value is coming from. The layer before the reshape is a flatten layer and I directly use the shape of it when I define the target shape of the Reshape layer.
The Model compiles just fine and I use Tensorflow as the backend, so it is defined before runtime. But the error appears only when I put date in to it.
import numpy as np
import tensorflow as tf
import keras.backend as K
from keras import Model
from keras.layers import LSTM, Conv2D, Dense, Flatten, Input, Reshape
from keras.optimizers import Adam
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
input = Input(batch_shape=(50, 230, 230, 1))
conv1 = Conv2D(
filters=12, kernel_size=(7, 7), strides=(1, 1), padding="valid", activation="relu"
conv2 = Conv2D(
filters=24, kernel_size=(5, 5), strides=(1, 1), padding="valid", activation="relu"
conv3 = Conv2D(
filters=48, kernel_size=(3, 3), strides=(2, 2), padding="valid", activation="relu"
conv4 = Conv2D(
filters=48, kernel_size=(5, 5), strides=(5, 5), padding="valid", activation="relu"
conv_out = Flatten()(conv4)
conv_out = Reshape(target_shape=(1, int(conv_out.shape[1])))(conv_out)
conv_out = Dense(128, activation="relu")(conv_out)
rnn_1 = LSTM(128, stateful=True, return_sequences=True)(conv_out)
rnn_2 = LSTM(128, stateful=True, return_sequences=True)(rnn_1)
rnn_3 = LSTM(128, stateful=True, return_sequences=False)(rnn_2)
value = Dense(1, activation="linear")(rnn_3)
policy = Dense(5, activation="softmax")(rnn_3)
model = Model(inputs=input, outputs=[value, policy])
adam = Adam(lr=0.001)
model.compile(loss="mse", optimizer=adam)
out = model.predict(np.random.randint(1, 5, size=(50, 230, 230, 1)))
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) (50, 230, 230, 1) 0
conv2d (Conv2D) (50, 224, 224, 12) 600 input_1[0][0]
conv2d_1 (Conv2D) (50, 220, 220, 24) 7224 conv2d[0][0]
conv2d_2 (Conv2D) (50, 109, 109, 48) 10416 conv2d_1[0][0]
conv2d_3 (Conv2D) (50, 21, 21, 48) 57648 conv2d_2[0][0]
flatten (Flatten) (50, 21168) 0 conv2d_3[0][0]
reshape (Reshape) (50, 1, 21168) 0 flatten[0][0]
dense (Dense) (50, 1, 128) 2709632 reshape[0][0]
lstm (LSTM) (50, 1, 128) 131584 dense[0][0]
lstm_1 (LSTM) (50, 1, 128) 131584 lstm[0][0]
lstm_2 (LSTM) (50, 128) 131584 lstm_1[0][0]
dense_1 (Dense) (50, 1) 129 lstm_2[0][0]
dense_2 (Dense) (50, 5) 645 lstm_2[0][0]
Total params: 3,181,046
Trainable params: 3,181,046
Non-trainable params: 0
Error for the above code:
Traceback (most recent call last):
File "", line 45, in <module>
out = model.predict(np.random.randint(1, 5, size=(50, 230, 230, 1)))
File "/home/vyz/.conda/envs/stackoverflow/lib/python3.6/site-packages/keras/engine/", line 1157, in predict
'Batch size: ' + str(batch_size) + '.')
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 50 samples. Batch size: 32.
Edition of question
Important: I have edited your question so it actually runs and represents your problem. Input should take batch_shape as provided currently. Next time please make sure your code works, will be easier.
Solution is quite simple; your batch passed to the network has wrong dimension.
677376 / 21168 = 32 it is the default size of the batch which is expected by predict. You are supposed to specify it if it's different (50 in your case), like this:
out = model.predict(np.random.randint(1, 5, size=(50, 230, 230, 1)), batch_size=50)
Everything should work fine now and remember to specify batch size if you want it hardcoded.