I wanted to merge two sequential models into one using a Merge layer but it is showing me an error. I am working with images, with size 128x128 (RGB image) and batch size is 32.
The error is:
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
The model is defined as:
model = Sequential() leftBranch = Sequential()
leftBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
leftBranch.add(Convolution2D(14, 3, 1, activation='relu'))
leftBranch.add(ZeroPadding2D((1, 1)))
leftBranch.add(Flatten())
rightBranch = Sequential()
rightBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
rightBranch.add(Convolution2D(14, 1, 3, activation='relu'))
rightBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
rightBranch.add(Flatten())
centralBranch = Sequential()
centralBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
centralBranch.add(Convolution2D(14, 5, 5, activation='relu'))
centralBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
centralBranch.add(Flatten())
merged = Merge([leftBranch, centralBranch, rightBranch], mode='concat')
model = Sequential()
model.add(merged) model.add(Dense(64))
model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(1))
model.add(Activation('sigmoid'))
Error coming is :
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
So, whats is the proper way to concatenate two sequential model with convolutional layers. I just want to Merge convolutional layers output like I did it here.
Related
Whilst trying to learn Recurrent Neural Networks(RNNs) am trying to train an Automatic Lip Reading Model using 3DCNN + LSTM. I tried out a code I found for the same on Kaggle.
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.get_output_shape_at(0)
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()
However, it returns the following error:
11 model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
12
---> 13 shape = model.get_output_shape_at(0)
14 model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
15
RuntimeError: The layer sequential_2 has never been called and thus has no defined output shape.
From my understanding, I see that the author of the code was trying to get the output shape of the first layer and reshape it such as to forward to the LSTM layer.
Found a similar post following which I made the following changes and the error was fixed.
shape = model.layers[-1].output_shape
# shape = model.get_output_shape_at(0)
Still I am confused as to what the code does to forward the input from the CNN layer to LSTM layer. Any help to make me understand the above is appreciated. Thank You!!
When you are passing the code from top to bottom then the inputs are flowing in the graph from top to bottom, you are getting this error because you can't call this function on eager mode, as Tensorflow 2.0 is fully transferred to eager mode, so, once you will fit the function and train it 1 epoch then you can use model.get_output_at(0) otherwise use mode.layers[-1].output.
The CNN Layer will extract the features locally then LSTM will sequentially extract and learn the feature, using CONV with LSTM is a good approach, but I will recommend you directly using tf.keras.layers.ConvLSTM3D. Check it here https://www.tensorflow.org/api_docs/python/tf/keras/layers/ConvLSTM3D
tf.keras.backend.clear_session()
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.layers[-1].output_shape
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()
I am implementing my first CNN in Tensorflow and I am having trouble when adding the dense layer to my CNN model. Here is the code:
batch_size = 4
sample_shape = (batch_size, 24, 30, 30, 5)
model = models.Sequential()
model.add(layers.Conv3D(96, kernel_size=(4, 4, 4), activation='relu', padding='same', input_shape=sample_shape))
model.add(layers.Conv3D(64, kernel_size=(3, 3, 3), activation='relu', padding='same'))
model.add(layers.Conv3D(64, kernel_size=(1, 1, 5), activation='relu', padding='same'))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.summary()
I am getting the following output. Later, my program crashes. What needs so much memory? It seems to be the Dense Layer, but I can't explain it.
2021-10-20 19:03:53.219849: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 5662310400 exceeds 10% of free system memory.
Your flattened weight shape at the last layer is roughly
4 * (24 * 30 * 30 * 64 * 256) = 1,415,577,600
That's an insane amount of parameters. Use MaxPooling3D between convolutional layers or GlobalAveragePooling3D instead of Flatten to reduce the number of parameters.
I have built the following object recognition model for my school assignment to predict classes from the CIFAR-10 dataset. The assignment requires that I use the VALID padding for all convolution and pooling layers.
def _build_cifar10_model(num_C1_channels=50, num_C2_channels=60, use_dropout=False):
model = Sequential()
# reshape 1D array of length 3072
# to a matrix of shape 32x32x3
model.add(Input(shape=(3072,)))
model.add(Reshape(target_shape=(32, 32, 3), input_shape=(3072,)))
# 24x24x3
model.add(Conv2D(filters=num_C1_channels, kernel_size=(9, 9), padding='valid', activation='relu'))
# 12x12x3
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
# 8x8x3
model.add(Conv2D(filters=num_C2_channels, kernel_size=(5, 5), padding='valid', activation='relu'))
# 4x4x3
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
model.add(Flatten())
model.add(Dense(units=300))
if use_dropout:
model.add(Dropout(rate=0.5))
model.add(Dense(units=10, activation='softmax'))
if use_dropout:
model.add(Dropout(rate=0.5))
return model
However, building this model is throwing the following error:
InvalidArgumentError: Negative dimension size caused by subtracting 1 from 0 for '{{node max_pooling2d_15/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", ksize=[1, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 1]](conv2d_23/Relu)' with input shapes: [?,24,24,0].
I figured that it is because when padding is VALID, a (32, 32, 3) image becomes size (24, 24, 0) after applying that 1st Conv2D layer with kernel_size=9 on it. Not sure why the RGB channels are completely lost.
Is there a way to go around this issue while maintaining the VALID padding?
Sorry in advance as this is the first time I am building such a model.
Use InputLayer instead of Input at the start of your model. It will give you the first layer of the model instead of a symbolic tensor.
Replace the line with this:
model.add(InputLayer(input_shape=(3072,)))
The rest of the code seems to execute for me and gives me a 4x4 layer after the convolution which you feed into your dropout/dense layers.
i have built and trained a CNN, and i want to get the wieghts of the first dense layer as numpy array . after i trained the model i loaded the model using this code
f = Path("model_structure.json")
model_structure = f.read_text()
model_wieghts = model_from_json(model_structure)
model_wieghts.load_weights("model_weights.h5")
in order to get the wieghts of the first dense layer i used :
wieghts_tf = model_wieghts.layers[9].output
wieghts_tf has this value:
<tf.Tensor 'dense_1/Relu:0' shape=(?, 496) dtype=float32>
the question is , i want to convert the type of wieghts_tf from tensor to numpy array . so i created a session and used the eval() function to do so . as shown below :
sess = tf.Session()
with sess.as_default() :
vector = wieghts_tf.eval()
but im getting this error
InvalidArgumentError: You must feed a value for placeholder tensor 'conv2d_1_input' with dtype float and shape [?,180,180,3]
how can i solve it ?
here is the code of the CNN model :
#creating nueral network
model = Sequential()
conv1_2d = model.add(Conv2D(180, (3, 3), padding='same', input_shape=(180, 180, 3), activation="relu")) #180 is the number of filters
conv2_2d = model.add(Conv2D(180, (3, 3), activation="relu"))
max_pool1 = model.add(MaxPooling2D(pool_size=(3, 3)))
drop_1 = model.add(Dropout(0.25))
conv3_2d =model.add(Conv2D(360, (3, 3), padding='same', activation="relu"))
conv4_2d =model.add(Conv2D(360, (3, 3), activation="relu"))
max_pool2 = model.add(MaxPooling2D(pool_size=(3, 3)))
drop_2 = model.add(Dropout(0.25))
flat = model.add(Flatten())
dense_1 = model.add(Dense(496, activation="relu"))
drop_3 = model.add(Dropout(0.5))
dense_2 = dense_layer = model.add(Dense(376, activation="softmax"))
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
model.fit(
train_data,
train_label,
batch_size=32,
epochs=40,
verbose = 2 ,
validation_split=0.1,
shuffle=True)
# Save neural network structure
model_structure = model.to_json()
f = Path("model_structure.json")
f.write_text(model_structure)
# Save neural network's trained weights
model.save_weights("model_weights.h5")
Found the solution:
x = np.frombuffer(layer.convolution.weights.float16Value, dtype=np.float16)
I want to build an end-to-end trainable model with the following proprieties:
CNN to extract features from image
The features is reshaped to a matrix
Each row of this matrix is then fed to LSTM1
Each column of this matrix is then fed to LSTM2
The output of LSTM1 and LSTM2 are concatenated for the final output
(it's more or less similar to Figure 2 in this paper: https://arxiv.org/pdf/1611.07890.pdf)
My problem now is after the reshape, how can I feed the values of feature matrix to LSTM with Keras or Tensorflow?
This is my code so far with VGG16 net (also a link to Keras issues):
# VGG16
model = Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 2
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 3
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 4
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 5
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 6
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
# reshape the feature 4096 = 64 * 64
model.add(Reshape((64, 64)))
# How to feed each row of this to LSTM?
# This is my first solution but it doesn’t look correct:
# model.add(LSTM(256, input_shape=(64, 1))) # 256 hidden units, sequence length = 64, feature dim = 1
Consider building your CNN model with Conv2D and MaxPool2D layers, until you reach your Flatten layer, because the vectorized output from the Flatten layer will be you input data to the LSTM part of your structure.
So, build your CNN model like this:
model_cnn = Sequential()
model_cnn.add(Conv2D...)
model_cnn.add(MaxPooling2D...)
...
model_cnn.add(Flatten())
Now, this is an interesting point, the current version of Keras has some incompatibility with some TensorFlow structures that will not let you stack your entire layers in just one Sequential object.
So it's time to use the Keras Model Object to complete you neural network with a trick:
input_lay = Input(shape=(None, ?, ?, ?)) #dimensions of your data
time_distribute = TimeDistributed(Lambda(lambda x: model_cnn(x)))(input_lay) # keras.layers.Lambda is essential to make our trick work :)
lstm_lay = LSTM(?)(time_distribute)
output_lay = Dense(?, activation='?')(lstm_lay)
And finally, now it's time to put together our 2 separated models:
model = Model(inputs=[input_lay], outputs=[output_lay])
model.compile(...)
OBS: Note that you can substitute my model_cnn example by your VGG without including the top layers, once the vectorized output from the VGG Flatten layer will be the input of the LSTM model.