Problems with VALID padding for Conv2D layer - tensorflow

I have built the following object recognition model for my school assignment to predict classes from the CIFAR-10 dataset. The assignment requires that I use the VALID padding for all convolution and pooling layers.
def _build_cifar10_model(num_C1_channels=50, num_C2_channels=60, use_dropout=False):
model = Sequential()
# reshape 1D array of length 3072
# to a matrix of shape 32x32x3
model.add(Input(shape=(3072,)))
model.add(Reshape(target_shape=(32, 32, 3), input_shape=(3072,)))
# 24x24x3
model.add(Conv2D(filters=num_C1_channels, kernel_size=(9, 9), padding='valid', activation='relu'))
# 12x12x3
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
# 8x8x3
model.add(Conv2D(filters=num_C2_channels, kernel_size=(5, 5), padding='valid', activation='relu'))
# 4x4x3
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
model.add(Flatten())
model.add(Dense(units=300))
if use_dropout:
model.add(Dropout(rate=0.5))
model.add(Dense(units=10, activation='softmax'))
if use_dropout:
model.add(Dropout(rate=0.5))
return model
However, building this model is throwing the following error:
InvalidArgumentError: Negative dimension size caused by subtracting 1 from 0 for '{{node max_pooling2d_15/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", ksize=[1, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 1]](conv2d_23/Relu)' with input shapes: [?,24,24,0].
I figured that it is because when padding is VALID, a (32, 32, 3) image becomes size (24, 24, 0) after applying that 1st Conv2D layer with kernel_size=9 on it. Not sure why the RGB channels are completely lost.
Is there a way to go around this issue while maintaining the VALID padding?
Sorry in advance as this is the first time I am building such a model.

Use InputLayer instead of Input at the start of your model. It will give you the first layer of the model instead of a symbolic tensor.
Replace the line with this:
model.add(InputLayer(input_shape=(3072,)))
The rest of the code seems to execute for me and gives me a 4x4 layer after the convolution which you feed into your dropout/dense layers.

Related

Passing output of 3DCNN layer to LSTM layer

Whilst trying to learn Recurrent Neural Networks(RNNs) am trying to train an Automatic Lip Reading Model using 3DCNN + LSTM. I tried out a code I found for the same on Kaggle.
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.get_output_shape_at(0)
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()
However, it returns the following error:
11 model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
12
---> 13 shape = model.get_output_shape_at(0)
14 model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
15
RuntimeError: The layer sequential_2 has never been called and thus has no defined output shape.
From my understanding, I see that the author of the code was trying to get the output shape of the first layer and reshape it such as to forward to the LSTM layer.
Found a similar post following which I made the following changes and the error was fixed.
shape = model.layers[-1].output_shape
# shape = model.get_output_shape_at(0)
Still I am confused as to what the code does to forward the input from the CNN layer to LSTM layer. Any help to make me understand the above is appreciated. Thank You!!
When you are passing the code from top to bottom then the inputs are flowing in the graph from top to bottom, you are getting this error because you can't call this function on eager mode, as Tensorflow 2.0 is fully transferred to eager mode, so, once you will fit the function and train it 1 epoch then you can use model.get_output_at(0) otherwise use mode.layers[-1].output.
The CNN Layer will extract the features locally then LSTM will sequentially extract and learn the feature, using CONV with LSTM is a good approach, but I will recommend you directly using tf.keras.layers.ConvLSTM3D. Check it here https://www.tensorflow.org/api_docs/python/tf/keras/layers/ConvLSTM3D
tf.keras.backend.clear_session()
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.layers[-1].output_shape
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()

Keras Shape errors when trying to use pre-trained model

I want to use a pre-trained model (from Keras Applications), with weights, and append my (very simple) CNN model at the end. To this end I am trying to loosely follow the tutorial here under the sub-header 'Fine-tune InceptionV3 on a new set of classes'.
My original simple CNN model was this:
model = Sequential()
model.add(Rescaling(1.0 / 255))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(256,256,3)))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Flatten())
model.add(Dense(units=5, activation='softmax'))
As I'm following the tutorial, I've converted it as so:
x = base_model.output
x = Rescaling(1.0 / 255)(x)
x = Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(256,256,3))(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = GlobalAveragePooling2D()(x)
predictions = Dense(units=5, activation='softmax')(x)
As you can see, the difference is that the top model is a Sequential() model while the bottom is Functional (I think?), and also, that the Flatten() layer has been replaced with GlobalAveragePooling2D(). I did this because I kept getting shape-related errors and it wasn't compiling. I thought I got it once I replaced the Flatten() layer with the GlobalAveragePooling() as this part of the code finally did compile, however now that I'm trying to train the model, it's giving me the following error:
ValueError: Exception encountered when calling layer "max_pooling2d_7" (type MaxPooling2D).
Negative dimension size caused by subtracting 2 from 1 for '{{node model/max_pooling2d_7/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", explicit_paddings=[], ksize=[1, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 1]](model/conv2d_10/Relu)' with input shapes: [?,1,1,64].
Call arguments received:
• inputs=tf.Tensor(shape=(None, 1, 1, 64), dtype=float32)
I don't want to remove the MaxPooling layer as I want this fine-tuned model append to be as close to the 'simple CNN' model I originally had, so that I can compare the two results. But I keep getting hit with these shape errors, which I don't really understand, and it's coming to the end of the day.
Is there a nice quick-fix that can enable this VGG16+simple CNN to work?
the first most important technical problem in your model structure is that you are rescaling images after passed through the base_model, so you should implement it just before the base model
the second one is that you have defined input_shape in the model above in convolution layer while data first pass throught base model, so you should define input layer before base model and then pass its output thorough base_model and the other layers
here i've edited your code:
inputs = Input(shape = (input_shape=(256,256,3))
x = Rescaling(1.0 / 255)(inputs)
x = base_model(x)
x = Conv2D(32, kernel_size=(3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = GlobalAveragePooling2D()(x)
predictions = Dense(units=5, activation='softmax')(x)
model = keras.Model(inputs = [inputs], outputs = [predictions])
And for the error raised, in this case you could set convolution layers padding parameter to 'same' or even resize images to larger size to override the problem.

Memory problem when adding dense layer to CNN

I am implementing my first CNN in Tensorflow and I am having trouble when adding the dense layer to my CNN model. Here is the code:
batch_size = 4
sample_shape = (batch_size, 24, 30, 30, 5)
model = models.Sequential()
model.add(layers.Conv3D(96, kernel_size=(4, 4, 4), activation='relu', padding='same', input_shape=sample_shape))
model.add(layers.Conv3D(64, kernel_size=(3, 3, 3), activation='relu', padding='same'))
model.add(layers.Conv3D(64, kernel_size=(1, 1, 5), activation='relu', padding='same'))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.summary()
I am getting the following output. Later, my program crashes. What needs so much memory? It seems to be the Dense Layer, but I can't explain it.
2021-10-20 19:03:53.219849: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 5662310400 exceeds 10% of free system memory.
Your flattened weight shape at the last layer is roughly
4 * (24 * 30 * 30 * 64 * 256) = 1,415,577,600
That's an insane amount of parameters. Use MaxPooling3D between convolutional layers or GlobalAveragePooling3D instead of Flatten to reduce the number of parameters.

How to add customm layers inside vgg16 when doing transfer learning?

I am trying to use transfer learning using vgg16. My main concept is to train the first few layers of vgg16, and add my own layer, afterwords add the rest of the layers from vgg16, and add my own output layer to the end. To do this I follow this sequence: (1) load layers and freez layers, (2) add my layers, (3) load the rest of layers (except the output layer) [THIS IS WHERE I ENCOUNTER THE FOLLOWING ERROR] and freez the layer, (4) add output layer. Is my approach ok? If not, then where I am doing wrong? Here's the error:
ValueError: Input 0 is incompatible with layer block3_conv1: expected axis -1 of input shape to have value 128 but got shape (None, 64, 56, 64)
The full code is here for better understanding:
vgg16_model= load_model('Fetched_VGG.h5')
vgg16_model.summary()
model= Sequential()
#add vgg layer (inputLayer, block1, block2)
for layer in vgg16_model.layers[0:6]:
model.add(layer)
#frees
# Freezing the layers (Oppose weights to be updated)
for layer in model.layers:
layer.trainable = False
#add custom
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', name='block66_conv1_m') )
model.add( Conv2D(64, (3, 3), activation='relu', padding='same', name='block66_conv2_m') )
model.add( Conv2D(64, (3, 3), activation='relu', padding='same', name='block66_conv3_m') )
model.add( MaxPooling2D((2, 2), strides=(2, 2), name='block66_pool_m'))
# add vgg layer (block 3 to last layer (except the output dense layer))
for layer in vgg16_model.layers[7:-1]:
model.add(layer)
# Freezing the layers (Oppose weights to be updated)
for layer in model.layers:
layer.trainable = False
# add out out layer
model.add(Dense(2, activation='softmax', name='predictions'))
model.summary()
As VGG16 layer 7 is expecting 128 filters you'll need to match this with your final Conv2D
model.add( Conv2D(128, (3, 3), activation='relu', padding='same', name='block66_conv3_m') )
If the dimensions match you should be able to build your model but it's not clear what you're trying to achieve. Your approach of adding to the middle of the VGG16 model will mean that all the downstream layers will need to be retrained

Merge layer keras with tensorflow backend

I wanted to merge two sequential models into one using a Merge layer but it is showing me an error. I am working with images, with size 128x128 (RGB image) and batch size is 32.
The error is:
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
The model is defined as:
model = Sequential() leftBranch = Sequential()
leftBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
leftBranch.add(Convolution2D(14, 3, 1, activation='relu'))
leftBranch.add(ZeroPadding2D((1, 1)))
leftBranch.add(Flatten())
rightBranch = Sequential()
rightBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
rightBranch.add(Convolution2D(14, 1, 3, activation='relu'))
rightBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
rightBranch.add(Flatten())
centralBranch = Sequential()
centralBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
centralBranch.add(Convolution2D(14, 5, 5, activation='relu'))
centralBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
centralBranch.add(Flatten())
merged = Merge([leftBranch, centralBranch, rightBranch], mode='concat')
model = Sequential()
model.add(merged) model.add(Dense(64))
model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(1))
model.add(Activation('sigmoid'))
Error coming is :
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
So, whats is the proper way to concatenate two sequential model with convolutional layers. I just want to Merge convolutional layers output like I did it here.