I am implementing my first CNN in Tensorflow and I am having trouble when adding the dense layer to my CNN model. Here is the code:
batch_size = 4
sample_shape = (batch_size, 24, 30, 30, 5)
model = models.Sequential()
model.add(layers.Conv3D(96, kernel_size=(4, 4, 4), activation='relu', padding='same', input_shape=sample_shape))
model.add(layers.Conv3D(64, kernel_size=(3, 3, 3), activation='relu', padding='same'))
model.add(layers.Conv3D(64, kernel_size=(1, 1, 5), activation='relu', padding='same'))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.summary()
I am getting the following output. Later, my program crashes. What needs so much memory? It seems to be the Dense Layer, but I can't explain it.
2021-10-20 19:03:53.219849: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 5662310400 exceeds 10% of free system memory.
Your flattened weight shape at the last layer is roughly
4 * (24 * 30 * 30 * 64 * 256) = 1,415,577,600
That's an insane amount of parameters. Use MaxPooling3D between convolutional layers or GlobalAveragePooling3D instead of Flatten to reduce the number of parameters.
Related
Whilst trying to learn Recurrent Neural Networks(RNNs) am trying to train an Automatic Lip Reading Model using 3DCNN + LSTM. I tried out a code I found for the same on Kaggle.
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.get_output_shape_at(0)
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()
However, it returns the following error:
11 model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
12
---> 13 shape = model.get_output_shape_at(0)
14 model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
15
RuntimeError: The layer sequential_2 has never been called and thus has no defined output shape.
From my understanding, I see that the author of the code was trying to get the output shape of the first layer and reshape it such as to forward to the LSTM layer.
Found a similar post following which I made the following changes and the error was fixed.
shape = model.layers[-1].output_shape
# shape = model.get_output_shape_at(0)
Still I am confused as to what the code does to forward the input from the CNN layer to LSTM layer. Any help to make me understand the above is appreciated. Thank You!!
When you are passing the code from top to bottom then the inputs are flowing in the graph from top to bottom, you are getting this error because you can't call this function on eager mode, as Tensorflow 2.0 is fully transferred to eager mode, so, once you will fit the function and train it 1 epoch then you can use model.get_output_at(0) otherwise use mode.layers[-1].output.
The CNN Layer will extract the features locally then LSTM will sequentially extract and learn the feature, using CONV with LSTM is a good approach, but I will recommend you directly using tf.keras.layers.ConvLSTM3D. Check it here https://www.tensorflow.org/api_docs/python/tf/keras/layers/ConvLSTM3D
tf.keras.backend.clear_session()
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.layers[-1].output_shape
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()
I want to use a pre-trained model (from Keras Applications), with weights, and append my (very simple) CNN model at the end. To this end I am trying to loosely follow the tutorial here under the sub-header 'Fine-tune InceptionV3 on a new set of classes'.
My original simple CNN model was this:
model = Sequential()
model.add(Rescaling(1.0 / 255))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(256,256,3)))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Flatten())
model.add(Dense(units=5, activation='softmax'))
As I'm following the tutorial, I've converted it as so:
x = base_model.output
x = Rescaling(1.0 / 255)(x)
x = Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(256,256,3))(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = GlobalAveragePooling2D()(x)
predictions = Dense(units=5, activation='softmax')(x)
As you can see, the difference is that the top model is a Sequential() model while the bottom is Functional (I think?), and also, that the Flatten() layer has been replaced with GlobalAveragePooling2D(). I did this because I kept getting shape-related errors and it wasn't compiling. I thought I got it once I replaced the Flatten() layer with the GlobalAveragePooling() as this part of the code finally did compile, however now that I'm trying to train the model, it's giving me the following error:
ValueError: Exception encountered when calling layer "max_pooling2d_7" (type MaxPooling2D).
Negative dimension size caused by subtracting 2 from 1 for '{{node model/max_pooling2d_7/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", explicit_paddings=[], ksize=[1, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 1]](model/conv2d_10/Relu)' with input shapes: [?,1,1,64].
Call arguments received:
• inputs=tf.Tensor(shape=(None, 1, 1, 64), dtype=float32)
I don't want to remove the MaxPooling layer as I want this fine-tuned model append to be as close to the 'simple CNN' model I originally had, so that I can compare the two results. But I keep getting hit with these shape errors, which I don't really understand, and it's coming to the end of the day.
Is there a nice quick-fix that can enable this VGG16+simple CNN to work?
the first most important technical problem in your model structure is that you are rescaling images after passed through the base_model, so you should implement it just before the base model
the second one is that you have defined input_shape in the model above in convolution layer while data first pass throught base model, so you should define input layer before base model and then pass its output thorough base_model and the other layers
here i've edited your code:
inputs = Input(shape = (input_shape=(256,256,3))
x = Rescaling(1.0 / 255)(inputs)
x = base_model(x)
x = Conv2D(32, kernel_size=(3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = GlobalAveragePooling2D()(x)
predictions = Dense(units=5, activation='softmax')(x)
model = keras.Model(inputs = [inputs], outputs = [predictions])
And for the error raised, in this case you could set convolution layers padding parameter to 'same' or even resize images to larger size to override the problem.
I was studying different CNN architectures to predict the CIFAR10 dataset, and I found this interesting Github repository:
https://gist.github.com/wielandbrendel/ccf1ff6f8f92139439be
I tried to run the model, but it was created in 6 years ago and the following Keras command is no longer valid:
model.add(Convolution2D(32, 3, 3, 3, border_mode='full'))
How is this command translated into the modern Keras syntax for Conv2D?
I get an error in Keras when I try to input the sequence of integers in Convolution2D(32, 3, 3, 3, ...)?
I guess 32 is the number of channels, and then we specify a 3x3 kernel size, but I am not sure about the meaning of the last 3 mentioned (4th position).
PS. Changing border_mode into padding = 'valid' or 'same' returns the following error:
model.add(Convolution2D(32, 3, 3, 3, padding='valid'))
TypeError: __init__() got multiple values for argument 'padding'
The gist there you're following is backdated and also has some issues. You don't need to follow this now. Here is the updated version of it. Try this.
Imports and DataSet
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Dense, Dropout, Activation,
Flatten, Conv2D, MaxPooling2D)
from tensorflow.keras.optimizers import SGD, Adadelta, Adagrad
import tensorflow as tf
# parameters
batch_size = 32
nb_classes = 10
nb_epoch = 5
# the data, shuffled and split between tran and test sets
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()
# convert class vectors to binary class matrices
Y_train = tf.keras.utils.to_categorical(y_train, nb_classes)
Y_test = tf.keras.utils.to_categorical(y_test, nb_classes)
# train model
X_train = X_train.astype("float32") / 255
X_test = X_test.astype("float32") / 255
X_train.shape, y_train.shape, X_test.shape, y_test.shape
((50000, 32, 32, 3), (50000, 1), (10000, 32, 32, 3), (10000, 1))
Modeling
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(3, 3),
strides=(1, 1), activation='relu', padding="same"))
model.add(Activation('relu'))
model.add(Conv2D(filters=32, kernel_size=(3, 3),
strides=(1, 1), activation='relu', padding="same"))
model.add(Activation('relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(filters=32, kernel_size=(3, 3),
strides=(1, 1), activation='relu', padding="same"))
model.add(Activation('relu'))
model.add(Conv2D(filters=32, kernel_size=(3, 3),
strides=(1, 1), activation='relu', padding="same"))
model.add(Activation('relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
# let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
Compile and Run
model.fit(X_train, Y_train, batch_size=batch_size, epochs=nb_epoch)
# test score & top 1 performance
score = model.evaluate(X_test, Y_test, batch_size=batch_size)
y_hat = model.predict(X_test)
yhat = np.argmax(y_hat, 1)
top1 = np.mean(yhat == np.squeeze(y_test))
print('Test score/Top1', score, top1)
The Convolutional2D is now named Conv2D, but there is still an alias for Convolutional2D, so that's not a problem.
The border_mode argument is not available anymore, the equivalent is padding, with options valid or same.
Try both to see if any of those fits the shapes of the outputs and allows to code to work.
I have built the following object recognition model for my school assignment to predict classes from the CIFAR-10 dataset. The assignment requires that I use the VALID padding for all convolution and pooling layers.
def _build_cifar10_model(num_C1_channels=50, num_C2_channels=60, use_dropout=False):
model = Sequential()
# reshape 1D array of length 3072
# to a matrix of shape 32x32x3
model.add(Input(shape=(3072,)))
model.add(Reshape(target_shape=(32, 32, 3), input_shape=(3072,)))
# 24x24x3
model.add(Conv2D(filters=num_C1_channels, kernel_size=(9, 9), padding='valid', activation='relu'))
# 12x12x3
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
# 8x8x3
model.add(Conv2D(filters=num_C2_channels, kernel_size=(5, 5), padding='valid', activation='relu'))
# 4x4x3
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
model.add(Flatten())
model.add(Dense(units=300))
if use_dropout:
model.add(Dropout(rate=0.5))
model.add(Dense(units=10, activation='softmax'))
if use_dropout:
model.add(Dropout(rate=0.5))
return model
However, building this model is throwing the following error:
InvalidArgumentError: Negative dimension size caused by subtracting 1 from 0 for '{{node max_pooling2d_15/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", ksize=[1, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 1]](conv2d_23/Relu)' with input shapes: [?,24,24,0].
I figured that it is because when padding is VALID, a (32, 32, 3) image becomes size (24, 24, 0) after applying that 1st Conv2D layer with kernel_size=9 on it. Not sure why the RGB channels are completely lost.
Is there a way to go around this issue while maintaining the VALID padding?
Sorry in advance as this is the first time I am building such a model.
Use InputLayer instead of Input at the start of your model. It will give you the first layer of the model instead of a symbolic tensor.
Replace the line with this:
model.add(InputLayer(input_shape=(3072,)))
The rest of the code seems to execute for me and gives me a 4x4 layer after the convolution which you feed into your dropout/dense layers.
I wanted to merge two sequential models into one using a Merge layer but it is showing me an error. I am working with images, with size 128x128 (RGB image) and batch size is 32.
The error is:
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
The model is defined as:
model = Sequential() leftBranch = Sequential()
leftBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
leftBranch.add(Convolution2D(14, 3, 1, activation='relu'))
leftBranch.add(ZeroPadding2D((1, 1)))
leftBranch.add(Flatten())
rightBranch = Sequential()
rightBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
rightBranch.add(Convolution2D(14, 1, 3, activation='relu'))
rightBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
rightBranch.add(Flatten())
centralBranch = Sequential()
centralBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
centralBranch.add(Convolution2D(14, 5, 5, activation='relu'))
centralBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
centralBranch.add(Flatten())
merged = Merge([leftBranch, centralBranch, rightBranch], mode='concat')
model = Sequential()
model.add(merged) model.add(Dense(64))
model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(1))
model.add(Activation('sigmoid'))
Error coming is :
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
So, whats is the proper way to concatenate two sequential model with convolutional layers. I just want to Merge convolutional layers output like I did it here.