Keras using too much memory - tensorflow

I have a keras (with tensorflow backend) model which is defined like so:
INPUT_SHAPE = [4740, 3540, 1]
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
model.add(Conv2D(2, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(4, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(8, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(16, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(32, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
This model has only 37,506 trainable params. Yet somehow it is able to deplete K80's 12GB vram resource on if a batch size is more then 1.
Why does this model need so much memory?
And how do I calculate memory requirements properly?
The function from How to determine needed memory of Keras model? gives me 2.15 GB per 1 element in a batch. So at least I should be able to make a batch of 5.
EDIT: model.summary()
Layer (type) Output Shape Param #
conv2d_1 (Conv2D) (None, 4738, 3538, 32) 320
conv2d_2 (Conv2D) (None, 4735, 3535, 2) 1026
max_pooling2d_1 (MaxPooling2 (None, 1183, 883, 2) 0
conv2d_3 (Conv2D) (None, 1180, 880, 4) 132
max_pooling2d_2 (MaxPooling2 (None, 295, 220, 4) 0
conv2d_4 (Conv2D) (None, 292, 217, 8) 520
max_pooling2d_3 (MaxPooling2 (None, 73, 54, 8) 0
conv2d_5 (Conv2D) (None, 70, 51, 16) 2064
max_pooling2d_4 (MaxPooling2 (None, 17, 12, 16) 0
conv2d_6 (Conv2D) (None, 14, 9, 32) 8224
max_pooling2d_5 (MaxPooling2 (None, 3, 2, 32) 0
dropout_1 (Dropout) (None, 3, 2, 32) 0
flatten_1 (Flatten) (None, 192) 0
dense_1 (Dense) (None, 128) 24704
dropout_2 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 4) 516
Total params: 37,506
Trainable params: 37,506
Non-trainable params: 0

The output shape of the first layer is B*4738*3538*32 (B is the batch size), which will take around 1GB * B memory. The gradients and other activations will probably take some memory too. Maybe increasing the stride for the first layer will help.


Why building same model in 2 different ways give different outputs?

I'm having a really weird problem.
I'm building same model in 2 different ways.
I checked the summary (number of parameters) and plot the 2 models, and see no difference.
The models give different predictions (after train them on same dataset).
What is the difference in the models ? (I can't figure it out)
How can I update the second model to be same as the first model ?
first model (the "source" model):
input_img = Input(shape=(dim_x, dim_y, dim_z))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoder = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoder)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoder)
autoencoder.compile(optimizer='adam', loss=loss_func) Layer (type) Output Shape Param #
input_3 (InputLayer) [(None, 224, 224, 3)] 0
conv2d_28 (Conv2D) (None, 224, 224, 16) 448
max_pooling2d_12 (MaxPooling (None, 112, 112, 16) 0
conv2d_29 (Conv2D) (None, 112, 112, 8) 1160
max_pooling2d_13 (MaxPooling (None, 56, 56, 8) 0
conv2d_30 (Conv2D) (None, 56, 56, 8) 584
max_pooling2d_14 (MaxPooling (None, 28, 28, 8) 0
conv2d_31 (Conv2D) (None, 28, 28, 8) 584
up_sampling2d_12 (UpSampling (None, 56, 56, 8) 0
conv2d_32 (Conv2D) (None, 56, 56, 8) 584
up_sampling2d_13 (UpSampling (None, 112, 112, 8) 0
conv2d_33 (Conv2D) (None, 112, 112, 16) 1168
up_sampling2d_14 (UpSampling (None, 224, 224, 16) 0
conv2d_34 (Conv2D) (None, 224, 224, 3) 435
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0
Layer (type) Output Shape Param #
conv2d_21 (Conv2D) (None, 224, 224, 16) 448
max_pooling2d_9 (MaxPooling2 (None, 112, 112, 16) 0
conv2d_22 (Conv2D) (None, 112, 112, 8) 1160
max_pooling2d_10 (MaxPooling (None, 56, 56, 8) 0
conv2d_23 (Conv2D) (None, 56, 56, 8) 584
max_pooling2d_11 (MaxPooling (None, 28, 28, 8) 0
conv2d_24 (Conv2D) (None, 28, 28, 8) 584
up_sampling2d_9 (UpSampling2 (None, 56, 56, 8) 0
conv2d_25 (Conv2D) (None, 56, 56, 8) 584
up_sampling2d_10 (UpSampling (None, 112, 112, 8) 0
conv2d_26 (Conv2D) (None, 112, 112, 16) 1168
up_sampling2d_11 (UpSampling (None, 224, 224, 16) 0
conv2d_27 (Conv2D) (None, 224, 224, 3) 435
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0
Second model (The model I want to build as first model in different way):
autoencoder = Sequential()
autoencoder.compile(optimizer='adam', loss=loss_func)
Layer (type) Output Shape Param #
input_3 (InputLayer) [(None, 224, 224, 3)] 0
conv2d_28 (Conv2D) (None, 224, 224, 16) 448
max_pooling2d_12 (MaxPooling (None, 112, 112, 16) 0
conv2d_29 (Conv2D) (None, 112, 112, 8) 1160
max_pooling2d_13 (MaxPooling (None, 56, 56, 8) 0
conv2d_30 (Conv2D) (None, 56, 56, 8) 584
max_pooling2d_14 (MaxPooling (None, 28, 28, 8) 0
conv2d_31 (Conv2D) (None, 28, 28, 8) 584
up_sampling2d_12 (UpSampling (None, 56, 56, 8) 0
conv2d_32 (Conv2D) (None, 56, 56, 8) 584
up_sampling2d_13 (UpSampling (None, 112, 112, 8) 0
conv2d_33 (Conv2D) (None, 112, 112, 16) 1168
up_sampling2d_14 (UpSampling (None, 224, 224, 16) 0
conv2d_34 (Conv2D) (None, 224, 224, 3) 435
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0
You should set a random seed using tensorflow.set_random_seed(0) and numpy.random.seed(0). The seed can be any int or 1D array_like, and should be set in your code once.
Also make sure that you have shuffling disabled, shuffle=False)
After that a random weight/parameters initialization and data ordering will be reproduceable in consecutive experiments and models.
Although there still may be some randomness resulting in different results after running the model. It can be from other libraries that use other randomness modules. (eg.: does not give reproducible results)

conv-autoencoder that val_loss doesn't decrease

I build a anomaly detection model using conv-autoencoder on UCSD_ped2 dataset. What puzzles me is that after very few epochs ,the val_loss don't decrease. It seem that the model couldn't learn any longer. I have done some research to improve my model,but it doesn't getting better. what should i do to fix it?
Here's my model's struct:
input_img = Input(shape = (x, y, inChannel))
bn1= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(input_img)
conv1 = Conv2D(256, (11, 11), strides=(4,4),activation='relu',
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
bn2= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool1)
conv2 = Conv2D(128, (5, 5),activation='relu',
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
bn3= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool2)
conv3 = Conv2D(64, (3, 3), activation='relu',
ubn3=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(conv3)
uconv3=Conv2DTranspose(128, (3,3),activation='relu',
upool3=UpSampling2D(size=(2, 2))(uconv3)
ubn2=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool3)
uconv2=Conv2DTranspose(256, (3, 3),activation='relu',
upool2=UpSampling2D(size=(2, 2))(uconv2)
ubn1=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool2)
decoded = Conv2DTranspose(1, (11, 11), strides=(4, 4),
activation='sigmoid', padding='same')(ubn1)
autoencoder = Model(input_img, decoded)
autoencoder.compile(loss = 'mean_squared_error', optimizer ='Adadelta',metrics=['accuracy']), Y_train,validation_split=0.3,
batch_size = batch_size, epochs = epochs, verbose = 0,
Layer (type) Output Shape Param #
input_1 (InputLayer) (None, 144, 224, 1) 0
batch_normalization_1 (Batch (None, 144, 224, 1) 4
conv2d_1 (Conv2D) (None, 36, 56, 256) 31232
max_pooling2d_1 (MaxPooling2 (None, 18, 28, 256) 0
batch_normalization_2 (Batch (None, 18, 28, 256) 1024
conv2d_2 (Conv2D) (None, 18, 28, 128) 819328
max_pooling2d_2 (MaxPooling2 (None, 9, 14, 128) 0
batch_normalization_3 (Batch (None, 9, 14, 128) 512
conv2d_3 (Conv2D) (None, 9, 14, 64) 73792
batch_normalization_4 (Batch (None, 9, 14, 64) 256
conv2d_transpose_1 (Conv2DTr (None, 9, 14, 128) 73856
up_sampling2d_1 (UpSampling2 (None, 18, 28, 128) 0
batch_normalization_5 (Batch (None, 18, 28, 128) 512
conv2d_transpose_2 (Conv2DTr (None, 18, 28, 256) 295168
up_sampling2d_2 (UpSampling2 (None, 36, 56, 256) 0
batch_normalization_6 (Batch (None, 36, 56, 256) 1024
conv2d_transpose_3 (Conv2DTr (None, 144, 224, 1) 30977
Total params: 1,327,685
Trainable params: 1,326,019
Non-trainable params: 1,666
the batch size=30;epoch=100 training data has 1785 pic; validation data has 765 pic.
I have tried :
delete kernel_regularizer;
adding ReduceLROnPlateau.
,but it only get a little improve.
Epoch 00043: ReduceLROnPlateau reducing learning rate to 9.99999874573554e-12.
Epoch 00044: val_loss did not improve from 0.00240
Epoch 00045: val_loss did not improve from 0.00240
As the val_loss get 0.00240, it didn't decrease...
The following figure was loss with epoch.
The following figure show model's reconstruction result which are truly poor.How can I making my model more workful?
Based on your screenshot, It seems that it is not an issue of overfitting or underfitting.
On my understanding:
Underfitting – Validation and training error high
Overfitting – Validation error is high, training error low
Good fit – Validation error low, slightly higher than the training error
Generally speaking, the dataset should be split properly for training and validation.
Typically the training set should be 4 times (80/20) the number of your validation set.
My suggestion is that you can try to increase the number of your datasets by doing data augmentation and continue the training.
Kindly refer to the documentation for data augmentation.

How to fix the output shape in Keras 2.1.0

I get a dense layer shape error with Keras Version 2.1.0. This problem only happens with this version of Keras (2.1.0). I am in no position to upgrade the version since it's on a cluster so I am trying to find a fix for the time being. My model is defined as below.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
input_shape=(32, 32, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
I have done one hot encoding as shown below.
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
The model summary is
Layer (type) Output Shape Param #
conv2d_1 (Conv2D) (None, 30, 30, 32) 896
conv2d_2 (Conv2D) (None, 28, 28, 64) 18496
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 64) 0
dropout_1 (Dropout) (None, 14, 14, 64) 0
flatten_1 (Flatten) (None, 12544) 0
dense_1 (Dense) (None, 128) 1605760
dropout_2 (Dropout) (None, 128) 0
dense_2 (Dense) (None, 10) 1290
Total params: 1,626,442
Trainable params: 1,626,442
Non-trainable params: 0
The error I get is :
ValueError: Error when checking target: expected dense_2 to have 2
dimensions, but got array with shape (50000, 1, 10)
The exact same code works perfectly in Keras 2.2.4

maxpooling results not displaying in model.summary() output

I am beginner in Keras. I am tring to build a model for which i am using Sequential model. When i am tring to reduce the input size from 28 to 14 or lesser by using maxpooling function then the maxpooling function results does't display on call to the model.summary() function. I am tring to achive an accuracy of 0.99 or above after traing i.e, on call to model.score() the accuracy result should be 0.99 or above. Model build my me so far can be seen here
from keras.layers import Activation, MaxPooling2D
model = Sequential()
model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(28,28,1)))
model.add(Convolution2D(32, 1, activation='relu'))
MaxPooling2D(pool_size=(2, 2))
model.add(Convolution2D(32, 26))
model.add(Convolution2D(10, 1))
Output -
Layer (type) Output Shape Param #
conv2d_29 (Conv2D) (None, 26, 26, 32) 320
conv2d_30 (Conv2D) (None, 26, 26, 32) 1056
conv2d_31 (Conv2D) (None, 1, 1, 32) 692256
conv2d_32 (Conv2D) (None, 1, 1, 10) 330
flatten_7 (Flatten) (None, 10) 0
activation_7 (Activation) (None, 10) 0
Total params: 693,962
Trainable params: 693,962
Non-trainable params: 0
Batch size i am using is 32 and number of epoch is 10.
metrics=['accuracy']), Y_train, batch_size=32, nb_epoch=10, verbose=1)
score = model.evaluate(X_test, Y_test, verbose=0)
Output after training -
[0.09016687796734459, 0.9814]
You are not adding the Maxpooling2D layer to your model...
model.add(MaxPooling2D(pool_size=(2, 2)))
Also, the output of your maxpooling will have shape (None, 13, 13, 32), the convolutional kernel in the next layer (in your case 26) can't be larger than the dimensions your current (13). Your code should be something like this:
from keras.layers import Activation, MaxPooling2D, Dense
model = Sequential()
model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(28,28,1)))
model.add(Convolution2D(32, 1, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 8))
model.add(Convolution2D(10, 6))
Layer (type) Output Shape Param #
conv2d_1 (Conv2D) (None, 26, 26, 32) 320
conv2d_2 (Conv2D) (None, 26, 26, 32) 1056
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32) 0
conv2d_3 (Conv2D) (None, 6, 6, 32) 65568
conv2d_4 (Conv2D) (None, 1, 1, 10) 11530
flatten_1 (Flatten) (None, 10) 0
activation_1 (Activation) (None, 10) 0
Total params: 78,474
Trainable params: 78,474
Non-trainable params: 0
P.S.: I would consider using smaller kernel sizes and a FC layer at the output, as it is a more practical solution in most cases than trying to match convolution output shapes

Improving accuracy of my CNN for pixel wise segmentation

I am trying to design a CNN that can do pixel wise segmentation of cell images. Such as these:
With segmentation masks such as this (except more than one segmentation mask for each raw image, eg: interior of cell, border of cell, background):
I have mostly copied the U-net design from here:
However even 10 annotated images (over 300 cells) I still get quite bad dice coefficient scores and not great predictions. According to the U-Net paper this number of annotated cells should be sufficient for a good prediction.
This is the code for the model I am using.
def get_unet():
inputs = Input((img_rows, img_cols, 1))
conv1 = Conv2D(16, window_size, activation='relu', padding='same')(inputs)
conv1 = Conv2D(16, window_size, activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, window_size, activation='relu', padding='same')(pool1)
conv2 = Conv2D(64, window_size, activation='relu', padding='same')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(128, window_size, activation='relu', padding='same')(pool2)
conv3 = Conv2D(128, window_size, activation='relu', padding='same')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(128, window_size, activation='relu', padding='same')(pool3)
conv4 = Conv2D(128, window_size, activation='relu', padding='same')(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)
conv5 = Conv2D(512, window_size, activation='relu', padding='same')(pool4)
conv5 = Conv2D(512, window_size, activation='relu', padding='same')(conv5)
up6 = concatenate([Conv2DTranspose(512, (2, 2), strides=(2, 2), padding='same')(conv5), conv4], axis=3)
conv6 = Conv2D(128, window_size, activation='relu', padding='same')(up6)
conv6 = Conv2D(128, window_size, activation='relu', padding='same')(conv6)
up7 = concatenate([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv6), conv3], axis=3)
conv7 = Conv2D(128, window_size, activation='relu', padding='same')(up7)
conv7 = Conv2D(128, window_size, activation='relu', padding='same')(conv7)
up8 = concatenate([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv7), conv2], axis=3)
conv8 = Conv2D(64, window_size, activation='relu', padding='same')(up8)
conv8 = Conv2D(64, window_size, activation='relu', padding='same')(conv8)
up9 = concatenate([Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv8), conv1], axis=3)
conv9 = Conv2D(16, window_size, activation='relu', padding='same')(up9)
conv9 = Conv2D(16, window_size, activation='relu', padding='same')(conv9)
conv10 = Conv2D(f_num, (1, 1), activation='softmax')(conv9) # change to N,(1,1) for more classes and softmax
model = Model(inputs=[inputs], outputs=[conv10])
model.compile(optimizer=Adam(lr=1e-5), loss=dice_coef_loss, metrics=[dice_coef])
return model`
I have tried many different hyper-parameters for the model all with no success. Dice scores hover around 0.25 and my loss barely decreases between epochs.
I feel I am doing something fundamentally wrong here. Any suggestions?
EDIT: Sigmoid activation improves dice score from 0.25 to 0.33 (again however 1 epoch reaches this score and subsequent epochs only improve very slightly from 0.33 to 0.331 etc)
dice_coef_loss is defined as below
smooth = 1.
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def dice_coef_loss(y_true, y_pred):
return -dice_coef(y_true, y_pred)
Also in case it's useful the model.summary output:
Layer (type) Output Shape Param #
input_2 (InputLayer) (None, 64, 64, 1) 0
conv2d_20 (Conv2D) (None, 64, 64, 16) 32
conv2d_21 (Conv2D) (None, 64, 64, 16) 272
max_pooling2d_5 (MaxPooling2 (None, 32, 32, 16) 0
conv2d_22 (Conv2D) (None, 32, 32, 64) 1088
conv2d_23 (Conv2D) (None, 32, 32, 64) 4160
max_pooling2d_6 (MaxPooling2 (None, 16, 16, 64) 0
conv2d_24 (Conv2D) (None, 16, 16, 128) 8320
conv2d_25 (Conv2D) (None, 16, 16, 128) 16512
max_pooling2d_7 (MaxPooling2 (None, 8, 8, 128) 0
conv2d_26 (Conv2D) (None, 8, 8, 128) 16512
conv2d_27 (Conv2D) (None, 8, 8, 128) 16512
max_pooling2d_8 (MaxPooling2 (None, 4, 4, 128) 0
conv2d_28 (Conv2D) (None, 4, 4, 512) 66048
conv2d_29 (Conv2D) (None, 4, 4, 512) 262656
conv2d_transpose_5 (Conv2DTr (None, 8, 8, 512) 1049088
concatenate_5 (Concatenate) (None, 8, 8, 640) 0
conv2d_30 (Conv2D) (None, 8, 8, 128) 82048
conv2d_31 (Conv2D) (None, 8, 8, 128) 16512
conv2d_transpose_6 (Conv2DTr (None, 16, 16, 128) 65664
concatenate_6 (Concatenate) (None, 16, 16, 256) 0
conv2d_32 (Conv2D) (None, 16, 16, 128) 32896
conv2d_33 (Conv2D) (None, 16, 16, 128) 16512
conv2d_transpose_7 (Conv2DTr (None, 32, 32, 128) 65664
concatenate_7 (Concatenate) (None, 32, 32, 192) 0
conv2d_34 (Conv2D) (None, 32, 32, 64) 12352
conv2d_35 (Conv2D) (None, 32, 32, 64) 4160
conv2d_transpose_8 (Conv2DTr (None, 64, 64, 64) 16448
concatenate_8 (Concatenate) (None, 64, 64, 80) 0
conv2d_36 (Conv2D) (None, 64, 64, 16) 1296
conv2d_37 (Conv2D) (None, 64, 64, 16) 272
conv2d_38 (Conv2D) (None, 64, 64, 4) 68
Total params: 1,755,092.0
Trainable params: 1,755,092.0
Non-trainable params: 0.0