Keras using too much memory - tensorflow

I have a keras (with tensorflow backend) model which is defined like so:
INPUT_SHAPE = [4740, 3540, 1]
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=INPUT_SHAPE))
model.add(Conv2D(2, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(4, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(8, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(16, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(32, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
This model has only 37,506 trainable params. Yet somehow it is able to deplete K80's 12GB vram resource on model.fit() if a batch size is more then 1.
Why does this model need so much memory?
And how do I calculate memory requirements properly?
The function from How to determine needed memory of Keras model? gives me 2.15 GB per 1 element in a batch. So at least I should be able to make a batch of 5.
EDIT: model.summary()
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 4738, 3538, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 4735, 3535, 2) 1026
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1183, 883, 2) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 1180, 880, 4) 132
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 295, 220, 4) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 292, 217, 8) 520
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 73, 54, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 70, 51, 16) 2064
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 17, 12, 16) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 14, 9, 32) 8224
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 3, 2, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 3, 2, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 192) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 24704
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 4) 516
=================================================================
Total params: 37,506
Trainable params: 37,506
Non-trainable params: 0
_________________________________________________________________

The output shape of the first layer is B*4738*3538*32 (B is the batch size), which will take around 1GB * B memory. The gradients and other activations will probably take some memory too. Maybe increasing the stride for the first layer will help.

Related

Why building same model in 2 different ways give different outputs?

I'm having a really weird problem.
I'm building same model in 2 different ways.
I checked the summary (number of parameters) and plot the 2 models, and see no difference.
The models give different predictions (after train them on same dataset).
What is the difference in the models ? (I can't figure it out)
How can I update the second model to be same as the first model ?
first model (the "source" model):
input_img = Input(shape=(dim_x, dim_y, dim_z))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoder = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoder)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoder)
autoencoder.compile(optimizer='adam', loss=loss_func) Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
conv2d_28 (Conv2D) (None, 224, 224, 16) 448
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 112, 112, 16) 0
_________________________________________________________________
conv2d_29 (Conv2D) (None, 112, 112, 8) 1160
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 56, 56, 8) 0
_________________________________________________________________
conv2d_30 (Conv2D) (None, 56, 56, 8) 584
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 28, 28, 8) 0
_________________________________________________________________
conv2d_31 (Conv2D) (None, 28, 28, 8) 584
_________________________________________________________________
up_sampling2d_12 (UpSampling (None, 56, 56, 8) 0
_________________________________________________________________
conv2d_32 (Conv2D) (None, 56, 56, 8) 584
_________________________________________________________________
up_sampling2d_13 (UpSampling (None, 112, 112, 8) 0
_________________________________________________________________
conv2d_33 (Conv2D) (None, 112, 112, 16) 1168
_________________________________________________________________
up_sampling2d_14 (UpSampling (None, 224, 224, 16) 0
_________________________________________________________________
conv2d_34 (Conv2D) (None, 224, 224, 3) 435
=================================================================
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0
summary:
Layer (type) Output Shape Param #
=================================================================
conv2d_21 (Conv2D) (None, 224, 224, 16) 448
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 112, 112, 16) 0
_________________________________________________________________
conv2d_22 (Conv2D) (None, 112, 112, 8) 1160
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 56, 56, 8) 0
_________________________________________________________________
conv2d_23 (Conv2D) (None, 56, 56, 8) 584
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 28, 28, 8) 0
_________________________________________________________________
conv2d_24 (Conv2D) (None, 28, 28, 8) 584
_________________________________________________________________
up_sampling2d_9 (UpSampling2 (None, 56, 56, 8) 0
_________________________________________________________________
conv2d_25 (Conv2D) (None, 56, 56, 8) 584
_________________________________________________________________
up_sampling2d_10 (UpSampling (None, 112, 112, 8) 0
_________________________________________________________________
conv2d_26 (Conv2D) (None, 112, 112, 16) 1168
_________________________________________________________________
up_sampling2d_11 (UpSampling (None, 224, 224, 16) 0
_________________________________________________________________
conv2d_27 (Conv2D) (None, 224, 224, 3) 435
=================================================================
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0
Second model (The model I want to build as first model in different way):
autoencoder = Sequential()
autoencoder.add(el1)
autoencoder.add(el2)
autoencoder.add(el3)
autoencoder.add(el4)
autoencoder.add(el5)
autoencoder.add(el6)
autoencoder.add(dl1)
autoencoder.add(dl2)
autoencoder.add(dl3)
autoencoder.add(dl4)
autoencoder.add(dl5)
autoencoder.add(dl6)
autoencoder.add(output_layer)
autoencoder.compile(optimizer='adam', loss=loss_func)
summary:
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
conv2d_28 (Conv2D) (None, 224, 224, 16) 448
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 112, 112, 16) 0
_________________________________________________________________
conv2d_29 (Conv2D) (None, 112, 112, 8) 1160
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 56, 56, 8) 0
_________________________________________________________________
conv2d_30 (Conv2D) (None, 56, 56, 8) 584
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 28, 28, 8) 0
_________________________________________________________________
conv2d_31 (Conv2D) (None, 28, 28, 8) 584
_________________________________________________________________
up_sampling2d_12 (UpSampling (None, 56, 56, 8) 0
_________________________________________________________________
conv2d_32 (Conv2D) (None, 56, 56, 8) 584
_________________________________________________________________
up_sampling2d_13 (UpSampling (None, 112, 112, 8) 0
_________________________________________________________________
conv2d_33 (Conv2D) (None, 112, 112, 16) 1168
_________________________________________________________________
up_sampling2d_14 (UpSampling (None, 224, 224, 16) 0
_________________________________________________________________
conv2d_34 (Conv2D) (None, 224, 224, 3) 435
=================================================================
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0
You should set a random seed using tensorflow.set_random_seed(0) and numpy.random.seed(0). The seed can be any int or 1D array_like, and should be set in your code once.
Also make sure that you have shuffling disabled model.fit(data, shuffle=False)
After that a random weight/parameters initialization and data ordering will be reproduceable in consecutive experiments and models.
Although there still may be some randomness resulting in different results after running the model. It can be from other libraries that use other randomness modules. (eg.: mnist_cnn.py does not give reproducible results)

conv-autoencoder that val_loss doesn't decrease

I build a anomaly detection model using conv-autoencoder on UCSD_ped2 dataset. What puzzles me is that after very few epochs ,the val_loss don't decrease. It seem that the model couldn't learn any longer. I have done some research to improve my model,but it doesn't getting better. what should i do to fix it?
Here's my model's struct:
x=144;y=224
input_img = Input(shape = (x, y, inChannel))
bn1= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(input_img)
conv1 = Conv2D(256, (11, 11), strides=(4,4),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(bn1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
bn2= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool1)
conv2 = Conv2D(128, (5, 5),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(bn2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
bn3= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool2)
conv3 = Conv2D(64, (3, 3), activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(bn3)
ubn3=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(conv3)
uconv3=Conv2DTranspose(128, (3,3),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(ubn3)
upool3=UpSampling2D(size=(2, 2))(uconv3)
ubn2=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool3)
uconv2=Conv2DTranspose(256, (3, 3),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(ubn2)
upool2=UpSampling2D(size=(2, 2))(uconv2)
ubn1=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool2)
decoded = Conv2DTranspose(1, (11, 11), strides=(4, 4),
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
activation='sigmoid', padding='same')(ubn1)
autoencoder = Model(input_img, decoded)
autoencoder.compile(loss = 'mean_squared_error', optimizer ='Adadelta',metrics=['accuracy'])
history=autoencoder.fit(X_train, Y_train,validation_split=0.3,
batch_size = batch_size, epochs = epochs, verbose = 0,
shuffle=True,
callbacks=[earlystopping,checkpointer,reduce_lr])
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 144, 224, 1) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 144, 224, 1) 4
_________________________________________________________________
conv2d_1 (Conv2D) (None, 36, 56, 256) 31232
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 18, 28, 256) 0
_________________________________________________________________
batch_normalization_2 (Batch (None, 18, 28, 256) 1024
_________________________________________________________________
conv2d_2 (Conv2D) (None, 18, 28, 128) 819328
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 9, 14, 128) 0
_________________________________________________________________
batch_normalization_3 (Batch (None, 9, 14, 128) 512
_________________________________________________________________
conv2d_3 (Conv2D) (None, 9, 14, 64) 73792
_________________________________________________________________
batch_normalization_4 (Batch (None, 9, 14, 64) 256
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 9, 14, 128) 73856
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 18, 28, 128) 0
_________________________________________________________________
batch_normalization_5 (Batch (None, 18, 28, 128) 512
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 18, 28, 256) 295168
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 36, 56, 256) 0
_________________________________________________________________
batch_normalization_6 (Batch (None, 36, 56, 256) 1024
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 144, 224, 1) 30977
=================================================================
Total params: 1,327,685
Trainable params: 1,326,019
Non-trainable params: 1,666
the batch size=30;epoch=100 training data has 1785 pic; validation data has 765 pic.
I have tried :
delete kernel_regularizer;
adding ReduceLROnPlateau.
,but it only get a little improve.
Epoch 00043: ReduceLROnPlateau reducing learning rate to 9.99999874573554e-12.
Epoch 00044: val_loss did not improve from 0.00240
Epoch 00045: val_loss did not improve from 0.00240
As the val_loss get 0.00240, it didn't decrease...
The following figure was loss with epoch.
The following figure show model's reconstruction result which are truly poor.How can I making my model more workful?
Based on your screenshot, It seems that it is not an issue of overfitting or underfitting.
On my understanding:
Underfitting – Validation and training error high
Overfitting – Validation error is high, training error low
Good fit – Validation error low, slightly higher than the training error
Generally speaking, the dataset should be split properly for training and validation.
Typically the training set should be 4 times (80/20) the number of your validation set.
My suggestion is that you can try to increase the number of your datasets by doing data augmentation and continue the training.
Kindly refer to the documentation for data augmentation.

How to fix the output shape in Keras 2.1.0

I get a dense layer shape error with Keras Version 2.1.0. This problem only happens with this version of Keras (2.1.0). I am in no position to upgrade the version since it's on a cluster so I am trying to find a fix for the time being. My model is defined as below.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=(32, 32, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=config["optimizer"],
metrics=['accuracy'])
I have done one hot encoding as shown below.
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
The model summary is
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 30, 30, 32) 896
_________________________________________________________________
conv2d_2 (Conv2D) (None, 28, 28, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 14, 14, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 12544) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 1605760
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 1290
=================================================================
Total params: 1,626,442
Trainable params: 1,626,442
Non-trainable params: 0
_____________________________________
The error I get is :
ValueError: Error when checking target: expected dense_2 to have 2
dimensions, but got array with shape (50000, 1, 10)
The exact same code works perfectly in Keras 2.2.4

maxpooling results not displaying in model.summary() output

I am beginner in Keras. I am tring to build a model for which i am using Sequential model. When i am tring to reduce the input size from 28 to 14 or lesser by using maxpooling function then the maxpooling function results does't display on call to the model.summary() function. I am tring to achive an accuracy of 0.99 or above after traing i.e, on call to model.score() the accuracy result should be 0.99 or above. Model build my me so far can be seen here
from keras.layers import Activation, MaxPooling2D
model = Sequential()
model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(28,28,1)))
model.add(Convolution2D(32, 1, activation='relu'))
MaxPooling2D(pool_size=(2, 2))
model.add(Convolution2D(32, 26))
model.add(Convolution2D(10, 1))
model.add(Flatten())
model.add(Activation('softmax'))
model.summary()
Output -
Layer (type) Output Shape Param #
=================================================================
conv2d_29 (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
conv2d_30 (Conv2D) (None, 26, 26, 32) 1056
_________________________________________________________________
conv2d_31 (Conv2D) (None, 1, 1, 32) 692256
_________________________________________________________________
conv2d_32 (Conv2D) (None, 1, 1, 10) 330
_________________________________________________________________
flatten_7 (Flatten) (None, 10) 0
_________________________________________________________________
activation_7 (Activation) (None, 10) 0
=================================================================
Total params: 693,962
Trainable params: 693,962
Non-trainable params: 0
____________________________
Batch size i am using is 32 and number of epoch is 10.
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=32, nb_epoch=10, verbose=1)
score = model.evaluate(X_test, Y_test, verbose=0)
print(score)
Output after training -
[0.09016687796734459, 0.9814]
You are not adding the Maxpooling2D layer to your model...
model.add(MaxPooling2D(pool_size=(2, 2)))
Also, the output of your maxpooling will have shape (None, 13, 13, 32), the convolutional kernel in the next layer (in your case 26) can't be larger than the dimensions your current (13). Your code should be something like this:
from keras.layers import Activation, MaxPooling2D, Dense
model = Sequential()
model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(28,28,1)))
model.add(Convolution2D(32, 1, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 8))
model.add(Convolution2D(10, 6))
model.add(Flatten())
model.add(Activation('softmax'))
print(model.summary())
Output
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 26, 26, 32) 1056
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 6, 6, 32) 65568
_________________________________________________________________
conv2d_4 (Conv2D) (None, 1, 1, 10) 11530
_________________________________________________________________
flatten_1 (Flatten) (None, 10) 0
_________________________________________________________________
activation_1 (Activation) (None, 10) 0
=================================================================
Total params: 78,474
Trainable params: 78,474
Non-trainable params: 0
___________________________________
P.S.: I would consider using smaller kernel sizes and a FC layer at the output, as it is a more practical solution in most cases than trying to match convolution output shapes

Improving accuracy of my CNN for pixel wise segmentation

I am trying to design a CNN that can do pixel wise segmentation of cell images. Such as these:
With segmentation masks such as this (except more than one segmentation mask for each raw image, eg: interior of cell, border of cell, background):
I have mostly copied the U-net design from here: https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
However even 10 annotated images (over 300 cells) I still get quite bad dice coefficient scores and not great predictions. According to the U-Net paper this number of annotated cells should be sufficient for a good prediction.
This is the code for the model I am using.
def get_unet():
inputs = Input((img_rows, img_cols, 1))
conv1 = Conv2D(16, window_size, activation='relu', padding='same')(inputs)
conv1 = Conv2D(16, window_size, activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, window_size, activation='relu', padding='same')(pool1)
conv2 = Conv2D(64, window_size, activation='relu', padding='same')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(128, window_size, activation='relu', padding='same')(pool2)
conv3 = Conv2D(128, window_size, activation='relu', padding='same')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(128, window_size, activation='relu', padding='same')(pool3)
conv4 = Conv2D(128, window_size, activation='relu', padding='same')(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)
conv5 = Conv2D(512, window_size, activation='relu', padding='same')(pool4)
conv5 = Conv2D(512, window_size, activation='relu', padding='same')(conv5)
up6 = concatenate([Conv2DTranspose(512, (2, 2), strides=(2, 2), padding='same')(conv5), conv4], axis=3)
conv6 = Conv2D(128, window_size, activation='relu', padding='same')(up6)
conv6 = Conv2D(128, window_size, activation='relu', padding='same')(conv6)
up7 = concatenate([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv6), conv3], axis=3)
conv7 = Conv2D(128, window_size, activation='relu', padding='same')(up7)
conv7 = Conv2D(128, window_size, activation='relu', padding='same')(conv7)
up8 = concatenate([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv7), conv2], axis=3)
conv8 = Conv2D(64, window_size, activation='relu', padding='same')(up8)
conv8 = Conv2D(64, window_size, activation='relu', padding='same')(conv8)
up9 = concatenate([Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv8), conv1], axis=3)
conv9 = Conv2D(16, window_size, activation='relu', padding='same')(up9)
conv9 = Conv2D(16, window_size, activation='relu', padding='same')(conv9)
conv10 = Conv2D(f_num, (1, 1), activation='softmax')(conv9) # change to N,(1,1) for more classes and softmax
model = Model(inputs=[inputs], outputs=[conv10])
model.compile(optimizer=Adam(lr=1e-5), loss=dice_coef_loss, metrics=[dice_coef])
return model`
I have tried many different hyper-parameters for the model all with no success. Dice scores hover around 0.25 and my loss barely decreases between epochs.
I feel I am doing something fundamentally wrong here. Any suggestions?
EDIT: Sigmoid activation improves dice score from 0.25 to 0.33 (again however 1 epoch reaches this score and subsequent epochs only improve very slightly from 0.33 to 0.331 etc)
dice_coef_loss is defined as below
smooth = 1.
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def dice_coef_loss(y_true, y_pred):
return -dice_coef(y_true, y_pred)
Also in case it's useful the model.summary output:
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 64, 64, 1) 0
_________________________________________________________________
conv2d_20 (Conv2D) (None, 64, 64, 16) 32
_________________________________________________________________
conv2d_21 (Conv2D) (None, 64, 64, 16) 272
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 32, 32, 16) 0
_________________________________________________________________
conv2d_22 (Conv2D) (None, 32, 32, 64) 1088
_________________________________________________________________
conv2d_23 (Conv2D) (None, 32, 32, 64) 4160
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 16, 16, 64) 0
_________________________________________________________________
conv2d_24 (Conv2D) (None, 16, 16, 128) 8320
_________________________________________________________________
conv2d_25 (Conv2D) (None, 16, 16, 128) 16512
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 8, 8, 128) 0
_________________________________________________________________
conv2d_26 (Conv2D) (None, 8, 8, 128) 16512
_________________________________________________________________
conv2d_27 (Conv2D) (None, 8, 8, 128) 16512
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 4, 4, 128) 0
_________________________________________________________________
conv2d_28 (Conv2D) (None, 4, 4, 512) 66048
_________________________________________________________________
conv2d_29 (Conv2D) (None, 4, 4, 512) 262656
_________________________________________________________________
conv2d_transpose_5 (Conv2DTr (None, 8, 8, 512) 1049088
_________________________________________________________________
concatenate_5 (Concatenate) (None, 8, 8, 640) 0
_________________________________________________________________
conv2d_30 (Conv2D) (None, 8, 8, 128) 82048
_________________________________________________________________
conv2d_31 (Conv2D) (None, 8, 8, 128) 16512
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 16, 16, 128) 65664
_________________________________________________________________
concatenate_6 (Concatenate) (None, 16, 16, 256) 0
_________________________________________________________________
conv2d_32 (Conv2D) (None, 16, 16, 128) 32896
_________________________________________________________________
conv2d_33 (Conv2D) (None, 16, 16, 128) 16512
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 32, 32, 128) 65664
_________________________________________________________________
concatenate_7 (Concatenate) (None, 32, 32, 192) 0
_________________________________________________________________
conv2d_34 (Conv2D) (None, 32, 32, 64) 12352
_________________________________________________________________
conv2d_35 (Conv2D) (None, 32, 32, 64) 4160
_________________________________________________________________
conv2d_transpose_8 (Conv2DTr (None, 64, 64, 64) 16448
_________________________________________________________________
concatenate_8 (Concatenate) (None, 64, 64, 80) 0
_________________________________________________________________
conv2d_36 (Conv2D) (None, 64, 64, 16) 1296
_________________________________________________________________
conv2d_37 (Conv2D) (None, 64, 64, 16) 272
_________________________________________________________________
conv2d_38 (Conv2D) (None, 64, 64, 4) 68
=================================================================
Total params: 1,755,092.0
Trainable params: 1,755,092.0
Non-trainable params: 0.0