Improving accuracy of my CNN for pixel wise segmentation - tensorflow

I am trying to design a CNN that can do pixel wise segmentation of cell images. Such as these:
With segmentation masks such as this (except more than one segmentation mask for each raw image, eg: interior of cell, border of cell, background):
I have mostly copied the U-net design from here: https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
However even 10 annotated images (over 300 cells) I still get quite bad dice coefficient scores and not great predictions. According to the U-Net paper this number of annotated cells should be sufficient for a good prediction.
This is the code for the model I am using.
def get_unet():
inputs = Input((img_rows, img_cols, 1))
conv1 = Conv2D(16, window_size, activation='relu', padding='same')(inputs)
conv1 = Conv2D(16, window_size, activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, window_size, activation='relu', padding='same')(pool1)
conv2 = Conv2D(64, window_size, activation='relu', padding='same')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(128, window_size, activation='relu', padding='same')(pool2)
conv3 = Conv2D(128, window_size, activation='relu', padding='same')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(128, window_size, activation='relu', padding='same')(pool3)
conv4 = Conv2D(128, window_size, activation='relu', padding='same')(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)
conv5 = Conv2D(512, window_size, activation='relu', padding='same')(pool4)
conv5 = Conv2D(512, window_size, activation='relu', padding='same')(conv5)
up6 = concatenate([Conv2DTranspose(512, (2, 2), strides=(2, 2), padding='same')(conv5), conv4], axis=3)
conv6 = Conv2D(128, window_size, activation='relu', padding='same')(up6)
conv6 = Conv2D(128, window_size, activation='relu', padding='same')(conv6)
up7 = concatenate([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv6), conv3], axis=3)
conv7 = Conv2D(128, window_size, activation='relu', padding='same')(up7)
conv7 = Conv2D(128, window_size, activation='relu', padding='same')(conv7)
up8 = concatenate([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv7), conv2], axis=3)
conv8 = Conv2D(64, window_size, activation='relu', padding='same')(up8)
conv8 = Conv2D(64, window_size, activation='relu', padding='same')(conv8)
up9 = concatenate([Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv8), conv1], axis=3)
conv9 = Conv2D(16, window_size, activation='relu', padding='same')(up9)
conv9 = Conv2D(16, window_size, activation='relu', padding='same')(conv9)
conv10 = Conv2D(f_num, (1, 1), activation='softmax')(conv9) # change to N,(1,1) for more classes and softmax
model = Model(inputs=[inputs], outputs=[conv10])
model.compile(optimizer=Adam(lr=1e-5), loss=dice_coef_loss, metrics=[dice_coef])
return model`
I have tried many different hyper-parameters for the model all with no success. Dice scores hover around 0.25 and my loss barely decreases between epochs.
I feel I am doing something fundamentally wrong here. Any suggestions?
EDIT: Sigmoid activation improves dice score from 0.25 to 0.33 (again however 1 epoch reaches this score and subsequent epochs only improve very slightly from 0.33 to 0.331 etc)
dice_coef_loss is defined as below
smooth = 1.
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def dice_coef_loss(y_true, y_pred):
return -dice_coef(y_true, y_pred)
Also in case it's useful the model.summary output:
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 64, 64, 1) 0
_________________________________________________________________
conv2d_20 (Conv2D) (None, 64, 64, 16) 32
_________________________________________________________________
conv2d_21 (Conv2D) (None, 64, 64, 16) 272
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 32, 32, 16) 0
_________________________________________________________________
conv2d_22 (Conv2D) (None, 32, 32, 64) 1088
_________________________________________________________________
conv2d_23 (Conv2D) (None, 32, 32, 64) 4160
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 16, 16, 64) 0
_________________________________________________________________
conv2d_24 (Conv2D) (None, 16, 16, 128) 8320
_________________________________________________________________
conv2d_25 (Conv2D) (None, 16, 16, 128) 16512
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 8, 8, 128) 0
_________________________________________________________________
conv2d_26 (Conv2D) (None, 8, 8, 128) 16512
_________________________________________________________________
conv2d_27 (Conv2D) (None, 8, 8, 128) 16512
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 4, 4, 128) 0
_________________________________________________________________
conv2d_28 (Conv2D) (None, 4, 4, 512) 66048
_________________________________________________________________
conv2d_29 (Conv2D) (None, 4, 4, 512) 262656
_________________________________________________________________
conv2d_transpose_5 (Conv2DTr (None, 8, 8, 512) 1049088
_________________________________________________________________
concatenate_5 (Concatenate) (None, 8, 8, 640) 0
_________________________________________________________________
conv2d_30 (Conv2D) (None, 8, 8, 128) 82048
_________________________________________________________________
conv2d_31 (Conv2D) (None, 8, 8, 128) 16512
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 16, 16, 128) 65664
_________________________________________________________________
concatenate_6 (Concatenate) (None, 16, 16, 256) 0
_________________________________________________________________
conv2d_32 (Conv2D) (None, 16, 16, 128) 32896
_________________________________________________________________
conv2d_33 (Conv2D) (None, 16, 16, 128) 16512
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 32, 32, 128) 65664
_________________________________________________________________
concatenate_7 (Concatenate) (None, 32, 32, 192) 0
_________________________________________________________________
conv2d_34 (Conv2D) (None, 32, 32, 64) 12352
_________________________________________________________________
conv2d_35 (Conv2D) (None, 32, 32, 64) 4160
_________________________________________________________________
conv2d_transpose_8 (Conv2DTr (None, 64, 64, 64) 16448
_________________________________________________________________
concatenate_8 (Concatenate) (None, 64, 64, 80) 0
_________________________________________________________________
conv2d_36 (Conv2D) (None, 64, 64, 16) 1296
_________________________________________________________________
conv2d_37 (Conv2D) (None, 64, 64, 16) 272
_________________________________________________________________
conv2d_38 (Conv2D) (None, 64, 64, 4) 68
=================================================================
Total params: 1,755,092.0
Trainable params: 1,755,092.0
Non-trainable params: 0.0

Related

How to merge 2 trained model in keras?

Good evening everyone,
I have 5 classes and each one has 2000 images, I built 2 Models with different model names and that's my model code
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(5, activation=tf.nn.softmax)
], name="Model1")
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_images, train_labels,
batch_size=128, epochs=30, validation_split=0.2)
model.save('f3_1st_model_seg.h5')
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(5, activation=tf.nn.softmax)
], name="Model2")
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_images, train_labels,
batch_size=128, epochs=30, validation_split=0.2)
model.save('f3_2nd_model_seg.h5')
then I used this code to merge the 2 models
input_shape = [150, 150, 3]
model = keras.models.load_model('1st_model_seg.h5')
model.summary()
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 148, 148, 32) 896
max_pooling2d (MaxPooling2D (None, 74, 74, 32) 0
)
conv2d_1 (Conv2D) (None, 72, 72, 32) 9248
max_pooling2d_1 (MaxPooling (None, 36, 36, 32) 0
2D)
conv2d_2 (Conv2D) (None, 34, 34, 64) 18496
max_pooling2d_2 (MaxPooling (None, 17, 17, 64) 0
2D)
conv2d_3 (Conv2D) (None, 15, 15, 128) 73856
max_pooling2d_3 (MaxPooling (None, 7, 7, 128) 0
2D)
flatten (Flatten) (None, 6272) 0
dense (Dense) (None, 5) 31365
=================================================================
Total params: 133,861
Trainable params: 133,861
Non-trainable params: 0
model2 = keras.models.load_model('2nd_model_seg.h5')
model2.summary()
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 148, 148, 32) 896
max_pooling2d (MaxPooling2D (None, 74, 74, 32) 0
)
conv2d_1 (Conv2D) (None, 72, 72, 32) 9248
max_pooling2d_1 (MaxPooling (None, 36, 36, 32) 0
2D)
conv2d_2 (Conv2D) (None, 34, 34, 64) 18496
max_pooling2d_2 (MaxPooling (None, 17, 17, 64) 0
2D)
conv2d_3 (Conv2D) (None, 15, 15, 128) 73856
max_pooling2d_3 (MaxPooling (None, 7, 7, 128) 0
2D)
flatten (Flatten) (None, 6272) 0
dense (Dense) (None, 5) 31365
=================================================================
Total params: 133,861
Trainable params: 133,861
Non-trainable params: 0
def concat_horizontal(models, input_shape):
models_count = len(models)
hidden = []
input = tf.keras.layers.Input(shape=input_shape)
for i in range(models_count):
hidden.append(models[i](input))
output = tf.keras.layers.concatenate(hidden)
model = tf.keras.Model(inputs=input, outputs=output)
return model
new_model = concat_horizontal(
[model, model2], (input_shape))
new_model.save('f1_1st_merged_seg.h5')
new_model.summary()
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 150, 150, 3 0 []
)]
model1 (Sequential) (None, 5) 133861 ['input_1[0][0]']
model2 (Sequential) (None, 5) 133861 ['input_1[0][0]']
concatenate (Concatenate) (None, 10) 0 ['model1[0][0]',
'model2[0][0]']
==================================================================================================
Total params: 267,722
Trainable params: 267,722
Non-trainable params: 0
so after I tested the merged model I found some images getting classes 7 and 9 although I have only 5 classes and that's my code for prediction
class_names = ['A', 'B', 'C', D', 'E']
for img in os.listdir(path):
# predicting images
img2 = tf.keras.preprocessing.image.load_img(
os.path.join(path, img), target_size=(150, 150))
x = tf.keras.preprocessing.image.img_to_array(img2)
x = np.expand_dims(x, axis=0)
images = np.vstack([x])
classes = np.argmax(model.predict(images), axis=-1)
y_out = class_names[classes[0]]
I got this error
y_out = class_names[classes[0]]
IndexError: list index out of range
for this case it could have been done even by sequential method, look you are trying to concatenate two output layers with 5 columns; so it would lead into increase classes from 5 to 10; try out to define these two models up to output layer (the flatten layer as the last layer defined for both these models) and then define final model with input layer, these two models, and concatenate layer and then the output layer with five units and activation;
so remove output layer
tf.keras.layers.Dense(5, activation=tf.nn.softmax)
from those two models, and implement it just as one layer after the output layer you have defined here
def concat_horizontal(models, input_shape):
models_count = len(models)
hidden = []
input = tf.keras.layers.Input(shape=input_shape)
for i in range(models_count):
hidden.append(models[i](input))
output = tf.keras.layers.concatenate(hidden)
output = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(output)
model = tf.keras.Model(inputs=input, outputs=output)
return model
But notice it would be better to define branch models based on functional API method for these cases

ZeroPadding2D pad twices when I set padding to 1

I've just started to learn Tensorflow (2.1.0), Keras (2.3.7) with Python 3.7.7.
I'm trying an encoder-decoder network using VGG16.
I need to Upsample a layer from (12, 12, ...) to (25, 25, ...) to make conv7_1 has the same shape as conv4_3 layer. The layer with the 'problem' is upsp2:
conv4_3 (Conv2D) (None, 25, 25, 512) 2359808
_________________________________________________________________
pool_4 (MaxPooling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv5_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
pool_5 (MaxPooling2D) (None, 6, 6, 512) 0
_________________________________________________________________
upsp1 (UpSampling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv6_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
upsp2 (UpSampling2D) (None, 24, 24, 512) 0
_________________________________________________________________
conv7_1 (Conv2D) (None, 24, 24, 512) 2359808
I have tried this:
#################################
# Decoder
#################################
#conv1 = Conv2DTranspose(512, (2, 2), strides = 2, name = 'conv1')(pool5)
upsp1 = UpSampling2D(size = (2,2), name = 'upsp1')(pool5)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_1')(upsp1)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_2')(conv6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_3')(conv6)
zero1 = ZeroPadding2D(padding = (1,1), data_format = 'channels_last', name='zero1')(conv6)
upsp2 = UpSampling2D(size = (2,2), name = 'upsp2')(zero1)
But I get that shape (12, 12, ...) gets into (14, 14, ...) at zero1 layer:
conv6_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
zero1 (ZeroPadding2D) (None, 14, 14, 512) 0
_________________________________________________________________
upsp2 (UpSampling2D) (None, 28, 28, 512) 0
_________________________________________________________________
How can I upsample (12,12,512) to (25,25,512)?
I did it using padding as a tuple of 2 tuples of 2 ints: interpreted as ((top_pad, bottom_pad), (left_pad, right_pad)). And setting ZeroPadding2D at the end of convolution 7 layer:
#################################
# Decoder
#################################
#conv1 = Conv2DTranspose(512, (2, 2), strides = 2, name = 'conv1')(pool5)
upsp1 = UpSampling2D(size = (2,2), name = 'upsp1')(pool5)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_1')(upsp1)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_2')(conv6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_3')(conv6)
upsp2 = UpSampling2D(size = (2,2), name = 'upsp2')(conv6)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_1')(upsp2)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_2')(conv7)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_3')(conv7)
zero1 = ZeroPadding2D(padding = ((1, 0), (1, 0)), data_format = 'channels_last', name='zero1')(conv7)

Can I use VGG16 for one channel images?

I've just started to learn Tensorflow (2.1.0), Keras (2.3.7) with Python 3.7.7.
I want to use VGG16 network to do semantic segmentation with black and white images (200x200x1).
I have used this network, with its original input_size was (224,224,3):
def vgg16_encoder_decoder(input_size = (200,200,1)):
#################################
# Encoder
#################################
inputs = Input(input_size, name = 'input')
conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_1')(inputs)
conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_2')(conv1)
pool1 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_1')(conv1)
conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_1')(pool1)
conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_2')(conv2)
pool2 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_2')(conv2)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_1')(pool2)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_2')(conv3)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_3')(conv3)
pool3 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_3')(conv3)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_1')(pool3)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_2')(conv4)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_3')(conv4)
pool4 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_4')(conv4)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_1')(pool4)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_2')(conv5)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_3')(conv5)
pool5 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_5')(conv5)
#################################
# Decoder
#################################
#conv1 = Conv2DTranspose(512, (2, 2), strides = 2, name = 'conv1')(pool5)
upsp1 = UpSampling2D(size = (2,2), name = 'upsp1')(pool5)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_1')(upsp1)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_2')(conv6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_3')(conv6)
upsp2 = UpSampling2D(size = (2,2), name = 'upsp2')(conv6)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_1')(upsp2)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_2')(conv7)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_3')(conv7)
upsp3 = UpSampling2D(size = (2,2), name = 'upsp3')(conv7)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_1')(upsp3)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_2')(conv8)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_3')(conv8)
upsp4 = UpSampling2D(size = (2,2), name = 'upsp4')(conv8)
conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_1')(upsp4)
conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_2')(conv9)
upsp5 = UpSampling2D(size = (2,2), name = 'upsp5')(conv9)
conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_1')(upsp5)
conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_2')(conv10)
conv11 = Conv2D(3, 3, activation = 'relu', padding = 'same', name = 'conv11')(conv10)
model = Model(inputs = inputs, outputs = conv11, name = 'vgg-16_encoder_decoder')
return model
Model summary:
Model: "vgg-16_encoder_decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 200, 200, 1) 0
_________________________________________________________________
conv1_1 (Conv2D) (None, 200, 200, 64) 640
_________________________________________________________________
conv1_2 (Conv2D) (None, 200, 200, 64) 36928
_________________________________________________________________
pool_1 (MaxPooling2D) (None, 100, 100, 64) 0
_________________________________________________________________
conv2_1 (Conv2D) (None, 100, 100, 128) 73856
_________________________________________________________________
conv2_2 (Conv2D) (None, 100, 100, 128) 147584
_________________________________________________________________
pool_2 (MaxPooling2D) (None, 50, 50, 128) 0
_________________________________________________________________
conv3_1 (Conv2D) (None, 50, 50, 256) 295168
_________________________________________________________________
conv3_2 (Conv2D) (None, 50, 50, 256) 590080
_________________________________________________________________
conv3_3 (Conv2D) (None, 50, 50, 256) 590080
_________________________________________________________________
pool_3 (MaxPooling2D) (None, 25, 25, 256) 0
_________________________________________________________________
conv4_1 (Conv2D) (None, 25, 25, 512) 1180160
_________________________________________________________________
conv4_2 (Conv2D) (None, 25, 25, 512) 2359808
_________________________________________________________________
conv4_3 (Conv2D) (None, 25, 25, 512) 2359808
_________________________________________________________________
pool_4 (MaxPooling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv5_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
pool_5 (MaxPooling2D) (None, 6, 6, 512) 0
_________________________________________________________________
upsp1 (UpSampling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv6_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
upsp2 (UpSampling2D) (None, 24, 24, 512) 0
_________________________________________________________________
conv7_1 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
conv7_2 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
conv7_3 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
upsp3 (UpSampling2D) (None, 48, 48, 512) 0
_________________________________________________________________
conv8_1 (Conv2D) (None, 48, 48, 256) 1179904
_________________________________________________________________
conv8_2 (Conv2D) (None, 48, 48, 256) 590080
_________________________________________________________________
conv8_3 (Conv2D) (None, 48, 48, 256) 590080
_________________________________________________________________
upsp4 (UpSampling2D) (None, 96, 96, 256) 0
_________________________________________________________________
conv9_1 (Conv2D) (None, 96, 96, 128) 295040
_________________________________________________________________
conv9_2 (Conv2D) (None, 96, 96, 128) 147584
_________________________________________________________________
upsp5 (UpSampling2D) (None, 192, 192, 128) 0
_________________________________________________________________
conv10_1 (Conv2D) (None, 192, 192, 64) 73792
_________________________________________________________________
conv10_2 (Conv2D) (None, 192, 192, 64) 36928
_________________________________________________________________
conv11 (Conv2D) (None, 192, 192, 3) 1731
=================================================================
Total params: 31,787,523
Trainable params: 31,787,523
Non-trainable params: 0
_________________________________________________________________
The last convolutional layer returns a shape of (192, 192, 3) but I need to return an image with shape (200, 200, 1).
I think I can change the last convolutional layer with this one to get a 1 channel image:
conv11 = Conv2D(1, 3, activation = 'relu', padding = 'same', name = 'conv11')(conv10)
But I don't know if this is correct because I've been reading about VGG16 network and it is for 3 channels images.
Can I use VGG16 for one channel images?
What you read about VGG being for three channel (RGB) images applies only to the pre-trained model, which is trained on the ImageNet dataset and contains only color images. Since you are not using the pre-trained model, you are not bound by this limitation.
So you can use one, three, or any number of inputs or output channels.

conv-autoencoder that val_loss doesn't decrease

I build a anomaly detection model using conv-autoencoder on UCSD_ped2 dataset. What puzzles me is that after very few epochs ,the val_loss don't decrease. It seem that the model couldn't learn any longer. I have done some research to improve my model,but it doesn't getting better. what should i do to fix it?
Here's my model's struct:
x=144;y=224
input_img = Input(shape = (x, y, inChannel))
bn1= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(input_img)
conv1 = Conv2D(256, (11, 11), strides=(4,4),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(bn1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
bn2= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool1)
conv2 = Conv2D(128, (5, 5),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(bn2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
bn3= BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(pool2)
conv3 = Conv2D(64, (3, 3), activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(bn3)
ubn3=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(conv3)
uconv3=Conv2DTranspose(128, (3,3),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(ubn3)
upool3=UpSampling2D(size=(2, 2))(uconv3)
ubn2=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool3)
uconv2=Conv2DTranspose(256, (3, 3),activation='relu',
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
padding='same')(ubn2)
upool2=UpSampling2D(size=(2, 2))(uconv2)
ubn1=BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001)(upool2)
decoded = Conv2DTranspose(1, (11, 11), strides=(4, 4),
kernel_regularizer=regularizers.l2(0.0005),
kernel_initializer=initializers.glorot_normal(seed=None),
activation='sigmoid', padding='same')(ubn1)
autoencoder = Model(input_img, decoded)
autoencoder.compile(loss = 'mean_squared_error', optimizer ='Adadelta',metrics=['accuracy'])
history=autoencoder.fit(X_train, Y_train,validation_split=0.3,
batch_size = batch_size, epochs = epochs, verbose = 0,
shuffle=True,
callbacks=[earlystopping,checkpointer,reduce_lr])
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 144, 224, 1) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 144, 224, 1) 4
_________________________________________________________________
conv2d_1 (Conv2D) (None, 36, 56, 256) 31232
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 18, 28, 256) 0
_________________________________________________________________
batch_normalization_2 (Batch (None, 18, 28, 256) 1024
_________________________________________________________________
conv2d_2 (Conv2D) (None, 18, 28, 128) 819328
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 9, 14, 128) 0
_________________________________________________________________
batch_normalization_3 (Batch (None, 9, 14, 128) 512
_________________________________________________________________
conv2d_3 (Conv2D) (None, 9, 14, 64) 73792
_________________________________________________________________
batch_normalization_4 (Batch (None, 9, 14, 64) 256
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 9, 14, 128) 73856
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 18, 28, 128) 0
_________________________________________________________________
batch_normalization_5 (Batch (None, 18, 28, 128) 512
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 18, 28, 256) 295168
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 36, 56, 256) 0
_________________________________________________________________
batch_normalization_6 (Batch (None, 36, 56, 256) 1024
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 144, 224, 1) 30977
=================================================================
Total params: 1,327,685
Trainable params: 1,326,019
Non-trainable params: 1,666
the batch size=30;epoch=100 training data has 1785 pic; validation data has 765 pic.
I have tried :
delete kernel_regularizer;
adding ReduceLROnPlateau.
,but it only get a little improve.
Epoch 00043: ReduceLROnPlateau reducing learning rate to 9.99999874573554e-12.
Epoch 00044: val_loss did not improve from 0.00240
Epoch 00045: val_loss did not improve from 0.00240
As the val_loss get 0.00240, it didn't decrease...
The following figure was loss with epoch.
The following figure show model's reconstruction result which are truly poor.How can I making my model more workful?
Based on your screenshot, It seems that it is not an issue of overfitting or underfitting.
On my understanding:
Underfitting – Validation and training error high
Overfitting – Validation error is high, training error low
Good fit – Validation error low, slightly higher than the training error
Generally speaking, the dataset should be split properly for training and validation.
Typically the training set should be 4 times (80/20) the number of your validation set.
My suggestion is that you can try to increase the number of your datasets by doing data augmentation and continue the training.
Kindly refer to the documentation for data augmentation.

Keras using too much memory

I have a keras (with tensorflow backend) model which is defined like so:
INPUT_SHAPE = [4740, 3540, 1]
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=INPUT_SHAPE))
model.add(Conv2D(2, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(4, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(8, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(16, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(32, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
This model has only 37,506 trainable params. Yet somehow it is able to deplete K80's 12GB vram resource on model.fit() if a batch size is more then 1.
Why does this model need so much memory?
And how do I calculate memory requirements properly?
The function from How to determine needed memory of Keras model? gives me 2.15 GB per 1 element in a batch. So at least I should be able to make a batch of 5.
EDIT: model.summary()
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 4738, 3538, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 4735, 3535, 2) 1026
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1183, 883, 2) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 1180, 880, 4) 132
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 295, 220, 4) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 292, 217, 8) 520
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 73, 54, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 70, 51, 16) 2064
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 17, 12, 16) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 14, 9, 32) 8224
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 3, 2, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 3, 2, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 192) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 24704
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 4) 516
=================================================================
Total params: 37,506
Trainable params: 37,506
Non-trainable params: 0
_________________________________________________________________
The output shape of the first layer is B*4738*3538*32 (B is the batch size), which will take around 1GB * B memory. The gradients and other activations will probably take some memory too. Maybe increasing the stride for the first layer will help.