Loss and accuracy don't change during the training phase - tensorflow

I built a model to colorize a grayscale image, during the training phase i feed the network 100 RGB images of a forest, and then i convert the images to the LAB color space to split the training set to L and AB,
Based on the trained AB data, the model will predict these two channels for the grayscale input image during the testing phase.
Now i have a problem, i trained the model with a different architecture than this one with 10 images, the loss decreased to 0.0035 and it worked good, for that, i wanted to increase the size of the dataset to acquire a better result, but in exchange, the loss and the accuracy kept being constant and the model output is a mess,
My code is the following, i wish anyone can direct me of what i am doing wrong, is it because of the optimizer? the loss function? the batch size? or anything else i'm not aware of,
Thank you in advance.
# Import images
MODEL_NAME = 'forest'
X = []
Y = []
for filename in os.listdir('forest/'):
if (filename != '.DS_Store'):
image = img_to_array(load_img("/Users/moos/Desktop/Project-Master/forest/" + filename))
image = np.array(image, dtype=float)
imL = rgb2lab(1.0 / 255 * image)[:, :,0]
X.append(imL)
imAB = rgb2lab(1.0 / 255 * image)[:, :,1:]
imAB = imAB/128
Y.append(imAB)
X = np.array(X)
Y = np.array(Y)
X = X.reshape(1, 256 , np.size(X)/256, 1)
Y = Y.reshape(1, 256, np.size(Y)/256/2, 2)
# Building the neural network
model = Sequential()
model.add(InputLayer(input_shape=(256, np.size(X)/256, 1)))
model.add(Conv2D(8, (3, 3), activation='relu', padding='same', strides=2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', strides=2))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', strides=1))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', strides=1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=1))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', strides=1))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(8, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(8, (3, 3), activation='relu', padding='same'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(2, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(2, (3, 3), activation='tanh', padding='same'))
model.add(UpSampling2D((2, 2)))
# Finish model
model.compile(optimizer='rmsprop',loss='mse', metrics=['acc'])
#Train the neural network
model.fit(x=X, y=Y, batch_size=100, epochs=1000)
print(model.evaluate(X, Y, batch_size=100))
output
Epoch 1/1000 1/1 [==============================] - 7s 7s/step - loss: 0.0214 - acc: 0.4987
Epoch 2/1000 1/1 [==============================] - 7s 7s/step - loss: 0.0214 - acc:
0.4987
Epoch 3/1000 1/1 [==============================] - 9s 9s/step - loss: 0.0214 - acc: 0.4987
Epoch 4/1000 1/1 [==============================] - 8s 8s/step - loss: 0.0214 - acc:
0.4987 . . . .

First of all, I have simplified the image loading code, and also normalized (subtract mean, divide by standard deviation) all channels (L, A, B) separately, also renamed the variables, that usually helps a lot. (5 minute free Coursera video about normalizing inputs (will bug you to subscribe but just click that away.).) So the loading part now looks like this:
# Import images
MODEL_NAME = 'forest'
imgLABs = []
for filename in os.listdir('./forest/'):
if (filename != '.DS_Store'):
image = img_to_array( load_img("./forest/" + filename) )
imgLABs.append( rgb2lab( image / 255.0 ) )
imgLABs_arr = np.array( imgLABs )
L, A, B = imgLABs_arr[ :, :, :, 0 : 1 ], imgLABs_arr[ :, :, :, 1 : 2 ], imgLABs_arr[ :, :, :, 2 : 3 ]
L_mean, L_std = np.mean( L ), np.std( L )
A_mean, A_std = np.mean( A ), np.std( A )
B_mean, B_std = np.mean( B ), np.std( B )
L, A, B = ( L - L_mean ) / L_std, ( A - A_mean ) / A_std, ( B - B_mean ) / B_std
AB = np.concatenate( ( A, B ), axis = 3)
Also changed the model around, added more feature depth, and a few max pool layers (don't forget to include them in the imports, not shown). Note that the activation function at the final few layers are set to None to allow for negative values, since we're expecting normalized results:
# Building the neural network
model = Sequential()
model.add(InputLayer( input_shape = L.shape[ 1: ] ) )
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2,
kernel_initializer='truncated_normal'))
model.add(MaxPooling2D( (3, 3), strides = 1, padding='same' ) )
model.add(Conv2D(64, (3, 3), activation='relu', padding='same',
kernel_initializer='truncated_normal'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2,
kernel_initializer='truncated_normal'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=1,
kernel_initializer='truncated_normal'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2,
kernel_initializer='truncated_normal'))
model.add(MaxPooling2D( (3, 3), strides = 1, padding='same' ) )
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=1,
kernel_initializer='truncated_normal'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=1,
kernel_initializer='truncated_normal'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=1,
kernel_initializer='truncated_normal'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=1,
kernel_initializer='truncated_normal'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same',
kernel_initializer='truncated_normal'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same',
kernel_initializer='truncated_normal'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(32, (3, 3), activation=None, padding='same',
kernel_initializer='truncated_normal'))
model.add(Conv2D(2, (3, 3), activation=None, padding='same',
kernel_initializer='truncated_normal'))
model.add(UpSampling2D((2, 2)))
# Finish model
optimizer = optimizers.RMSprop( lr = 0.0005, decay = 1e-5 )
model.compile( optimizer=optimizer, loss='mse', metrics=['acc'] )
#Train the neural network
model.fit( x=L, y=AB, batch_size=1, epochs=1800 )
model.save("forest-model-v2.h5")
Note the learning rate of 0.0005, I've experimented with some values, this looked best. Then the learning rate decay can help later in the training, reducing the learning rate as we go along. Also, I've changed the batch_size to 1 - this is very specific to this network and is not generally recommended. But here you mostly have straight convolutions, so it makes sense to update the kernels after each exemplar, as every exemplar itself is affecting the weights from each pixel. But if you change the architecture, then this might not make sense any more, and you should change the batch size back. I've also increased the epochs to 1,800, because it runs fairly quickly on my machine and I had the time to run it. It reaches its maximum around 1,000 though.
With all that, here's the output from the training (first and last few lines only):
Epoch 1/1800
100/100 [==============================] - 6s 63ms/step - loss: 1.0554 - acc: 0.5217
Epoch 2/1800
100/100 [==============================] - 1s 13ms/step - loss: 1.1097 - acc: 0.5703
...
Epoch 1000/1800
100/100 [==============================] - 1s 13ms/step - loss: 0.0533 - acc: 0.9338
...
Epoch 1800/1800
100/100 [==============================] - 1s 13ms/step - loss: 0.0404 - acc: 0.9422
To print the re-colored image I used the following code, please note 5 is just an arbitrary index of the image I picked from the 100; also we need to add back the means and standard deviations for L, A and B (you have to treat these six numbers as part of your network, when you want to use it for actual recoloring - you need to preprocess the input with L_std, L_mean, and then postprocess the output with A, B means and std-s):
predicted = model.predict( x = L[ 5 : 6 ], batch_size = 1, verbose = 1 )
plt.imshow( lab2rgb( np.concatenate(
( ( L[ 5 ] * L_std ) + L_mean,
( predicted[ 0, :, :, 0 : 1 ] * A_std ) + A_mean,
( predicted[ 0, :, :, 1 : 2 ] * B_std ) + B_mean),
axis = 2 ) ) )
img_pred = lab2rgb( np.concatenate(
( ( L[ 5 ] * L_std ) + L_mean,
( predicted[ 0, :, :, 0 : 1 ] * A_std ) + A_mean,
( predicted[ 0, :, :, 1 : 2 ] * B_std ) + B_mean),
axis = 2 ) )
img_orig = lab2rgb( np.concatenate(
( ( L[ 5 ] * L_std ) + L_mean,
( A[ 5 ] * A_std ) + A_mean,
( B[ 5 ] * B_std ) + B_mean ),
axis = 2 ) )
diff = img_orig - img_pred
plt.imshow( diff * 10 )
And with all that the images are (original; greyscale network input; network output (colors restored); difference between original and restored):
Pretty neat! :) Mainly some of the detail on the mountains what's lost only. Since it's only 100 training images, it might be seriously overfitted, though. Still, I hope this gives you a good start!

Related

Model performance on test set fluctuates highly from epoch to epoch

I have been trying to learn a binary classifier for photos with the following arhitechture:
class PatchNDepthBasedCNN:
#staticmethod
def build(width=256, height=256, depth=3):
# initialize the model along with the input shape to be
# "channels last" and the channels dimension itself
model = Sequential()
inputShape = (height, width, depth)
chanDim = -1
# if we are using "channels first", update the input shape
# and channels dimension
if K.image_data_format() == "channels_first":
inputShape = (depth, height, width)
chanDim = 1
# 1
model.add(Conv2D(32, (3, 3), strides=1, padding="same", input_shape=inputShape))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2, padding="same"))
# 2
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
x1 = MaxPooling2D(pool_size=(3, 3), strides=2, padding="same")
model.add(x1)
# the reshaped tensor will be used later in concatenate
print(x1.output.shape)
x1r = Reshape((int(width / 8), int(height / 8), 128))(x1.output)
# 3
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
x2 = MaxPooling2D(pool_size=(3, 3), strides=2, padding="same")
model.add(x2)
# 4
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
x3 = Conv2D(32, (3, 3), strides=1, padding="same")
model.add(x3)
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
# 5 - concat
c = Concatenate()([x1r, x2.output, x3.output])
model.add(InputLayer(input_tensor=c))
# 6 -
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
# 7
final_layer = Conv2D(2, (3, 3), strides=1, padding="same")
model.add(final_layer)
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
# * first (and only) set of FC => RELU layers
model.add(Flatten())
model.add(Dense(64))
model.add(Activation("relu"))
model.add(BatchNormalization())
# model.add(Dropout(0.5))
# softmax classifier
model.add(Dense(2))
model.add(Activation("softmax"))
return model
new_model = PatchNDepthBasedCNN.build(width=IMG_DIM, height=IMG_DIM, depth=3)
new_model.compile(
optimizer="rmsprop",
loss="categorical_crossentropy",
metrics=["accuracy"],
)
During training, I save the model at each epoch (for the purpose of experiment). I always thought that the latest model (the latest epoch) must be the preferred one (in case the model hasn't started to overfit). Still, when I assess each variant (epoch) of the trained model on the test set (from another data distribution), I get randomly fluctuating results from epoch to epoch. Say, the test accuracy on epoch 60 can be around 72%, epoch 61 - 97%, epoch 63 - 80%.
At the same time, if I substitute the last two layers of the model, and change the loss function to simulate SVM, I get overall worse results, but the tendency is clearly seen from epoch to epoch (test accuracy slowly rises from base 50% to around 78%, and then fluctuates within a small margin):
class PatchNDepthBasedCNN:
#staticmethod
def build(width=256, height=256, depth=3):
# initialize the model along with the input shape to be
# "channels last" and the channels dimension itself
model = Sequential()
inputShape = (height, width, depth)
chanDim = -1
# if we are using "channels first", update the input shape
# and channels dimension
if K.image_data_format() == "channels_first":
inputShape = (depth, height, width)
chanDim = 1
# 1
model.add(Conv2D(32, (3, 3), strides=1, padding="same", input_shape=inputShape))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2, padding="same"))
# 2
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
x1 = MaxPooling2D(pool_size=(3, 3), strides=2, padding="same")
model.add(x1)
# the reshaped tensor will be used later in concatenate
print(x1.output.shape)
x1r = Reshape((int(width / 8), int(height / 8), 128))(x1.output)
# 3
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
x2 = MaxPooling2D(pool_size=(3, 3), strides=2, padding="same")
model.add(x2)
# 4
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
x3 = Conv2D(32, (3, 3), strides=1, padding="same")
model.add(x3)
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
# 5 - concat
c = Concatenate()([x1r, x2.output, x3.output])
model.add(InputLayer(input_tensor=c))
# 6 -
model.add(Conv2D(32, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same"))
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
# 7
final_layer = Conv2D(2, (3, 3), strides=1, padding="same")
model.add(final_layer)
model.add(BatchNormalization(axis=chanDim))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(256))
model.add(Activation("relu"))
model.add(Dense(2, kernel_regularizer=l2(0.0001)))
model.add(Activation('linear'))
return model
new_model = PatchNDepthBasedCNN.build(width=IMG_DIM, height=IMG_DIM, depth=3)
new_model.compile(loss='hinge',
optimizer='adadelta',
metrics=['accuracy'])
What are the possble reasons/explanations for this behavior?
What advice could you give me to try achieve better results (if it is possible to infer from the provided data)?
Thank you for considering my question!
EDIT: Removed unused code (LR and other unused parameters from the model, they weren't actually taken into account while training the model, just forgot to remove them)

Tensorflow NIH Chest X-ray CNN validation accuracy not improving even with regularization

I’ve been working on a CNN that takes in a 224x224 grayscale xray image and outputs either 0 or 1 based on whether it detects an abnormality.
This is the dataset I am using. I split the dataset into two with 106496 images for training and the remaining 5624 for validation. Since they’re both from the same dataset, they should both come from the same distribution.
I tried training the model I described above using the pretrained InceptionV3 and VGG19 architectures without success. I then tried making my own model similar to the VGG19 architecture.
I simplified the model as much as possible so that the training accuracy was above 90% and added various regularizers such as dropout and l2. I also tried different hyperparameters and image augmentation but the validation accuracy wouldn’t exceed 70% after 5-10 epochs. The validation loss doesn't seem to drop at all either.
Here are my accuracy vs epoch and loss vs epoch curves (pink is train, green in validation):
And here is my code:
def create_model(settings):
"""
Create a basic model
"""
# create model
model = tf.keras.models.Sequential()
model.add(layers.Input((224, 224, 1)))
# block 1
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block1_conv'))
model.add(layers.MaxPool2D((2, 2), strides=(2, 2), name='block1_pool'))
# block 2
model.add(layers.Conv2D(96, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block2_conv'))
model.add(layers.MaxPool2D((2, 2), strides=(2, 2), name='block2_pool'))
# block 3
model.add(layers.Conv2D(192, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block3_conv1'))
model.add(layers.Conv2D(192, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block3_conv2'))
model.add(layers.MaxPool2D((2, 2), strides=(2, 2), name='block3_pool'))
# block 4
model.add(layers.Conv2D(384, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block4_conv1'))
model.add(layers.Conv2D(384, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block4_conv2'))
model.add(layers.Conv2D(384, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block4_conv3'))
model.add(layers.MaxPool2D((2, 2), strides=(2, 2), name='block4_pool'))
# block 5
model.add(layers.Conv2D(512, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block5_conv1'))
model.add(layers.Conv2D(512, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block5_conv2'))
model.add(layers.Conv2D(512, (3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', use_bias=True, name='block5_conv3'))
model.add(layers.MaxPool2D((2, 2), strides=(2, 2), name='block5_pool'))
# fully connected
model.add(layers.GlobalAveragePooling2D(name='fc_pool'))
model.add(layers.Dropout(0.3, name='fc_dropout'))
model.add(layers.Dense(1, activation='sigmoid', name='fc_output'))
# compile model
model.compile(
optimizers.SGD(
learning_rate=settings["lr_init"],
momentum=settings["momentum"],
),
loss='binary_crossentropy',
metrics=[
'accuracy',
metrics.Precision(),
metrics.Recall(),
metrics.AUC()
]
)
model.summary()
return model
def configure_callbacks(settings):
"""
Create a list of callback objects
"""
# tensorboard
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
# learning rate reduction on plateau
lrreduce_callback = callbacks.ReduceLROnPlateau(
monitor='val_loss',
factor=settings["lr_factor"],
patience=settings["lr_patience"],
min_lr=settings["lr_min"],
verbose=1,
)
# save model
checkpoint_callback = callbacks.ModelCheckpoint(
filepath="saves/" + settings["modelname"] + "/cp-{epoch:03d}",
monitor='val_accuracy',
mode='max',
save_weights_only=True,
save_best_only=True,
verbose=1,
)
return [tensorboard_callback, lrreduce_callback, checkpoint_callback]
def get_data(settings):
"""
Create a generator that will be used for training
"""
df=pd.read_csv("dataset/y_train_binary.csv")
columns = [
"Abnormal"
]
datagen = ImageDataGenerator(
rescale=1./255.,
rotation_range=5,
brightness_range=(0.9, 1.1),
zoom_range=(1, 1.1),
)
# 94.983% for training (106496 = 64*6656)
traindata = datagen.flow_from_dataframe(
dataframe=df[:NTRAIN],
directory="dataset/images",
x_col="Image Index",
y_col=columns,
color_mode='grayscale',
batch_size=settings["batchsize"],
class_mode="raw",
target_size=(224,224),
shuffle=True,
)
# 5.017% for testing (5624)
testdata = datagen.flow_from_dataframe(
dataframe=df[NTRAIN:],
directory="dataset/images",
x_col="Image Index",
y_col=columns,
color_mode='grayscale',
batch_size=settings["batchsize"],
class_mode="raw",
target_size=(224,224),
shuffle=True,
)
return (traindata, testdata)
def newtrain(settings):
"""
Create a new model "(modelname)" and train for (epoch) epochs
"""
model = create_model(settings)
callbacks = configure_callbacks(settings)
traindata, testdata = get_data(settings)
# train
model.fit(
x=traindata,
epochs=settings["epoch"],
validation_data=testdata,
callbacks=callbacks,
verbose=1,
)
model.save_weights(f"saves/{settings['modelname']}/cp-{settings['epoch']:03d}")
I’m running out of ideas and it takes half a day to train 50 epochs so I would appreciate if anyone knows how I can solve this issue. Thanks.
If you do some EDA on NIH Chest X-rays you may also see that there is a significant class imbalance issue among 14 classes. By your model definition, I can assume that you put a normal image on one side and an abnormal (13 cases) on the other side. First of all, if this true, I would say, it's better to classify all cases - all are important in clinician practice.
Shift to 14 cases classification
You're using your own design model, but you should first start with the pre-trained model. It's better and next you can gradually integrate your own idea.
Use pretriend model, e.g DenseNet, EfficientNet, NFNet etc
In your data generator, you use shuffle=True for the test set, which is wrong, rather it should be False.
testdata = datagen.flow_from_dataframe(
....
target_size=(224,224),
shuffle=False
For better control of your input pipeline, IMO, you should write your own custom data generator and experiment with advanced augmentation to prevent overfitting stuff.

Keras loss is in negative and accuracy is going down, but predictions are good?

I'm training a model in Keras with Tensorflow-gpu backend.
Task is to detect buildings in satellite images.
loss is going down(which is good) but in negative direction and accuracy is going down. But good part is, model's predictions are improving. My concern is that why loss is in negative. Moreover, why model is improving while accuracy is going down??
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import MaxPool2D as MaxPooling2D
from tensorflow.keras.layers import UpSampling2D
from tensorflow.keras.layers import concatenate
from tensorflow.keras.layers import Input
from tensorflow.keras import Model
from tensorflow.keras.optimizers import RMSprop
# LAYERS
inputs = Input(shape=(300, 300, 3))
# 300
down0 = Conv2D(32, (3, 3), padding='same')(inputs)
down0 = BatchNormalization()(down0)
down0 = Activation('relu')(down0)
down0 = Conv2D(32, (3, 3), padding='same')(down0)
down0 = BatchNormalization()(down0)
down0 = Activation('relu')(down0)
down0_pool = MaxPooling2D((2, 2), strides=(2, 2))(down0)
# 150
down1 = Conv2D(64, (3, 3), padding='same')(down0_pool)
down1 = BatchNormalization()(down1)
down1 = Activation('relu')(down1)
down1 = Conv2D(64, (3, 3), padding='same')(down1)
down1 = BatchNormalization()(down1)
down1 = Activation('relu')(down1)
down1_pool = MaxPooling2D((2, 2), strides=(2, 2))(down1)
# 75
center = Conv2D(1024, (3, 3), padding='same')(down1_pool)
center = BatchNormalization()(center)
center = Activation('relu')(center)
center = Conv2D(1024, (3, 3), padding='same')(center)
center = BatchNormalization()(center)
center = Activation('relu')(center)
# center
up1 = UpSampling2D((2, 2))(center)
up1 = concatenate([down1, up1], axis=3)
up1 = Conv2D(64, (3, 3), padding='same')(up1)
up1 = BatchNormalization()(up1)
up1 = Activation('relu')(up1)
up1 = Conv2D(64, (3, 3), padding='same')(up1)
up1 = BatchNormalization()(up1)
up1 = Activation('relu')(up1)
up1 = Conv2D(64, (3, 3), padding='same')(up1)
up1 = BatchNormalization()(up1)
up1 = Activation('relu')(up1)
# 150
up0 = UpSampling2D((2, 2))(up1)
up0 = concatenate([down0, up0], axis=3)
up0 = Conv2D(32, (3, 3), padding='same')(up0)
up0 = BatchNormalization()(up0)
up0 = Activation('relu')(up0)
up0 = Conv2D(32, (3, 3), padding='same')(up0)
up0 = BatchNormalization()(up0)
up0 = Activation('relu')(up0)
up0 = Conv2D(32, (3, 3), padding='same')(up0)
up0 = BatchNormalization()(up0)
up0 = Activation('relu')(up0)
# 300x300x3
classify = Conv2D(1, (1, 1), activation='sigmoid')(up0)
# 300x300x1
model = Model(inputs=inputs, outputs=classify)
model.compile(optimizer=RMSprop(lr=0.0001),
loss='binary_crossentropy',
metrics=[dice_coeff, 'accuracy'])
history = model.fit(sample_input, sample_target, batch_size=4, epochs=5)
OUTPUT:
Epoch 6/10
500/500 [==============================] - 76s 153ms/step - loss: -293.6920 -
dice_coeff: 1.8607 - acc: 0.2653
Epoch 7/10
500/500 [==============================] - 75s 150ms/step - loss: -309.2504 -
dice_coeff: 1.8730 - acc: 0.2618
Epoch 8/10
500/500 [==============================] - 75s 150ms/step - loss: -324.4123 -
dice_coeff: 1.8810 - acc: 0.2659
Epoch 9/10
136/500 [=======>......................] - ETA: 55s - loss: -329.0757 - dice_coeff: 1.8940 - acc: 0.2757
Predicted
Target
Where is the problem? (leave dice_coeff it's custom loss)
Your output is not normalized for a binary classification. (Data is also probably not normalized).
If you loaded an image, it's probably 0 to 255, or even 0 to 65355.
You should normalize y_train (divide by y_train.max()) and use a 'sigmoid' activation function at the end of your model.

Tensorflow returns 10% validation accuracy for VGG model (irrespective of number of epochs)?

I am trying to train a neural network on CIFAR-10 using keras package in tensorflow. The neural network considered is VGG-16, which I directly borrowed from the official keras models.
The definition is:
def cnn_model(nb_classes=10):
# VGG-16 official keras model
img_input= Input(shape=(32,32,3))
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(vgg_layer)
# Block 2
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block2_conv1')(vgg_layer)
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block2_conv2')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(vgg_layer)
# Block 3
vgg_layer= Conv2D(128, (3, 3), activation='relu', padding='same', name='block3_conv1')(vgg_layer)
vgg_layer= Conv2D(128, (3, 3), activation='relu', padding='same', name='block3_conv2')(vgg_layer)
vgg_layer= Conv2D(128, (3, 3), activation='relu', padding='same', name='block3_conv3')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(vgg_layer)
# Block 4
vgg_layer= Conv2D(256, (3, 3), activation='relu', padding='same', name='block4_conv1')(vgg_layer)
vgg_layer= Conv2D(256, (3, 3), activation='relu', padding='same', name='block4_conv2')(vgg_layer)
vgg_layer= Conv2D(256, (3, 3), activation='relu', padding='same', name='block4_conv3')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(vgg_layer)
# Classification block
vgg_layer= Flatten(name='flatten')(vgg_layer)
vgg_layer= Dense(1024, activation='relu', name='fc1')(vgg_layer)
vgg_layer= Dense(1024, activation='relu', name='fc2')(vgg_layer)
vgg_layer= Dense(nb_classes, activation='softmax', name='predictions')(vgg_layer)
return Model(inputs=img_input, outputs=vgg_layer)
However during training, I always get both train and validation accuracy as 0.1 i.e, 10%.
validation accuracy for adv. training of model for epoch 1= 0.1
validation accuracy for adv. training of model for epoch 2= 0.1
validation accuracy for adv. training of model for epoch 3= 0.1
validation accuracy for adv. training of model for epoch 4= 0.1
validation accuracy for adv. training of model for epoch 5= 0.1
As a step towards debugging, whenever I replace with any other model (eg, any simple CNN model) it works perfectly well. This shows that the rest of the script works well.
For example the following CNN model works perfectly well and achieves an accuracy of 75% after 30 epochs.
def cnn_model(nb_classes=10, num_hidden=1024, weight_decay= 0.0001, cap_factor=4):
model=Sequential()
input_shape = (32,32,3)
model.add(Conv2D(32*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation='relu', padding='same', input_shape=input_shape))
model.add(Conv2D(32*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(Conv2D(64*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation="relu", padding="same"))
model.add(Conv2D(64*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(num_hidden, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes, activation='softmax'))
return model
It appears to me that both of these models are correctly defined. However, one works perfect while the other doesn't learn at all. I also tried writing the VGG model as an Sequential structure i.e, similar to the second one, but it still gave me 10% accuracy.
Even if the model doesn't update any weights, still the "he_normal" initializer will easily able to obtain a much better accuracy than pure chance. It appears that somehow tensorflow computing the output logits from the model which results in accuracy as pure chance.
I will be really helpful if someone can point out my mistake in it.
Your 10% corresponds higly with nr of classes = 10. That makes me think that regardless of the training, your answer is always "1" for all categories, what constantly gives you 10% accuracy on 10 classes.
Check the output of the untrained model, if it is always 1
If so, check the initial weights of the model, probably it's wrongly initialized, gradients are zero and it can't converge

Training Convolutional Autoencoder with Keras

I'm training a convolutional autoencoder for IR faces, this is my first time doing autoencoder. I have about 1300 training images, and I didn't using any regulation method. Here's what I got after 800 epochs:
top: test images, bottom: output from autoencoder.
And this is my training curve: top: training loss, bottom: validation loss. Validation loss uses the test set images that is separated from training set. At the end, the training loss is about 0.006, but the validation loss is 0.009.
My model is defined bellow, with input images with size 110X150 and output images with size 88X120. I simply resize the source images to make the training labels. Each sample/label are normalized by dividing by 255.
As for the architecture of this network, I read one paper using this similar layout for RGB images face feature, and I halved each layer's depth (channels) for my purpose.
So my question is, is there something wrong? The training curve is quite weird to me. And how do I improve this autoencoder? More epochs? Regulations? Choose another activation function(I heard about leaky ReLU is better). Any feedback and suggestion is appreciated, thanks!
def create_models():
input_img = Input(shape=(150, 110, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(128, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (8, 6, 512) i.e. 128-dimensional
x = Conv2D(512, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(128, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='tanh', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='mean_squared_error')
return autoencoder