Large variation in loss and accuracy validation values during training Resnet50 on binary class image classification - tensorflow

I am using Resnet50 for binary image classification and the model shows a high variation in loss and accuracy on validation data during the epochs.
This is what I get after 40 epochs.
Here is my model code:
def build_model():
model = ResNet50(include_top=True ,input_shape=(224,224,3) , weights="imagenet")
for layer in model.layers:
layer.trainable=True
base_input= model.layers[0].input
base_output= model.layers[-2].output
l =Dense(units = 512 ,activation='sigmoid')(l)
l=BatchNormalization()(l)
l=Dropout(0.4)(l)
final_output= Dense(units = 1 ,activation='sigmoid')(l)
new_model= Model(inputs=base_input,outputs= final_output)
return new_model
def train_model(model, train_generator, valid_generator):
model.compile(optimizer = Adam(), loss = 'binary_crossentropy', metrics = ['accuracy'])
history= model.fit(train_generator ,validation_data=(valid_generator) ,epochs=40)
return history
I need to know what's the problem & how to fix it
thanks in advance

Related

I am training a deepfake image detection model, but why the validation accuracy is not changing?

I am training deepfake image detection using Tensorflow, but the validation accuracy is stuck at 67. I have tried to use different optimizers, but it's not decreasing and only floating around the same score.
Here is my step to creating the model.
Importing data from the image folder
Create an ImageDataGenerator object to do some augmentation.
datagen = ImageDataGenerator(
horizontal_flip=True,
validation_split=0.2,
rescale=1./255,
)
Creating the model
image dimension: 299, 299, 3
input_layer = Input(shape = (image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']))
base_model = keras.applications.EfficientNetB5(
weights='imagenet',
input_shape=(image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']),
include_top=False)
base_model.trainable = False
x = base_model(input_layer, training=False)
# Add pooling layer or flatten layer
y = GlobalAveragePooling2D()(x)
y = Dense(512, activation='relu')(y)
y = Dropout(0.4)(y)
y = Dense(256)(y)
# Add final dense layer
output_layer = Dense(1, activation='sigmoid')(y)
model = Model(inputs=input_layer, outputs=output_layer)
Training
efficientNet = EfficientNet(learning_rate = 0.001)
efficientNet.summary()
history = efficientNet.fit(datagen.flow(X_train, y_train, batch_size=64, subset='training'),
epochs=10,
validation_data=datagen.flow(X_train, y_train, batch_size=64, subset='validation'))
Result
Here is the result of the model training
Is there anyway I can fix this problem?

Getting constant accuracies for training and validation sets despite their losses are changing during CNN training?

As the title clearly describes the issue I've been experiencing during the training of my CNN model, the accuracies of training and validation sets are constant despite the losses of them are changing. I have included the detail regarding the model and its training setup below. What may cause this issue?
Here is the data that was used by training (X_train & y_train), validation, and test sets (X_test and y_test):
df = pd.read_csv(CSV_PATH, sep=',', header=None)
print(f'Shape of all data: {df.shape}')
y = df.iloc[:, -1].values
X = df.iloc[:, :-1].values
encoder = LabelEncoder()
encoder.fit(y)
encoded_Y = encoder.transform(y)
dummy_y = to_categorical(encoded_Y)
X_train, X_test, y_train, y_test = train_test_split(X, dummy_y, test_size=0.3, random_state=RANDOM_STATE)
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))
Here are the shapes of training and test sets:
Shape of X_train: (1322, 10800, 1)
Shape of Y_train: (1322, 3)
Shape of X_test: (567, 10800, 1)
Shape of y_test: (567, 3)
Here is my CNN model:
# Model hyper-parameters
activation_fn = 'relu'
n_lr = 1e-4
weight_decay = 1e-4
batch_size = 64
num_epochs = 200*10*10
num_classes = 3
n_dropout = 0.6
n_momentum = 0.5
n_kernel = 5
n_reg = 1e-5
# the sequential model
model = Sequential()
model.add(Conv1D(128, n_kernel, input_shape=(10800, 1)))
model.add(BatchNormalization())
model.add(Activation(activation_fn))
model.add(MaxPooling1D(pool_size=2, strides=2))
model.add(Dropout(n_dropout))
model.add(Conv1D(256, n_kernel))
model.add(BatchNormalization())
model.add(Activation(activation_fn))
model.add(MaxPooling1D(pool_size=2, strides=2))
model.add(Dropout(n_dropout))
model.add(GlobalAveragePooling1D()) # have tried model.add(Flatten()) as well
model.add(Dense(256, activation=activation_fn))
model.add(Dropout(n_dropout))
model.add(Dense(64, activation=activation_fn))
model.add(Dropout(n_dropout))
model.add(Dense(num_classes, activation='softmax'))
adam = Adam(lr=n_lr, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=weight_decay)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['acc'])
Here is how I have evaluated the model:
Y_pred = model.predict(X_test, verbose=0)
y_pred = np.argmax(Y_pred, axis=1)
y_test_int = np.argmax(y_test, axis=1)
And, my model always predicts the same class of three classes during the model evaluation as you can see from the classification result below (via classification_result(y_test_int, y_pred) function):
precision recall f1-score support
normal 0.743 1.000 0.852 421
apb 0.000 0.000 0.000 45
pvc 0.000 0.000 0.000 101
The model was trained using the EarlyStopping callback of Keras. Thus, the training has continued for 4,173 epochs. Here is the obtained losses during the training for training and validation sets:
Here are the obtained accuracies during the training for training and validation sets:
The model was implemented using Keras and hosted on Google Colab.
Although such issues are difficult to resolve without the data, there are a couple of general rules applicable.
The very first thing we do when the model does not seem to learn anything, like here (despite the mild drop in the loss), is to remove all dropout.
In fact, dropout is not supposed to be used by default; its nominal function is to guard against overfitting - but of course, before starting to worry about overfitting, you must first have some success with fitting, something that is clearly not happening here. The fact that, with a dropout rate of n_dropout = 0.6, you also seem to be rather too aggressive in its use, does not help, either.

CNN with imbalanced data stuck with 70% testing accuracy

I'm working on image classification task for diabetic retinopathy with fundus image data. There are 5 classes. The data distribution is 1805 images (class 1), 370 images (class 2), 999 images (class 3), 193 images (class 4), 295 images (class 5).
Here are the steps that I have tried to run:
Preprocessing (resized 224 * 224)
The divide of train and test data is 85% : 15%
x_train, xtest, y_train, ytest = train_test_split(
x_train, y_train,
test_size = 0.15,
random_state=SEED,
stratify = y_train
)
Data agumentation
ImageDataGenerator(
zoom_range=0.15,
fill_mode='constant',
cval=0.,
horizontal_flip=True,
vertical_flip=True,
)
Training with the ResNet-50 model and cross-validation
def getResNet():
modelres = ResNet50(weights=None, include_top=False, input_shape= (IMAGE_HEIGHT,IMAGE_HEIGHT, 3))
x = modelres.output
x = GlobalAveragePooling2D()(x)
x = Dense(5, activation= 'softmax')(x)
model = Model(inputs = modelres.input, outputs = x)
return model
num_folds = 5
skf = StratifiedKFold(n_splits = 5, shuffle=True, random_state=2021)
cvscores = []
fold = 1
for train, val in skf.split(x_train, y_train.argmax(1)):
print('Fold: ', fold)
Xtrain = x_train[train]
Xval = x_train[val]
Ytrain = y_train[train]
Yval = y_train[val]
data_generator = create_datagen().flow(Xtrain, Ytrain, batch_size=32, seed=2021)
model = getResNet()
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.0001),
metrics=['accuracy'])
with tf.compat.v1.device('/device:GPU:0'):
model_train = model.fit(data_generator,
validation_data=(Xval, Yval),
epochs=30, batch_size = 32, verbose=1)
model_name = 'cnn_keras_aug_Fold_'+str(fold)+'.h5'
model.save(model_name)
scores = model.evaluate(xtest, ytest, verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
fold = fold +1
The maximum results I got from this method were training accuracy of 81.2%, validation accuracy of 72.2%, and test accuracy of 70.73%.
Can anyone give me an idea to improve the model so that I can get the test accuracy above 90% as possible?
Later, I will use this model as a pre-trained model to train diabetic retinopathy data as well but from other sources.
BTW, I've tried replacing my preprocessing with this method:
def preprocessing(path):
image = cv2.imread(path)
image = crop_image_from_gray(image)
green = image[:,:,1]
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
cl = clahe.apply(green)
image[:,:,0] = image[:,:,0]
image[:,:,2] = image[:,:,2]
image[:,:,1] = cl
image = cv2.resize(image, (224,224))
return image
I've also tried to replace my model with VGG16, EfficientNetB0. However, none of that had much effect on my results. I'm still stucked with about 70% accuracy.
Please help me come up with ideas to improve my modeling results. I hope.
Your training accuracy is 81.2%. It is generally impossible to have testing accuracy higher that training accuracy, i.e. with current setup you will not achieve 90%.
However, your validation (and also testing) accuracy is about 70-72%. I can suggest that on your small dataset your model is overfitting. So if you add model regularization (e.g. dropout), it is possible that the gap between your training and your validation (and test) will decrease. This way you can improve your validation score.
To further increase the score, you need to check your data manually and try to understand which classes contribute the most to the errors and figure out how those errors can be reduced (e.g. updating your preprocessing pipeline).

Different results from Tensorflow and Keras

I get different results from Tensorflow and Keras with the same network structure.
The loss function looks like
class MaskedMultiCrossEntropy(object):
def loss(self, y_true, y_pred):
vec = tf.nn.softmax_cross_entropy_with_logits(logits=y_pred, labels=y_true, dim=1)
mask = tf.equal(y_true[:,0,:], -1)
zer = tf.zeros_like(vec)
loss = tf.where(mask, x=zer, y=vec)
return loss
The network layer I used is called CrowdsClassification, which is implemented by Keras. Then I build the network by
x = Dense(128, input_shape=(input_dim,), activation='relu')(inputs)
x = Dropout(0.5)(x)
x = Dense(N_CLASSES)(x)
x = Activation("softmax")(x)
crowd = CrowdsClassification(num_classes, num_oracles, conn_type="MW")
x = crowd(x)
Train the model with Keras
model = Model(inputs=inputs, outputs=x)
model.compile(optimizer='adam', loss=loss)
model.fit(inputs,
true_class, epochs=100, shuffle=False, verbose=2, validation_split=0.1))
Train the model with tensorflow
optimizer = tf.train.AdamOptimizer(learning_rate=0.01, beta1=0.9, beta2=0.999)
opt_op = optimizer.minimize(loss, global_step=global_step)
sess.run(tf.global_variables_initializer())
for epoch in range(100):
sess.run([loss, opt_op], feed_dict=train_feed_dict)
The Tensorflow will get a wrong prediction. It seems that the issue comes from the loss function, that Tensorflow cannot backproporgate the masked loss. Anyone can give some advices? Thx a lot.

My loss is "nan" and accuracy is " 0.0000e+00 " in Transfer learning: InceptionV3

I am working on transfer learning. My use case is to classify two categories of images. I used InceptionV3 to classify images. When training my model, I am getting nan as loss and 0.0000e+00 as accuracy in every epoch. I am using 20 epochs because my data amount is small: I got 1000 images for training and 100 for testing and per batch 5 records.
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
x = Dense(512, activation='relu')(x)
x = Dense(32, activation='relu')(x)
# and a logistic layer -- we have 2 classes
predictions = Dense(1, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 249 layers and unfreeze the rest:
for layer in model.layers[:249]:
layer.trainable = False
for layer in model.layers[249:]:
layer.trainable = True
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
'C:/Users/Desktop/Transfer/train/',
target_size=(64, 64),
batch_size=5,
class_mode='binary')
test_set = test_datagen.flow_from_directory(
'C:/Users/Desktop/Transfer/test/',
target_size=(64, 64),
batch_size=5,
class_mode='binary')
model.fit_generator(
training_set,
steps_per_epoch=1000,
epochs=20,
validation_data=test_set,
validation_steps=100)
It sounds like your gradient is exploding. There could be a few reasons for that:
Check that your input is generated correctly. For example use the save_to_dir parameter of flow_from_directory
Since you have a batch size of 5, fix the steps_per_epoch from 1000 to 1000/5=200
Use sigmoid activation instead of softmax
Set a lower learning rate in Adam; to do that you need to create the optimizer separately like adam = Adam(0.0001) and pass it in model.compile(..., optimizer=adam)
Try VGG16 instead of InceptionV3
Let us know when you tried all of the above.
Using Softmax for the activation does not make sense in case of single class. Your output value will always be normed by itself, thus equals to 1. The purpose of softmax is to make the values sum up to 1. In case of single value you will get it == 1. I believe at some moment in time you got 0 as predicted value, which resulted in zero division and NaN loss value.
You should either change the number of classes to 2 by:
predictions = Dense(2, activation='softmax')(x)
class_mode='categorical' in flow_from_directory
loss="categorical_crossentropy"
or use the sigmoid activation function for the last layer.