how to prevent overfitting in reinforced learning with vgg16 - tensorflow

I'm trying to train a model to recognize facial expressions, so basically a classification problem with 7 classes:
img_size=48
batch_size=64
datagen_train=ImageDataGenerator( rotation_range=15,
width_shift_range=0.15,
height_shift_range=0.15,
shear_range=0.15,
zoom_range=0.15,
horizontal_flip=True,
preprocessing_function=preprocess_input
)
train_generator=datagen_train.flow_from_directory(
train_path,
target_size=(img_size,img_size),
# color_mode='grayscale',
batch_size=batch_size,
class_mode='categorical',
shuffle=True
)
datagen_validation=ImageDataGenerator( horizontal_flip=True, preprocessing_function=preprocess_input)
validation_generator=datagen_train.flow_from_directory(
valid_path,
target_size=(img_size,img_size),
# color_mode='grayscale',
batch_size=batch_size,
class_mode='categorical',
shuffle=True,
)
I'm using ImageDataGenerator and I did my model with VGG16 no head transfer learning like so :
ptm = PretrainedModel(
input_shape=[48,48,3],
weights='imagenet',
include_top=False)
x = Flatten()(ptm.output)
x = Dropout(0.5)(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dense(7, activation='softmax',kernel_initializer='random_uniform', bias_initializer='random_uniform', bias_regularizer=regularizers.l2(0.01), name='predictions')(x)
opt=optimizers.RMSprop(learning_rate=0.0001)
model = Model(inputs=ptm.input, outputs=x)
model.compile(
loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy']
)
model.summary()
I used optimizers and early stopping and ran 100 epochs :
early_stopping = EarlyStopping(
monitor='val_accuracy',
min_delta=0.00005,
patience=11,
verbose=1,
restore_best_weights=True,
)
lr_scheduler = ReduceLROnPlateau(
monitor='val_accuracy',
factor=0.5,
patience=7,
min_lr=1e-7,
verbose=1,
)
callbacks = [
early_stopping,
lr_scheduler,
]
and after 61 epochs I had an early stop, and I got a decent accuracy, but the val_accuracy was very low compared to it:
loss: 0.6081 - accuracy: 0.7910 - val_loss: 1.4658 - val_accuracy: 0.5608
any suggestions on how I can fix this overfitting? thanks!

In your validation generator remove horizontal_flip=True and set shuffle=False. Also, you have the code
validation_generator=datagen_train.flow_from_directory( etc
You want to change it to
validation_generator=datagen_validation.flow_from_directory(etc

Related

Ways to decrease validation loss % and increase validation accuracy %?

I'm trying to work with a image classification model for gravity waves detection.
So I want to check if there is something I could do to lower validation loss %, or more importantly, increase validation accuracy %.
The dataset is about a total of 460 images, split into
300 images that belong to 2 classes
60 images belonging to 2 classes
100 images belonging to 2 classes
For context, this is the code for pre processing:
batch_size = 32
Data Augmentation:
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
horizontal_flip=True,
)
test_datagen = ImageDataGenerator(
rescale=1./255)
The generator that reads images to generate batches of augmented image data
train_generator = train_datagen.flow_from_directory(
'./train', target directory
target_size=(256, 256), images resized to 150x150
batch_size=batch_size,
# batch_size=40,
class_mode='binary')
validation_generator = train_datagen.flow_from_directory(
'./validation',
target_size=(256, 256),
batch_size=batch_size,
# batch_size=20,
class_mode='binary')
test_generator = test_datagen.flow_from_directory(
'./test',
target_size=(256, 256),
# batch_size=batch_size,
batch_size=batch_size,
#class_mode=None,
class_mode= None,
shuffle=False)
And this is the model used:
from tensorflow.keras.applications.inception_v3 import InceptionV3
import tensorflow as tf
from keras import regularizers
base_model = InceptionV3(input_shape = (256,256,3), include_top = False, weights = 'imagenet')
x = layers.Flatten()(base_model.output)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = Model(base_model.input, x)
model.compile(optimizer = tf.keras.optimizers.SGD(learning_rate=0.001), loss = 'binary_crossentropy', metrics = ['accuracy'])
fitness = model.fit(
train_generator,
steps_per_epoch= 120,
epochs = 100,
validation_data=validation_generator,
validation_steps= 64)
So far the accuracy and loss % have been around:
Average training accuracy: 0.9237500047683715
Average training loss: 0.17309745135484264
Average validation accuracy: 0.6489999979734421
Average validation loss: 0.9121886402368545
The predicitons have been around:
validation predictions:
(24, 36)
test predictions:
(45, 55)
And the confusion matrix:
Confusion Matrix:
array([[12, 18],
[12, 18]])

Getting wrong y_pred values from model.predict

First of all I am very new to deep learning. Here I want to create a confusion matrix. For this reason, I need y_pred and y_true. I calculated y_true and y_pred the following way:
y_true = test_gen.classes
y_pred = (model.predict(test_gen)>0.5).astype("int32")
My confusion matrix code is:
from sklearn.metrics import classification_report, confusion_matrix
print('Confusion Matrix')
print(confusion_matrix(y_true, y_pred ))
mat = confusion_matrix(y_true, y_pred)
I set metrics=['accuracy','TruePositives', 'TrueNegatives', 'FalsePositives', 'FalseNegatives'] in my model.compile
best_model = model
best_model.load_weights('./classify_model.h5')
best_model.evaluate(test_gen)
The value that I get for TruePositives,TrueNegatives, FalsePositives,FalseNegatives from best_model.evaluate(test_gen) don't match with my confusion matrix value.
My Train dataset:
My test dataset:
target_size=(224,224)
batch_size=64
train_datagen = ImageDataGenerator(
preprocessing_function=tf.keras.applications.resnet_v2.preprocess_input,
horizontal_flip=True, zoom_range=0.1
)
test_datagen = ImageDataGenerator(
preprocessing_function=tf.keras.applications.resnet_v2.preprocess_input
)
train_gen = train_datagen.flow_from_dataframe(
train_df,
directory=train_path,
x_col='file_paths',
y_col='labels',
target_size=target_size,
batch_size=batch_size,
color_mode='rgb',
class_mode='binary'
)
valid_gen = test_datagen.flow_from_dataframe(
valid_df,
directory=train_path,
x_col='file_paths',
y_col='labels',
target_size=target_size,
batch_size=batch_size,
color_mode='rgb',
class_mode='binary'
)
test_gen = test_datagen.flow_from_dataframe(
test_df,
directory=test_path,
x_col='file_paths',
y_col='labels',
target_size=target_size,
batch_size=batch_size,
color_mode='rgb',
class_mode='binary'
)
base_model = tf.keras.applications.ResNet50V2(include_top=False, input_shape=(224,224,3))
model = tf.keras.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1, activation='sigmoid')
])
lr=0.001
model.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=lr), metrics=['accuracy', 'TruePositives', 'TrueNegatives', 'FalsePositives', 'FalseNegatives'])
I am having trouble calculating y_true and y_pred correctly. Please help me to construct confusion matrix for this code.

Stuck In First Epoch When Training CNN model in google Colab

I created a model to identify plant diseases. I expected to identify 10 diseases. in jupyter notebook, it worked fine but it was slow due to GPU constraints. Then I decided to run that model in google colab but it did not run. it stuck at the first epoch.
The code I use to construct the model is given below
BATCH_SIZE = 64
IMAGE_SIZE = 256
CHANNELS=3
EPOCHS=10
dataset = tf.keras.preprocessing.image_dataset_from_directory(
"/content/drive/MyDrive/google-colab-files/PlantVillage",
seed=123,
shuffle=True,
image_size=(IMAGE_SIZE,IMAGE_SIZE),
batch_size=BATCH_SIZE
)
def get_dataset_partisions_tf(ds,trains_split=0.8,val_split=0.1,test_split=0.1,shuffle=True,shuffle_size=10000):
ds_size = len(ds)
if shuffle:
ds = ds.shuffle(shuffle_size,seed=12)
train_size = int(trains_split * ds_size)
val_size = int(val_split * ds_size)
train_ds = ds.take(train_size)
val_ds = ds.skip(train_size).take(val_size)
test_ds = ds.skip(train_size).skip(val_size)
return train_ds,val_ds,test_ds
train_ds,val_ds,test_ds = get_dataset_partisions_tf(dataset)
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
val_ds = val_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
test_ds = test_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
resize_and_rescales = Sequential([
layers.experimental.preprocessing.Resizing(IMAGE_SIZE,IMAGE_SIZE),
layers.experimental.preprocessing.Rescaling(1.0/255)
])
data_agmetation = Sequential([
layers.experimental.preprocessing.RandomFlip('horizontal_and_vertical'),
layers.experimental.preprocessing.RandomRotation(0.2),
])
input_shape = (BATCH_SIZE,IMAGE_SIZE,IMAGE_SIZE,CHANNELS)
n_classes = 10
model = Sequential([
resize_and_rescales,
data_agmetation,
layers.Conv2D(32,(3,3), activation='relu',input_shape = input_shape),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,kernel_size = (3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,kernel_size = (3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Flatten(),
layers.Dense(64,activation='relu'),
layers.Dense(n_classes, activation='softmax'),
])
model.build(input_shape = input_shape)
model.summary()
A screenshot of the model summary is:
model.compile(
optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy']
)
When I use the following code to train data:
model.fit(
train_ds,
epochs=EPOCHS,
batch_size=BATCH_SIZE,
verbose=2,
validation_data=val_ds
)
it keeps stuck in the first epoch
Check if TensorFlow is using a GPU or not. You can try reducing batch size.
my assumption is that this is because of your verbose, you should set verbose to 1 to see the step of the epoch you are in.

TensorFlow 2 Quantization Aware Training (QAT) with tf.GradientTape

Can anyone point to references where one can learn how to perform Quantization Aware Training (QAT) with tf.GradientTape on TensorFlow 2?
I only see this done with the tf.keras API. I do not use tf. keras, I always build customized training with tf.GradientTape provides more control over the training process. I now need to quantize a model but I only see references on how to do it using the tf. keras API.
In the official examples here, they showed QAT training with model. fit. Here is a demonstration of Quantization Aware Training using tf.GradientTape(). But for complete reference, let's do both here.
Base model training. This is directly from the official doc. For more details, please check there.
import os
import tensorflow as tf
from tensorflow import keras
import tensorflow_model_optimization as tfmot
# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images = test_images / 255.0
# Define the model architecture.
model = keras.Sequential([
keras.layers.InputLayer(input_shape=(28, 28)),
keras.layers.Reshape(target_shape=(28, 28, 1)),
keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation='relu'),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Flatten(),
keras.layers.Dense(10)
])
# Train the digit classification model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.summary()
model.fit(
train_images,
train_labels,
epochs=1,
validation_split=0.1,
)
10ms/step - loss: 0.5411 - accuracy: 0.8507 - val_loss: 0.1142 - val_accuracy: 0.9705
<tensorflow.python.keras.callbacks.History at 0x7f9ee970ab90>
QAT .fit.
Now, performing QAT over the base model.
# -----------------------
# ------------- Quantization Aware Training -------------
import tensorflow_model_optimization as tfmot
quantize_model = tfmot.quantization.keras.quantize_model
# q_aware stands for for quantization aware.
q_aware_model = quantize_model(model)
# `quantize_model` requires a recompile.
q_aware_model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
q_aware_model.summary()
train_images_subset = train_images[0:1000]
train_labels_subset = train_labels[0:1000]
q_aware_model.fit(train_images_subset, train_labels_subset,
batch_size=500, epochs=1, validation_split=0.1)
356ms/step - loss: 0.1431 - accuracy: 0.9629 - val_loss: 0.1626 - val_accuracy: 0.9500
<tensorflow.python.keras.callbacks.History at 0x7f9edf0aef90>
Checking performance
_, baseline_model_accuracy = model.evaluate(
test_images, test_labels, verbose=0)
_, q_aware_model_accuracy = q_aware_model.evaluate(
test_images, test_labels, verbose=0)
print('Baseline test accuracy:', baseline_model_accuracy)
print('Quant test accuracy:', q_aware_model_accuracy)
Baseline test accuracy: 0.9660999774932861
Quant test accuracy: 0.9660000205039978
QAT tf.GradientTape().
Here is the QAT training part on the base model. Note we can also perform custom training over the base model.
batch_size = 500
train_dataset = tf.data.Dataset.from_tensor_slices((train_images_subset,
train_labels_subset))
train_dataset = train_dataset.batch(batch_size=batch_size,
drop_remainder=False)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam()
for epoch in range(1):
for x, y in train_dataset:
with tf.GradientTape() as tape:
preds = q_aware_model(x, training=True)
loss = loss_fn(y, preds)
grads = tape.gradient(loss, q_aware_model.trainable_variables)
optimizer.apply_gradients(zip(grads, q_aware_model.trainable_variables))
_, baseline_model_accuracy = model.evaluate(
test_images, test_labels, verbose=0)
_, q_aware_model_accuracy = q_aware_model.evaluate(
test_images, test_labels, verbose=0)
print('Baseline test accuracy:', baseline_model_accuracy)
print('Quant test accuracy:', q_aware_model_accuracy)
Baseline test accuracy: 0.9660999774932861
Quant test accuracy: 0.9645000100135803

CoreMLtools and Keras ValueError: need more than 1 value to unpack

I'm fine-tuning the Inception V3 model with Keras, in order to convert it with coremltools into a .mlmodel file.
However, when converting the model coremltools throws an error saying the following when the converter reaches the last layer of the model:
coremltools/models/neural_network.py", line 2501, in set_pre_processing_parameters
channels, height, width = array_shape
ValueError: need more than 1 value to unpack
I used the code from the Keras documentation on applications found here: https://keras.io/applications/#fine-tune-inceptionv3-on-a-new-set-of-classes
And added a piece of code loading my dataset from the VGG example found here: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
My final script looks like this, using TesorFlow as backend:
LOAD THE DATA
from keras.preprocessing.image import ImageDataGenerator
img_width, img_height = 299, 299
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 358
nb_validation_samples = 21
epochs = 1
batch_size = 15
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
TRAIN THE MODEL
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(7, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
for i, layer in enumerate(base_model.layers):
print(i, layer.name)
for layer in model.layers[:249]:
layer.trainable = False
for layer in model.layers[249:]:
layer.trainable = True
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
model.save('finetuned_inception.h5')
I'm writing here in response to #SwimBikeRun's request (as I need a bit more space)
I was converting YOLO to Keras and then Keras to CoreML. For conversion I was using this script https://github.com/qqwweee/keras-yolo3/blob/master/convert.py
In the conversion-process the model was eventually created like that:
input_layer = Input(shape=(None, None, 3))
...
model = Model(inputs=input_layer, outputs=[all_layers[i] for i in out_index])
And those "None"-inputs was what made CoreML conversion fail. For CoreML the input-size to your model must be known. So I changed it to this:
input_layer = Input(shape=(416, 416, 3)
Your input-size will probably vary.
For your original question:
Maybe check your base_model.input size for the same problem.