VGG 16 model training with tensorflow - tensorflow

I'm trying to use VGG16 from keras to train a model for image detection.
Based on these articles (https://www.pyimagesearch.com/2019/06/03/fine-tuning-with-keras-and-deep-learning/ and https://learnopencv.com/keras-tutorial-fine-tuning-using-pre-trained-models/), I've put some addition Dense layer to the VGG 16 model. However, the training accuracy with 20 epoche is around 35% to 41% which doesn't match the result of these articles (above 90%).
Due to this, I would like to know, did I do something wrong with my code below.
Basic setting
url='/content/drive/My Drive/fer2013.csv'
batch_size = 64
img_width,img_height = 48,48
# 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral
num_classes = 7
model_path = '/content/drive/My Drive/Af/cnn.h5'
df=pd.read_csv(url)
def _load_fer():
# Load training and eval data
df = pd.read_csv(url, sep=',')
train_df = df[df['Usage'] == 'Training']
eval_df = df[df['Usage'] == 'PublicTest']
return train_df, eval_df
def _preprocess_fer(df,label_col='emotion',feature_col='pixels'):
labels, features = df.loc[:, label_col].values.astype(np.int32), [
np.fromstring(image, np.float32, sep=' ')
for image in df.loc[:, feature_col].values]
labels = [to_categorical(l, num_classes=num_classes) for l in labels]
features = np.stack((features,) * 3, axis=-1)
features /= 255
features = features.reshape(features.shape[0], img_width, img_height,3)
return features, labels
# Load fer data
train_df, eval_df = _load_fer()
# preprocess fer data
x_train, y_train = _preprocess_fer(train_df)
x_valid, y_valid = _preprocess_fer(eval_df)
gen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
train_generator = gen.flow(x_train, y_train, batch_size=batch_size)
predict_size_train = int(np.math.ceil(len(x_train) / batch_size))
input_tensor = Input(shape=(img_width, img_height, 3))
Now comes the model training part
baseModel = VGG16(
include_top=False, weights='imagenet',
input_tensor=input_tensor
)
# Construct the head of the model that will be placed on top of the base model (fine tuning)
headModel = baseModel.output
headModel = Flatten()(headModel)
headModel = Dense(1024, activation="relu")(headModel)
#headModel = Dropout(0.5)(headModel)
headModel = BatchNormalization()(headModel)
headModel = Dense(num_classes, activation="softmax")(headModel)
model = Model(inputs=baseModel.input, outputs=headModel)
for layer in baseModel.layers:
layer.trainable = False
model summary
model.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(lr=0.001),
metrics=['accuracy'])
history = model.fit(train_generator,
steps_per_epoch=predict_size_train * 1,
epochs=20,
validation_data=valid_generator,
validation_steps=predict_size_valid)
Result:
Result after training
It will be very thankful for you advice.
Best Regards.

Since you are freezing all layers, only one dense layer might not give you desired accuracy. Also if you are not in hurry, you may not set up the validation_steps and steps_per_epochs parameters. Also in this tutorial, model is having fluctuations, which do not want.
I suggest:
for layer in baseModel.layers:
layer.trainable = False
base_out = baseModel.get_layer('block3_pool').output // layer name may be different,
check with model baseModel.summary
With that you can get spefic layer's output. After got the output, you can add some convolutions. After convolutions try stacking more dense layers like:
x = tf.keras.layers.Flatten()(x)
x = Dense(512, activation= 'relu')(x)
x = Dropout(0.3)(x)
x = Dense(256, activation= 'relu')(x)
x = Dropout(0.2)(x)
output_model = Dense(num_classes, activation = 'softmax')(x)
If you don't want to add convolutions and use baseModel completely, that's also fine however you can do something like this:
for layer in baseModel.layers[:12]: // 12 is random you can try different ones. Not
all layers are frozen this time.
layer.trainable = False
for i, layer in enumerate(baseModel.layers):
print(i, layer.name, layer.trainable)
// check frozen layers
After that, you can try to set:
headModel = baseModel.output
headModel = Flatten()(headModel)
headModel = Dense(1024, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dense(num_classes, activation="softmax")(headModel)
If you see your model is learning, but your loss having fluctuations then you can reduce learning rate. Or you can use ReduceLROnPlateau callback:
rd_lr = ReduceLROnPlateau(monitor='val_loss', factor = np.sqrt(0.1), patience= 4, verbose = 1, min_lr = 5e-8)
Parameters are totally up to your model. For more details you can see docs

what is the form of the content of y_train. If they are integer values then you need to convert them to one hot vectors with
y_train=tf.keras.utils.to_categorical(train, num_classes)
since you are using loss='categorical_crossentropy' in model.compile. In addition VGG16 requires that the pixels be scaled between -1 and +1 so in include
gen = ImageDataGenerator(tf.keras.applications.vgg16.preprocess_input, etc
When you are training you have
for layer in baseModel.layers:
layer.trainable = False
so you are only training the dense layer which is OK but may not give you high accuracy. You might want to leave VGG as trainable but of course this will take longer. Or after you train with VGG not trainable, then change it back to trainable and run a few more epochs to fine tune the model.

Related

sklearn classification_report ValueError: Found input variables with inconsistent numbers of samples: [18, 576]

I'm working on a CNN classification problem. I used keras and a pre-trained model. Now I want to evaluate my model and need the precision, recall and f1-Score. When I use sklearn.metrics classification_report I get above error. I know where the numbers are coming from, first is the length of my test dataset in batches and second are the number of actual sampels (predictions) in there. However I don't know how to "convert" them.
See my code down below:
# load train_ds
train_ds = tf.keras.utils.image_dataset_from_directory(
directory ='/gdrive/My Drive/Flies_dt/224x224',
image_size = (224, 224),
validation_split = 0.40,
subset = "training",
seed = 123,
shuffle = True)
# load val_ds
val_ds = tf.keras.utils.image_dataset_from_directory(
directory ='/gdrive/My Drive/Flies_dt/224x224',
image_size = (224, 224),
validation_split = 0.40,
subset = "validation",
seed = 123,
shuffle = True)
# move some batches of val_ds to test_ds
test_ds = val_ds.take((1*len(val_ds)) // 2)
print('test_ds =', len(test_ds))
val_ds = val_ds.skip((1*len(val_ds)) // 2)
print('val_ds =', len(val_ds)) #test_ds = 18 val_ds = 18
# Load Model
base_model = keras.applications.vgg19.VGG19(
include_top=False,
weights='imagenet',
input_shape=(224,224,3)
)
# Freeze base_model
base_model.trainable = False
#
inputs = keras.Input(shape=(224,224,3))
x = data_augmentation(inputs) #apply data augmentation
# Preprocessing
x = tf.keras.applications.vgg19.preprocess_input(x)
# The base model contains batchnorm layers. We want to keep them in inference mode
# when we unfreeze the base model for fine-tuning, so we make sure that the
# base_model is running in inference mode here.
x = base_model(x, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dropout(0.2)(x) # Regularize with dropout
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs, outputs)
model.compile(
loss="sparse_categorical_crossentropy",
optimizer="Adam",
metrics=['acc']
)
model.fit(train_ds, epochs=8, validation_data=val_ds, callbacks=[tensorboard_callback])
# Unfreeze the base_model. Note that it keeps running in inference mode
# since we passed `training=False` when calling it. This means that
# the batchnorm layers will not update their batch statistics.
# This prevents the batchnorm layers from undoing all the training
# we've done so far.
base_model.trainable = True
model.summary()
model.compile(
optimizer=keras.optimizers.Adam(learning_rate=0.000001), # Low learning rate
loss="sparse_categorical_crossentropy",
metrics=['acc']
)
model.fit(train_ds, epochs=5, validation_data=val_ds)
#Evaluate
from sklearn.metrics import classification_report
y_pred = model.predict(test_ds, batch_size=64, verbose=1)
y_pred_bool = np.argmax(y_pred, axis=1)
print(classification_report(test_ds, y_pred_bool))
I also tried something like this, but I'm not sure if this gives me the correct values for multiclass classification.
from keras import backend as K
def recall_m(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def precision_m(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
def f1_m(y_true, y_pred):
precision = precision_m(y_true, y_pred)
recall = recall_m(y_true, y_pred)
return 2*((precision*recall)/(precision+recall+K.epsilon()))
# compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc',f1_m,precision_m, recall_m])
# fit the model
history = model.fit(Xtrain, ytrain, validation_split=0.3, epochs=10, verbose=0)
# evaluate the model
loss, accuracy, f1_score, precision, recall = model.evaluate(Xtest, ytest, verbose=0)
This is a lot, Sorry. Hope somebody can help.

I am training a deepfake image detection model, but why the validation accuracy is not changing?

I am training deepfake image detection using Tensorflow, but the validation accuracy is stuck at 67. I have tried to use different optimizers, but it's not decreasing and only floating around the same score.
Here is my step to creating the model.
Importing data from the image folder
Create an ImageDataGenerator object to do some augmentation.
datagen = ImageDataGenerator(
horizontal_flip=True,
validation_split=0.2,
rescale=1./255,
)
Creating the model
image dimension: 299, 299, 3
input_layer = Input(shape = (image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']))
base_model = keras.applications.EfficientNetB5(
weights='imagenet',
input_shape=(image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']),
include_top=False)
base_model.trainable = False
x = base_model(input_layer, training=False)
# Add pooling layer or flatten layer
y = GlobalAveragePooling2D()(x)
y = Dense(512, activation='relu')(y)
y = Dropout(0.4)(y)
y = Dense(256)(y)
# Add final dense layer
output_layer = Dense(1, activation='sigmoid')(y)
model = Model(inputs=input_layer, outputs=output_layer)
Training
efficientNet = EfficientNet(learning_rate = 0.001)
efficientNet.summary()
history = efficientNet.fit(datagen.flow(X_train, y_train, batch_size=64, subset='training'),
epochs=10,
validation_data=datagen.flow(X_train, y_train, batch_size=64, subset='validation'))
Result
Here is the result of the model training
Is there anyway I can fix this problem?

i am building my keras transfer learning model like shown below One thing i am unable to do is set training=False for batchNorm layers in xception

As it you can see I dont wanna change the way I built this model there is a way to change but that converts xception model into some functional model and In model summary it just shows Xception instead of all its layers and also I cant apply grad cam. So please help someone.
code
def build_model():
# use imagenet - pre-trainined weights for images
baseModel =Xception(weights= 'imagenet', include_top = False, input_shape=(224, 224, 3))
for layer in baseModel.layers:
layer.trainable = False
bn_layer.trainable=False
headModel =baseModel.output
headModel = Flatten()(headModel)
headModel = Dense(64,activation="LeakyReLU")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(32,activation="LeakyReLU")(headModel)
headModel = Dropout(0.4)(headModel)
headModel = Dense(16, activation="LeakyReLU")(headModel)
headModel = Dropout(0.3)(headModel)
headModel = Dense(8, activation="LeakyReLU")(headModel)
headModel = Dropout(0.2)(headModel)
headModel = Dense(3, activation="softmax")(headModel)
x = Model(baseModel.inputs,outputs=headModel)
optimizers = Adam(learning_rate= 0.001)
x.compile(loss = 'categorical_crossentropy', optimizer = optimizers, metrics = ['accuracy'])
return x
x= build_model()
sum=x.summary()
As far as I understood is that you want to freeze all the layer in the base model, right? You can do the following:
for layer in baseModel.layers:
layer.trainable = False
In this case, your code is right, except that you do not need the second line.
Instead if you want to train all the layers but the BatchNormalization layers then you can do the following:
for layer in model.layers:
if not isinstance(layer, layers.BatchNormalization):
layer.trainable = True
For more read: https://keras.io/guides/transfer_learning/ Especially, the section Example: the BatchNormalization layer.

How to freeze/unfreeze a pretrained Model as part of a subclassed Model in Tensorflow?

I am trying to build a subclassed Model which consists of a pretrained convolutional Base and some Dense Layers on top, using Tensorflow >= 2.4.
However freezing/unfreezing of the subclassed Model has no effect once it was trained before. When I do the same with the Functional API everything works as expected. I would really appreciate some Hint to what im missing here: Following Code should specify my problem further. Pardon me the amount of Code:
#Setup
import tensorflow as tf
tf.config.run_functions_eagerly(False)
import numpy as np
from tensorflow.keras.regularizers import l1
import matplotlib.pyplot as plt
#tf.function
def create_images_and_labels(img,label, height = 70, width = 70): #Image augmentation
label = tf.cast(label, 'float32')
label = tf.squeeze(label)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, (height, width))
# img = preprocess_input(img)
return img, label
cifar = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar.load_data()
num_classes = len(np.unique(y_train))
ds_train = tf.data.Dataset.from_tensor_slices((x_train, tf.one_hot(y_train, depth = len(np.unique(y_train)))))
ds_train = ds_train.map(lambda img, label: create_images_and_labels(img, label, height = 70, width = 70))
ds_train = ds_train.shuffle(50000)
ds_train = ds_train.batch(50, drop_remainder = True)
ds_val = tf.data.Dataset.from_tensor_slices((x_test, tf.one_hot(y_test, depth = len(np.unique(y_train)))))
ds_val = ds_val.map(lambda img, label: create_images_and_labels(img, label, height = 70, width = 70))
ds_val = ds_val.batch(50, drop_remainder=True)
# for i in ds_train.take(1):
# x, y = i
# for ind in range(x.shape[0]):
# plt.imshow(x[ind,:,:])
# plt.show()
# print(y[ind])
'''
Defining simple subclassed Model consisting of
VGG16
Flatten
Dense Layers
customized what happens in model.fit and model.evaluate (Actually its the standard Keras procedure with custom Metrics)
customized metrics: Loss and Accuracy for Training and Validation Step
added unfreezing Method
'set_trainable_layers'
Arguments:
num_head (How many dense Layers)
num_base (How many VGG Layers)
'''
class Test_Model(tf.keras.models.Model):
def __init__(
self,
num_unfrozen_head_layers,
num_unfrozen_base_layers,
num_classes,
conv_base = tf.keras.applications.VGG16(include_top = False, weights = 'imagenet', input_shape = (70,70,3)),
):
super(Test_Model, self).__init__(name = "Test_Model")
self.base = conv_base
self.flatten = tf.keras.layers.Flatten()
self.dense1 = tf.keras.layers.Dense(2048, activation = 'relu')
self.dense2 = tf.keras.layers.Dense(1024, activation = 'relu')
self.dense3 = tf.keras.layers.Dense(128, activation = 'relu')
self.out = tf.keras.layers.Dense(num_classes, activation = 'softmax')
self.out._name = 'out'
self.train_loss_metric = tf.keras.metrics.Mean('Supervised Training Loss')
self.train_acc_metric = tf.keras.metrics.CategoricalAccuracy('Supervised Training Accuracy')
self.val_loss_metric = tf.keras.metrics.Mean('Supervised Validation Loss')
self.val_acc_metric = tf.keras.metrics.CategoricalAccuracy('Supervised Validation Accuracy')
self.loss_fn = tf.keras.losses.categorical_crossentropy
self.learning_rate = 1e-4
# self.build((None, 32,32,3))
self.set_trainable_layers(num_unfrozen_head_layers, num_unfrozen_base_layers)
#tf.function
def call(self, inputs, training = False):
x = self.base(inputs)
x = self.flatten(x)
x = self.dense1(x)
x = self.dense2(x)
x = self.dense3(x)
x = self.out(x)
return x
#tf.function
def train_step(self, input_data):
x_batch, y_batch = input_data
with tf.GradientTape() as tape:
tape.watch(x_batch)
y_pred = self(x_batch, training = True)
loss = self.loss_fn(y_batch, y_pred)
trainable_vars = self.trainable_weights
gradients = tape.gradient(loss, trainable_vars)
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
self.train_loss_metric.update_state(loss)
self.train_acc_metric.update_state(y_batch, y_pred)
return {"Supervised Loss": self.train_loss_metric.result(),
"Supervised Accuracy":self.train_acc_metric.result()}
#tf.function
def test_step(self, input_data):
x_batch,y_batch = input_data
y_pred = self(x_batch, training = False)
loss = self.loss_fn(y_batch, y_pred)
self.val_loss_metric.update_state(loss)
self.val_acc_metric.update_state(y_batch, y_pred)
return {"Val Supervised Loss": self.val_loss_metric.result(),
"Val Supervised Accuracy":self.val_acc_metric.result()}
#property
def metrics(self):
# We list our `Metric` objects here so that `reset_states()` can be
# called automatically at the start of each epoch
# or at the start of `evaluate()`.
# If you don't implement this property, you have to call
# `reset_states()` yourself at the time of your choosing.
return [self.train_loss_metric,
self.train_acc_metric,
self.val_loss_metric,
self.val_acc_metric]
def set_trainable_layers(self, num_head, num_base):
for layer in [lay for lay in self.layers if not isinstance(lay , tf.keras.models.Model)]:
layer.trainable = False
print(layer.name, layer.trainable)
for block in self.layers:
if isinstance(block, tf.keras.models.Model):
print('Found Submodel', block.name)
for layer in block.layers:
layer.trainable = False
print(layer.name, layer.trainable)
if num_base > 0:
for layer in block.layers[-num_base:]:
layer.trainable = True
print(layer.name, layer.trainable)
if num_head > 0:
for layer in [lay for lay in self.layers if not isinstance(lay, tf.keras.models.Model)][-num_head:]:
layer.trainable = True
print(layer.name, layer.trainable)
'''
Showcase1: First training completely frozen Model, then unfreezing:
unfreezed model doesnt learn
'''
model = Test_Model(num_unfrozen_head_layers= 0, num_unfrozen_base_layers = 0, num_classes = num_classes) # Should NOT learn -> doesnt learn
model.build((None, 70,70,3))
model.summary()
model.compile(optimizer = tf.keras.optimizers.Adam(1e-5))
model.fit(ds_train, validation_data = ds_val)
model.set_trainable_layers(10,20) # SHOULD LEARN -> Doesnt learn
model.summary()
model.compile(optimizer = tf.keras.optimizers.Adam(1e-5))
model.fit(ds_train, validation_data = ds_val)
#DOESNT LEARN
'''
Showcase2: when first training the Model with more trainable Layers than in the second step:
AssertionError occurs
'''
model = Test_Model(num_unfrozen_head_layers= 10, num_unfrozen_base_layers = 2, num_classes = num_classes) # SHOULD LEARN -> learns
model.build((None, 70,70,3))
model.summary()
model.compile(optimizer = tf.keras.optimizers.Adam(1e-5))
model.fit(ds_train, validation_data = ds_val)
model.set_trainable_layers(1,1) # SHOULD NOT LEARN -> AssertionError
model.summary()
model.compile(optimizer = tf.keras.optimizers.Adam(1e-5))
model.fit(ds_train, validation_data = ds_val)
'''
Showcase3: same Procedure as in Showcase2 but optimizer State is transferred to recompiled Model:
Cant set Weigthts because optimizer expects List of Length 0
'''
model = Test_Model(num_unfrozen_head_layers= 10, num_unfrozen_base_layers = 20, num_classes = num_classes) # SHOULD LEARN -> learns
model.build((None, 70,70,3))
model.summary()
model.compile(optimizer = tf.keras.optimizers.Adam(1e-5))
model.fit(ds_train, validation_data = ds_val)
opti_state = model.optimizer.get_weights()
model.set_trainable_layers(0,0) # SHOULD NOT LEARN -> Learns
model.summary()
model.compile(optimizer = tf.keras.optimizers.Adam(1e-5))
model.optimizer.set_weights(opti_state)
model.fit(ds_train, validation_data = ds_val)
#%%%
'''
Constructing same Architecture with Functional API and running Experiments
'''
import tensorflow as tf
conv_base = tf.keras.applications.VGG16(include_top = False, weights = 'imagenet', input_shape = (70,70,3))
inputs = tf.keras.layers.Input((70,70,3))
x = conv_base(inputs)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dense(2048, activation = 'relu') (x)
x = tf.keras.layers.Dense(1024,activation = 'relu') (x)
x = tf.keras.layers.Dense(128,activation = 'relu') (x)
out = tf.keras.layers.Dense(num_classes,activation = 'softmax') (x)
isinstance(tf.keras.layers.Flatten(), tf.keras.models.Model)
isinstance(conv_base, tf.keras.models.Model)
def set_trainable_layers(mod, num_head, num_base):
import time
for layer in [lay for lay in mod.layers if not isinstance(lay , tf.keras.models.Model)]:
layer.trainable = False
print(layer.name, layer.trainable)
for block in mod.layers:
if isinstance(block, tf.keras.models.Model):
print('Found Submodel')
for layer in block.layers:
layer.trainable = False
print(layer.name, layer.trainable)
if num_base > 0:
for layer in block.layers[-num_base:]:
layer.trainable = True
print(layer.name, layer.trainable)
if num_head > 0:
for layer in [lay for lay in mod.layers if not isinstance(lay, tf.keras.models.Model)][-num_head:]:
layer.trainable = True
print(layer.name, layer.trainable)
'''
Showcase1: First training frozen Model, then unfreezing, recomiling and retraining:
model behaves as expected
'''
mod = tf.keras.models.Model(inputs,out, name = 'TestModel')
set_trainable_layers(mod, 0 ,0)
mod.summary()
mod.compile(optimizer = tf.keras.optimizers.Adam(1e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
mod.fit(ds_train, validation_data = ds_val) # Model should NOT learn
set_trainable_layers(mod, 10,20)
mod.summary()
mod.compile(optimizer = tf.keras.optimizers.Adam(1e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
mod.fit(ds_train, validation_data = ds_val) #Model SHOULD learn
'''
Showcase2: First training unfrozen Model, then reducing number of trainable Layers:
Model behaves as Expected
'''
mod = tf.keras.models.Model(inputs,out, name = 'TestModel')
set_trainable_layers(mod, 10 ,20)
mod.summary()
mod.compile(optimizer = tf.keras.optimizers.Adam(1e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
mod.fit(ds_train, validation_data = ds_val) # Model SHOULD learn
set_trainable_layers(mod, 0,0)
mod.summary()
mod.compile(optimizer = tf.keras.optimizers.Adam(1e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
mod.fit(ds_train, validation_data = ds_val) #Model should NOT learn
'''
Showcase3: First training unfrozen Model, then reducing number of trainable Layers but also trying to trasnfer Optimizer States:
Behaves as subclassed Model: New Optimizer shouldnt have Weights
'''
mod = tf.keras.models.Model(inputs,out, name = 'TestModel')
set_trainable_layers(mod, 1 ,3)
mod.summary()
mod.compile(optimizer = tf.keras.optimizers.Adam(1e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
mod.fit(ds_train, validation_data = ds_val) # Model SHOULD learn
opti_state = mod.optimizer.get_weights()
set_trainable_layers(mod, 4,8)
mod.summary()
mod.compile(optimizer = tf.keras.optimizers.Adam(1e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
mod.optimizer.set_weights(opti_state)
mod.fit(ds_train, validation_data = ds_val) #Model should NOT learn
This is happening because one of the fundamental differences between the Subclassing API and the Functional or Sequential APIs in Tensorflow2.
While the Functional or Sequential APIs build a graph of Layers (think of it as a separate data structure), the Subclassing model builds a whole object and stores it as bytecode.
This means that with Subclassing you lose access to the internal connectivity graph and the normal behaviour that allows you to freeze/unfreeze layers or reuse them in other models starts to get weird. Seeing your implementation I would say that the Subclassed model is correct and it SHOULD be working if we were dealing with a library other than Tensorflow that is.
Francois Chollet explains it better than I will ever do in one of his Tweettorials
After some more experiments i have found a workaround for this Problem:
While the model itself cannot be unfrozen/frozen after the first compilation and training, it is however possible to save the model weights to a temporary file model.save_weights('temp.h5') and afterwards reconstructing the model class (Creating a new instance of model class for example) and loading the previous weights with model.load_weights('temp.h5').
However this can also lead to errors occuring when the previous model has both unfrozen and frozen weights. To prevent them you have to either set all layers trainable after the training and before saving weights, or copy the exact trainability structure of the model, and reconstructing the new model such that its layers have the same trainability state as the previous. this is possible with the following functions:
def get_trainability(model): # Takes Keras model and returns dictionary with layer names of Model as key, and its trainability as value/item
train_dict = {}
for layer in model.layers:
if isinstance(layer, tf.keras.models.Model):
train_dict.update(get_trainability(layer))
else:
train_dict[layer.name] = layer.trainable
return train_dict
def set_trainability(model, train_dict): # Takes keras Model and dictionary with layer names and booleans indicating the desired trainability of the layer.
# modifies model so that every Layer in the Model, whose name matches dict key will get trainable = boolean
for layer in model.layers:
if isinstance(layer, tf.keras.models.Model):
set_trainability(layer, train_dict)
else:
for name in train_dict.keys():
if name == layer.name:
layer.trainable = train_dict[name]
print(layer.name)
Hope this helps for simmilar problems in the Future

Keras Why binary classification isn't as accurate as categorical calssification

I am trying to create a model which can tell whether there are birds in an image or not.
I was using categorical classification to train the model to recognize Bird v.s. Flowers, the results turned to be very successful in terms of recognizing these 2 classes.
BUT, when I change it to Binary Classification to detect the existence of birds in an images, the accuracy dropped dramatically.
The reason why I changed to use Binary classification is that if I
provided a dog to my Categorical Classification trained model, it
recognized the dog as a bird.
btw, here is my data set structure:
Training:
5000 images for birds and 2000 images for not-birds
Validating:
1000 images for birds and 500 images for not-birds
Someone said, the inblanced dataset will also cause problems. Is it true?
Could someone please point out where I get wrong in the following code?
def get_num_files(path):
if not os.path.exists(path):
return 0
return sum([len(files) for r, d, files in os.walk(path)])
def get_num_subfolders(path):
if not os.path.exists(path):
return 0
return sum([len(d) for r, d, files in os.walk(path)])
def create_img_generator():
return ImageDataGenerator(
preprocessing_function=preprocess_input,
rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True
)
INIT_LT = 1e-3
Image_width, Image_height = 299, 299
Training_Epochs = 30
Batch_Size = 32
Number_FC_Neurons = 1024
Num_Classes = 2
train_dir = 'to my train folder'
validate_dir = 'to my validation folder'
num_train_samples = get_num_files(train_dir)
num_classes = get_num_subfolders(train_dir)
num_validate_samples = get_num_files(validate_dir)
num_epoch = Training_Epochs
batch_size = Batch_Size
train_image_gen = create_img_generator()
test_image_gen = create_img_generator()
train_generator = train_image_gen.flow_from_directory(
train_dir,
target_size=(Image_width, Image_height),
batch_size = batch_size,
seed = 42
)
validation_generator = test_image_gen.flow_from_directory(
validate_dir,
target_size=(Image_width, Image_height),
batch_size=batch_size,
seed=42
)
Inceptionv3_model = InceptionV3(weights='imagenet', include_top=False)
print('Inception v3 model without last FC loaded')
x = Inceptionv3_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(Number_FC_Neurons, activation='relu')(x)
predictions = Dense(num_classes, activation='softmax')(x)
# model = Model(inputs=Inceptionv3_model.input, outputs=predictions)
v3model = Model(inputs=Inceptionv3_model.input, outputs=predictions)
# Use new Sequential model to add v3model and add a bath normalization layer after
model = Sequential()
model.add(v3model)
model.add(BatchNormalization()) # added normalization
print(model.summary())
print('\nFine tuning existing model')
Layers_To_Freeze = 172
for layer in model.layers[:Layers_To_Freeze]:
layer.trainable = False
for layer in model.layers[Layers_To_Freeze:]:
layer.trainable = True
optizer = Adam(lr=INIT_LT, decay=INIT_LT / Training_Epochs)
# optizer = SGD(lr=0.0001, momentum=0.9)
model.compile(optimizer=optizer, loss='binary_crossentropy', metrics=['accuracy'])
cbk_early_stopping = EarlyStopping(monitor='val_acc', mode='max')
history_transfer_learning = model.fit_generator(
train_generator,
steps_per_epoch = num_train_samples,
epochs=num_epoch,
validation_data=validation_generator,
validation_steps = num_validate_samples,
class_weight='auto',
callbacks=[cbk_early_stopping]
)
model.save('incepv3_transfer_mini_binary.h5', overwrite=True, include_optimizer=True)
Categorical
Use Num_Classes = 2
Use one-hot-encoded targets (example: Bird = [1, 0], Flower = [0, 1])
Use 'softmax' activation
Use 'categorical_crossentropy'
Binary
Use Num_Classes = 1
Use binary targets (example: is flower = 1 | not flower = 0)
Use 'sigmoid' activation
Use 'binary_crossentropy'
Details here: Using categorical_crossentropy for only two classes