Tensorflow: train multiple models in parallel with the same ImageDataGenerator - tensorflow

I'm doing HPO on a small custom CNN. During training the GPU is under-utilised and I'm finding a bottleneck in the CPU: the data augmentation process is too slow. Looking online, I found that I could use multiple CPU cores for the generator and speedup the process. I set up workers=n_cores and this did improve things, but not as much as I'd like.
So I though that I could train multiple models simultaneously on the GPU, and feed the same augmented data to the models. However, I can't come up with some idea on how to do this and I couldn't find any similar question.
Here's a minimal example (I'm leaving out imports for brevity):
# load model and set only last layer as trainable
def create_model(learning_rate, alpha, dropout):
model_path = '/content/drive/My Drive/Progetto Advanced Machine Learning/Model Checkpoints/Custom Model 1 2020-06-01 10:56:21.010759.hdf5'
model = tf.keras.models.load_model(model_path)
x = model.layers[-2].output
x = Dropout(dropout)(x)
predictions = Dense(120, activation='softmax', name='prediction', kernel_regularizer=tf.keras.regularizers.l2(alpha))(x)
model = Model(inputs=model.inputs, outputs=predictions)
for layer in model.layers[:-2]:
layer.trainable = False
model.compile(loss='categorical_crossentropy', optimizer=Adam(learning_rate), metrics=['accuracy'])
return model
#declare the search space
SEARCH_SPACE = [skopt.space.Real(0.0001, 0.1, name='learning_rate', prior='log-uniform'),
skopt.space.Real(1e-9, 1, name='alpha', prior='log-uniform'),
skopt.space.Real(0.0001, 0.95, name='dropout', prior='log-uniform')]
# declare generator
train_datagenerator = ImageDataGenerator(rescale=1. / 255, rotation_range=30, zoom_range=0.2, horizontal_flip=True, validation_split=0.2, data_format='channels_last')
# training function to be called by the optimiser
#use_named_args(SEARCH_SPACE)
def fitness(learning_rate, alpha, dropout):
model = create_model(learning_rate, alpha, dropout)
#compile generators
train_batches = train_datagenerator.flow_from_directory(train_out_path, target_size=image_size, color_mode="rgb", class_mode="categorical" , batch_size=32, subset='training', seed = 20052020)
val_batches = train_datagenerator.flow_from_directory(directory=train_out_path, target_size=image_size, color_mode="rgb", class_mode="categorical" , batch_size=32, subset='validation', shuffle=False, seed = 20052020)
#train
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)
training_results = model.fit(train_batches, epochs=5, verbose=1, shuffle=True, validation_data=val_batches, workers=2)
history[hyperpars] = training_results.history
with open(dict_save_path, 'wb') as f:
pickle.dump(history, f)
return training_results.history['val_accuracy'][-1]
# HPO
result = skopt.forest_minimize(fitness, SEARCH_SPACE, n_calls=10, callback=checkpoint_saver)

Related

sklearn classification_report ValueError: Found input variables with inconsistent numbers of samples: [18, 576]

I'm working on a CNN classification problem. I used keras and a pre-trained model. Now I want to evaluate my model and need the precision, recall and f1-Score. When I use sklearn.metrics classification_report I get above error. I know where the numbers are coming from, first is the length of my test dataset in batches and second are the number of actual sampels (predictions) in there. However I don't know how to "convert" them.
See my code down below:
# load train_ds
train_ds = tf.keras.utils.image_dataset_from_directory(
directory ='/gdrive/My Drive/Flies_dt/224x224',
image_size = (224, 224),
validation_split = 0.40,
subset = "training",
seed = 123,
shuffle = True)
# load val_ds
val_ds = tf.keras.utils.image_dataset_from_directory(
directory ='/gdrive/My Drive/Flies_dt/224x224',
image_size = (224, 224),
validation_split = 0.40,
subset = "validation",
seed = 123,
shuffle = True)
# move some batches of val_ds to test_ds
test_ds = val_ds.take((1*len(val_ds)) // 2)
print('test_ds =', len(test_ds))
val_ds = val_ds.skip((1*len(val_ds)) // 2)
print('val_ds =', len(val_ds)) #test_ds = 18 val_ds = 18
# Load Model
base_model = keras.applications.vgg19.VGG19(
include_top=False,
weights='imagenet',
input_shape=(224,224,3)
)
# Freeze base_model
base_model.trainable = False
#
inputs = keras.Input(shape=(224,224,3))
x = data_augmentation(inputs) #apply data augmentation
# Preprocessing
x = tf.keras.applications.vgg19.preprocess_input(x)
# The base model contains batchnorm layers. We want to keep them in inference mode
# when we unfreeze the base model for fine-tuning, so we make sure that the
# base_model is running in inference mode here.
x = base_model(x, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dropout(0.2)(x) # Regularize with dropout
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs, outputs)
model.compile(
loss="sparse_categorical_crossentropy",
optimizer="Adam",
metrics=['acc']
)
model.fit(train_ds, epochs=8, validation_data=val_ds, callbacks=[tensorboard_callback])
# Unfreeze the base_model. Note that it keeps running in inference mode
# since we passed `training=False` when calling it. This means that
# the batchnorm layers will not update their batch statistics.
# This prevents the batchnorm layers from undoing all the training
# we've done so far.
base_model.trainable = True
model.summary()
model.compile(
optimizer=keras.optimizers.Adam(learning_rate=0.000001), # Low learning rate
loss="sparse_categorical_crossentropy",
metrics=['acc']
)
model.fit(train_ds, epochs=5, validation_data=val_ds)
#Evaluate
from sklearn.metrics import classification_report
y_pred = model.predict(test_ds, batch_size=64, verbose=1)
y_pred_bool = np.argmax(y_pred, axis=1)
print(classification_report(test_ds, y_pred_bool))
I also tried something like this, but I'm not sure if this gives me the correct values for multiclass classification.
from keras import backend as K
def recall_m(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def precision_m(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
def f1_m(y_true, y_pred):
precision = precision_m(y_true, y_pred)
recall = recall_m(y_true, y_pred)
return 2*((precision*recall)/(precision+recall+K.epsilon()))
# compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc',f1_m,precision_m, recall_m])
# fit the model
history = model.fit(Xtrain, ytrain, validation_split=0.3, epochs=10, verbose=0)
# evaluate the model
loss, accuracy, f1_score, precision, recall = model.evaluate(Xtest, ytest, verbose=0)
This is a lot, Sorry. Hope somebody can help.

I am training a deepfake image detection model, but why the validation accuracy is not changing?

I am training deepfake image detection using Tensorflow, but the validation accuracy is stuck at 67. I have tried to use different optimizers, but it's not decreasing and only floating around the same score.
Here is my step to creating the model.
Importing data from the image folder
Create an ImageDataGenerator object to do some augmentation.
datagen = ImageDataGenerator(
horizontal_flip=True,
validation_split=0.2,
rescale=1./255,
)
Creating the model
image dimension: 299, 299, 3
input_layer = Input(shape = (image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']))
base_model = keras.applications.EfficientNetB5(
weights='imagenet',
input_shape=(image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']),
include_top=False)
base_model.trainable = False
x = base_model(input_layer, training=False)
# Add pooling layer or flatten layer
y = GlobalAveragePooling2D()(x)
y = Dense(512, activation='relu')(y)
y = Dropout(0.4)(y)
y = Dense(256)(y)
# Add final dense layer
output_layer = Dense(1, activation='sigmoid')(y)
model = Model(inputs=input_layer, outputs=output_layer)
Training
efficientNet = EfficientNet(learning_rate = 0.001)
efficientNet.summary()
history = efficientNet.fit(datagen.flow(X_train, y_train, batch_size=64, subset='training'),
epochs=10,
validation_data=datagen.flow(X_train, y_train, batch_size=64, subset='validation'))
Result
Here is the result of the model training
Is there anyway I can fix this problem?

Autoencoder Custom Dataset tensorflow 2.3 ValueError: `y` argument is not supported when using dataset as input

I'm trying to implement an Autoencoder in Tensorflow 2.3. I am taking my own Image dataset stored on disk as input.can someone explain to me how this can be done in a correct way?
I tried loading the data in directory using tf.keras.preprocessing.image_dataset_from_directory() but when I use start training with the data taken from above method I am getting following error.
"ValueError: y argument is not supported when using dataset as input."
PFB the code that I am running
'''
import tensorflow as tf
from convautoencoder import ConvAutoencoder
from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt
import numpy as np
EPOCHS = 25
batch_size = 1
img_height = 180
img_width = 180
data_dir = "/media/aniruddha/FE47-91B8/Laptop_Backup/Auto-Encoders/Basic/data"
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
(encoder, decoder, autoencoder) = ConvAutoencoder.build(224, 224, 3)
opt = Adam(lr=1e-3)
autoencoder.compile(loss="mse", optimizer=opt)
H = autoencoder.fit( train_ds, train_ds, validation_data=(val_ds, val_ds), epochs=EPOCHS, batch_size=batch_size)
'''
I resolved this. I was not feeding the input dataset as a tuple to the model for training. Once I corrected that the training started.
I used generators to feed the input data as tuple to the autoencoder.
Please find my code below.
# initialize the training training data augmentation object
trainAug = ImageDataGenerator(rescale=1. / 255)
valAug = ImageDataGenerator(rescale=1. / 255)
# initialize the training generator
trainGen = trainAug.flow_from_directory(
config.TRAIN_PATH,
class_mode="input",
classes=None,
target_size=(64, 64),
color_mode="grayscale",
shuffle=True,
batch_size=BS)
# initialize the validation generator
valGen = valAug.flow_from_directory(
config.TRAIN_PATH,
class_mode="input",
classes=None,
target_size=(64, 64),
color_mode="grayscale",
shuffle=False,
batch_size=BS)
# initialize the testing generator
testGen = valAug.flow_from_directory(
config.TRAIN_PATH,
class_mode="input",
classes=None,
target_size=(64, 64),
color_mode="grayscale",
shuffle=False,
batch_size=BS)
early_stop = EarlyStopping(monitor='val_loss', patience=20)
mc = ModelCheckpoint('best_model_1.h5', monitor='val_loss', mode='min', save_best_only=True)
# construct our convolutional autoencoder
print("[INFO] building autoencoder...")
(encoder, decoder, autoencoder) = ConvAutoencoder.build(64, 64, 1)
opt = Adam(learning_rate= 0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-04, amsgrad=False)
autoencoder.compile(loss="mse", optimizer=opt)
# train the convolutional autoencoder
H = autoencoder.fit( trainGen, validation_data=valGen, epochs=EPOCHS, batch_size=BS ,callbacks=[ mc , early_stop])
fit is expecting data and labels, but it only accepts a single tf.data.Dataset. To use data as labels for the autoencoder you should provide it twice to the dataset constructor, e.g. :
dataset = tf.data.Dataset.from_tensor_slices((images, images))

Keras Why binary classification isn't as accurate as categorical calssification

I am trying to create a model which can tell whether there are birds in an image or not.
I was using categorical classification to train the model to recognize Bird v.s. Flowers, the results turned to be very successful in terms of recognizing these 2 classes.
BUT, when I change it to Binary Classification to detect the existence of birds in an images, the accuracy dropped dramatically.
The reason why I changed to use Binary classification is that if I
provided a dog to my Categorical Classification trained model, it
recognized the dog as a bird.
btw, here is my data set structure:
Training:
5000 images for birds and 2000 images for not-birds
Validating:
1000 images for birds and 500 images for not-birds
Someone said, the inblanced dataset will also cause problems. Is it true?
Could someone please point out where I get wrong in the following code?
def get_num_files(path):
if not os.path.exists(path):
return 0
return sum([len(files) for r, d, files in os.walk(path)])
def get_num_subfolders(path):
if not os.path.exists(path):
return 0
return sum([len(d) for r, d, files in os.walk(path)])
def create_img_generator():
return ImageDataGenerator(
preprocessing_function=preprocess_input,
rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True
)
INIT_LT = 1e-3
Image_width, Image_height = 299, 299
Training_Epochs = 30
Batch_Size = 32
Number_FC_Neurons = 1024
Num_Classes = 2
train_dir = 'to my train folder'
validate_dir = 'to my validation folder'
num_train_samples = get_num_files(train_dir)
num_classes = get_num_subfolders(train_dir)
num_validate_samples = get_num_files(validate_dir)
num_epoch = Training_Epochs
batch_size = Batch_Size
train_image_gen = create_img_generator()
test_image_gen = create_img_generator()
train_generator = train_image_gen.flow_from_directory(
train_dir,
target_size=(Image_width, Image_height),
batch_size = batch_size,
seed = 42
)
validation_generator = test_image_gen.flow_from_directory(
validate_dir,
target_size=(Image_width, Image_height),
batch_size=batch_size,
seed=42
)
Inceptionv3_model = InceptionV3(weights='imagenet', include_top=False)
print('Inception v3 model without last FC loaded')
x = Inceptionv3_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(Number_FC_Neurons, activation='relu')(x)
predictions = Dense(num_classes, activation='softmax')(x)
# model = Model(inputs=Inceptionv3_model.input, outputs=predictions)
v3model = Model(inputs=Inceptionv3_model.input, outputs=predictions)
# Use new Sequential model to add v3model and add a bath normalization layer after
model = Sequential()
model.add(v3model)
model.add(BatchNormalization()) # added normalization
print(model.summary())
print('\nFine tuning existing model')
Layers_To_Freeze = 172
for layer in model.layers[:Layers_To_Freeze]:
layer.trainable = False
for layer in model.layers[Layers_To_Freeze:]:
layer.trainable = True
optizer = Adam(lr=INIT_LT, decay=INIT_LT / Training_Epochs)
# optizer = SGD(lr=0.0001, momentum=0.9)
model.compile(optimizer=optizer, loss='binary_crossentropy', metrics=['accuracy'])
cbk_early_stopping = EarlyStopping(monitor='val_acc', mode='max')
history_transfer_learning = model.fit_generator(
train_generator,
steps_per_epoch = num_train_samples,
epochs=num_epoch,
validation_data=validation_generator,
validation_steps = num_validate_samples,
class_weight='auto',
callbacks=[cbk_early_stopping]
)
model.save('incepv3_transfer_mini_binary.h5', overwrite=True, include_optimizer=True)
Categorical
Use Num_Classes = 2
Use one-hot-encoded targets (example: Bird = [1, 0], Flower = [0, 1])
Use 'softmax' activation
Use 'categorical_crossentropy'
Binary
Use Num_Classes = 1
Use binary targets (example: is flower = 1 | not flower = 0)
Use 'sigmoid' activation
Use 'binary_crossentropy'
Details here: Using categorical_crossentropy for only two classes

Low accurcacy - Transfer learning + bottle-neck keras-tensorflow (resnet50)

I'm trying to do transfer learning / bottle neck with keras/tensorflow on a google Colaboratory notebook. My problem is that the accuracy doesn't go over 6% (Kaggle's dog breed challenge, 120 classes, data generated with datagen.flow_from_directory)
Below is my code, is there something I'm missing?
tr_model=ResNet50(include_top=False,
weights='imagenet',
input_shape = (224, 224, 3),)
datagen = ImageDataGenerator(rescale=1. / 255)
#### Training ####
train_generator = datagen.flow_from_directory(train_data_dir,
target_size=(image_size,image_size),
class_mode=None,
batch_size=batch_size,
shuffle=False)
bottleneck_features_train = tr_model.predict_generator(train_generator)
train_labels = to_categorical(train_generator.classes , num_classes=num_classes)
#### Validation ####
validation_generator = datagen.flow_from_directory(validation_data_dir,
target_size=(image_size,image_size),
class_mode=None,
batch_size=batch_size,
shuffle=False)
bottleneck_features_validation = tr_model.predict_generator(validation_generator)
validation_labels = to_categorical(validation_generator.classes, num_classes=num_classes)
#### Model creation ####
model = Sequential()
model.add(Flatten(input_shape=bottleneck_features_train.shape[1:]))
model.add(Dense(num_class, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
history = model.fit(bottleneck_features_train, train_labels,
epochs=30,
batch_size=batch_size,
validation_data=(bottleneck_features_validation, validation_labels))
I get a val_acc = 0.0592
When I use ResNet50 with the last layer, I get a score of 82%.
Can anyone spot what's wrong with my code.
Suppress the rescale and add the preprocessing helped a lot.
Those modifications help immensely:
from keras.applications.resnet50 import preprocess_input
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
I now have an accuracy of 80%