I'm currently working on a CNN model that classifies food images. So far, I have managed to build a functioning CNN but I would like to improve the accurracy. For the dataset, I have used some images from Kaggle and few from my own collection.
Here is some information about the dataset:
There are 91 classes of food images.
Each class has around 500 to 650 images.
The dataset has been manually cleaned and checked for unrelated or bad quality images (the photos are of different sizes).
Here is my CNN model:
classifier = Sequential()
def cnn_layer_creation(classifier):
classifier.add(InputLayer(input_shape=[224,224,3]))
classifier.add(Conv2D(filters=32,kernel_size=5,strides=1,padding='same',activation='relu',data_format='channels_first'))
classifier.add(MaxPooling2D(pool_size=5,padding='same'))
classifier.add(Conv2D(filters=50,kernel_size=5,strides=1,padding='same',activation='relu'))
classifier.add(MaxPooling2D(pool_size=5,padding='same'))
classifier.add(Conv2D(filters=80,kernel_size=5,strides=1,padding='same',activation='relu',data_format='channels_last'))
classifier.add(MaxPooling2D(pool_size=5,padding='same'))
classifier.add(Dropout(0.25))
classifier.add(Flatten())
classifier.add(Dense(64,activation='relu'))
classifier.add(Dropout(rate=0.5))
classifier.add(Dense(91,activation='softmax'))
# Compiling the CNN
classifier.compile(optimizer="RMSprop", loss = 'categorical_crossentropy', metrics = ['accuracy'])
data_initialization(classifier)
def data_initialization(classifier):
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
training_set = train_datagen.flow_from_directory('food_image/train',
target_size = (224, 224),
batch_size = 100,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('food_image/test',
target_size = (224, 224),
batch_size = 100,
class_mode = 'categorical')
classifier.fit_generator(training_set,
steps_per_epoch = 100,
epochs = 100,
validation_data = test_set,
validation_steps = 100)
classifier.save("brynModelGPULite.h5")
classifier.summary()
def main():
cnn_layer_creation(classifier)
Training is done on GPU (nVidia 980M)
Unfortunately, the accuracy has not exceeded 10%. Things I've tried are:
Increase the number of epochs.
Change the optimizer (ADAM, RMSPROP).
Change the activation function.
Reduce the image input size.
Increase the batch size.
Change the filter size to 32, 64, 128.
None of these have improved the accuracy.
Could anyone shine some light on how I might improve my model accuracy?
You can augment only the training data.
The following code
test_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
should be
test_datagen = ImageDataGenerator(rescale = 1./255)
Firstly, I assume you are building your model from scratch. As such training on fewer epochs(I assume you would not have trained your model for more than 1000 epochs), will not help as the network would not evolve completely because the representations would not have been completely learnt in so few epochs when you train a model from scratch. You can try increasing the number of epochs to around 10000 and see. Rather why not try and use transfer learning for the same, additionally you can also using feature extraction and fine tuning or both using a pre trained convnet. For reference you can have a look at chapter 5 in the book by Francois Chollet titled Deep Learning with Python.
I had same problem with another dataset and I replaced the flatten layer with globalAveragepooling and it solved the problem.
I'm not sure this is going to work for you but as my model has a structure similar to yours, I think this can help you. But the difference is that I trained my model for 3 classes.
Related
I'm working on image classification task for diabetic retinopathy with fundus image data. There are 5 classes. The data distribution is 1805 images (class 1), 370 images (class 2), 999 images (class 3), 193 images (class 4), 295 images (class 5).
Here are the steps that I have tried to run:
Preprocessing (resized 224 * 224)
The divide of train and test data is 85% : 15%
x_train, xtest, y_train, ytest = train_test_split(
x_train, y_train,
test_size = 0.15,
random_state=SEED,
stratify = y_train
)
Data agumentation
ImageDataGenerator(
zoom_range=0.15,
fill_mode='constant',
cval=0.,
horizontal_flip=True,
vertical_flip=True,
)
Training with the ResNet-50 model and cross-validation
def getResNet():
modelres = ResNet50(weights=None, include_top=False, input_shape= (IMAGE_HEIGHT,IMAGE_HEIGHT, 3))
x = modelres.output
x = GlobalAveragePooling2D()(x)
x = Dense(5, activation= 'softmax')(x)
model = Model(inputs = modelres.input, outputs = x)
return model
num_folds = 5
skf = StratifiedKFold(n_splits = 5, shuffle=True, random_state=2021)
cvscores = []
fold = 1
for train, val in skf.split(x_train, y_train.argmax(1)):
print('Fold: ', fold)
Xtrain = x_train[train]
Xval = x_train[val]
Ytrain = y_train[train]
Yval = y_train[val]
data_generator = create_datagen().flow(Xtrain, Ytrain, batch_size=32, seed=2021)
model = getResNet()
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.0001),
metrics=['accuracy'])
with tf.compat.v1.device('/device:GPU:0'):
model_train = model.fit(data_generator,
validation_data=(Xval, Yval),
epochs=30, batch_size = 32, verbose=1)
model_name = 'cnn_keras_aug_Fold_'+str(fold)+'.h5'
model.save(model_name)
scores = model.evaluate(xtest, ytest, verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
fold = fold +1
The maximum results I got from this method were training accuracy of 81.2%, validation accuracy of 72.2%, and test accuracy of 70.73%.
Can anyone give me an idea to improve the model so that I can get the test accuracy above 90% as possible?
Later, I will use this model as a pre-trained model to train diabetic retinopathy data as well but from other sources.
BTW, I've tried replacing my preprocessing with this method:
def preprocessing(path):
image = cv2.imread(path)
image = crop_image_from_gray(image)
green = image[:,:,1]
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
cl = clahe.apply(green)
image[:,:,0] = image[:,:,0]
image[:,:,2] = image[:,:,2]
image[:,:,1] = cl
image = cv2.resize(image, (224,224))
return image
I've also tried to replace my model with VGG16, EfficientNetB0. However, none of that had much effect on my results. I'm still stucked with about 70% accuracy.
Please help me come up with ideas to improve my modeling results. I hope.
Your training accuracy is 81.2%. It is generally impossible to have testing accuracy higher that training accuracy, i.e. with current setup you will not achieve 90%.
However, your validation (and also testing) accuracy is about 70-72%. I can suggest that on your small dataset your model is overfitting. So if you add model regularization (e.g. dropout), it is possible that the gap between your training and your validation (and test) will decrease. This way you can improve your validation score.
To further increase the score, you need to check your data manually and try to understand which classes contribute the most to the errors and figure out how those errors can be reduced (e.g. updating your preprocessing pipeline).
I'm trying to do a binary image classification problem, but the two classes (~590 and ~5900 instances, for class 1 and 2, respectively) are heavily skewed, but still quite distinct.
Is there any way I can fix this, I want to try SMOTE/random weighted oversampling.
I've tried a lot of different things but I'm stuck. I've tried using class_weights=[10,1],[5900,590], and [1/5900,1/590] and my model still only predicts class 2.
I've tried using tf.data.experimental.sample_from_datasets but I couldn't get it to work. I've even tried using sigmoid focal cross-entropy loss, which helped a lot but not enough.
I want to be able to oversample class 1 by a factor of 10, the only thing I have tried that has kinda worked is manually oversampling i.e. copying the train dir's class 1 instances to match the number of instances in class 2.
Is there not an easier way of doing this, I'm using Google Colab and so doing this is extremely inefficient.
Is there a way to specify SMOTE params / oversampling within the data generator or similar?
data/
...class_1/
........image_1.jpg
........image_2.jpg
...class_2/
........image_1.jpg
........image_2.jpg
My data is in the form shown above.
TRAIN_DATAGEN = ImageDataGenerator(rescale = 1./255.,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
TEST_DATAGEN = ImageDataGenerator(rescale = 1.0/255.)
TRAIN_GENERATOR = TRAIN_DATAGEN.flow_from_directory(directory = TRAIN_DIR,
batch_size = BACTH_SIZE,
class_mode = 'binary',
target_size = (IMG_HEIGHT, IMG_WIDTH),
subset = 'training',
seed = DATA_GENERATOR_SEED)
VALIDATION_GENERATOR = TEST_DATAGEN.flow_from_directory(directory = VALIDATION_DIR,
batch_size = BACTH_SIZE,
class_mode = 'binary',
target_size = (IMG_HEIGHT, IMG_WIDTH),
subset = 'validation',
seed = DATA_GENERATOR_SEED)
...
...
...
HISTORY = MODEL.fit(TRAIN_GENERATOR,
validation_data = VALIDATION_GENERATOR,
epochs = EPOCHS,
verbose = 2,
callbacks = [EARLY_STOPPING],
class_weight = CLASS_WEIGHT)
I'm relatively new to Tensorflow but I have some experience with ML as a whole. I've been tempted to switch to PyTorch several times as they have params for data loaders that automatically (over/under)sample with sampler=WeightedRandomSampler.
Note: I've looked at many tutorials about how to oversample however none of them are image classification problems, I want to stick with TF/Keras as it allows for easy transfer learning, could you guys help out?
You can use this strategy to calculate weights based on the imbalance:
from sklearn.utils import class_weight
import numpy as np
class_weights = class_weight.compute_class_weight(
'balanced',
np.unique(train_generator.classes),
train_generator.classes)
train_class_weights = dict(enumerate(class_weights))
model.fit_generator(..., class_weight=train_class_weights)
In Python you can implement SMOTE using imblearn library as follows:
from imblearn.over_sampling import SMOTE
oversample = SMOTE()
X, y = oversample.fit_resample(X, y)
As you already define your class_weight as a dictionary, e.g., {0: 10, 1: 1}, you might try augmenting the minority class. See balancing an imbalanced dataset with keras image generator and the tutorial (that was mentioned there) at https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
I am working on a dataset of gray images that are saved under RGB format. I trained VGG16 on this dataset, and preprocessed them this way:
train_data_gen = ImageDataGenerator(rescale=1./255,rotation_range = 20,
width_shift_range = 0.2,
height_shift_range = 0.2,
horizontal_flip = True)
validation_data_gen = ImageDataGenerator(rescale=1./255)
train_gen= train_data_gen.flow_from_directory(trainPath,
target_size=(224, 224),
batch_size = 64,
class_mode='categorical' )
validation_gen= validation_data_gen.flow_from_directory(validationPath, target_size=(224, 224),
batch_size = 64, class_mode='categorical' )
When the training was done, both training and validation accuracy were high (92%).
In the prediction phase, I first tried to preprocess images as indicated in https://keras.io/applications/ :
img = image.load_img(img_path, target_size=(image_size,image_size))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
However, the test accuracy was very low! around 50%. I tried the prediction again on train and validation samples and got a low accuracy, also around 50%, which means that the problem is in the prediction phase.
Instead, I preprocessed images using OpenCV library, and the accuracy was better, but still not as expected. I tried to make the prediction on train samples (where accuracy during training was 92%), and during the prediction I got 82%. Here is the code:
img = cv2.imread(imagePath)
#np.flip(img, axis=-1)
img= cv2.resize(img, (224, 224),
interpolation = cv2.INTER_AREA)
img = np.reshape(img,
(1, img.shape[0], img.shape[1], img.shape[2]))
img = img/255.
The result is the same with/without flipping the image. What's wrong with the preprocessing step?
Thanks
The error was in the interpolation parameter of resize function. It should be cv2.INTER_NEAREST instead of cv2.INTER_AREA.
I am training a model for Optical Character Recognition of Gujarati Language. The input image is a character image. I have taken 37 classes. Total training images are 22200 (600 per class) and testing images are 5920 (160 per class). My input images are 32x32
Below is my code:
model = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet', pooling='max')
base_inputs = model.layers[0].input
base_outputs = model.layers[-1].output # NOTICE -1 not -2
prefinal_outputs = layers.Dense(1024)(base_outputs)
final_outputs = layers.Dense(37)(prefinal_outputs)
new_model = keras.Model(inputs=base_inputs, outputs=base_outputs)
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=False)
test_datagen = ImageDataGenerator(horizontal_flip = False)
training_set = train_datagen.flow_from_directory('C:/Users/shweta/Desktop/characters/train',
target_size = (32, 32),
batch_size = 64,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('C:/Users/shweta/Desktop/characters/test',
target_size = (32, 32),
batch_size = 64,
class_mode = 'categorical')
new_model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
new_model.fit_generator(training_set,
epochs = 25,
validation_data = test_set, shuffle=True)
new_model.save('alphanumeric.mod')
I am getting following output:
Thanks in advance!
First of all, very well written code.
These are some of the things, I have noticed while I was going through the code and tf,keras docs.
I would like to ask what kind of labels have you got beacuse you know categorical_crossentropy expects ONE HOT CODED labels.(Check this).So, if your labels are integers, use sparsecategoricalentropy.
Similar issue
There was post where someone was trying to classsify into 2 and used categorical instead of binary crossentropy. If you want to look at.
Cheers
Let me know how it goes!
PS: #gerry made a very good point and if labels are One hot encoded use categoricalcrossentropy!
The code should be:
model = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet, pooling='max', input_shape=(32,32,3))
base_outputs = model.layers[-1].output
prefinal_outputs = layers.Dense(1024)(base_outputs)
final_outputs = layers.Dense(37)(prefinal_outputs)
new_model = keras.Model(inputs=model.input, outputs=final_outputs)
new_model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
Also you should use model.fit in the future. Model.fit can now work with generators and model.fit_generator will be depreciate in future versions of tensorflow. I ran against your dataset and got accurate results in about 10 epochs. Here is some additional advice. It is best to use and adjustable learning rate. The keras callback ReduceLROnPlateau makes this easy to do. Documentation is here. Set it to monitor the validation loss. My use is shown below.
lr_adjust=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1, verbose=1, mode="auto",
min_delta=0.00001, cooldown=0, min_lr=0)
Also I recommend using the callback ModelCheckpoint. Documentation is here. Set it up to monitor validation loss and it will save the weights that achieved the lowest validation loss. My implementation is shown below.
sav_loc=r'c:\Temp' # set this to the path where you want to save the weights
checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath=save_loc, monitor='val_loss', verbose=1, save_best_only=True,
save_weights_only=True, mode='auto', save_freq='epoch', options=None)
callbacks=[checkpoint, lr_adjust]
In model.fit include callbacks=callbacks. When training is completed you want to load these saved weights into the model, then save the model. You can use the saved model to make predictions. Code is below.
model.load_weights(save_loc)
model.save(save_loc)
I'm doing HPO on a small custom CNN. During training the GPU is under-utilised and I'm finding a bottleneck in the CPU: the data augmentation process is too slow. Looking online, I found that I could use multiple CPU cores for the generator and speedup the process. I set up workers=n_cores and this did improve things, but not as much as I'd like.
So I though that I could train multiple models simultaneously on the GPU, and feed the same augmented data to the models. However, I can't come up with some idea on how to do this and I couldn't find any similar question.
Here's a minimal example (I'm leaving out imports for brevity):
# load model and set only last layer as trainable
def create_model(learning_rate, alpha, dropout):
model_path = '/content/drive/My Drive/Progetto Advanced Machine Learning/Model Checkpoints/Custom Model 1 2020-06-01 10:56:21.010759.hdf5'
model = tf.keras.models.load_model(model_path)
x = model.layers[-2].output
x = Dropout(dropout)(x)
predictions = Dense(120, activation='softmax', name='prediction', kernel_regularizer=tf.keras.regularizers.l2(alpha))(x)
model = Model(inputs=model.inputs, outputs=predictions)
for layer in model.layers[:-2]:
layer.trainable = False
model.compile(loss='categorical_crossentropy', optimizer=Adam(learning_rate), metrics=['accuracy'])
return model
#declare the search space
SEARCH_SPACE = [skopt.space.Real(0.0001, 0.1, name='learning_rate', prior='log-uniform'),
skopt.space.Real(1e-9, 1, name='alpha', prior='log-uniform'),
skopt.space.Real(0.0001, 0.95, name='dropout', prior='log-uniform')]
# declare generator
train_datagenerator = ImageDataGenerator(rescale=1. / 255, rotation_range=30, zoom_range=0.2, horizontal_flip=True, validation_split=0.2, data_format='channels_last')
# training function to be called by the optimiser
#use_named_args(SEARCH_SPACE)
def fitness(learning_rate, alpha, dropout):
model = create_model(learning_rate, alpha, dropout)
#compile generators
train_batches = train_datagenerator.flow_from_directory(train_out_path, target_size=image_size, color_mode="rgb", class_mode="categorical" , batch_size=32, subset='training', seed = 20052020)
val_batches = train_datagenerator.flow_from_directory(directory=train_out_path, target_size=image_size, color_mode="rgb", class_mode="categorical" , batch_size=32, subset='validation', shuffle=False, seed = 20052020)
#train
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)
training_results = model.fit(train_batches, epochs=5, verbose=1, shuffle=True, validation_data=val_batches, workers=2)
history[hyperpars] = training_results.history
with open(dict_save_path, 'wb') as f:
pickle.dump(history, f)
return training_results.history['val_accuracy'][-1]
# HPO
result = skopt.forest_minimize(fitness, SEARCH_SPACE, n_calls=10, callback=checkpoint_saver)