I am (very) new to deep learning and I am trying to train a dog breed classifier using Tensorflow/Keras. I have selected a subset of 10 breeds to speed up calculations, and I am using all the images available in the Stanford dataset for those breeds, which I have placed in train/test/val directories. I have 1338 images for training, 379 images for validation and 200 images for test.
I have first tried building a simple CNN from scratch without data augmentation, and I quickly reached 99% accuracy for the training set and got stuck at 30% for the val set (which I assume is quite normal without augmentation ?)
Then I applied data augmentation and tried two approaches, building a CNN from scratch and using transfer learning. With the "home-made" CNN I can't reach more than around 30 % accuracy even for the training set, and I can't figure out what the problem is. And I am stuck around 80 % with transfer learning, which I guess is not that good either ?
Here is the code for data augmentation:
`
# Creating image generator steps
train_datagen = ImageDataGenerator(rescale=1.0/255.0,
rotation_range=60,
width_shift_range=0.3,
height_shift_range=0.3,
shear_range=0.2,
zoom_range=[0.5, 1.5],
brightness_range=[0.5, 1.5],
horizontal_flip=True
)
val_datagen = ImageDataGenerator(rescale=1.0/255.0)
test_datagen = ImageDataGenerator(rescale=1.0/255.0)
train_generator = train_datagen.flow_from_directory(
directory="split_output/train",
target_size=(224,224),
color_mode="rgb",
batch_size=8,
class_mode='sparse',
shuffle='True',
seed=42
)
val_generator = val_datagen.flow_from_directory(
directory="split_output/val",
target_size=(224,224),
color_mode="rgb",
batch_size=8,
class_mode='sparse',
shuffle='True',
seed=42
)
test_generator = test_datagen.flow_from_directory(
directory="split_output/test",
target_size=(224,224),
color_mode="rgb",
batch_size=8,
class_mode='sparse',
shuffle='False',
seed=42
)
`
Here is the first CNN I tried (for which accuracies are both stuck around 25 %):
`
# The CNN architecture
model = Sequential()
model.add(Conv2D(32,(3,3), padding="same", activation='relu',input_shape = (224,224,3)))
model.add(MaxPooling2D((2,2)))
# 32 = number of filters
# (3, 3) = kernel size
model.add(Conv2D(64,(3,3), padding="same", activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(64,(3,3), padding="same", activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dense(64,activation='relu'))
model.add(Dense(10,activation='softmax'))
# Fitting the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
history = model.fit_generator(train_generator,
# steps_per_epoch=1000,
epochs=50,
validation_data=val_generator,
# validation_steps=250,
verbose=1
)
`
And the second one, a bit deeper and including BatchNorm and Dropout (accuracies are stuck around 35%):
`
# The CNN architecture
model = Sequential()
model.add(Conv2D(32,(3,3), padding="same", activation='relu',input_shape = (224,224,3)))
model.add(MaxPooling2D((2,2)))
# 32 = number of filters
# (3, 3) = kernel size
model.add(Conv2D(32,(3,3),activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(64,(3,3), padding="same", activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(128,(3,3), padding="same", activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512,activation='relu'))
model.add(Dense(10,activation='softmax'))
model.summary()
opt = Adam(lr=0.0001)
# Fitting the model
model.compile(optimizer=opt,
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
history = model.fit(train_generator,
# steps_per_epoch=1000,
epochs=50,
validation_data=val_generator,
# validation_steps=250,
verbose=1
)
`
Here is the history for that second CNN:
accuracies for 2nd CNN
And finally I tried with a resnet, which gets stuck around 90% for train and 80% for val:
`
model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg', weights="imagenet"))
model.add(Flatten())
model.add(BatchNormalization())
model.add(Dense(2048, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(1024, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(10, activation='softmax'))
opt = Adam(lr=0.0001)
model.compile(optimizer=opt, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_generator,
# steps_per_epoch=1000,
epochs=150,
validation_data=val_generator,
# validation_steps=250,
verbose=1
)
`
And the history for this last one:
resnet history
I'm a bit surprised at how the accuracies (especially val) get stuck so fast at a nearly constant value...
Again I'm very new at this so there could be very basic mistakes!
Related
I am training my first ML model. I am working on a 10-class classification problem. From what I can see, the model is overfitting since there is a significant difference between the training and validation accuracy.
This is the relevant code for the model
model = keras.Sequential()
model.add(keras.Input(shape=(x_train[0].shape)))
model.add(tf.keras.layers.Conv2D(filters=32,kernel_size=3, strides = (3, 3), padding = "same", activation = "relu", kernel_regularizer=tf.keras.regularizers.l1_l2(0.01)))
model.add(tf.keras.layers.MaxPool2D(strides=2))
model.add(tf.keras.layers.Conv2D(filters=32, kernel_size=(3,3), padding='valid', activation='relu', kernel_regularizer=tf.keras.regularizers.l1_l2(0.01)))
model.add(tf.keras.layers.MaxPool2D(strides=2))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dense(10))
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001/2)
model.summary()
model.compile(optimizer=optimizer,
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs = 30, validation_data = (x_val, y_val), callbacks=tf.keras.callbacks.EarlyStopping(verbose=1, patience=4))
There are large fluctuations in the validation accuracy and I am not sure why.
I have tried augmenting the data and have also injected noise into the training data. (This is an audio classification problem with 10 different classes)
https://i.stack.imgur.com/TXe50.png
I am training a CNN model to classify grayscale images into 6 classes. While my code is working well on RGB images, it gives error when I apply it on grayscale images. Here is part of the code:
input_shape=(256, 256,1) # assign "1" to the last channel to account for grayscale.
target_size = (256, 256) # To use it in the flow_from_directory package
model_name='Test1'
model_filename = (model_name+'.hdf5')
optimizer = Adam(learning_rate=1e-3)
loss=['categorical_crossentropy']
metrics = ['accuracy']
## Here is the model:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(6)) # To account for 6 classes
model.add(Activation('softmax'))
model.summary()
train_datagen = ImageDataGenerator(
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
vaidation_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
train_path, # points to the folder containing all training images
target_size=target_size,
color_mode='grayscale', # to specify the grayscale
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
interpolation='nearest')
validation_generator = vaidation_datagen.flow_from_directory(
validation_path, # points to the folder containing all validation images
target_size=target_size,
color_mode='grayscale', # to specify the grayscale
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
interpolation='nearest')
model.compile(optimizer, loss , metrics)
model_checkpoint = tf.keras.callbacks.ModelCheckpoint((model_path+model_filename), monitor='loss',verbose=1, save_best_only=True)
model.summary()
history = model.fit(
train_generator,
steps_per_epoch = num_of_train_img_raw//batch_size,
epochs = epochs,
validation_data = validation_generator,
validation_steps = num_of_val_img_raw//batch_size,
callbacks=[model_checkpoint],
use_multiprocessing = False)
Here is the error I receive:
"input depth must be evenly divisible by filter depth: 1 vs 3"
Then the IDE kernel freezes!
Yes it is wasteful/slower but why not just convert the greyscale images into RGB? Unless you need superior performance or really want to update the model both of which will take time to do.
Use grayscale_to_rgb (already built into tensorflow).
Recently I have been trying to build an automated car in gta v using a CNN model. I started out by collecting about 30k images from the game by driving around while capturing the scene and the key that was pressed at the current time. I also made sure to keep the dataset balanced by limiting the amount of data for each label to be equal.
An example of a random image in the dataset: IMAGE.
The labels are the basic driving inputs -LABELS.
Using this dataset on various models the accuracy never went above 50-60% on the validation test (test accuracy is even lower). Trying to fix this issue I tried cropping the images from the dataset to only include the center of the image which contains the road and drop the outlying data (scenery, buildings etc..). Also tried using RGB pictures as data instead of greyscale, also tested out collecting data from a specific location and testing it in the same place, different model architectures, different parameters and still no luck.
All the models were tested out in-game by constantly capturing the image of the road in-game and using it as an input to the model, then the output of the model would be the input for the game. All models seem to behave in the same general way which is basically outputting the same label – mostly ‘WA’, until it crashes into a wall.
I would love to get some tips on what I may be doing wrong or on what I can do to improve performance and let me know if you need any more information regarding this project to help out.
Thanks in advance.
TWO OF THE MODELS I TRIED:
model = Sequential()
model.add(Conv2D(filters=96, kernel_size=11, strides=4, activation='relu', input_shape=(IMAGE_HEIGHT, IMAGE_WIDTH, IMAGE_CHANNELS), padding='same'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2))
model.add(Conv2D(filters=256, kernel_size=5, activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2))
model.add(Conv2D(filters=384, kernel_size=3, activation='relu', padding='same'))
model.add(Conv2D(filters=384, kernel_size=3, activation='relu', padding='same'))
model.add(Conv2D(filters=256, kernel_size=3, activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=2))
model.add(Dense(4096, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='tanh'))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(7, activation='sigmoid'))
model = Sequential()
model.add(Conv2D(filters=12, kernel_size=11, activation='relu', input_shape=(IMAGE_HEIGHT, IMAGE_WIDTH, IMAGE_CHANNELS), padding='same'))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Conv2D(filters=256, kernel_size=5, activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Conv2D(filters=384, kernel_size=3, activation='relu', padding='same'))
model.add(Conv2D(filters=256, kernel_size=5, activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Flatten())
model.add(Dense(7, activation='sigmoid'))
The code:
filenames = os.listdir("dataset")
labels = []
for filename in filenames:
label = filename.split('.')[1]
labels.append(label)
df = pd.DataFrame({
'filename': filenames,
'category': labels
})
model = model1()
print(model.summary())
model.compile(loss='categorical_crossentropy', optimizer=Adam(learning_rate=0.001), metrics=['accuracy'])
train_df, validate_df = train_test_split(df, test_size=0.40, random_state=42)
train_df = train_df.reset_index(drop=True)
validate_df = validate_df.reset_index(drop=True)
total_train = train_df.shape[0]
total_validate = validate_df.shape[0]
batch_size=32
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_dataframe(
train_df,
"dataset",
x_col='filename',
y_col='category',
target_size=IMAGE_SIZE,
class_mode='categorical',
batch_size=batch_size
)
validation_datagen = ImageDataGenerator( rescale=1./255,)
validation_generator = validation_datagen.flow_from_dataframe(
validate_df,
"dataset",
x_col='filename',
y_col='category',
target_size=IMAGE_SIZE,
class_mode='categorical',
batch_size=batch_size
)
epochs = 25
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
history = model.fit(
train_generator,
epochs=epochs,
validation_data=validation_generator,
validation_steps=total_validate // batch_size,
steps_per_epoch=total_train // batch_size, shuffle=True, callbacks=[tensorboard_callback])
model.save("model.h5")
I have a dataset of images ( EEG spectrograms ) as given below
Image dimensions are 669X1026. I am using the following code for binary classification of the spectrograms.
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
# dimensions of our images.
img_width, img_height = 669, 1026
train_data_dir = '/home/spectrograms/train'
validation_data_dir = '/home/spectrograms/test'
nb_train_samples = 791
nb_validation_samples = 198
epochs = 100
batch_size = 3
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height,3)
model = Sequential()
model.add(Conv2D(128, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(512, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(16))
model.add(Activation('relu'))
# model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0,
zoom_range=0,
horizontal_flip=False)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
model.save_weights('CNN_model.h5')
But I am not able to get a training accuracy greater than 0.53. I have only a limited amount of data ( 790 training samples and 198 testing samples ). So increasing number of input images is not an option. What else can I do to improve the accuracy?
your code
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0,
zoom_range=0,
horizontal_flip=False)
Is not doing any image augmentation, only rescalling. Not sure what type of augmentation may help. Looks like your images really do not rely on color. It probably will not help accuracy but you could reduce computational expense by converting the images to gray scale. You might get some improvement by using the Keras callbacks ReduceLROnPlateau and EarlyStopping. Documentation is here. My suggested code for these callbacks is shown below
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1,
verbose=1, mode="auto", min_delta=0.0001, cooldown=0, min_lr=0)
estop=tf.keras.callbacks.EarlyStopping(monitor="val_loss", min_delta=0,patience=4,
verbose=1, mode="auto", baseline=None, restore_best_weights=True)
callbacks=[rlronp, estop]
You can try using transfer learning. Most of those models are trained on the imagenet dataset which is dis-similar to the type of images you are using but it might be worth a try. I recommend you use the Mobilenet model. Code for that is shown below
base_model=tf.keras.applications.mobilenet.MobileNet( include_top=False,
input_shape=input_shape, pooling='max', weights='imagenet',dropout=.4)
x=base_model.output
x = Dense(64,activation='relu')(x)
x=Dropout(.3, seed=123)(x)
output=Dense(1, activation='sigmoid')(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adamax(lr=.001), loss='binary_crossentropy', metrics=['accuracy'])
use the callbacks referenced above in model.fit You may get a warning the Mobilenet was trained with an image shape of 224 X 224 X 3 but it should still load the imagenet weights and work.
Y_train = to_categorical(Y_train, num_classes = 10)#
random_seed = 2
X_train,X_val,Y_train,Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=random_seed)
Y_train.shape
model = Sequential()
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size = 86, epochs = 3,validation_data = (X_val, Y_val), verbose =2)
I have to classify the MNIST data into 10 classes. I am converting the Y_train into one hot encoded array. I have gone through a number of answers but none have helped. Kindly guide me in this regard as I am a novice in ML and neural network.
It seems there is no need to use model.add(Flatten()) in your first layer. Instead of doing so, you can use a dense layer with a specific input size like: model.add(Dense(64, input_shape=your_input_shape, activation="relu").
To ensure this issue happens because of the layers, you can check whether to_categorical() function works alone with jupyter notebook.
Updated Answer
Before the model, you should reshape your model. In that case 28*28 to 784.
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
I also suggest to normalize the data that could be done by simply dividing the images to 255
After that step you should create your model.
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(64, activation='relu'),
Dense(10, activation='softmax'),
])
Have you noticed input_shape=(784,) That is the shape of your flattened input.
Last step, compiling and fitting.
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
model.fit(
train_images,
train_labels,
epochs=10,
batch_size=16,
)
What you do is you have just flattened the input layer without feeding the network with an input. That's why you experience an issue. The point is you should manually reshape your inputs and feed forward to the Dense() layers with parameter input_shape