Why is VGG-16 performing poor on CIFAR-10 dataset? - tensorflow

I am trying to implement VGG-16 Convolutional Neural Network for the CIFAR-10 dataset with Tensorflow. But I am getting near about 10% of training accuracy. What is wrong with my code?
import tensorflow as tf
from tensorflow.keras import datasets
(X_train, y_train), (X_test, y_test) = datasets.cifar10.load_data()
X_train.shape, y_train.shape, X_test.shape, y_test.shape
X_train = X_train/255
X_test = X_test/255
y_train = y_train.reshape(-1,)
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(filters=64, kernel_size=(3,3), activation="relu", input_shape=
(32,32,3),padding="same"),
tf.keras.layers.Conv2D(filters=64, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.MaxPool2D(pool_size=(2,2), strides=(2,2)),
tf.keras.layers.Conv2D(filters=128, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.Conv2D(filters=128, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.MaxPool2D(pool_size=(2,2), strides=(2,2)),
tf.keras.layers.Conv2D(filters=256, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.Conv2D(filters=256, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.Conv2D(filters=256, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.MaxPool2D(pool_size=(2,2), strides=(2,2)),
tf.keras.layers.Conv2D(filters=512, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.Conv2D(filters=512, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.Conv2D(filters=512, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.MaxPool2D(pool_size=(2,2), strides=(2,2)),
tf.keras.layers.Conv2D(filters=512, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.Conv2D(filters=512, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.Conv2D(filters=512, kernel_size=(3,3), activation="relu",
padding="same"),
tf.keras.layers.MaxPool2D(pool_size=(2,2), strides=(2,2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(4096, activation="relu"),
tf.keras.layers.Dense(4096, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
model.summary()
model.compile(loss=tf.keras.losses.sparse_categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
X_train[0].shape, y_train[0].shape
model.fit(X_train, y_train, epochs = 100)

It seems like you haven't found the right training schedule yet.
If you don't mind changing the model a bit, I recommend using Batchnorm after each convolution layer. In general, a model is easier to train with Batchnorm.
Furthermore, did you decrease the learning rate after a certain amount of iterations? At some point, a too-large learning rate may not decrease your training error anymore. The ResNet, for example, is trained with an initial learning rate of 0.1 for 100 epochs then with 0.01 for another 50 epochs and 0.001 for another 50 epochs.

Related

get VGG16 model to acceptable accuracy

I'm trying to get VGG16 model to acceptable accuracy but I can't get it above .3
here's the model
def VGG16():
model = Sequential()
model.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(3,3),padding='same', activation='relu'))
model.add(Conv2D(filters=64,kernel_size=(3,3),padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding='same', activation='relu'))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding='same', activation='relu'))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding='same', activation='relu'))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding='same', activation='relu'))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding='same', activation='relu'))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding='same', activation='relu'))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding='same', activation='relu'))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2),name='vgg16'))
model.add(Flatten(name='flatten'))
model.add(Dense(4096, activation='relu', name='fc1'))
model.add(Dense(4096, activation='relu', name='fc2'))
model.add(Dense(9, activation='softmax', name='output'))
return model
opt = SGD(learning_rate=1e-6, momentum=0.9)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
some answers here suggested changing the number of neurons in the fully connected layers to 4096 (originally used 256 and 128), using SGD instead of Adam, increasing the number of epochs (tried 50, 100 and 200) and the batch size (tried 64 and 128) but I can't get it above .3 and usually it's .2.
parameters I used in the best result are
fully connected neurons 4096
optimizer SGD
learning rate e-6
epochs 100
batch size 128
edit dataset used https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic
you did not show the data for the model training but I suspect your model will be very prone to over fitting. You need to add some dropout layers and some regularization.
After your flatten layer type the following
model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 ))
model.add(Dense(256, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
bias_regularizer=regularizers.l1(0.006) ,activation='relu') )
model.add(Dropout(rate=.4, seed=123), name='dropout'))
model.add(Dense(9, activation='softmax', name='output'))
it would be helpful if you provide the model training data as well

The val_accuracy is higher than training accuracy, and the test accuracy is very low compared to both val_accuracy and train_accuracy

I am training a CNN model where,
Training data=687 , validation data=102 , testing data=79
The validation accuracy is higher than training accuracy
The test accuracy is very low compared to both validation accuracy and training accuracy.
Validation loss is lower than training loss.
code snippet:
train_datagen = ImageDataGenerator(
rescale=1./255,
# rotation_range=30,
zoom_range=0.1,
horizontal_flip=True,
# vertical_flip=True,
# fill_mode='nearest',
validation_split=.15) # set validation split
val_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.15)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(height, width),
batch_size=batch_size,
class_mode='categorical',
seed=13
)
validation_generator = val_datagen.flow_from_directory(
train_dir, # same directory as training data
target_size=(height, width),
batch_size=batch_size,
class_mode='categorical',
seed=13,
subset='validation'
)
model = Sequential()
model.add(Conv2D(16,3,padding="same", activation="relu", input_shape=(height, width, 3)))
model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.2))
model.add(Conv2D(32,3,padding="same", activation="relu"))
model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.2))
model.add(Conv2D(32, 3, padding="same", activation="relu"))
model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.2))
# model.add(Conv2D(32, 3, padding="same", activation="relu", kernel_regularizer=l2(0.0001)))
# model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.4))
model.add(Conv2D(64, 3, padding="same", activation="relu"))
model.add(AveragePooling2D(strides=(2,2), padding="same",))
# model.add(Dropout(0.2))
model.add(Conv2D(64, 3, padding="same", activation="relu"))
model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.2))
model.add(Conv2D(128, 3, padding="same", activation="relu"))
model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.2))
model.add(Conv2D(128, 3, padding="same", activation="relu"))
model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.2))
model.add(Conv2D(256, 3, padding="same", activation="relu"))
model.add(AveragePooling2D(strides=(2,2), padding="same"))
# model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(256,activation="relu"))
model.add(Dropout(.5))
# model.add(Dense(256,activation="relu"))
# model.add(Dropout(.5))
model.add(Dense(4, activation="softmax"))
model.summary()
Adam(learning_rate=0.0001, name='Adam')
model.compile(optimizer = 'Adam',loss = 'categorical_crossentropy',metrics = ['accuracy'])
I've done few things to solve this problem:
I didn't used dropout layers between the conv layers
I decreased the range of the data augmentation.
I trained for longer period(the testing accuracy drops to 62% and the val_acc eventually reaches 100%).
What could be the cause of this issue and how can it be resolved?
How can I display test images in Python that have high levels of inaccuracy?
You need to add subset='training' in train_generator. Right now, you are training on both training and validation data.
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(height, width),
batch_size=batch_size,
class_mode='categorical',
seed=13,
subset='training'
)

Use different activation functions on forward- and backward pass in Keras

I have a non-differentiable activation function I want to use on the forward-pass. On the backward-pass I want to use the ReLU activation function. I started with a small example, but unfortunately can not find approaches to incur the second activation function into my keras code. Probably something like this isn't even possible with keras? I would be very happy if someone has an idea how to approach this and can give me a hint. Thank you!
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
test_images_discrete = test_images
train_images = train_images / 255.0
test_images = test_images / 255.0
#my_test_model
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(filters=16, kernel_size=5, strides=2, padding="valid",
activation="relu", input_shape=(28,28,1), use_bias=True),
tf.keras.layers.Conv2D(filters=32, kernel_size=3, strides=2, padding="valid",
activation="relu", use_bias=True),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=["accuracy"])
model.fit(train_images,
train_labels,
epochs=5,
validation_data=(test_images, test_labels))

ValueError: Shapes (None, 7) and (None, 8) are incompatible

I keep on getting the same error I think I have the problem with the input shapes, Please help me
X = Features.iloc[: ,:-1].values
Y = Features['labels'].values
As this is a multiclass classification problem onehotencoding our Y.
encoder = OneHotEncoder()
Y = encoder.fit_transform(np.array(Y).reshape(-1,1)).toarray()
# splitting data
x_train, x_test, y_train, y_test = train_test_split(X, Y, random_state=0, shuffle=True)
x_train.shape, y_train.shape, x_test.shape, y_test.shape
# scaling our data with sklearn's Standard scaler
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
x_train.shape, y_train.shape, x_test.shape, y_test.shape
# making our data compatible to model.
x_train = np.expand_dims(x_train, axis=2)
x_test = np.expand_dims(x_test, axis=2)
x_train.shape, y_train.shape, x_test.shape, y_test.shape, x_train.shape[1]
The model
model=Sequential()
model.add(Conv1D(256, kernel_size=5, strides=1, padding='same', activation='relu', input_shape=(x_train.shape[1], 1)))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Conv1D(256, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Conv1D(128, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Dropout(0.2))
model.add(Conv1D(64, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Flatten())
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(units=8, activation='softmax'))
model.compile(optimizer = 'adam' , loss = 'categorical_crossentropy' , metrics = ['accuracy'])
model.summary()
rlrp = ReduceLROnPlateau(monitor='val_loss', factor=0.4, verbose=0, patience=2, min_lr=0.0000001)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', min_delta=0.0001, cooldown=0, min_lr=0)
model.fit(x_train, y_train, batch_size=64, epochs=50, validation_data=(x_test, y_test), callbacks=[rlrp])
#history=model.fit(x_train, y_train, callbacks=[rlrp])
I'm getting the error when trying to fit the model.
Here is your code running.
Since you did not provide any sample data, I had to fake some data and I will explain what is the issue.
Your y_train must have a depth of 8 since your softmax layer is 8.
If you want to get the same error in my code, change
y_train = tf.one_hot(tf.random.uniform(shape=[1000],minval=0, maxval=2, dtype=tf.int32),8) #change the depth to 7 and you will see your error
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
from matplotlib import pyplot as plt
import numpy as np
x_train = tf.random.normal(shape=(1000,10,1), dtype = tf.float32)
x_test = tf.random.normal(shape=(100,10,1), dtype = tf.float32)
y_train = tf.one_hot(tf.random.uniform(shape=[1000],minval=0, maxval=2, dtype=tf.int32),8)
y_test = tf.one_hot(tf.random.uniform(shape=[100],minval=0, maxval=2, dtype=tf.int32),8)
tf.print(y_train)
model=tf.keras.Sequential()
model.add(layers.Conv1D(256, kernel_size=5, strides=1, padding='same', activation='relu', input_shape=(x_train.shape[1], 1)))
model.add(layers.MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(layers.Conv1D(256, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(layers.MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(layers.Conv1D(128, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(layers.MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(layers.Dropout(0.2))
model.add(layers.Conv1D(64, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(layers.MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(layers.Flatten())
model.add(layers.Dense(units=32, activation='relu'))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(units=8, activation='softmax'))
model.compile(optimizer = 'adam' , loss = 'categorical_crossentropy' , metrics = ['accuracy'])
model.summary()
rlrp = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.4, verbose=0, patience=2, min_lr=0.0000001)
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', min_delta=0.0001, cooldown=0, min_lr=0)
model.fit(x_train, y_train, batch_size=64, epochs=50, validation_data=(x_test, y_test), callbacks=[rlrp])
#history=model.fit(x_train, y_train, callbacks=[rlrp])
encoder = OneHotEncoder()
Y = encoder.fit_transform(np.array(Y).reshape(-1,1)).toarray()
# splitting data
x_train, x_test, y_train, y_test = train_test_split(X, Y, random_state=0, shuffle=True)
x_train.shape, y_train.shape, x_test.shape, y_test.shape
# scaling our data with sklearn's Standard scaler
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
x_train.shape, y_train.shape, x_test.shape, y_test.shape
# making our data compatible to model.
x_train = np.expand_dims(x_train, axis=2)
x_test = np.expand_dims(x_test, axis=2)
x_train.shape, y_train.shape, x_test.shape, y_test.shape, x_train.shape[1]

Keras at TF2 metrics not added

I'm using Tensorflow 2.0 nightly build, on google colab.
I made simple CNN model, and than compiled it, and fit it.
Here's code:
model = tf.keras.models.Sequential([
tf.keras.layers.Reshape((28, 28, 1)),
tf.keras.layers.Conv2D(filters=32, kernel_size=(3, 3), padding='SAME',
activation=tf.nn.relu),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Conv2D(filters=64, kernel_size=(3, 3), padding='SAME',
activation=tf.nn.relu),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(10),
tf.keras.layers.Softmax()
])
optimizer = tf.keras.optimizers.Adam(0.001)
model.compile(optimizer=optimizer,
loss=tf.keras.losses.CategoricalCrossentropy(),
matrics=['accuracy'])
log_dir='./logs'
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir,
histogram_freq=2,
write_images=True,
update_freq='batch',
profile_batch=0)
model.fit(x=x_train, y=y_train, batch_size=100, epochs=15,
callbacks=[tensorboard_callback], validation_data=(x_test, y_test))
And it don't give me accuracy information.
I evaluated model, and it supposed to give me accuracy information, but it only gives me loss information.
I printed model.metrics, and it was just [].
Is it bug? Or I missed something?
You misspelled metrics as matrics. Change matrics=['accuracy'] to metrics=['accuracy'].