Saving and loading of Keras model not working - tensorflow

I've been trying to save and reupload a model and whenever I do that the accuracy always goes down.
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(64, kernel_size=3, activation='relu', input_shape=(IMG_SIZE,IMG_SIZE,3)))
model.add(tf.keras.layers.Conv2D(32, kernel_size=3, activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(len(SURFACE_TYPES), activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['acc'])
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=EPOCHS,
validation_steps=10)
Output:
Epoch 1/3
84/84 [==============================] - 2s 19ms/step - loss: 1.9663 - acc: 0.6258 - val_loss: 0.8703 - val_acc: 0.6867
Epoch 2/3
84/84 [==============================] - 1s 18ms/step - loss: 0.2865 - acc: 0.9105 - val_loss: 0.4494 - val_acc: 0.8667
Epoch 3/3
84/84 [==============================] - 1s 18ms/step - loss: 0.1409 - acc: 0.9574 - val_loss: 0.3614 - val_acc: 0.9000
This followed by running these commands to produce outputs result in the same training loss but different training accuracies. The weights and structures of the models are also identical.
model.save("my_model2.h5")
model2 = load_model("my_model2.h5")
model2.evaluate(train_ds)
model.evaluate(train_ds)
Output:
84/84 [==============================] - 1s 9ms/step - loss: 0.0854 - acc: 0.0877
84/84 [==============================] - 1s 9ms/step - loss: 0.0854 - acc: 0.9862
[0.08536089956760406, 0.9861862063407898]

i have shared reference link click here
it has all formats to save & load your model

Related

Fine tuning in CNN using Tensor Flow - 2.0

I am currently working defect classification problem in solar panel. It's a multi class classification problem. Currently its 3 class. I have done the coding part but my accuracy is very low. How to improve my accuracy?
Total training images - 900
Testing/validation - 300
Class - 3
My code is given below -
import tensorflow as tf
import keras_preprocessing
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator
TRAINING_DIR = "/content/drive/My Drive/solar_images/solar_images/train/"
training_datagen = ImageDataGenerator(
rescale = 1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
VALIDATION_DIR = "/content/drive/My Drive/solar_images/solar_images/test/"
validation_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = training_datagen.flow_from_directory(
TRAINING_DIR,
target_size=(150,150),
class_mode='categorical',
batch_size=64
)
validation_generator = validation_datagen.flow_from_directory(
VALIDATION_DIR,
target_size=(150,150),
class_mode='categorical',
batch_size=64
)
model = tf.keras.models.Sequential([
# Note the input shape is the desired size of the image 150x150 with 3 bytes color
# This is the first convolution
tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
# The second convolution
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# The third convolution
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# The fourth convolution
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# Flatten the results to feed into a DNN
tf.keras.layers.Flatten(),
tf.keras.layers.Dropout(0.5),
# 512 neuron hidden layer
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax')
])
model.summary()
model.compile(loss = 'categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
batch_size=64
history = model.fit(train_generator,
epochs=20,
steps_per_epoch=int(894/batch_size),
validation_data = validation_generator,
verbose = 1,
validation_steps=int(289/batch_size))
model.save("solar_images_weight.h5")
My accuracy is -
Epoch 1/20
13/13 [==============================] - 1107s 85s/step - loss: 1.2893 - accuracy: 0.3470 - val_loss: 1.0926 - val_accuracy: 0.3594
Epoch 2/20
13/13 [==============================] - 1239s 95s/step - loss: 1.1037 - accuracy: 0.3566 - val_loss: 1.0954 - val_accuracy: 0.3125
Epoch 3/20
13/13 [==============================] - 1203s 93s/step - loss: 1.0964 - accuracy: 0.3904 - val_loss: 1.0841 - val_accuracy: 0.5625
Epoch 4/20
13/13 [==============================] - 1182s 91s/step - loss: 1.0980 - accuracy: 0.3750 - val_loss: 1.0894 - val_accuracy: 0.3633
Epoch 5/20
13/13 [==============================] - 1218s 94s/step - loss: 1.1086 - accuracy: 0.3386 - val_loss: 1.0874 - val_accuracy: 0.3125
Epoch 6/20
13/13 [==============================] - 1214s 93s/step - loss: 1.0953 - accuracy: 0.3257 - val_loss: 1.0763 - val_accuracy: 0.6094
Epoch 7/20
13/13 [==============================] - 1136s 87s/step - loss: 1.0851 - accuracy: 0.3831 - val_loss: 1.0754 - val_accuracy: 0.3164
Epoch 8/20
13/13 [==============================] - 1170s 90s/step - loss: 1.1005 - accuracy: 0.3940 - val_loss: 1.0545 - val_accuracy: 0.5039
Epoch 9/20
13/13 [==============================] - 1138s 88s/step - loss: 1.1294 - accuracy: 0.4337 - val_loss: 1.0130 - val_accuracy: 0.5703
Epoch 10/20
13/13 [==============================] - 1131s 87s/step - loss: 1.0250 - accuracy: 0.4531 - val_loss: 0.8911 - val_accuracy: 0.6055
Epoch 11/20
13/13 [==============================] - 1162s 89s/step - loss: 1.0243 - accuracy: 0.4735 - val_loss: 0.9160 - val_accuracy: 0.4727
Epoch 12/20
13/13 [==============================] - 1153s 89s/step - loss: 0.9978 - accuracy: 0.4783 - val_loss: 0.7754 - val_accuracy: 0.6406
Epoch 13/20
13/13 [==============================] - 1187s 91s/step - loss: 1.0080 - accuracy: 0.4687 - val_loss: 0.7701 - val_accuracy: 0.6602
Epoch 14/20
13/13 [==============================] - 1204s 93s/step - loss: 0.9851 - accuracy: 0.5048 - val_loss: 0.7450 - val_accuracy: 0.6367
Epoch 15/20
13/13 [==============================] - 1181s 91s/step - loss: 0.9699 - accuracy: 0.4892 - val_loss: 0.7409 - val_accuracy: 0.6289
Epoch 16/20
13/13 [==============================] - 1187s 91s/step - loss: 0.8884 - accuracy: 0.5241 - val_loss: 0.7169 - val_accuracy: 0.6133
Epoch 17/20
13/13 [==============================] - 1197s 92s/step - loss: 0.9372 - accuracy: 0.5084 - val_loss: 0.7464 - val_accuracy: 0.5859
Epoch 18/20
13/13 [==============================] - 1224s 94s/step - loss: 0.9230 - accuracy: 0.5229 - val_loss: 0.9198 - val_accuracy: 0.5156
Epoch 19/20
13/13 [==============================] - 1270s 98s/step - loss: 0.9161 - accuracy: 0.5192 - val_loss: 0.6785 - val_accuracy: 0.6289
Epoch 20/20
13/13 [==============================] - 1173s 90s/step - loss: 0.8728 - accuracy: 0.5193 - val_loss: 0.6674 - val_accuracy: 0.5781
Training and validation accuracy plot is given below -
You could use transfer learning. Using a pre-trained model such as mobilenet or inception to train on your dataset. This would significantly improve your accuracy.

Validation Accuracy Does Not Improve in CNN

I have a CNN like AlexNet trying to predict class of the ornament. The train accuracy and loss monotonically increase and decrease respectively. But, the test accuracy fluctuates around 0.50.
I've tried to change various hyperparameters, changed batch size,used data augmentation, changed data to gray scale because its just stone pictures, added dropout, regularization, Gaussian noise, changed the unit count in dense layers but still the validation accuracy does not change.
I don't know what to do and how to improve my model. Please help me
from keras.preprocessing.image import ImageDataGenerator
train_datagen=ImageDataGenerator (rescale = 1/255,
featurewise_center =True,
shear_range= 0.2,
zoom_range=0.2,
rotation_range=90,
width_shift_range=0.1,
height_shift_range=0.1,
fill_mode = 'nearest',
vertical_flip = True,
horizontal_flip=True)
training_set=train_datagen.flow_from_directory('/content/drive/My Drive/DATASET1/train',
target_size= (224,224),
batch_size= 128,
color_mode='grayscale',
class_mode='categorical')
test_datagen=ImageDataGenerator ( rescale = 1/255,
featurewise_center =True,
#shear_range= 0.2,
#zoom_range=0.2,
#horizontal_flip=True
)
test_set=test_datagen.flow_from_directory('/content/drive/My Drive/DATASET1/val',
target_size= (224,224),
batch_size= 48,
color_mode='grayscale',
class_mode='categorical')
model = Sequential()
# 1st Convolutional Layer
model.add(Conv2D(filters=96, input_shape=(224,224,1), kernel_size=(11,11), strides=(4,4), padding="same", activation = "relu"))
# Max Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding="valid"))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())
# 2nd Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding="same", activation = "relu"))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding="valid"))
# Batch Normalisation
model.add(BatchNormalization())
# 3rd Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding="same", activation = "relu"))
# Batch Normalisation
model.add(BatchNormalization())
# 4th Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding="same", activation = "relu"))
# Batch Normalisation
model.add(BatchNormalization())
# 5th Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(3,3), strides=(1,1), padding="same", activation = "relu"))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding="valid"))
# Batch Normalisation
model.add(BatchNormalization())
# Passing it to a Fully Connected layer
model.add(Flatten())
# 1st Fully Connected Layer
regularizer =keras.regularizers.l2(l=0.0005)
model.add(GaussianNoise(0.1))
model.add(Dense(units = 4096, activation = "relu", kernel_regularizer = regularizer))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None))
# 2nd Fully Connected Layer
regularizer =keras.regularizers.l2(l=0.0005)
model.add(GaussianNoise(0.1))
model.add(Dense(units = 2048, activation = "relu", kernel_regularizer = regularizer ))
# Add Dropout
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())
# 3rd Fully Connected Layer
regularizer =keras.regularizers.l2(l=0.0005)
model.add(GaussianNoise(0.1))
model.add(Dense(2048, activation = "relu", kernel_regularizer = regularizer))
# Add Dropout
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())
# Output Layer
model.add(Dense(2, activation = "softmax")) #As we have two classes
Epoch 1/20
/usr/local/lib/python3.6/dist-packages/keras_preprocessing/image/image_data_generator.py:716: UserWarning: This ImageDataGenerator specifies `featurewise_center`, but it hasn't been fit on any training data. Fit it first by calling `.fit(numpy_data)`.
warnings.warn('This ImageDataGenerator specifies ')
5/5 [==============================] - 9s 2s/step - loss: 6.2275 - accuracy: 0.5244 - val_loss: 5.9162 - val_accuracy: 0.4985
Epoch 00001: val_accuracy improved from -inf to 0.49853, saving model to alexnet_1.h5
Epoch 2/20
5/5 [==============================] - 7s 1s/step - loss: 6.1302 - accuracy: 0.6031 - val_loss: 5.9220 - val_accuracy: 0.5103
Epoch 00002: val_accuracy improved from 0.49853 to 0.51032, saving model to alexnet_1.h5
Epoch 3/20
5/5 [==============================] - 5s 1s/step - loss: 6.1390 - accuracy: 0.6250 - val_loss: 6.0433 - val_accuracy: 0.4932
Epoch 00003: val_accuracy did not improve from 0.51032
Epoch 4/20
5/5 [==============================] - 6s 1s/step - loss: 6.0528 - accuracy: 0.6429 - val_loss: 5.9255 - val_accuracy: 0.4985
Epoch 00004: val_accuracy did not improve from 0.51032
Epoch 5/20
5/5 [==============================] - 7s 1s/step - loss: 6.0935 - accuracy: 0.6094 - val_loss: 5.9714 - val_accuracy: 0.4926
Epoch 00005: val_accuracy did not improve from 0.51032
Epoch 6/20
5/5 [==============================] - 5s 1s/step - loss: 6.0139 - accuracy: 0.6447 - val_loss: 5.5711 - val_accuracy: 0.4932
Epoch 00006: val_accuracy did not improve from 0.51032
Epoch 7/20
5/5 [==============================] - 5s 1s/step - loss: 6.0250 - accuracy: 0.6353 - val_loss: 5.9171 - val_accuracy: 0.5133
Epoch 00007: val_accuracy improved from 0.51032 to 0.51327, saving model to alexnet_1.h5
Epoch 8/20
5/5 [==============================] - 7s 1s/step - loss: 6.0012 - accuracy: 0.6422 - val_loss: 6.0526 - val_accuracy: 0.4749
Epoch 00008: val_accuracy did not improve from 0.51327
Epoch 9/20
5/5 [==============================] - 6s 1s/step - loss: 5.9814 - accuracy: 0.6635 - val_loss: 5.4898 - val_accuracy: 0.4966
Epoch 00009: val_accuracy did not improve from 0.51327
Epoch 10/20
5/5 [==============================] - 5s 906ms/step - loss: 5.9613 - accuracy: 0.6769 - val_loss: 6.1255 - val_accuracy: 0.4956
Epoch 00010: val_accuracy did not improve from 0.51327
Epoch 11/20
5/5 [==============================] - 6s 1s/step - loss: 5.9888 - accuracy: 0.6484 - val_loss: 6.2377 - val_accuracy: 0.4956
Epoch 00011: val_accuracy did not improve from 0.51327
Epoch 12/20
5/5 [==============================] - 5s 1s/step - loss: 6.0045 - accuracy: 0.6767 - val_loss: 5.4328 - val_accuracy: 0.4932
Epoch 00012: val_accuracy did not improve from 0.51327
Epoch 13/20
5/5 [==============================] - 5s 1s/step - loss: 5.9569 - accuracy: 0.6654 - val_loss: 5.9874 - val_accuracy: 0.4985
Epoch 00013: val_accuracy did not improve from 0.51327
Epoch 14/20
5/5 [==============================] - 7s 1s/step - loss: 5.8978 - accuracy: 0.6859 - val_loss: 6.2074 - val_accuracy: 0.4897
Epoch 00014: val_accuracy did not improve from 0.51327
Epoch 15/20
5/5 [==============================] - 5s 1s/step - loss: 6.0063 - accuracy: 0.6792 - val_loss: 5.3235 - val_accuracy: 0.4966
Epoch 00015: val_accuracy did not improve from 0.51327
Epoch 16/20
5/5 [==============================] - 6s 1s/step - loss: 5.8966 - accuracy: 0.7068 - val_loss: 6.1324 - val_accuracy: 0.5015
Epoch 00016: val_accuracy did not improve from 0.51327
Epoch 17/20
5/5 [==============================] - 7s 1s/step - loss: 5.9352 - accuracy: 0.6562 - val_loss: 6.2356 - val_accuracy: 0.4867
Epoch 00017: val_accuracy did not improve from 0.51327
Epoch 18/20
5/5 [==============================] - 6s 1s/step - loss: 5.9475 - accuracy: 0.6391 - val_loss: 7.9573 - val_accuracy: 0.4966
Epoch 00018: val_accuracy did not improve from 0.51327
Epoch 19/20
5/5 [==============================] - 5s 1s/step - loss: 5.9627 - accuracy: 0.6898 - val_loss: 6.0916 - val_accuracy: 0.4985
Epoch 00019: val_accuracy did not improve from 0.51327
Epoch 20/20
5/5 [==============================] - 6s 1s/step - loss: 5.8621 - accuracy: 0.6974 - val_loss: 6.3277 - val_accuracy: 0.4926
Epoch 00020: val_accuracy did not improve from 0.51327
As you said in Output layer, you have 2 classes, which indicates this model is for binary classification-(0,1). For this classification, you need to define output layer as below:
model.add(Dense(1, activation = "sigmoid"))
Along with class_mode='binary' and binary_crossentropy loss function in your model.
I have removed BatchNormalization() and Dropout() to test because we are already using data augmentation with large dataset.
model = Sequential()
model.add(Conv2D(16, input_shape=(224,224,1), kernel_size=(3,3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(3,3), ))
model.add(Conv2D(32, kernel_size=(3,3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2,2), ))
model.add(Conv2D(64, kernel_size=(3,3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(128, kernel_size=(3,3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(256, kernel_size=(3,3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation = "relu"))
model.add(Dense(64, activation = "relu" ))
model.add(Dense(32, activation = "relu"))
model.add(Dense(1, activation = "sigmoid")) #As we have two classes
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(training_set, epochs =50, validation_data=test_set)
Output:
Epoch 45/50
16/16 [==============================] - 23s 1s/step - loss: 0.5665 - accuracy: 0.7180 - val_loss: 0.5676 - val_accuracy: 0.6960
Epoch 46/50
16/16 [==============================] - 22s 1s/step - loss: 0.5678 - accuracy: 0.7120 - val_loss: 0.5528 - val_accuracy: 0.7160
Epoch 47/50
16/16 [==============================] - 22s 1s/step - loss: 0.5524 - accuracy: 0.7305 - val_loss: 0.5584 - val_accuracy: 0.7060
Epoch 48/50
16/16 [==============================] - 23s 1s/step - loss: 0.5651 - accuracy: 0.7100 - val_loss: 0.5554 - val_accuracy: 0.7120
Epoch 49/50
16/16 [==============================] - 22s 1s/step - loss: 0.5587 - accuracy: 0.7145 - val_loss: 0.5604 - val_accuracy: 0.7120
Epoch 50/50
16/16 [==============================] - 22s 1s/step - loss: 0.5522 - accuracy: 0.7265 - val_loss: 0.5281 - val_accuracy: 0.7260
You can find the details to overcome from overfitting problem and optimize the model better by following this reference.

Keras VGG16 low validation accuracy

I built a very simple Convolutional Neural Network using the pre-trained VGG16.
I am using the Pokemon generation one dataset containing 10.000 images belonging to 149 different classes. I manually split the dataset, 0.7 for training and 0.3 for validation in different directories.
The problem is that I am getting high accuracy but the validation accuracy is not very high.
In the code below, there is the best configuration found, using Adam optimizer with 0.0001 of learning rate.
Can someone suggest me how I can improve the performance and avoid overfitting?
Code:
import tensorflow as tf
import numpy as np
vgg_model = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape = (224,224,3))
vgg_model.trainable = False
model = tf.keras.models.Sequential()
model.add(vgg_model)
model.add(tf.keras.layers.Flatten(input_shape=vgg_model.output_shape[1:]))
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dense(149, activation='softmax'))
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.0001, decay=0.0001/100), loss='categorical_crossentropy', metrics=['accuracy'])
train= tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
test= tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
training_set = train.flow_from_directory('datasets/generation/train', target_size=(224,224), class_mode = 'categorical')
val_set = train.flow_from_directory('datasets/generation/test', target_size=(224,224), class_mode = 'categorical')
history = model.fit_generator(training_set, steps_per_epoch = 64, epochs = 100, validation_data = val_set, validation_steps = 64)
Here is the output every 10 epochs:
Epoch 1/100
64/64 [====================] - 57s 891ms/step - loss: 4.8707 - acc: 0.0654 - val_loss: 4.7281 - val_acc: 0.0718
Epoch 10/100
64/64 [====================] - 53s 821ms/step - loss: 2.9540 - acc: 0.4141 - val_loss: 3.2206 - val_acc: 0.3447
Epoch 20/100
64/64 [====================] - 56s 869ms/step - loss: 1.9040 - acc: 0.6279 - val_loss: 2.6155 - val_acc: 0.4577
Epoch 30/100
64/64 [====================] - 50s 781ms/step - loss: 1.2899 - acc: 0.7658 - val_loss: 2.3345 - val_acc: 0.4897
Epoch 40/100
64/64 [====================] - 53s 832ms/step - loss: 1.0192 - acc: 0.8096 - val_loss: 2.1765 - val_acc: 0.5149
Epoch 50/100
64/64 [====================] - 55s 854ms/step - loss: 0.7948 - acc: 0.8672 - val_loss: 2.1082 - val_acc: 0.5359
Epoch 60/100
64/64 [====================] - 52s 816ms/step - loss: 0.5774 - acc: 0.9106 - val_loss: 2.0673 - val_acc: 0.5435
Epoch 70/100
64/64 [====================] - 52s 811ms/step - loss: 0.4383 - acc: 0.9385 - val_loss: 2.0499 - val_acc: 0.5454
Epoch 80/100
64/64 [====================] - 56s 881ms/step - loss: 0.3638 - acc: 0.9473 - val_loss: 1.9849 - val_acc: 0.5501
Epoch 90/100
64/64 [====================] - 55s 860ms/step - loss: 0.2860 - acc: 0.9609 - val_loss: 1.9564 - val_acc: 0.5531
Epoch 100/100
64/64 [====================] - 52s 815ms/step - loss: 0.2328 - acc: 0.9697 - val_loss: 2.0334 - val_acc: 0.5615
As i can see in your output above you are not overfitting yet but there is a huge spread between train and validation score. There are plenty of things you can try to improve you validation score.
You could add more training data (not always possible)
heavier augmentation
tta
and add dropout layers
add a dropout layer like this:
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(149, activation='softmax'))

How to increase the accuracy of my cnn model?

I have 4 different types of images like bicyle,car,airplane and musicalinstrument and i tried to make image classification with this dataset.Then when i train the model i get this accuracy : 0.62
What should i do to increase the accuracy?
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras.optimizers import RMSprop,Adam
#build the model
#set up the layers
model = Sequential()
model.add(Conv2D(filters = 8, kernel_size = (5,5),padding = 'Same',
activation ='relu', input_shape = (IMG_SIZE,IMG_SIZE,1)))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'Same',
activation ='relu'))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation = "relu"))
model.add(Dropout(0.5))
model.add(Dense(10, activation = "softmax"))
# Define the optimizer
#Adam optimizer: Change the learning rate
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999)
# Compile the model
model.compile(optimizer = optimizer , loss = "categorical_crossentropy",
metrics=["accuracy"])
#train the model
model.fit(train_images, train_labels, epochs=10,batch_size=250)
#evaluate accuracy
val_loss, val_acc = model.evaluate(val_images, val_labels)
print('validation accuracy:', val_acc)
print('validation loss:' , val_loss)
Here is the result:
Epoch 1/10
18620/18620 [==============================] - 987s 53ms/step - loss: 2.0487 - acc: 0.39380/18620 [=======>......................] - ETA: 11:57 - loss: 4.0915 - acc: 0.278410500/18620 [===============>..............] - ETA: 7:07 - loss: 2.7013 - acc: 0.325015500/18620 [=======================>......] - ETA: 2:45 - loss: 2.2196 - acc: 0.3754
Epoch 2/10
18620/18620 [==============================] - 985s 53ms/step - loss: 1.1145 - acc: 0.5409TA: 14:05 - loss: 1.1721 - acc: 0.4987 7750/18620 [===========>..................] - ETA: 9:31 - loss: 1.1378 - acc: 0.5288 - ETA: 2:44 - loss: 1.1183 - acc: 0.5392
Epoch 3/10
18620/18620 [==============================] - 978s 53ms/step - loss: 1.0331 - acc: 0.5830TA: 14:17 - loss: 1.0323 - acc: 0.5845
Epoch 4/10
18620/18620 [==============================] - 975s 52ms/step - loss: 1.0032 - acc: 0.5942TA: 9:37 - loss: 1.0127 - acc: 0.5875 9750/18620 [==============>...............] - ETA: 7:41 - loss: 1.0119 - acc: 0.5892 - ETA: 5:19 - loss: 1.0122 - acc: 0.5902
Epoch 5/10
18620/18620 [==============================] - 973s 52ms/step - loss: 0.9680 - acc: 0.6137TA: 11:27 - loss: 0.9670 - acc: 0.6137 7000/18620 [==========>...................] - ETA: 9:58 - loss: 0.9718 - acc: 0.6066 15000/18620 [=======================>......] - ETA: 3:08 - loss: 0.9694 - acc: 0.6115
Epoch 6/10
18620/18620 [==============================] - 979s 53ms/step - loss: 0.9308 - acc: 0.62960/18620 [=============>................] - ETA: 8:36 - loss: 0.9311 - acc: 0.633110500/18620 [===============>..............] - ETA: 7:05 - loss: 0.9310 - acc: 0.6304
Epoch 7/10
18620/18620 [==============================] - 976s 52ms/step - loss: 0.9052 - acc: 0.63860/18620 [==========>...................] - ETA: 10:02 - loss: 0.9112 - acc: 0.6347 - ETA: 9:11 - loss: 0.9055 - acc: 0.6368 - ETA: 5:19 - loss: 0.9105 - acc: 0.6362
Epoch 8/10
18620/18620 [==============================] - 1008s 54ms/step - loss: 0.8755 - acc: 0.6507/18620 [==================>...........] - ETA: 5:52 - loss: 0.8746 - acc: 0.6513
Epoch 9/10
18620/18620 [==============================] - 994s 53ms/step - loss: 0.8479 - acc: 0.66140/18620 [===>..........................] - ETA: 14:12 - loss: 0.8474 - acc: 0.6560 3500/18620 [====>.........................] - ETA: 13:27 - loss: 0.8437 - acc: 0.6566 - ETA: 11:54 - loss: 0.8318 - acc: 0.6672 - ETA: 9:30 - loss: 0.8273 - acc: 0.6681 9500/18620 [==============>...............] - ETA: 8:08 - loss: 0.8390 - acc: 0.6653 - ETA: 6:09 - loss: 0.8399 - acc: 0.6660 - ETA: 1:53 - loss: 0.8473 - acc: 0.6628
Epoch 10/10
18620/18620 [==============================] - 997s 54ms/step - loss: 0.8108 - acc: 0.67490/18620 [=======>......................] - ETA: 11:54 - loss: 0.8146 - acc: 0.6650 - ETA: 10:45 - loss: 0.8196 - acc: 0.6652
4656/4656 [==============================] - 40s 9ms/stepETA: 1s
validation accuracy: 0.6265034364261168
validation loss: 0.964772748373628
At the first you can increase CNN filters (8 or 16 is not enough)and then use BatchNormalization layers after MaxPool layers. If it's not work change the optimizer (like SGD etc.)
You may need to tune the exact values of filters depending on
The complexity of your dataset.
The depth of your neural network, but I recommend starting with filters in the range [32, 64, 128] in the earlier and increasing up to [256, 512, 1024] in the deeper layers.
If your input images are greater than 128×128 you may choose to use a kernel size > 3.
learn larger spatial filters.
to help reduce volume size.
If your images are smaller than 128×128 you may want to consider sticking with strictly 1×1 and 3×3 filters.
Also use Adam optimiser with its original values.

Why does a cnn with keras not learn?

I am kind of new to deep learning and especially keras, and I have an assignment from university to train a CNN and learn about it, using keras. I am using the MURA dataset (skeletonal radiography).
What I have done until now is to go over all images from the dataset and split the training set into train and validation (90/10).
I am using a CNN that has been given in the paper and I am not allowed to modify it until the second task. The first task is to observe and understand the CNN.
def run():
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory('train_data',
target_size=(227,227),
batch_size=BATCH_SIZE,
class_mode='binary',
color_mode='grayscale'
)
val_generator = val_datagen.flow_from_directory('test_data',
target_size=(227,227),
batch_size=BATCH_SIZE,
class_mode='binary',
color_mode='grayscale'
)
classifier = Sequential()
classifier.add(Conv2D(64,(7,7),strides=2, input_shape=(227,227,1)))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size=(2,2), strides=2))
classifier.add(Conv2D(128, (5,5), strides=2 ))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size=(2,2),strides=2))
classifier.add(Conv2D(256, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(Conv2D(384, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(Conv2D(256, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(Conv2D(256, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size=(2,2),strides=2))
classifier.add(Flatten())
classifier.add(Dropout(0.5))
classifier.add(Dense(units=2048))
classifier.add(Activation('relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(units=1))
classifier.add(Activation('sigmoid'))
classifier.summary()
# from keras.optimizers import SGD
# sg = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
classifier.compile(optimizer=keras.optimizers.SGD(),loss='binary_crossentropy', metrics=['accuracy'])
classifier.fit_generator(train_generator,
steps_per_epoch= training_len//BATCH_SIZE,
epochs=10,
validation_data=val_generator,
validation_steps= valid_len//BATCH_SIZE,
shuffle=True,
verbose=1)
classifier.save_weights('first_model_weights.h5')
classifier.save('first_model.h5')
The problem I am having is that if I run this, it just does not learn. Or at least I think it doesn't.
The output looks like this:
Epoch 1/10
575/575 [==============================] - 693s 1s/step - loss: 0.6767 - acc: 0.5958 - val_loss: 0.6751 - val_acc: 0.5966
Epoch 2/10
575/575 [==============================] - 207s 359ms/step - loss: 0.6760 - acc: 0.5948 - val_loss: 0.6752 - val_acc: 0.5958
Epoch 3/10
575/575 [==============================] - 258s 448ms/step - loss: 0.6745 - acc: 0.5983 - val_loss: 0.6748 - val_acc: 0.5958
Epoch 4/10
575/575 [==============================] - 165s 287ms/step - loss: 0.6760 - acc: 0.5950 - val_loss: 0.6757 - val_acc: 0.5947
Epoch 5/10
575/575 [==============================] - 166s 288ms/step - loss: 0.6761 - acc: 0.5948 - val_loss: 0.6731 - val_acc: 0.6016
Epoch 6/10
575/575 [==============================] - 167s 290ms/step - loss: 0.6742 - acc: 0.5990 - val_loss: 0.6778 - val_acc: 0.5875
Epoch 7/10
575/575 [==============================] - 206s 359ms/step - loss: 0.6762 - acc: 0.5938 - val_loss: 0.6721 - val_acc: 0.6038
Epoch 8/10
575/575 [==============================] - 165s 286ms/step - loss: 0.6762 - acc: 0.5938 - val_loss: 0.6763 - val_acc: 0.5947
Epoch 9/10
575/575 [==============================] - 164s 286ms/step - loss: 0.6751 - acc: 0.5972 - val_loss: 0.6787 - val_acc: 0.5897
Epoch 10/10
575/575 [==============================] - 168s 292ms/step - loss: 0.6750 - acc: 0.5971 - val_loss: 0.6722 - val_acc: 0.6022
Am I doing something wrong in the code? Is it the dataset splitting? I am currently in a dark spot and I can't seem to figure it out.