how to improve the performance of the CNN model (Machine Learning - Deep Learning) - tensorflow

I am trying to train a model based on CNN,
Training data is the image of lunar lander, but the performance of model is not good, the accuracy is about 45%, I try to add more layer to improve it, but it still doesn't work well, could any one provide some ideas about how to improve it.
Label: up down left right (0,1,2,3)
Sample rate: 0.1
The Training data has converted from image to data
Label:
here is some of my code:
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(CHANNELS, ROWS, COLS), activation='relu'))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation = 'softmax'))
.summary()

I would experiment a lower dropout rate. Dropouts are used to prevent overfitting. It looks like your model isn't even fitting in the first place.

Related

Why is testing accuracy marginally higher then the training accuracy?

I am running a 3 class classification problem with around 7200 total images using CNN. I have used 80:20 split and the accuracy and loss curves are attached along with the code.
Can someone explain why the validation accuracy is higher then the training and similarly training loss is higher than the validation loss. Although the difference is marginal (around 1%), I am not sure if its acceptable or I need to improve the model.
Here is the model I am running with batch_size = 64
# define cnn model
def define_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(64, 64, 3)))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.3))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
return model
I tried various other models but the accuracy and loss curves didn't seem good. This one seems good but I was expecting the training accuracy to be higher than the validation accuracy

How to predict multiple properties with same classes?

The data looks like this:
X is a 100x150 RGB image
Y is a list of 6 One-Hot-Encoded Features. Each feature can have 6 labels (same for each feature)
This is how my model currenty looks (and works):
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(100, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(36, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Sigmoid is working ok-ish, but I would rather tell the model that every 6 units of the output is actually one feature. Ideally, I would want to apply the softmax function to every group of 6 neurons, to properly cover the probability for each label of a feature.
Thanks for the help!

ValueError: Shapes (None, 1) and (None, 5) are incompatible in keras

model = Sequential()
model.add(Conv2D(128, (3, 3), activation='relu', input_shape=(64, 64, 3), padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation = 'softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=[tf.keras.metrics.Recall()])
This code works fine for metrics=['accuracy']), but it shows ValueError: Shapes (None, 1) and (None, 5) are incompatible for metrics=[tf.keras.metrics.Recall()])
Please help me. Thanks in advance.
Recall makes sense only for binary classification. Your final layer has 5 nodes, which essentially means you have 5 classes. You should change recall to another metric. Documentation should help you choose an appropriate metric for your model. Categorical accuracy should be good enough to get started.

Conv neural network to tell standard 52-card deck apart

I'm using the below keras model to train a neural network to tell 52 game cards 23456789TJQA each with Club, Diamond, Heart and Spade apart.
The model is working quite well but occasionally has problems telling Club and Diamond apart, as they are the most similar (and the difference is quite granular). I was wondering if anybody has some suggestions in what way I can improve the below model?
I've tried different things, like converting everything to black and white, grayscale, smoothing, augmentation etc, but nothing seems to solve that problem.
The pictures are all 15x50 pixels, with 1 channel, so the input shape is (15,50,1)
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape, activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(1024, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))

Feed CNN features to LSTM

I want to build an end-to-end trainable model with the following proprieties:
CNN to extract features from image
The features is reshaped to a matrix
Each row of this matrix is then fed to LSTM1
Each column of this matrix is then fed to LSTM2
The output of LSTM1 and LSTM2 are concatenated for the final output
(it's more or less similar to Figure 2 in this paper: https://arxiv.org/pdf/1611.07890.pdf)
My problem now is after the reshape, how can I feed the values of feature matrix to LSTM with Keras or Tensorflow?
This is my code so far with VGG16 net (also a link to Keras issues):
# VGG16
model = Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 2
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 3
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 4
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 5
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 6
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
# reshape the feature 4096 = 64 * 64
model.add(Reshape((64, 64)))
# How to feed each row of this to LSTM?
# This is my first solution but it doesn’t look correct:
# model.add(LSTM(256, input_shape=(64, 1))) # 256 hidden units, sequence length = 64, feature dim = 1
Consider building your CNN model with Conv2D and MaxPool2D layers, until you reach your Flatten layer, because the vectorized output from the Flatten layer will be you input data to the LSTM part of your structure.
So, build your CNN model like this:
model_cnn = Sequential()
model_cnn.add(Conv2D...)
model_cnn.add(MaxPooling2D...)
...
model_cnn.add(Flatten())
Now, this is an interesting point, the current version of Keras has some incompatibility with some TensorFlow structures that will not let you stack your entire layers in just one Sequential object.
So it's time to use the Keras Model Object to complete you neural network with a trick:
input_lay = Input(shape=(None, ?, ?, ?)) #dimensions of your data
time_distribute = TimeDistributed(Lambda(lambda x: model_cnn(x)))(input_lay) # keras.layers.Lambda is essential to make our trick work :)
lstm_lay = LSTM(?)(time_distribute)
output_lay = Dense(?, activation='?')(lstm_lay)
And finally, now it's time to put together our 2 separated models:
model = Model(inputs=[input_lay], outputs=[output_lay])
model.compile(...)
OBS: Note that you can substitute my model_cnn example by your VGG without including the top layers, once the vectorized output from the VGG Flatten layer will be the input of the LSTM model.