How to build a pretrained CNN-LSTM network with Keras - tensorflow

I'm trying to use a CNN-LSTM network with Keras in order to analyze videos. I read about it and run into TimeDistributed function and some examples.
Actually, I tried the network described below, which is in fact composed by a convolutional and pooling layers followed by recurrent and dense layers.
model = Sequential()
model.add(TimeDistributed(Conv2D(2, (2,2), activation= 'relu' ), input_shape=(None, IMG_SIZE, IMG_SIZE, 3)))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50))
model.add(Dense(50, activation = 'softmax'))
model.compile(loss = 'categorical_crossentropy' , optimizer = 'adam' , metrics = ['acc'])
I haven't tested properly the model, since my dataset is too small. However, during training process the network reaches accuracy 0.98 in 4-5 epochs (perhaps it is overfitting, but it isn't a problem yet because I hope to get a suitable dataset later).
Then, I read about how to use a pretrained convolutional network (MobileNet, ResNet or Inception) as a feature extractor for LSTM network, such that I use the following code:
inputs = Input(shape = (frames, IMG_SIZE, IMG_SIZE, 3))
cnn_base = InceptionV3(include_top = False, weights='imagenet', input_shape = (IMG_SIZE, IMG_SIZE, 3))
cnn_out = GlobalAveragePooling2D()(cnn_base.output)
cnn = Model(inputs=cnn_base.input, outputs=cnn_out)
encoded_frames = TimeDistributed(cnn)(inputs)
encoded_sequence = LSTM(256)(encoded_frames)
hidden_layer = Dense(1024, activation="relu")(encoded_sequence)
outputs = Dense(50, activation="softmax")(hidden_layer)
model = Model([inputs], outputs)
In this case, when training the model it always shows accuracy ~0.02 (it is the baseline 1/50).
Since the first model at least learned anything, I am wondering if there is any error with the way the network is build in the second case.
Has anybody faced this situation? Any advice?
Thank you.

The reason is you have very small amount of data and retraining the complete Inception V3 weights. Either you have to train the model with more amount of data OR train the model with more number of epochs with hyper parameter tuning. You can find more about hyper parameter training here.
The ideal way is to freeze the base model by base_model.trainable = False and just train the new layers that you have added on top of the Inception V3 layers.
OR
Unfreeze the top layers of the base model(Inception V3 layers) and set the bottom layers to be un-trainable. You can do it as below -
# Let's take a look to see how many layers are in the base model
print("Number of layers in the base model: ", len(base_model.layers))
# Fine-tune from this layer onwards
fine_tune_at = 100
# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
layer.trainable = False

Related

How to reduce the size of neural network model file in keras? To deploy to OpenMV

I need to train a picture classification model with 15 categories. Because it needs to be deployed to OpenMV, the model size cannot exceed 1M, otherwise it cannot be loaded by openmv. I used the trained model mobilenetv2. As in the example on Keras's website, I placed the convolution basis of mobilenetv2 at the bottom of my model, and then added a softmax activated dense layer at the top. The training effect is good, and the accuracy has reached
90%, but the problem is derived The size of H5 model reaches 3M.
I tried to use a fool migration learning website. https://studio.edgeimpulse.com/
The classification model exported from the website is only 600KB(after INT8 quantized). What causes my model to be too large?
Here is the structure of my network:
def mobile_net_v2(data_augmentation, input_shape):
base_model = keras.applications.mobilenet_v2.MobileNetV2(
weights='imagenet',
input_shape=input_shape,
alpha=0.35,
include_top=False)
inputs = keras.Input(shape=input_shape)
base_model.trainable = False
x = data_augmentation(inputs)
x = layers.Rescaling(1. / 255)(x)
x = base_model(x)
x = layers.Flatten()(x)
outputs = layers.Dense(15, activation='softmax')(x)
return keras.Model(inputs, outputs)
The input size is (96, 96)

Stopping training at maximum validation accuracy. Is this a good practice?

I am training my model with a dataset of 200 images. I have created a binary classification CNN that looks like this one:
classifier = Sequential()
# Adding a first convolutional layer
classifier.add(Convolution2D(48, 3, input_shape = (320, 320, 3), activation = 'relu'))
classifier.add(MaxPooling2D())
# Adding a second convolutional layer
classifier.add(Convolution2D(48, 3, activation = 'relu'))
classifier.add(MaxPooling2D())
# Adding a third convolutional layer
classifier.add(Convolution2D(48, 3, activation = 'relu'))
classifier.add(MaxPooling2D())
#Flattening
classifier.add(Flatten())
#Full connected
classifier.add(Dense(256, activation = 'relu'))
#Full connected
classifier.add(Dense(256, activation = 'sigmoid'))
#Dropout
classifier.add(Dropout(0.5))
#Full connected
classifier.add(Dense(1, activation = 'sigmoid'))
# Compiling the CNN
opt = keras.optimizers.Adam(learning_rate=0.001)
classifier.compile(optimizer = opt, loss = 'binary_crossentropy', metrics = ['accuracy'])
classifier.summary()
I am also using Image Data Augmentation and Early Stopping based on val_accuracy with a patience of 10.
My results are the following:
Result graph validation accuracy
The best validation accuracy I get is 0.9231 at the 21st epoch. Should I stop the training with a custom callback once I surpass 92% or is it a bad practice?
Would it be a good practice to set a custom callback that stops training
The best practice here is to save the model every time the validation accuracy hits a maximum, but to keep training. Alternatively, you could save a model after each epoch, and choose the best one to use by checking the validation graph (I'd suggest epoch 11 here. After 11 the validation graph is just oscillating, which is mostly noise).
Finally, 200 images is rarely enough to get good results. You want thousands or tens of thousands at least. Even your validation set should have at least 100 images so that even minor changes to the model show smooth changes in the validation curve. You should also consider adding some data augmentation if you aren't doing it already.

Keras model not learning and predicting only one class out of three classes

New to the field of deep learning and currently working on this competition for predicting the earthquake damage to buildings.
The model I created starts at an accuracy of .56 but remains at this for any number of epochs i let it run. When finished, the model only predicts one of the three classes (which I one hot encoded into a dataframe with three columns). Changing the number of layers, optimizers, data preparation, dropout wont change anything. Even trying to overfit my model with the over-parameterization of the neural network will still have the same accuracy and a non-learning model.
What am I doing wrong?
This is my code:
model = keras.models.Sequential()
model.add(keras.layers.Dense(64, input_dim = 85, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(128, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(256, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(512, activation = "relu"))
model.add(keras.layers.Dense(3, activation = "softmax"))
adam = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer = adam,
loss='categorical_crossentropy',
metrics = ['accuracy'])
history = model.fit(traindata, trainlabels,
epochs = 5,
validation_split = 0.2,
verbose = 1,)
There's nothing visually wrong with your model, but it may be too haevy to learn any useful features.
Try normalizing your input with https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
Start with only 2 layers, and a few numbers of neurons.
Increase batch_size and try learning_rate scheduling.
Observe the validation_accuracy, stop when it starts to overfit.
Finally, for a 3-class classification, 56% accuracy is better than baseline, remmeber it's a competition so the data is not dummy playground data which you can expect to get a 90% accuracy with an MLP in the first try.
Finally, try hyperparameter optimization with tuner.

How is my model working if all the base layer trainables are set to false?

This is the model I make for my deep learning project and I am getting decent accuracy out of it. My question is, if I froze the weights of the initial model(which is my base model of VGG19) how did I manage to train the whole model? And also after adding the VGG19 layer with the layers frozen I got better results than I acheived only which a few layers of CNN. Could it be because the weights of the VGG19 were initialized into my CNN layer?
img_h=224
img_w=224
initial_model = applications.vgg19.VGG19(weights='imagenet', include_top=False,input_shape = (img_h,img_w,3))
last = initial_model.output
for layer in initial_model.layers:
layer.trainable = False
x = Conv2D(128, kernel_size=3, strides=1, activation='relu')(last)
x = Conv2D(64, kernel_size=3, strides=1, activation='relu')(x)
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = (Dropout(0.1))(x)
preds = Dense(2, activation='sigmoid')(x)
"Freezing the layers" just means you don't update the weights on those layers when you backpropagate the error. Therefore, you'll just update the weights on those layers that are not frozen, which enables your neural net to learn.
You are adding some layers after VGG. I don't know if this is a common approach, but it totally makes sense that it kind of works, assuming you are interpreting your metrics right.
Your VGG has already been pre-trained on ImageNet, so it's a pretty good baseline for many use-cases. You are basically using VGG as your encoder. Then, on the output of this encoder (which we can call latent representation of your input), you train a neural net.
I would also try out more mainstream transfer learning techniques, where you gradually unfreeze layers starting from the end, or you have gradually smaller learning rate.

Training Resnet-50 with CIFAR-100 dataset in TensorFlow, can't get good accuracy

I am trying to a resnet-50 model in tensorflow by cifar-100 dataset.I have used builtin resnet_v1_50 to create model in tensorflow with two fully connected layer on it's head.But my validation accuracy stuck at nearly 37%.What is the problem???am I configure wrongly define and configure resnet_v1_50??? my model creation code is given below.
import tensorflow as tf
from tensorflow.contrib.slim.python.slim.nets import resnet_v1
X = tf.placeholder(dtype=tf.float32, shape=[None, 32, 32, 3])
Y = tf.placeholder(dtype=tf.float32, shape=[None, 100])
net, end_points = resnet_v1.resnet_v1_50(X,global_pool=False,is_training=True)
flattened = tf.contrib.layers.flatten(net)
dense_fc1 = tf.layers.dense(inputs=flattened,units=625, activation=tf.nn.relu,kernel_initializer=tf.contrib.layers.xavier_initializer())
dropout_fc1 = tf.layers.dropout(inputs=dense_fc1,rate=0.5, training=self.training)
logits = tf.layers.dense(inputs=dropout_fc1, units=num_classes,kernel_initializer = tf.contrib.layers.xavier_initializer())
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
I think you have an extra dense layer. ResNet uses single fully-connected layer with softmax and size=num_classes.
You might also need to make sure that your hyperparameters are set correctly, like learning_rate and weight_decay and your input processing pipeline is also correct.
Here is an extra link to see if your pipeline is similar to a working solution.