Retraining InceptionV3: Does not converge with input dimensions 299x299 - tensorflow

I'm using Keras with pretrained weights for the Conv Layers and want to train only the dense layer.
The model performs as expected when I use input dimensions 150x150 or 224x224 but does not converge with 299x299 (train loss increases, train & validation accuracy remains flat equivalent to random guess).
Why does this happen?

solved - learning rate needs to be adjusted depending on input size in CNN

Related

SegNet for CT images pretrained weights

I'm trying to train a SegNet for segmentation task on ct images (with Keras TF).
I'm using VGG16 pretrained weights but I had a problem with the first convolutional layer because I'm using grayscale images but VGG was trained on rgb ones.
I solved that using second method of this (can't use first method because requires too much memory).
However it didn't help me, values are really bad (trained for 100 epochs).
Should I train the first convolutional layer from scratch?
You can try to add a Conv2D before the vgg. Something like :
> Your Input(shape=(height,width,1))
Conv2D(filters=3,kernel_size=1, padding='same',activation='relu')
> The VGG pretrained network (input = (height,width,3))
is interesting in your case because 1x1 convolution is usually employed to change the depth of your object.

Higher train set accuracy, Lower test set accuracy

Im using CNN to classify wireless signal.
Meamwhile I meet some strange problem - when trainset accuray is 80%, I got 79% testset accuracy, but when trianset accuracy is 93%, the testset accuray fall to 71%. Anyone have same problem before?
My net is based on keras + tensorflow.
the detail of net is :
CNN(512,(2,2),tanh)
Batch_normaliztion
flatten()
DNN(512,elu)
DNN(256,elu)
DNN(128,softmax)
opt=adam
loss = mse
THANKS
This would appear to be a case of over fitting.How did you get the training accuracy to go from 80% to 93%? Was it just by running more epochs?.
If over fitting is what is happening add dropout layers to the model. This should improve the validation accuracy but it may take more epochs to achieve the desired training accuracy. Another alternative is to use regularizers in the dense layers.
The more complex your model is the more it is prone to over fitting so you might try running the model with just two dense layers or alternatively reduce the number of nodes in the hidden layers.

Which loss function will converge well in multi-label image classification task?

I've trained a multi-label multi-class image classifier by using sigmoid as output activation function and binary_crossentropy as loss function.
The accuracy curve for validation is showing up-down fluctuation while loss curve at few epochs is showing weird(very high) values.
Following is the Accuracy and loss-curve for fine-tuned(last block) VGG19 model with Dropout and BatchNormalization.
Accuracy curve
loss curve
Accuracy and loss-curve for fine-tuned(last block) VGG19 model with Dropout, BatchNormalization and Data Augmentation.
accuracy curve with data augmentation
loss curve with data augmentation
I've trained the classifier with 1800 training images(5-labels) with 100 validation images. The optimizer I'd used is SGD((lr=0.001, momentum=0.99).
Can anyone explain why loss-curve is getting so much weird or high values at some eochs?
Should I use different loss-function? If yes, which one?
Don't worry - all is well. Your loss curve doesn't say much, especially 'spikes in the loss curve'. They're totally allowed, your model is still training. You should look at your accuracy curve, and that one goes up pretty normal I think.

Expected validation accuracy for Keras Mobile Net V1 for CIFAR-10 (training from scratch)

Has anybody trained Mobile Net V1 from scratch using CIFAR-10? What was the maximum accuracy you got? I am getting stuck at 70% after 110 epochs. Here is how I am creating the model. However, my training accuracy is above 99%.
#create mobilenet layer
MobileNet_model = tf.keras.applications.MobileNet(include_top=False, weights=None)
# Must define the input shape in the first layer of the neural network
x = Input(shape=(32,32,3),name='input')
#Create custom model
model = MobileNet_model(x)
model = Flatten(name='flatten')(model)
model = Dense(1024, activation='relu',name='dense_1')(model)
output = Dense(10, activation=tf.nn.softmax,name='output')(model)
model_regular = Model(x, output,name='model_regular')
I used Adam optimizer with a LR= 0.001, amsgrad = True and batch size = 64. Also normalized pixel data by dividing by 255.0. I am not using any Data Augmentation.
optimizer1 = tf.keras.optimizers.Adam(lr=0.001, amsgrad=True)
model_regular.compile(optimizer=optimizer1, loss='categorical_crossentropy', metrics=['accuracy'])
history = model_regular.fit(x_train, y_train_one_hot,validation_data=(x_test,y_test_one_hot),batch_size=64, epochs=100) # train the model
I think I am supposed to get at least 75% according to https://arxiv.org/abs/1712.04698
Am I am doing anything wrong or is this the expected accuracy after 100 epochs. Here is a plot of my validation accuracy.
Mobilenet was designed to train Imagenet which is much larger, therefore train it on Cifar10 will inevitably result in overfitting. I would suggest you plot the loss (not acurracy) from both training and validation/evaluation, and try to train it hard to achieve 99% training accuracy, then observe the validation loss. If it is overfitting, you would see that the validation loss will actually increase after reaching minima.
A few things to try to reduce overfitting:
add dropout before fully connected layer
data augmentation - random shift, crop and rotation should be enough
use smaller width multiplier (read the original paper, basically just reduce number of filter per layers) e.g. 0.75 or 0.5 to make the layers thinner.
use L2 weight regularization and weight decay
Then there are some usual training tricks:
use learning rate decay e.g. reduce the learning rate from 1e-2 to 1e-4 stepwise or exponentially
With some hyperparameter search, I got evaluation loss of 0.85. I didn't use Keras, I wrote the Mobilenet myself using Tensorflow.
The OP asked about MobileNetv1. Since MobileNetv2 has been published, here is an update on training MobileNetv2 on CIFAR-10 -
1) MobileNetv2 is tuned primarily to work on ImageNet with an initial image resolution of 224x224. It has 5 convolution operations with stride 2. Thus the GlobalAvgPool2D (penultimate layer) gets a feature map of Cx7x7, where C is the number of filters (1280 for MobileNetV2).
2) For CIFAR10, I changed the stride in the first three of these layers to 1. Thus the GlobalAvgPool2D gets a feature map of Cx8x8. Secondly, I trained with 0.25 on the width parameter (affects the depth of the network). I trained with mixup in mxnet (https://gluon-cv.mxnet.io/model_zoo/classification.html). This gets me a validation accuracy of 93.27.
3) Another MobileNetV2 implementation that seems to work well for CIFAR-10 is available here - PyTorch-CIFAR
The reported accuracy is 94.43. This implementation changes the stride in the first two of the original layers which downsample the resolution to stride 1. And it uses the full width of the channels as used for ImageNet.
4) Further, I trained a MobileNetV2 on CIFAR-10 with mixup while only setting altering the stride in the first conv layer from 2 to 1 and used the complete depth (width parameter==1.0). Thus the GlobalAvgPool2D (penultimate layer) gets a feature map of Cx2x2. This gets me an accuracy of 92.31.

Possible to train Tensorflow Inception V3 models with images greater than 299x299?

Is it possible to train a Tensorflow Inception V3 model with an image size greater than size 299x299? Seems that the Inception V3 CNN is designed for this image size only.
As long as you do not include the fully connected (Dense) layers at the top, it should be fine to use a different image size.
You can do that by adding this argument while loading the model
base_model = InceptionV3(weights=weights, include_top=False)
The convolutional layer weights should be independent of the image size in general and hence you can use those weights. The FC layer of the pre-trained network with n fully connected nodes would have a weight matrix of size[m X n]. This layer will expect the input to that layer to be of size m. However, due to the change in image size, you will end up with a different value for m when you feed the image from the new dataset(convolution filter convolving on a different image size).
After adding new dense layers, you can fine-tune the network to train on the top layers alone (keeping the weights of the conv-blocks below it fixed).