Convolutional neural network doesn't classify test set keras - tensorflow

I have a 3-D convolutional neural network [keras, tensorflow] and 3D brain images of people with advanced alzheimer's, early alzheimer's and healthy people (3 classes). I have training set of 324 images and test set of 74 images. When I trained my CNN, I had about 65-70% accuracy but for the test set I had only 30-40%. When I used the test data as validation data then for training set I had no more than 37% accuracy as well and the loss stayed at the same level the whole time. Nevermind which parameters I change, the result is the same. I load my prepared and normalized data from .h5 file into Python, and the input have shape (None, 90, 120, 80, 1). I don't have an idea what may be wrong, I checked the code many times and everything seems to be correct.
My CNN have 4 conv3D layers, 3 max-pooling, activations:relu and batch_normalizations, 3 dense layers and dropout, softmax
I appreciate any help or ideas.

If you only have 65/70% accuracy on your training data that is really poor and indicates your neural network is not converging properly. Your network should be capable of at least overfitting the training data if the structure is complex enough, by effectively learning to hardcode the outputs from the small input sample. By the sound of it, your structure is complex enough.
The first thing to try is to reduce the learning rate by a factor of 10, and turn off validation/early stopping/normalisation/regularisation and any other ways to prevent overfitting. Then rinse, repeat - more iterations, each reducing the LR by a factor of 10 - until you can overfit the training data to where it gets close to 100% on the training data.
You can then work on putting in the proper early stopping, dropout, normalisation, regularisation etc to prevent overfitting with a learning rate you know works.
If dropping the LR doesn't even overfit however small the LR then you have some issue with your NN structure.

Related

Higher train set accuracy, Lower test set accuracy

Im using CNN to classify wireless signal.
Meamwhile I meet some strange problem - when trainset accuray is 80%, I got 79% testset accuracy, but when trianset accuracy is 93%, the testset accuray fall to 71%. Anyone have same problem before?
My net is based on keras + tensorflow.
the detail of net is :
CNN(512,(2,2),tanh)
Batch_normaliztion
flatten()
DNN(512,elu)
DNN(256,elu)
DNN(128,softmax)
opt=adam
loss = mse
THANKS
This would appear to be a case of over fitting.How did you get the training accuracy to go from 80% to 93%? Was it just by running more epochs?.
If over fitting is what is happening add dropout layers to the model. This should improve the validation accuracy but it may take more epochs to achieve the desired training accuracy. Another alternative is to use regularizers in the dense layers.
The more complex your model is the more it is prone to over fitting so you might try running the model with just two dense layers or alternatively reduce the number of nodes in the hidden layers.

Small gap between train and test error, does that mean overfitting?

i am working on a dataset of 368609 samples and 34 features, i wanted to use a neural network to predict latency (real value) using keras, the model has 3 hidden layers, each layer has 1024 neurons, i have used drop_out (50 %) and l2 regularization (0.001) for each hidden layer. The problem is i am getting a test mean absolute error of 3.5505 ms and train mean absolute error of 3.4528. Here, the train error is smaller than test error by a small gap, does this mean that we have an overfitting problem here ?
Not really, yet it is still always a good idea to see how your model is generalizing to new data.
Keep something between 10%-20% of your original dataset as a test set and try to predict the output for each record in the test set.
Sometimes when we deal with the same validation set for many attempts of improving our model, we tend to overfit the evaluation dataset as well.
Having 3 different datasets for training, evaluation and testing usually provides a whole solution to overfitting.
If you get a high accuracy on your training set and a low accuracy on you test set it often means that your are overfitting. So in you case - no you are probably not overfitting.
Normally you would also have a validation set, so you don't fit your data to the testset.

Overfitting DL model?

I am trying to build a Deep Learning model to pick out Tropical Cyclones in weather model data. I have collected the data, normalized it in the region [0, 1] and passed it to my early model. Then I plotted my loss and accuracy curves as here. I am getting weird curves as the validation loss starts increasing after ~50 epochs, indicating overfitting, but the validation accuracy is still increasing. Is my model overfitting (at around 50 epochs) or not?
These graphs are classical graphs that comes with overfitting! You can recognize overfit because even though training accuracy keeps increasing, validation accuracy does not. To prevent overfit, there are several approaches; too numerous to name in one answer. You could apply L1/L2 regularization, dropout, or artificially expand your training data (amongst others).

diagnosis on training process of neural network

I am training an autoencoder DNN for a regression question. Need suggestions on how to improve the training process.
The total number of training sample is about ~100,000. I use Keras to fit the model, setting validation_split = 0.1. After training, I drew loss function change and got the following picture. As can be seen here, validation loss is unstable and mean values are very close to training loss.
My question is: based on this, what is the next step I should try to improve the training process?
[Edit on 1/26/2019]
The details of network architecture are as follows:
It has 1 latent layer of 50 nodes. The input and output layer have 1000 nodes,respectively. The activation of hidden layer is ReLU. Loss function is MSE. For optimizer, I use Adadelta with default parameter settings. I also tried to set lr=0.5, but got very similar results. Different features of the data have scaled between -10 and 10, with mean of 0.
By observing the graph provided, the network could not approximate the function which establishes a relation between the input and output.
If your features are too diverse. That one of them is large and others have a very small value, then you should normalize the feature vector. You can read more here.
For a better training and testing result, you can follow these tips,
Use a small network. A network with one hidden layer is enough.
Perform activations in the input as well as hidden layers. The output layer must have a linear function. Use ReLU activation function.
Prefer small learning rate like 0.001. Use RMSProp optimizer. It works fine on most regression problems.
If you are not using mean squared error function, use it.
Try slow and steady learning and not fast learning.

Tensorflow: loss decreasing, but accuracy stable

My team is training a CNN in Tensorflow for binary classification of damaged/acceptable parts. We created our code by modifying the cifar10 example code. In my prior experience with Neural Networks, I always trained until the loss was very close to 0 (well below 1). However, we are now evaluating our model with a validation set during training (on a separate GPU), and it seems like the precision stopped increasing after about 6.7k steps, while the loss is still dropping steadily after over 40k steps. Is this due to overfitting? Should we expect to see another spike in accuracy once the loss is very close to zero? The current max accuracy is not acceptable. Should we kill it and keep tuning? What do you recommend? Here is our modified code and graphs of the training process.
https://gist.github.com/justineyster/6226535a8ee3f567e759c2ff2ae3776b
Precision and Loss Images
A decrease in binary cross-entropy loss does not imply an increase in accuracy. Consider label 1, predictions 0.2, 0.4 and 0.6 at timesteps 1, 2, 3 and classification threshold 0.5. timesteps 1 and 2 will produce a decrease in loss but no increase in accuracy.
Ensure that your model has enough capacity by overfitting the training data. If the model is overfitting the training data, avoid overfitting by using regularization techniques such as dropout, L1 and L2 regularization and data augmentation.
Last, confirm your validation data and training data come from the same distribution.
Here are my suggestions, one of the possible problems is that your network start to memorize data, yes you should increase regularization,
update:
Here I want to mention one more problem that may cause this:
The balance ratio in the validation set is much far away from what you have in the training set. I would recommend, at first step try to understand what is your test data (real-world data, the one your model will face in inference time) descriptive look like, what is its balance ratio, and other similar characteristics. Then try to build such a train/validation set almost with the same descriptive you achieve for real data.
Well, I faced the similar situation when I used Softmax function in the last layer instead of Sigmoid for binary classification.
My validation loss and training loss were decreasing but accuracy of both remained constant. So this gave me lesson why sigmoid is used for binary classification.