What is the difference between Loss, accuracy, validation loss, Validation accuracy? - tensorflow

At the end of each epoch, I am getting for example the following output:
Epoch 1/25
2018-08-06 14:54:12.555511:
2/2 [==============================] - 86s 43s/step - loss: 6.0767 - acc: 0.0469 - val_loss: 4.1037 - val_acc: 0.2000
Epoch 2/25
2/2 [==============================] - 26s 13s/step - loss: 3.6901 - acc: 0.0938 - val_loss: 2.5610 - val_acc: 0.0000e+00
Epoch 3/25
2/2 [==============================] - 66s 33s/step - loss: 3.1491 - acc: 0.1406 - val_loss: 2.4793 - val_acc: 0.0500
Epoch 4/25
2/2 [==============================] - 44s 22s/step - loss: 3.0686 - acc: 0.0694 - val_loss: 2.3159 - val_acc: 0.0500
Epoch 5/25
2/2 [==============================] - 62s 31s/step - loss: 2.5884 - acc: 0.1094 - val_loss: 2.4601 - val_acc: 0.1500
Epoch 6/25
2/2 [==============================] - 41s 20s/step - loss: 2.7708 - acc: 0.1493 - val_loss: 2.2542 - val_acc: 0.4000
.
.
.
.
Can anyone explain me what's the difference between loss, accuracy, validation loss and validation accuracy?

When we mention validation_split as fit parameter while fitting DL model, it splits data into two parts for every epoch i.e. training data and validation data.
It trains the model on training data and validate the model on validation data by checking its loss and accuracy.
Usually with every epoch increasing, loss goes lower and accuracy goes higher. But with val_loss and val_acc, many cases can be possible:
val_loss starts increasing, val_acc starts decreasing(means model is cramming values not learning)
val_loss starts increasing, val_acc also increases.(could be case of overfitting or diverse probability values in cases softmax is used in output layer)
val_loss starts decreasing, val_acc starts increasing(Correct, means model build is learning and working fine)
This is a link to refer as well in which there is more description given. Thanks. How to interpret "loss" and "accuracy" for a machine learning model
I have tried to explain at https://www.javacodemonk.com/difference-between-loss-accuracy-validation-loss-validation-accuracy-when-training-deep-learning-model-with-keras-ff358faa

In your model.compile function you have defined a loss function and a metrics function.
Your "loss" is the value of your loss function (unknown as you do not show your code)
Your "acc" is the value of your metrics (in this case accuracy)
The val_* simply means that the value corresponds to your validation data.
Only the loss function is used to update your model's parameters, the accuracy is only used for you to see how well your model is doing.
You should seek to minimize your loss and maximize your accuracy.
Ideally the difference between your validation data results and your training data results should be similar (allthough some difference are expected)

I think here is another answer that worth notifying:
val_loss is the value of cost function for your cross-validation data and loss is the value of cost function for your training data
https://datascience.stackexchange.com/a/25269

Related

Is validation dataset initialized/created every epoch during the training process?

Setup:
U-Net network is trained to process small patches (e.g. 64x64 pixels).
The network is fed with a training dataset and validation dataset using Tensorflow Dataset API.
Small patches are generated by sampling (randomly) much larger
images.
The sampling of image patches takes place during the training process
(both training and validation image patches are cropped on the fly).
Tensorflow 2.1 (eager execution mode)
Both training and validation datasets are the same:
dataset = tf.data.Dataset.from_tensor_slices((large_images, large_targets))
dataset = dataset.shuffle(buffer_size=num_large_samples)
dataset = dataset.map(get_patches_from_large_images, num_parallel_calls=num_parallel_calls)
dataset = dataset.unbatch()
dataset = dataset.shuffle(buffer_size=num_small_patches)
dataset = dataset.batch(patches_batch_size)
dataset = dataset.prefetch(1)
dataset = dataset.repeat()
Function get_patches_from_large_images samples a predefined number of small patches from a single large image using tf.image.random_crop. There are two nested loops for and while. The outer loop for is responsible for generating the predefined number of small patches and while is used to check if randomly generated patch using tf.image.random_crop meets some predefined criteria (e.g. patches containing only the background should be discarded). The inner loop while gives up if it is not able to generate a proper patch in some predefined number of iterations so we will not get stuck in this loop. This approach is based on the solution presented here.
for i in range(number_of_patches_from_one_large_image):
num_tries = 0
patches = []
while num_tries < max_num_tries_befor_giving_up:
patch = tf.image.random_crop(large_input_and_target_image,[patch_size, patch_size, 2])
if patch_meets_some_criterions:
break
num_tries = num_tries + 1
patches.append(patch)
Experiment:
training and validation datasets to feed the model are the same (5 large pairs of input-target images), both datasets produce exactly the same number of small patches from single large image
batch_size for training and validation is the same and equals to 50 image patches,
steps_per_epoch and validation_steps are equal (20 batches)
When training is run for validation_freq=5
unet_model.fit(dataset_train, epochs=10, steps_per_epoch=20, validation_data = dataset_val, validation_steps=20, validation_freq=5)
Train for 20 steps, validate for 20 steps
Epoch 1/10
20/20 [==============================] - 44s 2s/step - loss: 0.6771 - accuracy: 0.9038
Epoch 2/10
20/20 [==============================] - 4s 176ms/step - loss: 0.4952 - accuracy: 0.9820
Epoch 3/10
20/20 [==============================] - 4s 196ms/step - loss: 0.0532 - accuracy: 0.9916
Epoch 4/10
20/20 [==============================] - 4s 194ms/step - loss: 0.0162 - accuracy: 0.9942
Epoch 5/10
20/20 [==============================] - 42s 2s/step - loss: 0.0108 - accuracy: 0.9966 - val_loss: 0.0081 - val_accuracy: 0.9975
Epoch 6/10
20/20 [==============================] - 1s 36ms/step - loss: 0.0074 - accuracy: 0.9978
Epoch 7/10
20/20 [==============================] - 4s 175ms/step - loss: 0.0053 - accuracy: 0.9985
Epoch 8/10
20/20 [==============================] - 3s 169ms/step - loss: 0.0034 - accuracy: 0.9992
Epoch 9/10
20/20 [==============================] - 3s 171ms/step - loss: 0.0023 - accuracy: 0.9995
Epoch 10/10
20/20 [==============================] - 43s 2s/step - loss: 0.0016 - accuracy: 0.9997 - val_loss: 0.0013 - val_accuracy: 0.9998
we can see that the first epoch and epochs with validation (every 5th epoch) took much more time than epochs without validation. The same experiment but this time validation is run each epoch give us the following result:
history = unet_model.fit(dataset_train, epochs=10, steps_per_epoch=20, validation_data = dataset_val, validation_steps=20)
Train for 20 steps, validate for 20 steps
Epoch 1/10
20/20 [==============================] - 84s 4s/step - loss: 0.6775 - accuracy: 0.8971 - val_loss: 0.6552 - val_accuracy: 0.9542
Epoch 2/10
20/20 [==============================] - 41s 2s/step - loss: 0.5985 - accuracy: 0.9833 - val_loss: 0.4677 - val_accuracy: 0.9951
Epoch 3/10
20/20 [==============================] - 43s 2s/step - loss: 0.1884 - accuracy: 0.9950 - val_loss: 0.0173 - val_accuracy: 0.9948
Epoch 4/10
20/20 [==============================] - 44s 2s/step - loss: 0.0116 - accuracy: 0.9962 - val_loss: 0.0087 - val_accuracy: 0.9969
Epoch 5/10
20/20 [==============================] - 44s 2s/step - loss: 0.0062 - accuracy: 0.9979 - val_loss: 0.0051 - val_accuracy: 0.9983
Epoch 6/10
20/20 [==============================] - 45s 2s/step - loss: 0.0039 - accuracy: 0.9989 - val_loss: 0.0033 - val_accuracy: 0.9991
Epoch 7/10
20/20 [==============================] - 44s 2s/step - loss: 0.0025 - accuracy: 0.9994 - val_loss: 0.0023 - val_accuracy: 0.9995
Epoch 8/10
20/20 [==============================] - 44s 2s/step - loss: 0.0019 - accuracy: 0.9996 - val_loss: 0.0017 - val_accuracy: 0.9996
Epoch 9/10
20/20 [==============================] - 44s 2s/step - loss: 0.0014 - accuracy: 0.9997 - val_loss: 0.0013 - val_accuracy: 0.9997
Epoch 10/10
20/20 [==============================] - 45s 2s/step - loss: 0.0012 - accuracy: 0.9998 - val_loss: 0.0011 - val_accuracy: 0.9998
Question:
In the first example, we can see that the initialization/creation of the training data set (dataset_train) took about 40s. However, subsequent epochs (without validation) were shorter and took about 4s. Nevertheless, the duration was extended again to about 40 seconds for the epoch with the validation step. Validation dataset (dataset_val) is exactly the same as the training dataset (datasat_train) so the procedure of its creation/initialization took about 40s. However, I am surprised that each validation step is time expensive. I expected the first validation to take 40s, but the next validations should take about 4s. I thought that the validation dataset will behave like the training dataset so the first fetch will take long but subsequent should be much shorter. Am I right or maybe I'm missing something?
Update:
I have checked that creating the iterator from the dataset takes about 40s
dataset_val_it = iter(dataset_val) #40s
If we look inside the fit function, we will see that data_handler object is created once for the whole training, and it returns the data iterator that is used in the main loop of the training process. The iterator is created by calling the function enumerate_epochs. When the fit function wants to perform the validation process, it calls the evaluate function. Whenever evaluate function is called it creates new data_handler object. And then it calls enumerate_epochs function what in turn creates the iterator from the dataset. Unfortunately, in the case of complicated datasets, this process is time-consuming.
If you want just want a quickfix to speed up your input pipeline, you can try caching the elements of the validation dataset.
If we look inside the fit function, we will see that data_handler object is created once for the whole training, and it returns the data iterator that is used in the main loop of the training process. The iterator is created by calling the function enumerate_epochs. When the fit function wants to perform the validation process, it calls the evaluate function. Whenever evaluate function is called it creates new data_handler object. And then it calls enumerate_epochs function what in turn creates the iterator from the dataset. Unfortunately, in the case of complicated datasets, this process is time-consuming.
I've never dug very deep in the tf.data code, but you seem to make a point here. I think it can be interesting to open an issue on Github for this.

Keras: Validation accuracy stays the exact same but validation loss decreases

I know that the problem can't be with the dataset because I've seen other projects use the same dataset.
Here is my data preprocessing code:
import pandas as pd
dataset = pd.read_csv('political_tweets.csv')
dataset.head()
dataset = pd.read_csv('political_tweets.csv')["tweet"].values
y_train = pd.read_csv('political_tweets.csv')["dem_or_rep"].values
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(dataset, y_train, test_size=0.1)
max_words = 10000
print(max_words)
max_len = 25
tokenizer = Tokenizer(num_words = max_words, filters='!"#$%&()*+,-./:;<=>?#[\\]^_`{|}~\t\n1234567890', lower=False,oov_token="<OOV>")
tokenizer.fit_on_texts(x_train)
x_train = tokenizer.texts_to_sequences(x_train)
x_train = pad_sequences(x_train, max_len, padding='post', truncating='post')
tokenizer.fit_on_texts(x_test)
x_test = tokenizer.texts_to_sequences(x_test)
x_test = pad_sequences(x_test, max_len, padding='post', truncating='post')
And my model:
model = Sequential([
Embedding(max_words+1,64,input_length=max_len),
Bidirectional(GRU(64, return_sequences = True), merge_mode='concat'),
GlobalMaxPooling1D(),
Dense(64,kernel_regularizer=regularizers.l2(0.02)),
Dropout(0.5),
Dense(1, activation='sigmoid'),
])
model.summary()
model.compile(loss='binary_crossentropy', optimizer=RMSprop(learning_rate=0.0001), metrics=['accuracy'])
model.fit(x_train,y_train, batch_size=128, epochs=500, verbose=1, shuffle=True, validation_data=(x_test, y_test))
Both of my losses decrease, my training accuracy increases, but the validation accuracy stays at 50% (which is awful considering I am doing a binary classification model).
Epoch 1/500
546/546 [==============================] - 35s 64ms/step - loss: 1.7385 - accuracy: 0.5102 - val_loss: 1.2458 - val_accuracy: 0.5102
Epoch 2/500
546/546 [==============================] - 34s 62ms/step - loss: 0.9746 - accuracy: 0.5137 - val_loss: 0.7886 - val_accuracy: 0.5102
Epoch 3/500
546/546 [==============================] - 34s 62ms/step - loss: 0.7235 - accuracy: 0.5135 - val_loss: 0.6943 - val_accuracy: 0.5102
Epoch 4/500
546/546 [==============================] - 34s 62ms/step - loss: 0.6929 - accuracy: 0.5135 - val_loss: 0.6930 - val_accuracy: 0.5102
Epoch 5/500
546/546 [==============================] - 34s 62ms/step - loss: 0.6928 - accuracy: 0.5135 - val_loss: 0.6931 - val_accuracy: 0.5102
Epoch 6/500
546/546 [==============================] - 34s 62ms/step - loss: 0.6927 - accuracy: 0.5135 - val_loss: 0.6931 - val_accuracy: 0.5102
Epoch 7/500
546/546 [==============================] - 37s 68ms/step - loss: 0.6925 - accuracy: 0.5136 - val_loss: 0.6932 - val_accuracy: 0.5106
Epoch 8/500
546/546 [==============================] - 34s 63ms/step - loss: 0.6892 - accuracy: 0.5403 - val_loss: 0.6958 - val_accuracy: 0.5097
Epoch 9/500
546/546 [==============================] - 35s 63ms/step - loss: 0.6815 - accuracy: 0.5633 - val_loss: 0.7013 - val_accuracy: 0.5116
Epoch 10/500
546/546 [==============================] - 34s 63ms/step - loss: 0.6747 - accuracy: 0.5799 - val_loss: 0.7096 - val_accuracy: 0.5055
I've seen other posts on this topic and they say to add dropout, crossentropy, decrease the learning rate, etc. I have done all of this and none of it works.
Any help is greatly appreciated.
Thanks in advance!
A couple of observations for your problem:
Though not particularly familiar with the dataset, I trust that it is used in many circumstances without problems. However, you could try to check for its balance. In train_test_split() there is a parameter called stratify which, if fed the y, it will ensure the same number of samples for each class are in training set and test set proportionally.
Your phenomenon with validation loss and validation accuracy is not something out of the ordinary. Imagine that in the first epochs, the neural network considers some ground truth positive examples (ys) with GT == 1 with 55% confidence. While the training advances, the neural network learns better, and now it is 90% confident for a ground truth positive example (ys) with GT == 1. Since the threshold for calculating the accuracy is 50% , in both situations you have the same accuracy. Nevertheless, the loss has changed significantly, since 90% >> 55%.
You training seems to advance(slowly but surely). Have you considered using Adam as an off-the-shelves optimizer?
If the low accuracy is still maintained over some epochs, you may very well suffer from a well known phenomenon called underfitting, in which your model is unable to capture the dependencies between your data. To mitigate/avoid underfitting altogether, you may want to use a more complex model (2 LSTMs / 2 GRUs) stacked.
At this stage, remove the Dropout() layer, since you have underfitting, not overfitting.
Decrease the batch_size. Very big batch_size can lead to local minima, rendering you network unable to properly learn/generalize.
If none of these work, try starting with a lower learning rate, say 0.00001 instead of 0.0001.
Reiterate over the dataset preprocessing steps. Ensure the sentences are converted properly.
I have had a similar issue and I think it might be because dropout is right before the output layer. Try moving it to one layer before that.

Discrepancy in validation accuracy and validation loss during training

I'm training my CNN model with 14k+ images for 30 epochs and in the 28th epoch, I find an abnormal validation accuracy and loss as shown below :
- 67s - loss: 0.0767 - acc: 0.9750 - val_loss: 0.6755 - val_acc: 0.8412
Epoch 27/30
- 67s - loss: 0.1039 - acc: 0.9630 - val_loss: 0.3671 - val_acc: 0.9018
Epoch 28/30
- 67s - loss: 0.0639 - acc: 0.9775 - val_loss: 1.1921e-07 - val_acc: 0.1190
Epoch 29/30
- 66s - loss: 0.0767 - acc: 0.9744 - val_loss: 0.8091 - val_acc: 0.8306
Epoch 30/30
- 66s - loss: 0.0534 - acc: 0.9815 - val_loss: 0.2091 - val_acc: 0.9433
Using TensorFlow backend.
Can anyone explain why this happened?
To me it's looks like overfitting. Your training loss is approaching zero and training accuracy approaches 100, whereas validation loss and accuracy jump around.
I would recommend you to tune your regularization (dropout, l2/l1, data augmentation ...) or model capacity.
Usually, it's a good practice to have high-capacity model with tuned regularization.
Like Arkady. A said, its strongly looks like overfitting. It means your model memorized the images so your accuracy raise and raise. But on validation data you achieve bad result.
Example: You memorize that 2*8=16 without understand how multiplication in math realy works. So for the question 2*8 you give as answer 16. But for 2*9=? you dont know what the anwser is.
How to avoid:
Use strong Image Augmentation like imgaug or augmentor.
Use Dropout Layer
Calculate and Save 2 Graphs for each epoch, one for train data Accuracy, one for validation.Normaly both graphs going up at the beginning, and after epoch X the validation graph start to jumping or decreasing. This epoch or epoch-1 is your last good state.
Use more metrics like ROC AUC
Use EarlyStop Callback with monitoring val_acc

Keras Batchnormalization, differing results in trainin and evaluation on training dataset

I'm am training a CNN, for the sake of debugging a my problem I am working on a small subset of the actual training data.
During training the loss and accuracy seem very reasonable and pretty good. (In the example I used the same small subset for validation, the problem shows here already)
Fit on x_train and validate on x_train, using batch_size=32
Epoch 10/10
1/10 [==>...........................] - ETA: 2s - loss: 0.5126 - acc: 0.7778
2/10 [=====>........................] - ETA: 1s - loss: 0.3873 - acc: 0.8576
3/10 [========>.....................] - ETA: 1s - loss: 0.3447 - acc: 0.8634
4/10 [===========>..................] - ETA: 1s - loss: 0.3320 - acc: 0.8741
5/10 [==============>...............] - ETA: 0s - loss: 0.3291 - acc: 0.8868
6/10 [=================>............] - ETA: 0s - loss: 0.3485 - acc: 0.8848
7/10 [====================>.........] - ETA: 0s - loss: 0.3358 - acc: 0.8879
8/10 [=======================>......] - ETA: 0s - loss: 0.3315 - acc: 0.8863
9/10 [==========================>...] - ETA: 0s - loss: 0.3215 - acc: 0.8885
10/10 [==============================] - 3s - loss: 0.3106 - acc: 0.8863 - val_loss: 1.5021 - val_acc: 0.2707
When I evaluate on the same training dataset however the accuracy is really off from what I saw during training ( I would expect it to be at least as good as during training on the same dataset).
When evaluating straight forward or using
K.set_learning_phase(0)
I get, similar to the validation (Evaluating on x_train using batch_size=32):
Evaluation Accuracy: 0.266318537392, Loss: 1.50756853772
If I set the backend to learning phase the results get pretty good again, so the per batch normalization seems to work well. I suspect that the cumulated mean and variance are not properly being used.
So after
K.set_learning_phase(1)
I get (Evaluating on x_train using batch_size=32):
Evaluation Accuracy: 0.887728457507, Loss: 0.335956037511
I added the the batchnormalization layer after the first convolutional layer like this:
model = models.Sequential()
model.add(Conv2D(80, first_conv_size, strides=2, activation="relu", input_shape=input_shape, padding=padding_name))
model.add(BatchNormalization(axis=-1))
model.add(MaxPooling2D(first_max_pool_size, strides=4, padding=padding_name))
...
Further down the line I would also have some dropout layers, which I removed to investigate the Batchnormalization behavior. My intend would be to use the model in non-training phase for normal prediction.
Shouldn't it work like that, or am I missing some additional configuration?
Thanks!
I'm using keras 2.0.8 with tensorflow 1.1.0 (anaconda)
This is really annoying. When you set the learning_phase to be True - a BatchNormalization layer is getting normalization statistics straight from data what might be a problem when you have a small batch_size. I came across similar issue some time ago - and here you have my solution:
When building a model - add an option if the model would predict in either learning or not-learning phase and in this used in learning phase use the following class instead of BatchNormalization:
class NonTrainableBatchNormalization(BatchNormalization):
"""
This class makes possible to freeze batch normalization while Keras
is in training phase.
"""
def call(self, inputs, training=None):
return super(
NonTrainableBatchNormalization, self).call(inputs, training=False)
Once you train your model - reset its weights to a NonTrainable copy:
learning_phase_model.set_weights(learned_model.get_weights())
Now you can fully enjoy using BatchNormalization in a learning_phase.

Possible explanations for loss increasing?

I've got a 40k image dataset of images from four different countries. The images contain diverse subjects: outdoor scenes, city scenes, menus, etc. I wanted to use deep learning to geotag images.
I started with a small network of 3 conv->relu->pool layers and then added 3 more to deepen the network since the learning task is not straightforward.
My loss is doing this (with both the 3 and 6 layer networks)::
The loss actually starts kind of smooth and declines for a few hundred steps, but then starts creeping up.
What are the possible explanations for my loss increasing like this?
My initial learning rate is set very low: 1e-6, but I've tried 1e-3|4|5 as well. I have sanity-checked the network design on a tiny-dataset of two classes with class-distinct subject matter and the loss continually declines as desired. Train accuracy hovers at ~40%
I would normally say your learning rate it too high however it looks like you have ruled that out. You should check the magnitude of the numbers coming into and out of the layers. You can use tf.Print to do so. Maybe you are somehow inputting a black image by accident or you can find the layer where the numbers go crazy.
Also how are you calculating the cross entropy? You might want to add a small epsilon inside of the log since it's value will go to infinity as its input approaches zero. Or better yet use the tf.nn.sparse_softmax_cross_entropy_with_logits(...) function which takes care of numerical stability for you.
Since the cost is so high for your crossentropy it sounds like the network is outputting almost all zeros (or values close to zero). Since you did not post any code I can not say why. I think you may just be zeroing something out in the cost function calculation by accident.
I was also facing the problem ,I was using keras library (tensorflow backend)
Epoch 00034: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-34-0.627.hdf50
Epoch 35/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.2870 - acc: 0.9331 - val_loss: 2.7904 - val_acc: 0.6193
Epoch 36/150
226160/226160 [==============================] - 65s 288us/step - loss: 0.2813 - acc: 0.9331 - val_loss: 2.7907 - val_acc: 0.6268
Epoch 00036: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-36-0.627.hdf50
Epoch 37/150
226160/226160 [==============================] - 65s 286us/step - loss: 0.2910 - acc: 0.9330 - val_loss: 2.5704 - val_acc: 0.6327
Epoch 38/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.2982 - acc: 0.9321 - val_loss: 2.5147 - val_acc: 0.6415
Epoch 00038: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-38-0.642.hdf50
Epoch 39/150
226160/226160 [==============================] - 68s 301us/step - loss: 0.2968 - acc: 0.9318 - val_loss: 2.7375 - val_acc: 0.6409
Epoch 40/150
226160/226160 [==============================] - 68s 299us/step - loss: 0.3124 - acc: 0.9298 - val_loss: 2.8359 - val_acc: 0.6047
Epoch 00040: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-40-0.605.hdf50
Epoch 41/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.2945 - acc: 0.9315 - val_loss: 3.5825 - val_acc: 0.5321
Epoch 42/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.3214 - acc: 0.9278 - val_loss: 2.5816 - val_acc: 0.6444
When i saw my model ,the model was consisting of too many neurons ,
In short the model was overfitting.
I decreased the no of neurons in 2 dense layers (from 300 neurons to 200 neurons)
This may be useful for somebody out there who is facing similar issues to the above. OP is using Tensorflow, in which case this is irrelevant and not the answer, but if you ever implement neural networks and backpropagation by hand (for example for coursework or assignments) make sure that you are subtracting the gradient from the parameters and not adding it. Easy mistake to make - adding the gradient will move away from the nearest minimum and thus increase the loss. I had this very problem only about 30 minutes before writing this.