Accuracy in history dictionary different from what printed on screen - tensorflow

When training a model in Keras, the accuracies printed on-screen at every epoch are different from what saved in the history object. For example (minimal test, compacted output):
history = model.fit(...)
Epoch 1/5
156/156 [===] - loss: 0.6325 - accuracy: 0.7700 - val_loss: 0.4330 - val_accuracy: 0.8156
Epoch 2/5
156/156 [===] - loss: 0.3855 - accuracy: 0.8538 - val_loss: 0.4692 - val_accuracy: 0.8050
Epoch 3/5
156/156 [===] - loss: 0.3918 - accuracy: 0.8427 - val_loss: 0.4666 - val_accuracy: 0.7861
Epoch 4/5
156/156 [===] - loss: 0.3820 - accuracy: 0.8461 - val_loss: 0.4101 - val_accuracy: 0.8014
Epoch 5/5
156/156 [===] - loss: 0.3927 - accuracy: 0.8492 - val_loss: 0.4092 - val_accuracy: 0.7979
Then (rounding like printed values for convenience):
>>> [round(x, 4) for x in history.history['accuracy']]
[0.8184, 0.8474, 0.8484, 0.8488, 0.8476]
>>> [round(x, 4) for x in history.history['val_accuracy']]
[0.8156, 0.805, 0.7861, 0.8014, 0.7979]
As you can see, while validation accuracies match printed values, training accuracies do not (tested both in Colab with GPU and local PC with CPU, using Keras 2.4.0 and TensorFlow 2.4.1).
This is a problem if you want to save data from multiple tests to a file, for example. What am I getting wrong?
EDIT: here is an example to reproduce the problem, slightly modified from TF MNIST quickstart. See the block right after calling model.fit().
https://colab.research.google.com/drive/14Uogeq8wRlZlinaKLbkFr_Bl2aLzUJuy?usp=sharing
EDIT 2: as suggested by another user, I submitted a bug issue here: https://github.com/tensorflow/tensorflow/issues/48408

I used your colab and able to reproduce your issue. Yes, this seems like a serious bug. I tested the code in both CPU and GPU mode with tf 2.0, 2.1, 2.3 without any issue. But this issue causes in tf 2.4 and tf-nightly.
I would suggest you raise a bug issue in TensorFlow GitHub. And share a cross-link here and there so that others can follow the update. In the meantime, you can roll back to tf 2.3. However, I didn't check whether callbacks.CSVLogger also has some issue in the latest release, you can check that too.

Related

Tensorflow binary classification training loss won't decrease, accuracy stuck at around 50%

I'm fairly new to this, and could use some advice on where to go from here.
I'm using tensorflow 2.3.0 with keras to build a binary classification model. I am unable to share the dataset since it's proprietary data owned by my company, but the features are all numerical financial data, representing a sort of histogram for a customer.
I've tried two models, one with 300 features and one with 600, the one with 600 simply representing a longer history. The features are normalized first, and the labels are all 0 or 1, to indicate whether the account should be flagged or not.
I have 500,000 training samples, and 60,000 test samples. The 0/1 label split is roughly half.
This is the code I have currently:
import pandas as pd
import numpy as np
# Make numpy values easier to read.
np.set_printoptions(precision=3, suppress=True)
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import utils
features = pd.read_csv('train.csv')
labels = np.array(features.pop('target'))
features = np.array(features)
num_features = features.shape[1]
features = utils.normalize(features)
model = tf.keras.Sequential([
layers.Dense(512, activation='relu', input_shape=(num_features,)),
layers.Dropout(0.5),
layers.Dense(512, activation='relu'),
layers.Dropout(0.5),
layers.Dense(512, activation='relu'),
layers.Dropout(0.5),
layers.Dense(1, activation='sigmoid')
])
model.compile(loss = tf.losses.BinaryCrossentropy(), optimizer = tf.optimizers.Adam(learning_rate=0.001), metrics=['accuracy'])
model.fit(features, labels, epochs=100)
This is likely the wrong topology, it's just my most recent attempt. I have tried a few different topologies - ranging from tiny single-layer networks with a small number of units to what you see here. I have tried different learning rates and epochs, and with or without the dropout. All of them give basically this same pattern:
Epoch 1/100
15625/15625 [==============================] - 46s 3ms/step - loss: 0.6932 - accuracy: 0.5113
Epoch 2/100
15625/15625 [==============================] - 46s 3ms/step - loss: 0.6929 - accuracy: 0.5127
Epoch 3/100
15625/15625 [==============================] - 46s 3ms/step - loss: 0.6929 - accuracy: 0.5135
Epoch 4/100
15625/15625 [==============================] - 47s 3ms/step - loss: 0.6928 - accuracy: 0.5142
Epoch 5/100
15625/15625 [==============================] - 48s 3ms/step - loss: 0.6928 - accuracy: 0.5138
The loss essentially flatlines here and the accuracy hovers around this point. If I use a very high learning rate the loss starts high but eventually flatlines around this same point.
To test if the model is working at all, I tried with a very small subset of the data (only 5 rows or so), and it quickly reduces the loss to near zero with 100% accuracy, which of course is greatly overfit but was only meant to test the code/data.
What are some next steps I can try to improve this? Does this look like maybe just poorly designed features that the NN can't figure out how to correlate, or is this perhaps not the right choice of algorithm?
EDIT:
Based on comments and responses (thanks!), I've tried a few more tweaks and I'm making some progress. I've adjusted the batch size, tweaked the topology, and lowered the learning rate. I also didn't really understand how validation data fit into the picture, so I have been running a training session now with validation_split=0.2 - my new problem is that now my training loss is decreasing/accuracy increasing, but the inverse is true for the validation loss/accuracy. Here's some epoch snapshots:
Epoch 1/1000
1563/1563 [==============================] - 25s 16ms/step - loss: 0.6926 - accuracy: 0.5150 - val_loss: 0.6927 - val_accuracy: 0.5134
Epoch 20/1000
1563/1563 [==============================] - 24s 15ms/step - loss: 0.6746 - accuracy: 0.5760 - val_loss: 0.7070 - val_accuracy: 0.5103
Epoch 50/1000
1563/1563 [==============================] - 24s 15ms/step - loss: 0.5684 - accuracy: 0.7015 - val_loss: 0.8222 - val_accuracy: 0.5043
I assume this is overfitting in action?
I would change the dense layer units to 512,128,64,1. Remove all dropout layers except the last one. Set the drop out rate of the last one to something like .3. Use your test samples as validation data so you can see if the model is over/under fitting. Also recommend you try using an adjustable learning using the keras callback ReduceLROnPlateau, and early stopping using the keras callback EarlyStopping. Documentation is [here.][1] Set each callback to monitor validation loss. My suggested code is shown below:
reduce_lr=tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_loss",factor=0.5, patience=2, verbose=1)
e_stop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=5,
verbose=0, restore_best_weights=True)
callbacks=[reduce_lr, e_stop]
In model.fit include
callbacks=callbacks

Discrepancy in validation accuracy and validation loss during training

I'm training my CNN model with 14k+ images for 30 epochs and in the 28th epoch, I find an abnormal validation accuracy and loss as shown below :
- 67s - loss: 0.0767 - acc: 0.9750 - val_loss: 0.6755 - val_acc: 0.8412
Epoch 27/30
- 67s - loss: 0.1039 - acc: 0.9630 - val_loss: 0.3671 - val_acc: 0.9018
Epoch 28/30
- 67s - loss: 0.0639 - acc: 0.9775 - val_loss: 1.1921e-07 - val_acc: 0.1190
Epoch 29/30
- 66s - loss: 0.0767 - acc: 0.9744 - val_loss: 0.8091 - val_acc: 0.8306
Epoch 30/30
- 66s - loss: 0.0534 - acc: 0.9815 - val_loss: 0.2091 - val_acc: 0.9433
Using TensorFlow backend.
Can anyone explain why this happened?
To me it's looks like overfitting. Your training loss is approaching zero and training accuracy approaches 100, whereas validation loss and accuracy jump around.
I would recommend you to tune your regularization (dropout, l2/l1, data augmentation ...) or model capacity.
Usually, it's a good practice to have high-capacity model with tuned regularization.
Like Arkady. A said, its strongly looks like overfitting. It means your model memorized the images so your accuracy raise and raise. But on validation data you achieve bad result.
Example: You memorize that 2*8=16 without understand how multiplication in math realy works. So for the question 2*8 you give as answer 16. But for 2*9=? you dont know what the anwser is.
How to avoid:
Use strong Image Augmentation like imgaug or augmentor.
Use Dropout Layer
Calculate and Save 2 Graphs for each epoch, one for train data Accuracy, one for validation.Normaly both graphs going up at the beginning, and after epoch X the validation graph start to jumping or decreasing. This epoch or epoch-1 is your last good state.
Use more metrics like ROC AUC
Use EarlyStop Callback with monitoring val_acc

What is the difference between Loss, accuracy, validation loss, Validation accuracy?

At the end of each epoch, I am getting for example the following output:
Epoch 1/25
2018-08-06 14:54:12.555511:
2/2 [==============================] - 86s 43s/step - loss: 6.0767 - acc: 0.0469 - val_loss: 4.1037 - val_acc: 0.2000
Epoch 2/25
2/2 [==============================] - 26s 13s/step - loss: 3.6901 - acc: 0.0938 - val_loss: 2.5610 - val_acc: 0.0000e+00
Epoch 3/25
2/2 [==============================] - 66s 33s/step - loss: 3.1491 - acc: 0.1406 - val_loss: 2.4793 - val_acc: 0.0500
Epoch 4/25
2/2 [==============================] - 44s 22s/step - loss: 3.0686 - acc: 0.0694 - val_loss: 2.3159 - val_acc: 0.0500
Epoch 5/25
2/2 [==============================] - 62s 31s/step - loss: 2.5884 - acc: 0.1094 - val_loss: 2.4601 - val_acc: 0.1500
Epoch 6/25
2/2 [==============================] - 41s 20s/step - loss: 2.7708 - acc: 0.1493 - val_loss: 2.2542 - val_acc: 0.4000
.
.
.
.
Can anyone explain me what's the difference between loss, accuracy, validation loss and validation accuracy?
When we mention validation_split as fit parameter while fitting DL model, it splits data into two parts for every epoch i.e. training data and validation data.
It trains the model on training data and validate the model on validation data by checking its loss and accuracy.
Usually with every epoch increasing, loss goes lower and accuracy goes higher. But with val_loss and val_acc, many cases can be possible:
val_loss starts increasing, val_acc starts decreasing(means model is cramming values not learning)
val_loss starts increasing, val_acc also increases.(could be case of overfitting or diverse probability values in cases softmax is used in output layer)
val_loss starts decreasing, val_acc starts increasing(Correct, means model build is learning and working fine)
This is a link to refer as well in which there is more description given. Thanks. How to interpret "loss" and "accuracy" for a machine learning model
I have tried to explain at https://www.javacodemonk.com/difference-between-loss-accuracy-validation-loss-validation-accuracy-when-training-deep-learning-model-with-keras-ff358faa
In your model.compile function you have defined a loss function and a metrics function.
Your "loss" is the value of your loss function (unknown as you do not show your code)
Your "acc" is the value of your metrics (in this case accuracy)
The val_* simply means that the value corresponds to your validation data.
Only the loss function is used to update your model's parameters, the accuracy is only used for you to see how well your model is doing.
You should seek to minimize your loss and maximize your accuracy.
Ideally the difference between your validation data results and your training data results should be similar (allthough some difference are expected)
I think here is another answer that worth notifying:
val_loss is the value of cost function for your cross-validation data and loss is the value of cost function for your training data
https://datascience.stackexchange.com/a/25269

Keras Batchnormalization, differing results in trainin and evaluation on training dataset

I'm am training a CNN, for the sake of debugging a my problem I am working on a small subset of the actual training data.
During training the loss and accuracy seem very reasonable and pretty good. (In the example I used the same small subset for validation, the problem shows here already)
Fit on x_train and validate on x_train, using batch_size=32
Epoch 10/10
1/10 [==>...........................] - ETA: 2s - loss: 0.5126 - acc: 0.7778
2/10 [=====>........................] - ETA: 1s - loss: 0.3873 - acc: 0.8576
3/10 [========>.....................] - ETA: 1s - loss: 0.3447 - acc: 0.8634
4/10 [===========>..................] - ETA: 1s - loss: 0.3320 - acc: 0.8741
5/10 [==============>...............] - ETA: 0s - loss: 0.3291 - acc: 0.8868
6/10 [=================>............] - ETA: 0s - loss: 0.3485 - acc: 0.8848
7/10 [====================>.........] - ETA: 0s - loss: 0.3358 - acc: 0.8879
8/10 [=======================>......] - ETA: 0s - loss: 0.3315 - acc: 0.8863
9/10 [==========================>...] - ETA: 0s - loss: 0.3215 - acc: 0.8885
10/10 [==============================] - 3s - loss: 0.3106 - acc: 0.8863 - val_loss: 1.5021 - val_acc: 0.2707
When I evaluate on the same training dataset however the accuracy is really off from what I saw during training ( I would expect it to be at least as good as during training on the same dataset).
When evaluating straight forward or using
K.set_learning_phase(0)
I get, similar to the validation (Evaluating on x_train using batch_size=32):
Evaluation Accuracy: 0.266318537392, Loss: 1.50756853772
If I set the backend to learning phase the results get pretty good again, so the per batch normalization seems to work well. I suspect that the cumulated mean and variance are not properly being used.
So after
K.set_learning_phase(1)
I get (Evaluating on x_train using batch_size=32):
Evaluation Accuracy: 0.887728457507, Loss: 0.335956037511
I added the the batchnormalization layer after the first convolutional layer like this:
model = models.Sequential()
model.add(Conv2D(80, first_conv_size, strides=2, activation="relu", input_shape=input_shape, padding=padding_name))
model.add(BatchNormalization(axis=-1))
model.add(MaxPooling2D(first_max_pool_size, strides=4, padding=padding_name))
...
Further down the line I would also have some dropout layers, which I removed to investigate the Batchnormalization behavior. My intend would be to use the model in non-training phase for normal prediction.
Shouldn't it work like that, or am I missing some additional configuration?
Thanks!
I'm using keras 2.0.8 with tensorflow 1.1.0 (anaconda)
This is really annoying. When you set the learning_phase to be True - a BatchNormalization layer is getting normalization statistics straight from data what might be a problem when you have a small batch_size. I came across similar issue some time ago - and here you have my solution:
When building a model - add an option if the model would predict in either learning or not-learning phase and in this used in learning phase use the following class instead of BatchNormalization:
class NonTrainableBatchNormalization(BatchNormalization):
"""
This class makes possible to freeze batch normalization while Keras
is in training phase.
"""
def call(self, inputs, training=None):
return super(
NonTrainableBatchNormalization, self).call(inputs, training=False)
Once you train your model - reset its weights to a NonTrainable copy:
learning_phase_model.set_weights(learned_model.get_weights())
Now you can fully enjoy using BatchNormalization in a learning_phase.

Possible explanations for loss increasing?

I've got a 40k image dataset of images from four different countries. The images contain diverse subjects: outdoor scenes, city scenes, menus, etc. I wanted to use deep learning to geotag images.
I started with a small network of 3 conv->relu->pool layers and then added 3 more to deepen the network since the learning task is not straightforward.
My loss is doing this (with both the 3 and 6 layer networks)::
The loss actually starts kind of smooth and declines for a few hundred steps, but then starts creeping up.
What are the possible explanations for my loss increasing like this?
My initial learning rate is set very low: 1e-6, but I've tried 1e-3|4|5 as well. I have sanity-checked the network design on a tiny-dataset of two classes with class-distinct subject matter and the loss continually declines as desired. Train accuracy hovers at ~40%
I would normally say your learning rate it too high however it looks like you have ruled that out. You should check the magnitude of the numbers coming into and out of the layers. You can use tf.Print to do so. Maybe you are somehow inputting a black image by accident or you can find the layer where the numbers go crazy.
Also how are you calculating the cross entropy? You might want to add a small epsilon inside of the log since it's value will go to infinity as its input approaches zero. Or better yet use the tf.nn.sparse_softmax_cross_entropy_with_logits(...) function which takes care of numerical stability for you.
Since the cost is so high for your crossentropy it sounds like the network is outputting almost all zeros (or values close to zero). Since you did not post any code I can not say why. I think you may just be zeroing something out in the cost function calculation by accident.
I was also facing the problem ,I was using keras library (tensorflow backend)
Epoch 00034: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-34-0.627.hdf50
Epoch 35/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.2870 - acc: 0.9331 - val_loss: 2.7904 - val_acc: 0.6193
Epoch 36/150
226160/226160 [==============================] - 65s 288us/step - loss: 0.2813 - acc: 0.9331 - val_loss: 2.7907 - val_acc: 0.6268
Epoch 00036: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-36-0.627.hdf50
Epoch 37/150
226160/226160 [==============================] - 65s 286us/step - loss: 0.2910 - acc: 0.9330 - val_loss: 2.5704 - val_acc: 0.6327
Epoch 38/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.2982 - acc: 0.9321 - val_loss: 2.5147 - val_acc: 0.6415
Epoch 00038: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-38-0.642.hdf50
Epoch 39/150
226160/226160 [==============================] - 68s 301us/step - loss: 0.2968 - acc: 0.9318 - val_loss: 2.7375 - val_acc: 0.6409
Epoch 40/150
226160/226160 [==============================] - 68s 299us/step - loss: 0.3124 - acc: 0.9298 - val_loss: 2.8359 - val_acc: 0.6047
Epoch 00040: saving model to /home/ubuntu/temp/trained_data1/final_dev/final_weights-improvement-40-0.605.hdf50
Epoch 41/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.2945 - acc: 0.9315 - val_loss: 3.5825 - val_acc: 0.5321
Epoch 42/150
226160/226160 [==============================] - 65s 287us/step - loss: 0.3214 - acc: 0.9278 - val_loss: 2.5816 - val_acc: 0.6444
When i saw my model ,the model was consisting of too many neurons ,
In short the model was overfitting.
I decreased the no of neurons in 2 dense layers (from 300 neurons to 200 neurons)
This may be useful for somebody out there who is facing similar issues to the above. OP is using Tensorflow, in which case this is irrelevant and not the answer, but if you ever implement neural networks and backpropagation by hand (for example for coursework or assignments) make sure that you are subtracting the gradient from the parameters and not adding it. Easy mistake to make - adding the gradient will move away from the nearest minimum and thus increase the loss. I had this very problem only about 30 minutes before writing this.