Why the accuracy and validation_accuracy are drastically different on the same dataset (no normalization or dropout)? - tensorflow

I'm new to tensorflow 2.0 and I'm running a very simple model that classifies a 1d time series of fixed size (100 values) into one of two classes:
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(512, activation='relu', input_shape=(100, 1)),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])
I have a dataset of ~660,000 labeled examples that I feed into the model with batch_size=256. When I train the NN for 10 epochs, using the same data as a validation dataset
history = model.fit(training_dataset,
epochs=10,
verbose=1,
validation_data=training_dataset)
I got the following output
Epoch 1/10
2573/2573 [==============================] - 55s 21ms/step - loss: 0.5271 - acc: 0.7433 - val_loss: 3.4160 - val_acc: 0.4282
Epoch 2/10
2573/2573 [==============================] - 55s 21ms/step - loss: 0.5673 - acc: 0.7318 - val_loss: 3.3634 - val_acc: 0.4282
Epoch 3/10
2573/2573 [==============================] - 55s 21ms/step - loss: 0.5628 - acc: 0.7348 - val_loss: 2.6422 - val_acc: 0.4282
Epoch 4/10
2573/2573 [==============================] - 57s 22ms/step - loss: 0.5589 - acc: 0.7314 - val_loss: 2.6799 - val_acc: 0.4282
Epoch 5/10
2573/2573 [==============================] - 56s 22ms/step - loss: 0.5683 - acc: 0.7278 - val_loss: 2.3266 - val_acc: 0.4282
Epoch 6/10
2573/2573 [==============================] - 55s 21ms/step - loss: 0.5644 - acc: 0.7276 - val_loss: 2.3177 - val_acc: 0.4282
Epoch 7/10
2573/2573 [==============================] - 56s 22ms/step - loss: 0.5664 - acc: 0.7255 - val_loss: 2.3848 - val_acc: 0.4282
Epoch 8/10
2573/2573 [==============================] - 55s 21ms/step - loss: 0.5711 - acc: 0.7237 - val_loss: 2.2369 - val_acc: 0.4282
Epoch 9/10
2573/2573 [==============================] - 55s 22ms/step - loss: 0.5739 - acc: 0.7189 - val_loss: 2.6969 - val_acc: 0.4282
Epoch 10/10
2573/2573 [==============================] - 219s 85ms/step - loss: 0.5778 - acc: 0.7213 - val_loss: 2.5662 - val_acc: 0.4282
How come the accuracy during the training is so different from the validation step, when run on the same dataset? I tried to find some explanation but it seems that such problems usually arise when people use BatchNormalization or Dropout layers, which is not the case here.

Based on the information above, I may assume your data has strong dependencies on examples that are closer to each other in time series.
Therefore, NN data flow will likely be like this:
NN takes the first batch, calculates the loss, and updates the weights and biases
the cycle repeats on and on
but since examples in batches not that far away from each other in time series it is easier for NN to update weights accordingly, making loss reasonably low for every next batch
When it is time for validation NN just calculates the loss without updating the weights,
so you end up with NN that learned how to infer on a small portion of the data but do not generalize well on the whole dataset.
That is why validation error is different from training even on the same dataset.
And a list of reasons is not limited to this, this is just one assumption.

Related

Keras model gets worse when fine-tuning

I'm trying to follow the fine-tuning steps described in https://www.tensorflow.org/tutorials/images/transfer_learning#create_the_base_model_from_the_pre-trained_convnets to get a trained model for binary segmentation.
I create an encoder-decoder with the weights of the encoder being the ones of the MobileNetV2 and fixed as encoder.trainable = False. Then, I define my decoder as said in the tutorial and I train the network for 300 epochs using a learning rate of 0.005. I get the following loss value and Jaccard index during the lasts epochs:
Epoch 297/300
55/55 [==============================] - 85s 2s/step - loss: 0.2443 - jaccard_sparse3D: 0.5556 - accuracy: 0.9923 - val_loss: 0.0440 - val_jaccard_sparse3D: 0.3172 - val_accuracy: 0.9768
Epoch 298/300
55/55 [==============================] - 75s 1s/step - loss: 0.2437 - jaccard_sparse3D: 0.5190 - accuracy: 0.9932 - val_loss: 0.0422 - val_jaccard_sparse3D: 0.3281 - val_accuracy: 0.9776
Epoch 299/300
55/55 [==============================] - 78s 1s/step - loss: 0.2465 - jaccard_sparse3D: 0.4557 - accuracy: 0.9936 - val_loss: 0.0431 - val_jaccard_sparse3D: 0.3327 - val_accuracy: 0.9769
Epoch 300/300
55/55 [==============================] - 85s 2s/step - loss: 0.2467 - jaccard_sparse3D: 0.5030 - accuracy: 0.9923 - val_loss: 0.0463 - val_jaccard_sparse3D: 0.3315 - val_accuracy: 0.9740
I store all the weights of this model and then, I compute the fine-tuning with the following steps:
model.load_weights('my_pretrained_weights.h5')
model.trainable = True
model.compile(optimizer=Adam(learning_rate=0.00001, name='adam'),
loss=SparseCategoricalCrossentropy(from_logits=True),
metrics=[jaccard, "accuracy"])
model.fit(training_generator, validation_data=(val_x, val_y), epochs=5,
validation_batch_size=2, callbacks=callbacks)
Suddenly the performance of my model is way much worse than during the training of the decoder:
Epoch 1/5
55/55 [==============================] - 89s 2s/step - loss: 0.2417 - jaccard_sparse3D: 0.0843 - accuracy: 0.9946 - val_loss: 0.0079 - val_jaccard_sparse3D: 0.0312 - val_accuracy: 0.9992
Epoch 2/5
55/55 [==============================] - 90s 2s/step - loss: 0.1920 - jaccard_sparse3D: 0.1179 - accuracy: 0.9927 - val_loss: 0.0138 - val_jaccard_sparse3D: 7.1138e-05 - val_accuracy: 0.9998
Epoch 3/5
55/55 [==============================] - 95s 2s/step - loss: 0.2173 - jaccard_sparse3D: 0.1227 - accuracy: 0.9932 - val_loss: 0.0171 - val_jaccard_sparse3D: 0.0000e+00 - val_accuracy: 0.9999
Epoch 4/5
55/55 [==============================] - 94s 2s/step - loss: 0.2428 - jaccard_sparse3D: 0.1319 - accuracy: 0.9927 - val_loss: 0.0190 - val_jaccard_sparse3D: 0.0000e+00 - val_accuracy: 1.0000
Epoch 5/5
55/55 [==============================] - 97s 2s/step - loss: 0.1920 - jaccard_sparse3D: 0.1107 - accuracy: 0.9926 - val_loss: 0.0215 - val_jaccard_sparse3D: 0.0000e+00 - val_accuracy: 1.0000
Is there any known reason why this is happening? Is it normal?
Thank you in advance!
OK I found out what I do different that makes it NOT necessary to compile. I do not set encoder.trainable = False. What I do in the code below is equivalent
for layer in encoder.layers:
layer.trainable=False
then train your model. Then you can unfreeze the encoder weights with
for layer in encoder.layers:
layer.trainable=True
You do not need to recompile the model. I tested this and it works as expected. You can
verify by priniting model summary before and after and look at the number of trainable parameters. As for changing the learning rate I find it is best to use the the keras callback ReduceLROnPlateau to automatically adjust the learning rate based on validation loss. I also recommend using the EarlyStopping callback which monitors validation and halts training if the loss fails to reduce after 'patience' number of consecutive epochs. Setting restore_best_weights=True will load the weights for the epoch with the lowest validation loss so you don't have to save then reload the weights. Set epochs to a large number to ensure this callback activates. The code I use is shown below
es=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=3,
verbose=1, restore_best_weights=True)
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1,
verbose=1)
callbacks=[es, rlronp]
In model.fit set callbacks=callbacks

Tensorflow 2: Customized Loss Function works differently from the original Keras SparseCategoricalCrossentropy

I just started to work with tensorflow 2.0 and followed the simple example from its official website.
import tensorflow as tf
import tensorflow.keras.layers as layers
mnist = tf.keras.datasets.mnist
(t_x, t_y), (v_x, v_y) = mnist.load_data()
model = tf.keras.Sequential()
model.add(layers.Flatten())
model.add(layers.Dense(128, activation="relu"))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(10))
lossFunc = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam', loss=lossFunc,
metrics=['accuracy'])
model.fit(t_x, t_y, epochs=5)
The output for the above code is:
Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 4s 60us/sample - loss: 2.5368 - accuracy: 0.7455
Epoch 2/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.5846 - accuracy: 0.8446
Epoch 3/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.4751 - accuracy: 0.8757
Epoch 4/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.4112 - accuracy: 0.8915
Epoch 5/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.3732 - accuracy: 0.9018
However, if I change the lossFunc to the following:
def myfunc(y_true, y_pred):
return lossFunc(y_true, y_pred)
which just simply wrap the previous function, it performs totally differently. The output is:
Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 4s 60us/sample - loss: 2.4444 - accuracy: 0.0889
Epoch 2/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.5696 - accuracy: 0.0933
Epoch 3/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.4493 - accuracy: 0.0947
Epoch 4/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.4046 - accuracy: 0.0947
Epoch 5/5
60000/60000 [==============================] - 3s 51us/sample - loss: 0.3805 - accuracy: 0.0943
The loss values are very similar but the accuracy values are totally different. Anyone knows what is the magic in it, and what is the correct way to write your own loss function?
When you use built-in loss function, you can use 'accuracy' as metric . Under the hood, tensorflow will select appropriate accuracy function (in your case it is tf.keras.metrics.SparseCategoricalAccuracy()).
When you define custom_loss function, then tensorflow doesn't know which accuracy function to use. In this case, you need to explicitly specify that it is tf.keras.metrics.SparseCategoricalAccuracy(). Please check the gist hub gist here.
The code modification and the output is as follows
model2 = tf.keras.Sequential()
model2.add(layers.Flatten())
model2.add(layers.Dense(128, activation="relu"))
model2.add(layers.Dropout(0.2))
model2.add(layers.Dense(10))
lossFunc = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model2.compile(optimizer='adam', loss=myfunc,
metrics=['accuracy',tf.keras.metrics.SparseCategoricalAccuracy()])
model2.fit(t_x, t_y, epochs=5)
output
Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 5s 81us/sample - loss: 2.2295 - accuracy: 0.0917 - sparse_categorical_accuracy: 0.7483
Epoch 2/5
60000/60000 [==============================] - 5s 76us/sample - loss: 0.5827 - accuracy: 0.0922 - sparse_categorical_accuracy: 0.8450
Epoch 3/5
60000/60000 [==============================] - 5s 76us/sample - loss: 0.4602 - accuracy: 0.0933 - sparse_categorical_accuracy: 0.8760
Epoch 4/5
60000/60000 [==============================] - 5s 76us/sample - loss: 0.4197 - accuracy: 0.0946 - sparse_categorical_accuracy: 0.8910
Epoch 5/5
60000/60000 [==============================] - 5s 76us/sample - loss: 0.3965 - accuracy: 0.0937 - sparse_categorical_accuracy: 0.8979
<tensorflow.python.keras.callbacks.History at 0x7f5095286780>
Hope this helps

Keras categorical crossentropy learning stuck by putting all in one category

I was following the tensorflow tutorial on classification but got stuck with the problem, that the learning stagnates with my trained network in a sub optimal solution putting all pictures in just one categorie. My first thought was, that this was due to an unballanced distribution of training pictures in the categories (as also suggested here), so I deleted enough training pictures, so that the same amount of pictures remained in each category. However, the problem did not change. Next I tried different loss functions, different metrics, different optimizers and different layer structures of my model, without any improvements. My model still puts all pictures in just one category after training. Any idea is highly welcome.
Here is one of the models I tried:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(PicHeight, PicWidth, 3)),
keras.layers.Dense(64, activation='relu'),
keras.layers.Dense(64, activation='relu'),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(8, activation='relu'),
keras.layers.Dense(number_of_categories, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
And this is the training
Train on 101 samples
Epoch 1/16
101/101 [==============================] - 1s 11ms/sample - loss: 55.8119 - accuracy: 0.1584
Epoch 2/16
101/101 [==============================] - 1s 6ms/sample - loss: 232.9768 - accuracy: 0.1485
Epoch 3/16
101/101 [==============================] - 1s 6ms/sample - loss: 111.9690 - accuracy: 0.1584
Epoch 4/16
101/101 [==============================] - 1s 6ms/sample - loss: 72.1569 - accuracy: 0.1782
Epoch 5/16
101/101 [==============================] - 1s 6ms/sample - loss: 39.3051 - accuracy: 0.1386
Epoch 6/16
101/101 [==============================] - 1s 6ms/sample - loss: 2.6347 - accuracy: 0.0990
Epoch 7/16
101/101 [==============================] - 1s 6ms/sample - loss: 2.3318 - accuracy: 0.1683
Epoch 8/16
101/101 [==============================] - 1s 6ms/sample - loss: 2.5922 - accuracy: 0.2277
Epoch 9/16
101/101 [==============================] - 1s 6ms/sample - loss: 2.0848 - accuracy: 0.1485
Epoch 10/16
101/101 [==============================] - 1s 6ms/sample - loss: 1.9453 - accuracy: 0.1386
Epoch 11/16
101/101 [==============================] - 1s 6ms/sample - loss: 1.9453 - accuracy: 0.1386
Epoch 12/16
101/101 [==============================] - 1s 6ms/sample - loss: 1.9453 - accuracy: 0.1386
Epoch 13/16
101/101 [==============================] - 1s 6ms/sample - loss: 1.9452 - accuracy: 0.1386
Epoch 14/16
101/101 [==============================] - 1s 6ms/sample - loss: 1.9452 - accuracy: 0.1485
Epoch 15/16
101/101 [==============================] - 1s 6ms/sample - loss: 1.9452 - accuracy: 0.1485
Epoch 16/16
101/101 [==============================] - 1s 7ms/sample - loss: 1.9451 - accuracy: 0.1485
25/25 - 0s - loss: 1.9494 - accuracy: 0.1200
The training data has 7 categories with 18 pictures each.
Don't use so many FC layers. They aren't really good in dealing with pictures.
Your dataset size is obviously too small for deep learning. Adding more training data or try traditional machine learning like SVM, LR.
Imbalanced training data won't have that effect on model performance. It really depends on how imbalanced your data are. If it is less than 15%, it will be fine. You can definitely improve by weighted loss, overbalancing, preprocessing to make more images,etc.
If you have enough training data and picture sizes are bigger than 20*20, you should try CNN.

Understanding output during training - what does the durations mean & what what does TF do between two epochs?

I am quite new to Tensorflow and Keras and mighty google couldn't help me with the following question so far:
Below you can see TF/Keras-output from training a pre-trained CNN in spyder (using anaconda):
What is those (bold) timings about? As far as I could measure, it is the total time needed for this epoch. Am I correct?
The italicized numbers mean seconds for the complete batch (steps*batch-size) and time/batch.
What is TF/Keras doing in the significant time-span between two training-batches
Lets look at Epoch 2:
The whole epoch took 42 seconds, the training itself only 7 seconds. What's going on in the remaining 42-7 = 35 seconds?
From my understanding, the training time includes:
+everything about learning (fwd prop, calculating gradients, backwards prop)
Is the remaining time purely loading and re-scaling images?
Epoch 1/50
50/50 [==============================] - *9s 186ms/step* - loss: 0.6557 - acc: 0.9076
- **53s** - loss: 0.8610 - acc: 0.8472 - val_loss: 0.6557 - val_acc: 0.9076
Epoch 2/50
50/50 [==============================] - *7s 147ms/step* - loss: 0.4148 - acc: 0.9478
- **41s** - loss: 0.2432 - acc: 0.9097 - val_loss: 0.4148 - val_acc: 0.9478
Epoch 3/50
50/50 [==============================] - *8s 158ms/step* - loss: 0.5873 - acc: 0.9384 - **42s** - loss: 0.1696 - acc: 0.9335 - val_loss: 0.5873 - val_acc: 0.9384
Epoch 4/50
50/50 [==============================] - *7s 149ms/step* - loss: 0.5356 - acc: 0.9492
- **41s** - loss: 0.1274 - acc: 0.9548 - val_loss: 0.5356 - val_acc: 0.9492
.....
If it matters: I am using an image generator (see code below) and augmentation. The small (usually <500kb) pics are loaded from an SSD (Samsung 960 1TB).
train_datagen = ImageDataGenerator(rescale=1./255.)
train_generator = train_datagen.flow_from_directory(train_dir,
batch_size=20,
class_mode='binary',
target_size=(IMAGE_WIDTH, IMAGE_HEIGHT))
Thanks a lot guys.

Keras - training loss vs validation loss

Just for the sake of the argument I am using the same data during training for training and validation, like this:
model.fit_generator(
generator=train_generator,
epochs=EPOCHS,
steps_per_epoch=train_generator.n // BATCH_SIZE,
validation_data=train_generator,
validation_steps=train_generator.n // BATCH_SIZE
)
So I would expect that the loss and the accuracy of training and validation at the end of each epoch would be pretty much the same? Still it looks like this:
Epoch 1/150
26/26 [==============================] - 55s 2s/step - loss: 1.5520 - acc: 0.3171 - val_loss: 1.6646 - val_acc: 0.2796
Epoch 2/150
26/26 [==============================] - 46s 2s/step - loss: 1.2924 - acc: 0.4996 - val_loss: 1.5895 - val_acc: 0.3508
Epoch 3/150
26/26 [==============================] - 46s 2s/step - loss: 1.1624 - acc: 0.5873 - val_loss: 1.6197 - val_acc: 0.3262
Epoch 4/150
26/26 [==============================] - 46s 2s/step - loss: 1.0601 - acc: 0.6265 - val_loss: 1.9420 - val_acc: 0.3150
Epoch 5/150
26/26 [==============================] - 46s 2s/step - loss: 0.9790 - acc: 0.6640 - val_loss: 1.9667 - val_acc: 0.2823
Epoch 6/150
26/26 [==============================] - 46s 2s/step - loss: 0.9191 - acc: 0.6951 - val_loss: 1.8594 - val_acc: 0.3342
Epoch 7/150
26/26 [==============================] - 46s 2s/step - loss: 0.8811 - acc: 0.7087 - val_loss: 2.3223 - val_acc: 0.2869
Epoch 8/150
26/26 [==============================] - 46s 2s/step - loss: 0.8148 - acc: 0.7379 - val_loss: 1.9683 - val_acc: 0.3358
Epoch 9/150
26/26 [==============================] - 46s 2s/step - loss: 0.8068 - acc: 0.7307 - val_loss: 2.1053 - val_acc: 0.3312
Why does especially the accuracy differ so much although its from the same data source? Is there something about the way how this is calculated that I am missing?
The generator is created like this:
train_images = keras.preprocessing.image.ImageDataGenerator(
rescale=1./255
)
train_generator = train_images.flow_from_directory(
directory="data/superheros/images/train",
target_size=(299, 299),
batch_size=BATCH_SIZE,
shuffle=True
)
Yes, it shuffles the images, but as it iterates over all images also for validation, shouldn't the accuracy at least be close?
So the model looks like this:
inceptionV3 = keras.applications.inception_v3.InceptionV3(include_top=False)
features = inceptionV3.output
net = keras.layers.GlobalAveragePooling2D()(features)
predictions = keras.layers.Dense(units=2, activation="softmax")(net)
for layer in inceptionV3.layers:
layer.trainable = False
model = keras.Model(inputs=inceptionV3.input, outputs=predictions)
optimizer = keras.optimizers.RMSprop()
model.compile(
optimizer=optimizer,
loss="categorical_crossentropy",
metrics=['accuracy']
)
So no dropout or anything, just the inceptionv3 with a softmax layer on top. I would expect that the accuracy differs a bit, but not in this magnitude.
Are you sure your train_generator returns the same data when Keras retrieves training data and validation data, if it's a generator?
The name being generator, I'd expect it not to :)