Loss in the convolution layer - tensorflow

So I'm working on image registration to align an image to a template image. I am performing this convolution operation on a Unet output (100x100x100x32) to get a 3D deformation field (100x100x100x3). For clarification, 100x100x100 is the dimension of the image.
# transform unet output into a flow field
name = 'vxm_dense'
Conv = getattr(KL, 'Conv%dD' % ndims)
flow_mean = Conv(filters=3, kernel_size=3, padding='same',
kernel_initializer=KI.RandomNormal(mean=0.0, stddev=1e-5),
name='%s_flow' % name)(unet_model.output)
So far everything is clear, but when I run model.fit(), I notice in the logs that in addition to the actual loss, an additional loss called vxm_dense_flow_loss is logged.
Epoch 9/60
560/560 [==============================] - 734s 1s/step - loss: 0.0023 - vxm_dense_flow_loss: 0.0216
I don't understand,
why is this loss calculated?
what is the loss function used? I don't have any loss functions (e.g. mse, ncc or mi) configured for it.
in order for the loss to be calculated, there must be a ground truth. Which ground truth is used here?
PS: The actual loss is calculated as the mean square error between the registered image and the template image.

Related

Training model in Keras [duplicate]

How is Accuracy defined when the loss function is mean square error? Is it mean absolute percentage error?
The model I use has output activation linear and is compiled with loss= mean_squared_error
model.add(Dense(1))
model.add(Activation('linear')) # number
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
and the output looks like this:
Epoch 99/100
1000/1000 [==============================] - 687s 687ms/step - loss: 0.0463 - acc: 0.9689 - val_loss: 3.7303 - val_acc: 0.3250
Epoch 100/100
1000/1000 [==============================] - 688s 688ms/step - loss: 0.0424 - acc: 0.9740 - val_loss: 3.4221 - val_acc: 0.3701
So what does e.g. val_acc: 0.3250 mean? Mean_squared_error should be a scalar not a percentage - shouldnt it? So is val_acc - mean squared error, or mean percentage error or another function?
From definition of MSE on wikipedia:https://en.wikipedia.org/wiki/Mean_squared_error
The MSE is a measure of the quality of an estimator—it is always
non-negative, and values closer to zero are better.
Does that mean a value of val_acc: 0.0 is better than val_acc: 0.325?
edit: more examples of the output of accuracy metric when I train - where the accuracy is increase as I train more. While the loss function - mse should decrease. Is Accuracy well defined for mse - and how is it defined in Keras?
lAllocator: After 14014 get requests, put_count=14032 evicted_count=1000 eviction_rate=0.0712657 and unsatisfied allocation rate=0.071714
1000/1000 [==============================] - 453s 453ms/step - loss: 17.4875 - acc: 0.1443 - val_loss: 98.0973 - val_acc: 0.0333
Epoch 2/100
1000/1000 [==============================] - 443s 443ms/step - loss: 6.6793 - acc: 0.1973 - val_loss: 11.9101 - val_acc: 0.1500
Epoch 3/100
1000/1000 [==============================] - 444s 444ms/step - loss: 6.3867 - acc: 0.1980 - val_loss: 6.8647 - val_acc: 0.1667
Epoch 4/100
1000/1000 [==============================] - 445s 445ms/step - loss: 5.4062 - acc: 0.2255 - val_loss: 5.6029 - val_acc: 0.1600
Epoch 5/100
783/1000 [======================>.......] - ETA: 1:36 - loss: 5.0148 - acc: 0.2306
There are at least two separate issues with your question.
The first one should be clear by now from the comments by Dr. Snoopy and the other answer: accuracy is meaningless in a regression problem, such as yours; see also the comment by patyork in this Keras thread. For good or bad, the fact is that Keras will not "protect" you or any other user from putting not-meaningful requests in your code, i.e. you will not get any error, or even a warning, that you are attempting something that does not make sense, such as requesting the accuracy in a regression setting.
Having clarified that, the other issue is:
Since Keras does indeed return an "accuracy", even in a regression setting, what exactly is it and how is it calculated?
To shed some light here, let's revert to a public dataset (since you do not provide any details about your data), namely the Boston house price dataset (saved locally as housing.csv), and run a simple experiment as follows:
import numpy as np
import pandas
import keras
from keras.models import Sequential
from keras.layers import Dense
# load dataset
dataframe = pandas.read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:13]
Y = dataset[:,13]
model = Sequential()
model.add(Dense(13, input_dim=13, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model asking for accuracy, too:
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(X, Y,
batch_size=5,
epochs=100,
verbose=1)
As in your case, the model fitting history (not shown here) shows a decreasing loss, and an accuracy roughly increasing. Let's evaluate now the model performance in the same training set, using the appropriate Keras built-in function:
score = model.evaluate(X, Y, verbose=0)
score
# [16.863721372581754, 0.013833992168483997]
The exact contents of the score array depend on what exactly we have requested during model compilation; in our case here, the first element is the loss (MSE), and the second one is the "accuracy".
At this point, let us have a look at the definition of Keras binary_accuracy in the metrics.py file:
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
So, after Keras has generated the predictions y_pred, it first rounds them, and then checks to see how many of them are equal to the true labels y_true, before getting the mean.
Let's replicate this operation using plain Python & Numpy code in our case, where the true labels are Y:
y_pred = model.predict(X)
l = len(Y)
acc = sum([np.round(y_pred[i])==Y[i] for i in range(l)])/l
acc
# array([0.01383399])
Well, bingo! This is actually the same value returned by score[1] above...
To make a long story short: since you (erroneously) request metrics=['accuracy'] in your model compilation, Keras will do its best to satisfy you, and will return some "accuracy" indeed, calculated as shown above, despite this being completely meaningless in your setting.
There are quite a few settings where Keras, under the hood, performs rather meaningless operations without giving any hint or warning to the user; two of them I have happened to encounter are:
Giving meaningless results when, in a multi-class setting, one happens to request loss='binary_crossentropy' (instead of categorical_crossentropy) with metrics=['accuracy'] - see my answers in Keras binary_crossentropy vs categorical_crossentropy performance? and Why is binary_crossentropy more accurate than categorical_crossentropy for multiclass classification in Keras?
Disabling completely Dropout, in the extreme case when one requests a dropout rate of 1.0 - see my answer in Dropout behavior in Keras with rate=1 (dropping all input units) not as expected
The loss function (Mean Square Error in this case) is used to indicate how far your predictions deviate from the target values. In the training phase, the weights are updated based on this quantity. If you are dealing with a classification problem, it is quite common to define an additional metric called accuracy. It monitors in how many cases the correct class was predicted. This is expressed as a percentage value. Consequently, a value of 0.0 means no correct decision and 1.0 only correct decisons.
While your network is training, the loss is decreasing and usually the accuracy increases.
Note, that in contrast to loss, the accuracy is usally not used to update the parameters of your network. It helps to monitor the learning progress and the current performane of the network.
#desertnaut has said it very clearly.
Consider the following two pieces of code
compile code
binary_accuracy code
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
Your labels should be integer,Because keras does not round y_true, and you get high accuracy.......

Neural Network has accuracy of 75% but has bad predictions

I'm building a convolutional neural networtk in order to predict 5 emotions from a data set of faces.
After working in the construction of the weights I could get an accuracy of 75%
score = model_2_emotion.evaluate(test_datagen.flow(X_test, Y_test, batch_size = 4))
print('Accuracy: {}'.format(score[1]))
308/308 [==============================] - 17s 56ms/step - loss: 0.6139 - accuracy: 0.7575
Accuracy: 0.7575264573097229
But model_2_emotion.predict(X_test) returns me this array
array([[0.6594997 , 0.00083318, 0.19473663, 0.08065161, 0.06427888],
[0.6610887 , 0.0008383 , 0.19332188, 0.08035047, 0.06440066],
[0.66172844, 0.00082645, 0.19264877, 0.08032911, 0.06446711],
...,
[0.66067713, 0.00084266, 0.19318439, 0.08052441, 0.06477145],
[0.66050553, 0.00085838, 0.19319515, 0.08056776, 0.06487323],
[0.6602842 , 0.00084602, 0.19372217, 0.08054546, 0.06460217]],
dtype=float32)
Where we can see it's just predecting "correcty" the first emotion (the first column) with the accuracy of 60% and from this array produces me this heat map:
Heat map
Which I think there is something wrong since its passing through the first emotion. Since I got 75% of accuracy but bad predictions, someone knows what's going on?
Looking at your confusion matrix (this is not called a heat map), seems like your model is only predicting a single class, and that your data is unbalanced.
How many samples you have for each class (is it unbalanced)?
How many epochs is your model training?
how many neurons your neural network have in the last layer (it is supposed to have 5 neurons) ?
Only looking closer to the data/problem (and in the train/test accuracy curve over epochs) a better suggestion could be made, but your problem seems to be Under/Overfiting, and that you can benefit of better theoretical basis.
Take a look on any source about bias-variance trade off.
https://quantdare.com/mitigating-overfitting-neural-networks/
here are some generic tips: get more data, improve pre processing, improve model (more layers, different kernel sizes, skip connections, batch normalization, different optimization/learning rates etc ...).

What do these numbers mean when training in Tensor Flow

Taking the following example:
import tensorflow as tf
data = tf.keras.datasets.mnist
(training_images, training_labels), (val_images, val_labels) = data.load_data()
training_images = training_images / 255.0
val_images = val_images / 255.0
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(input_shape=(28,28)),
tf.keras.layers.Dense(20, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=20, validation_data=(val_images, val_labels))
The result is something like this:
Epoch 1/20
1875/1875 [==============================] - 4s 2ms/step - loss: 0.4104 - accuracy: 0.8838 -
val_loss: 0.2347 - val_accuracy: 0.9304
Where does 1875 come from? What does that number represent? I am unable to see where it is coming from. The training_images has a shape of 60000x28x28 when I look at it.
1875 is the number of iterations the training need to complete entire dataset with batch size of 32.
1875 * 32 = 60k
Epoch An epoch describes the number of times the algorithm sees the
entire data set. So, each time the algorithm has seen all samples in
the dataset, an epoch has completed.
Iteration An iteration describes the number of times a batch of data
passed through the algorithm. In the case of neural networks, that
means the forward pass and backward pass. So, every time you pass a
batch of data through the NN, you completed an iteration.
For more, you can refer link-1 and link-2
1875 is the number of steps/batches trained on. For example, with the default batch size of 32, this tells us that you have 60 000 images (plus or minus 31, as the last batch may or may not be full).

Defining model to reduce overfitting effect of batch normalization

I'm trying to train my model using transfer learning from pretrained model with 30 classes and 7200 images(80% train, 10% validation, 10% test). My model is always overfitting despite changing various parameters. After i read this link https://www.tensorflow.org/tutorials/images/transfer_learning#create_the_base_model_from_the_pre-trained_convnets, i know batch normalization always update variance even though the convolutional base was freeze.
So, i set training = false in base_model. But, i'm still confused. Is my code correct? Because my image was augmented using ImageDataGenerator not like example where augmentation and preprocessing used as base model input.
This is my code
#Create the model
inputs = keras.Input(shape=(224, 224, 3))
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
outputs = tf.keras.layers.Dense((len(CLASS_NAMES)), activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
history = model.fit_generator(train_data_gen,
epochs=epochs,
steps_per_epoch=int(np.ceil(total_train / float(BATCH_SIZE))),
validation_data=val_data_gen,
validation_steps=int(np.ceil(total_val / float(BATCH_SIZE))),
callbacks=[cm_callback,tensorboard_callback])
Output
576/576 [==============================] - 157s 273ms/step - loss: 0.0075 - accuracy: 0.9996
144/144 [==============================] - 26s 181ms/step - loss: 0.0092 - accuracy: 1.0000
[0.007482105916197825, 0.99956596]
[0.009182391463279297, 1.0]
If my code is correct, Is it good that the validation accuracy = 1(too accurate)?

Understanding epoch, batch size, accuracy and performance gain in lstm forecasting model

I am new to machine learning and lstm. I am referring this link LSTM for multistep forecasting for Encoder-Decoder LSTM Model With Multivariate Input section.
Here is my dataset description after reshaping the train and test set.
print(dataset.shape)
print(train_x.shape, train_y.shape)
print((test.shape)
(2192, 15)
(1806, 14, 14) (1806, 7, 1)
(364, 15)
In above I have n_input=14, n_out=7.
Here is my lstm model description:
def build_model(train, n_input):
# prepare data
train_x, train_y = to_supervised(train, n_input)
# define parameters
verbose, epochs, batch_size = 2, 100, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# reshape output into [samples, timesteps, features]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
# define model
model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features)))
model.add(RepeatVector(n_outputs))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(100, activation='relu')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mse', optimizer='adam')
# fit network
model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
return model
On evaluating the model, I am getting the output as:
Epoch 98/100
- 8s - loss: 64.6554
Epoch 99/100
- 7s - loss: 64.4012
Epoch 100/100
- 7s - loss: 63.9625
According to my understanding: (Please correct me if I am wrong)
Here my model accuracy is 63.9625 (by seeing the last epoch 100). Also, this is not stable since there is a gap between epoch 99 and epoch 100.
Here are my questions:
How epoch and batch size above defined is related to gaining model accuracy? How its increment and decrement affect model accuracy?
Is my above-defined epoch, batch, n_input is correct for the model?
How can I increase my model accuracy? Is the above dataset size is good enough for this model?
I am not able to link all this parameter and kindly help me in understanding how to achieve more accuracy by the above factor.
Having a very large epoch size will not necessarily improve your accuracy. Epoch sizes can increase the accuracy up to a certain limit beyond which you begin to overfit your model. Having a very low one will also result in underfitting. See this. So looking at the huge difference between epoch 99 and epoch 100, you can already tell that you are overfitting the model. As a rule of thumb, when you notice the accuracy stops increasing, that is the ideal number of epochs you should have usually between 1 and 10. 100 seems too much already.
Batch size does not affect your accuracy. This is just used to control the speed or performance based on the memory in your GPU. If you have huge memory, you can have a huge batch size so training will be faster.
What you can do to increase your accuracy is:
1. Increase your dataset for the training.
2. Try using Convolutional Networks instead. Find more on convolutional networks from this youtube channel or in a nutshell, CNN's help you identify what features to focus on in training your model.
3. Try other algorithms.
There is no well defined formula for batch size. Typically a larger batch size will run faster, but may compromise your accuracy. You will have to play around with the number.
However, one component with regards to epochs that you are missing is validation. It is normal to have a validation dataset and observe whether this accuracy over this dataset goes up or down. If the accuracy over this dataset goes up, you can multiply your learning rate by 0.8. See this link: https://machinelearningmastery.com/difference-test-validation-datasets/