How to extract the loss function of a trained model? - tensorflow

Lets say that I have a model trained with TF2 (e.g., a model from TF model zoo). How can I get the loss function of this model?
Note that I do not want the value of the loss for a given input (that can be obtained via model.evaluate method), but I want the loss function itself such that:
I can take its gradient with respect to input or any desired parameter
I can pass the labels and logits and it provide me the loss value
Note that I am using an already trained model (inheriting tf.keras.Model).

Related

Keras custom loss function with multiple output model

In a segmentation task I wanted to have my model to have two outputs because I implemented weight maps as suggested in the original U-net paper https://arxiv.org/pdf/1505.04597.pdf.
As per the suggestion I created weightmaps concentrating some of the ground truth mask to have higher weights. Now I have a model with.
weightmap=layers.Lambda(lambda x:x)(weight_map) # A non trainable layer to output this as tensor for loss function
Model=model(inputs=[input,weight_map], outputs=[output,weightmap]
Now I need to compute binary cross entropy loss for the following model
def custom_loss(target,outputs):
loss=K.binary_crossentropy(target,outputs[0]) #ouputs[0] should be the model output
loss=loss*outputs[1] #outputs[1] should be weightmaps
return loss
This output[0] and output[1] slicing of output tensor from model doesnt work.
Is there anything I can do to implement the following with both outputs of model in a single loss function?

Keras: Custom loss function with training data not directly related to model

I am trying to convert my CNN written with tensorflow layers to use the keras api in tensorflow (I am using the keras api provided by TF 1.x), and am having issue writing a custom loss function, to train the model.
According to this guide, when defining a loss function it expects the arguments (y_true, y_pred)
https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses
def basic_loss_function(y_true, y_pred):
return ...
However, in every example I have seen, y_true is somehow directly related to the model (in the simple case it is the output of the network). In my problem, this is not the case. How do implement this if my loss function depends on some training data that is unrelated to the tensors of the model?
To be concrete, here is my problem:
I am trying to learn an image embedding trained on pairs of images. My training data includes image pairs and annotations of matching points between the image pairs (image coordinates). The input feature is only the image pairs, and the network is trained in a siamese configuration.
I am able to implement this successfully with tensorflow layers and train it sucesfully with tensorflow estimators.
My current implementations builds a tf Dataset from a large database of tf Records, where the features is a dictionary containing the images and arrays of matching points. Before I could easily feed these arrays of image coordinates to the loss function, but here it is unclear how to do so.
There is a hack I often use that is to calculate the loss within the model, by means of Lambda layers. (When the loss is independent from the true data, for instance, and the model doesn't really have an output to be compared)
In a functional API model:
def loss_calc(x):
loss_input_1, loss_input_2 = x #arbirtray inputs, you choose
#according to what you gave to the Lambda layer
#here you use some external data that doesn't relate to the samples
externalData = K.constant(external_numpy_data)
#calculate the loss
return the loss
Using the outputs of the model itself (the tensor(s) that are used in your loss)
loss = Lambda(loss_calc)([model_output_1, model_output_2])
Create the model outputting the loss instead of the outputs:
model = Model(inputs, loss)
Create a dummy keras loss function for compilation:
def dummy_loss(y_true, y_pred):
return y_pred #where y_pred is the loss itself, the output of the model above
model.compile(loss = dummy_loss, ....)
Use any dummy array correctly sized regarding number of samples for training, it will be ignored:
model.fit(your_inputs, np.zeros((number_of_samples,)), ...)
Another way of doing it, is using a custom training loop.
This is much more work, though.
Although you're using TF1, you can still turn eager execution on at the very beginning of your code and do stuff like it's done in TF2. (tf.enable_eager_execution())
Follow the tutorial for custom training loops: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough
Here, you calculate the gradients yourself, of any result regarding whatever you want. This means you don't need to follow Keras standards of training.
Finally, you can use the approach you suggested of model.add_loss.
In this case, you calculate the loss exaclty the same way I did in the first answer. And pass this loss tensor to add_loss.
You can probably compile a model with loss=None then (not sure), because you're going to use other losses, not the standard one.
In this case, your model's output will probably be None too, and you should fit with y=None.

What does `training=True` mean when calling a TensorFlow Keras model?

In TensorFlow's offcial documentations, they always pass training=True when calling a Keras model in a training loop, for example, logits = mnist_model(images, training=True).
I tried help(tf.keras.Model.call) and it shows that
Help on function call in module tensorflow.python.keras.engine.network:
call(self, inputs, training=None, mask=None)
Calls the model on new inputs.
In this case `call` just reapplies
all ops in the graph to the new inputs
(e.g. build a new computational graph from the provided inputs).
Arguments:
inputs: A tensor or list of tensors.
training: Boolean or boolean scalar tensor, indicating whether to run
the `Network` in training mode or inference mode.
mask: A mask or list of masks. A mask can be
either a tensor or None (no mask).
Returns:
A tensor if there is a single output, or
a list of tensors if there are more than one outputs.
It says that training is a Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode. But I didn't find any information about this two modes.
In a nutshell, I don't know what is the influence of this argument. And what if I missed this argument when training?
Some neural network layers behave differently during training and inference, for example Dropout and BatchNormalization layers. For example
During training, dropout will randomly drop out units and correspondingly scale up activations of the remaining units.
During inference, it does nothing (since you usually don't want the randomness of dropping out units here).
The training argument lets the layer know which of the two "paths" it should take. If you set this incorrectly, your network might not behave as expected.
Training indicating whether the layer should behave in training mode or in inference mode.
training=True: The layer will normalize its inputs using the mean and variance of the current batch of inputs.
training=False: The layer will normalize its inputs using the mean and variance of its moving statistics, learned during training.
Usually in inference mode training=False, but in some networks such as pix2pix_cGAN‍‍‍‍‍‍ At both times of inference and training, training=True.

How to get gradients during fit or fit_generator in Keras

I need to monitor the gradients in real time during training when using fit or fit_generator methods. This should have been achieved by using custom callback function. However, I don't how to access the gradients correctly. The attribute model.optimizer.update returns tensors of gradients but it need to be fed with data. What I want to get is the value of gradients that have been applied in the last batch during training.
The following answer does not give the corresponding solution because it just define a function to calculate the gradients by feeding extra data.
Getting gradient of model output w.r.t weights using Keras

Obtaining Logits of the output from deeplab model

I'm using a pre-trained deeplab model (from here) to obtain segmentations for an input image. I'm able to obtain the sematic labels (i.e. SemanticPredictions) which is argmax applied to logits (link).
I was wondering if there is an easy way to obtain the logits before argmax? I was hoping to find the output tensor name and simply pass it into my tfsession
as in the following:
tf_session.run(
self.OUTPUT_TENSOR_NAME,
feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(input_image)]})
But I have not been able to locate such tensor name in the code that reveals the logits, or softmax outputs.
For a model trained from MobileNet_V2 setting self.OUTPUT_TENSOR_NAME = 'ResizeBilinear_2:0' retrieves the logits before the argmax is performed.
I suspect this is the same for xception, but have not verified it.
I arrived at this answer by loading my model in tensorflow. Then, printing the name of all layers in the loaded graph. Finally, I took the name of the final output layer before the last 'ArgMax' layer and ran some inferencing using that.
Here is a link to a stackoverflow question on printing the names of the layers in a graph. I found the answer by Ted to be most helpful.
By the way, the output layers of DeeplabV3 models does not apply SoftMax. So you cannot simply take the raw value of the elements of output vectors as a confidence.