Keras: Multiple outputs, loss only a function of one? - tensorflow

I have a setup like this:
model = keras.Model(input,[output1,output2])
My loss function is only a function of output1. How do I tell Keras to ignore output2 for the purposes of computing loss? The best I have come up with is to generate a bogus loss function which always returns 0.0:
model.compile(optimizer=..., loss=[realLossFunction, zeroLossFunction])
I can live with this, but I have to see the statistics and progress of this loss function all over the place and would like to know if there is a more elegant way.

You could just avoid putting this output in the model, and then reusing the weights (or sharing them with the functional API) to add the additional output to the full model.
But using a zero loss is also fine.

Related

Loss reduction in tf.distributed.MirroredStrategy()

I'm confused regarding using distributed strategy and the correct way of reduction in loss functions.
I implemented a U-Net using tf.distribute.MirroredStrategy(). Everything works fine using default loss BinaryCrossentropy as follows:
with strategy.scope():
model = build_network((size, size, 3), num_classes)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=args.learning_rate),
loss=tf.keras.losses.BinaryCrossentropy()])
However, I want to create custom loss functions. To start with, I wrote a wrapper containing BinaryCrossentropy, to get familiar with the correct way of using the reduction methods. I followed the instructions in https://www.tensorflow.org/tutorials/distribute/custom_training#define_the_loss_function
and used tf.nn.compute_average_loss in order to divide by the global batch_size.
def loss_functions(loss_spec):
if loss_spec == 'cross_entropy':
def c_loss(truth, pred):
my_loss = tf.keras.losses.BinaryCrossentropy(reduction=tf.keras.losses.Reduction.NONE)(truth, pred)
my_loss = tf.math.reduce_mean(my_loss, axis=[1, 2]) # to compute average across the two image dimensions
my_loss = tf.nn.compute_average_loss(my_loss) # sums up all items and divides by batch size
return my_loss
return c_loss
which is called in the following way:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=args.learning_rate),
loss=utils.loss_functions('cross_entropy')])
It also works, but I realised a difference of factor of number or replica compared to using tf.keras.losses.BinaryCrossentropy(). I.e., when using two kernels, using BinaryCrossentropy() directly yields a loss twice as large as my custom loss. Thus, to geht the same, I would need to divide by the batch size per replica instead of global batch size, i.e., the way it should NOT be done according to the documentation.
However, the documentation refers to building an own training routine, whereas I am using model.compile() and model.fit() methods.
Can anybody explain this behaviour to me?
UPDATE:
The use of tf.nn.compute_average_loss or the use of any reduction on the batch axis is not needed when using model.compile() and model.fit() at all - the reduction and scaling is done automatically. However, I still do not know how model.fit() does work internally.
Thanks and cheers, everybody

How to specify custom weight updates in tensorflow custom optimizer

In a custom optimizer I would like to update weights with random values if the loss function has not decreased.
However, I can not see how to do that in the methods you can override (resource_apply_dense, resource_apply_sparse, create_slots, get_config). None of them are passed the loss function.
I have tried overriding minimize(), but that is not called in a standard training loop.
Any ideas?
If you are writing a custom optimizer, I think the easiest way to apply it is to explicitly define the layers, also. In a standard feedforward neural network, if x is the input, then h=tf.tanh(tf.matmul(x,W)+b) is an example of the first hidden layer. Similarly you can get more layers. Then W and b are variables you need to update. The training loop would look something like this:
trainable_variables=[W,b]
for i in range(1000):
optimizer.minimize(loss, trainable_variables)
but with your own optimizer instead of the one from keras.

Keras: Custom loss function with training data not directly related to model

I am trying to convert my CNN written with tensorflow layers to use the keras api in tensorflow (I am using the keras api provided by TF 1.x), and am having issue writing a custom loss function, to train the model.
According to this guide, when defining a loss function it expects the arguments (y_true, y_pred)
https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses
def basic_loss_function(y_true, y_pred):
return ...
However, in every example I have seen, y_true is somehow directly related to the model (in the simple case it is the output of the network). In my problem, this is not the case. How do implement this if my loss function depends on some training data that is unrelated to the tensors of the model?
To be concrete, here is my problem:
I am trying to learn an image embedding trained on pairs of images. My training data includes image pairs and annotations of matching points between the image pairs (image coordinates). The input feature is only the image pairs, and the network is trained in a siamese configuration.
I am able to implement this successfully with tensorflow layers and train it sucesfully with tensorflow estimators.
My current implementations builds a tf Dataset from a large database of tf Records, where the features is a dictionary containing the images and arrays of matching points. Before I could easily feed these arrays of image coordinates to the loss function, but here it is unclear how to do so.
There is a hack I often use that is to calculate the loss within the model, by means of Lambda layers. (When the loss is independent from the true data, for instance, and the model doesn't really have an output to be compared)
In a functional API model:
def loss_calc(x):
loss_input_1, loss_input_2 = x #arbirtray inputs, you choose
#according to what you gave to the Lambda layer
#here you use some external data that doesn't relate to the samples
externalData = K.constant(external_numpy_data)
#calculate the loss
return the loss
Using the outputs of the model itself (the tensor(s) that are used in your loss)
loss = Lambda(loss_calc)([model_output_1, model_output_2])
Create the model outputting the loss instead of the outputs:
model = Model(inputs, loss)
Create a dummy keras loss function for compilation:
def dummy_loss(y_true, y_pred):
return y_pred #where y_pred is the loss itself, the output of the model above
model.compile(loss = dummy_loss, ....)
Use any dummy array correctly sized regarding number of samples for training, it will be ignored:
model.fit(your_inputs, np.zeros((number_of_samples,)), ...)
Another way of doing it, is using a custom training loop.
This is much more work, though.
Although you're using TF1, you can still turn eager execution on at the very beginning of your code and do stuff like it's done in TF2. (tf.enable_eager_execution())
Follow the tutorial for custom training loops: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough
Here, you calculate the gradients yourself, of any result regarding whatever you want. This means you don't need to follow Keras standards of training.
Finally, you can use the approach you suggested of model.add_loss.
In this case, you calculate the loss exaclty the same way I did in the first answer. And pass this loss tensor to add_loss.
You can probably compile a model with loss=None then (not sure), because you're going to use other losses, not the standard one.
In this case, your model's output will probably be None too, and you should fit with y=None.

How to create a tensor from a float value?

I'm using keras to build NN.
I need to make custom evaluation function (which I did but not completely).
Here is part of the code:
from keras import backend as K
...
def customized_loss(y_true, y_pred):
loss = K.square(y_true - y_pred)
return K.sum(loss)
...
model.compile(loss=customized_loss, optimizer='adam')
This part works perfectly but that's not really what I need.
I'd like to be able to return e.g. 0.123
Currently, what is being returned is K.sum(loss) which is also just a value (wrapped in some tensor).
Whatever you write it down 0.123 or [0.123] or [[0.123]] or ...
So the question is how to transform float to tensor which will be accepted as a response in this custom loss function?
E.g. something like return K.new_tensor(0.123)
(also fyi I'm using tensorflow as backend)
EDIT:
I just realized that my idea makes no sense conceptually. I wanted to apply loss (amount) without providing any info on what the loss function is (i.e. what's its derivative) which is crucial for back propagation so it can know how to update weights.

How to wrap a custom TensorFlow loss function in Keras?

This is my third attempt to get a deep learning project off the ground. I'm working with protein sequences. First I tried TFLearn, then raw TensorFlow, and now I'm trying Keras.
The previous two attempts taught me a lot, and gave me some code and concepts that I can re-use. However there has always been an obstacle, and I've asked questions that the developers can't answer (in the case of TFLearn), or I've simply gotten bogged down (TensorFlow object introspection is tedious).
I have written this TensorFlow loss function, and I know it works:
def l2_angle_distance(pred, tgt):
with tf.name_scope("L2AngleDistance"):
# Scaling factor
count = tgt[...,0,0]
scale = tf.to_float(tf.count_nonzero(tf.is_finite(count)))
# Mask NaN in tgt
tgt = tf.where(tf.is_nan(tgt), pred, tgt)
# Calculate L1 losses
losses = tf.losses.cosine_distance(pred, tgt, -1, reduction=tf.losses.Reduction.NONE)
# Square the losses, then sum, to get L2 scalar loss.
# Divide the loss result by the scaling factor.
return tf.reduce_sum(losses * losses) / scale
My target values (tgt) can include NaN, because my protein sequences are passed in a 4D Tensor, despite the fact that the individual sequences differ in length. Before you ask, the data can't be resampled like an image. So I use NaN in the tgt Tensor to indicate "no prediction needed here." Before I calculate the L2 cosine loss, I replace every NaN with the matching values in the prediction (pred) so the loss for every NaN is always zero.
Now, how can I re-use this function in Keras? It appears that the Keras Lambda core layer is not a good choice, because a Lambda only takes a single argument, and a loss function needs two arguments.
Alternately, can I rewrite this function in Keras? I shouldn't ever need to use the Theano or CNTK backend, so it isn't necessary for me to rewrite my function in Keras. I'll use whatever works.
I just looked at the Keras losses.py file to get some clues. I imported keras.backend and had a look around. I also found https://keras.io/backend/. I don't seem to find wrappers for ANY of the TensorFlow function calls I happen to use: to_float(), count_nonzero(), is_finite(), where(), is_nan(), cosine_distance(), or reduce_sum().
Thanks for your suggestions!
I answered my own question. I'm posting the solution for anyone who may come across this same problem.
I tried using my TF loss function directly in Keras, as was independently suggested by Matias Valdenegro. I did not provoke any errors from Keras by doing so, however, the loss value went immediately to NaN.
Eventually I identified the problem. The calling convention for a Keras loss function is first y_true (which I called tgt), then y_pred (my pred). But the calling convention for a TensorFlow loss function is pred first, then tgt. So if you want to keep a Tensorflow-native version of the loss function around, this fix works:
def keras_l2_angle_distance(tgt, pred):
return l2_angle_distance(pred, tgt)
<snip>
model.compile(loss = keras_l2_angle_distance, optimizer = "something")
Maybe Theano or CNTK uses the same parameter order as Keras, I don't know. But I'm back in business.
You don't need to use keras.backend, as your loss is directly written in TensorFlow, then you can use it directly in Keras. The backend functions are an abstraction layer so you can code a loss/layer that will work with the multiple available backends in Keras.
You just have to put your loss in the model.compile call:
model.compile(loss = l2_angle_distance, optimizer = "something")