I need a custom weighted MSE loss function. I defined it in keras.backend
from keras import backend as K
def weighted_loss(y_true, y_pred):
return K.mean( K.square(y_pred - y_true) *
K.exp(-K.log(1.7) * (K.log(1. + K.exp((y_true - 3)/5 ))))
,axis=-1 )
However, a test run returns
weighted_loss(1,2)
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("Exp_37:0", shape=(), dtype=float32)'
or
weighted_loss(1.,2.)
ZeroDivisionError: integer division or modulo by zero
I wonder what mistakes am I making here.
Whether you are using Tensorflow or Theano is irrelevant for your question. Google the meaning of 'tensor' if the term confuses you.
Take a look at how Keras own loss function tests have been implemented here:
def test_metrics():
y_a = K.variable(np.random.random((6, 7)))
y_b = K.variable(np.random.random((6, 7)))
for metric in all_metrics:
output = metric(y_a, y_b)
print(metric.__name__)
assert K.eval(output).shape == (6,)
You can't simply feed a float or int into tensor calculations. Note also the use of K.eval to obtain the result you're looking for.
So try something similar with your function:
from keras import backend as K
import numpy as np
y_a = K.variable(np.random.random((6, 7)))
y_b = K.variable(np.random.random((6, 7)))
output = weighted_loss(y_a,y_b)
result = K.eval(output)
There is also no need to define your custom function in keras.backend - what if you decide to update Keras later on?
Instead you could do the following in your own code: define a function that returns your loss function
def weighted_loss(y_true, y_pred):
return K.mean( K.square(y_pred - y_true) * K.exp(-K.log(1.7) * (K.log(1. + K.exp((y_true - 3)/5 )))),axis=-1 )
Then when you want to compile your model with your loss function, you can do:
model.compile(loss = weighted_loss)
In case you want to define a more general loss function, where the weighting depends on some input, you'll need to wrap the function. So for example:
def get_weighted_loss(my_input):
def weighted_loss(y_true, y_pred):
return K.mean( K.square(y_pred - y_true) * K.exp(-K.log(1.7) * (K.log(1. + K.exp((y_true - 3)/my_input )))),axis=-1 )
return weighted_loss
Then when you want to compile your model with your loss function, you can do:
model.compile(loss = get_weighted_loss(5))
Related
The sample code below shows that all the following give the same (correct) results when
writing a custom loss function (calculating mean_squared_error) for
a simple linear regression model.
Do not use tf_reduce_mean() (so returning a loss for each example)
Use tf_reduce_mean() (so returning a single loss)
Use tf_reduce_mean(..., axis-1)
Is there any reason to prefer one approach to another, and are there any circumstances
where it makes a difference?
(There is, for example sample code at
Make a custom loss function in keras
that suggests axis=-1 should be used)
import numpy as np
import tensorflow as tf
# Create simple dataset to do linear regression on
# The mean squared error (~ best achievable MSE loss after fitting linear regression) for this dataset is 0.01
xtrain = np.random.randn(5000) # Already normalized
ytrain = xtrain + np.random.randn(5000) * 0.1 # Close enough to being normalized
# Function to create model and fit linear regression, and report final loss
def cre_and_fit(loss="mean_squared_error", lossdescription="",epochs=20):
model = tf.keras.models.Sequential([tf.keras.layers.Dense(1, input_shape=(1,))])
model.compile(loss=loss, optimizer="RMSProp")
history = model.fit(xtrain, ytrain, epochs=epochs, verbose=False)
print(f"Final loss value for {lossdescription}: {history.history['loss'][-1]:.4f}")
# Result from standard MSE loss ~ 0.01
cre_and_fit("mean_squared_error","Keras standard MSE")
# This gives the right result, not reducing. Return shape = (batch_size,)
cre_and_fit(lambda y_true, y_pred: (y_true-y_pred)*(y_true-y_pred),
"custom loss, not reducing over batch items" )
# This also gives the right result, reducing over batch items. Return shape = ()
cre_and_fit(lambda y_true, y_pred: tf.reduce_mean((y_true-y_pred)*(y_true-y_pred) ),
"custom loss, reducing over batch items")
# How about using axis=-1? Also gives the same result
cre_and_fit(lambda y_true, y_pred: tf.reduce_mean((y_true-y_pred)*(y_true-y_pred), axis=-1),
"custom loss, reducing with axis=-1" )
When you pass a lambda (or a callable in general) to compile and call fit, TF will wrap it inside a LossFunctionWrapper, which is a subclass of Loss, with a default reduction type of ReductionV2.AUTO. Note that a Loss object always has a reduction type representing how it will reduce the loss tensor to a single scalar.
Under most circumstances, ReductionV2.AUTO translates to ReductionV2.SUM_OVER_BATCH_SIZE which, despite its name, actually performs reduced mean over all axis on the underlying lambda's output.
import tensorflow as tf
from keras import losses as losses_mod
from keras.utils import losses_utils
a = tf.random.uniform((10,2))
b = tf.random.uniform((10,2))
l_auto = losses_mod.LossFunctionWrapper(fn=lambda y_true, y_pred : tf.square(y_true - y_pred), reduction=losses_utils.ReductionV2.AUTO)
l_sum = losses_mod.LossFunctionWrapper(fn=lambda y_true, y_pred : tf.square(y_true - y_pred), reduction=losses_utils.ReductionV2.SUM_OVER_BATCH_SIZE)
l_auto(a,b).shape.rank == l_sum(a,b).shape.rank == 0 # rank 0 means scalar
l_auto(a,b) == tf.reduce_mean(tf.square(a - b)) # True
l_sum(a,b) == tf.reduce_mean(tf.square(a - b)) # True
So to answer your question, the three options are equivalent since they all eventually result in a single scalar that is the mean of all elements in the raw tf.square(a - b) loss tensor. However, should you wish to perform an operation other than reduce_mean e.g., reduce_sum, in the lambda, then the three will yield different results:
l1 = losses_mod.LossFunctionWrapper(fn=lambda y_true, y_pred : tf.square(y_true - y_pred),
reduction=losses_utils.ReductionV2.AUTO)
l2 = losses_mod.LossFunctionWrapper(fn=lambda y_true, y_pred : tf.reduce_sum(tf.square(y_true - y_pred)),
reduction=losses_utils.ReductionV2.AUTO)
l3 = losses_mod.LossFunctionWrapper(fn=lambda y_true, y_pred : tf.reduce_sum(tf.square(y_true - y_pred), axis=-1),
reduction=losses_utils.ReductionV2.AUTO)
l1(a,b) == tf.reduce_mean(tf.square(a-b)) # True
l2(a,b) == tf.reduce_sum(tf.square(a-b)) # True
l3(a,b) == tf.reduce_mean(tf.reduce_sum(tf.square(a-b), axis=-1)) # True
Concretely, l2(a,b) == tf.reduce_mean(tf.reduce_sum(tf.square(a-b))), but that is just tf.reduce_sum(tf.square(a-b)) since mean of a scalar is itself.
I am using TensorFlow 2. I am trying to optimize a function which uses the loss of a trained tensorflow model (poison).
#tf.function
def totalloss(x):
xt = tf.multiply(x, (1.0 - m)) + tf.multiply(m, d)
label = targetlabel*np.ones(xt.shape[0])
loss1 = poison.evaluate(xt, label, steps=1)
loss2 = tf.linalg.norm(m, 1)
return loss1 + loss2
I am not able to execute this function, however, when I comment the #tf.function line the function works!
I need to use this function as a tensorflow op so as to optimize 'm' & 'd'.
Value Error: Unknown graph. Aborting.
This is how I am defining the model and variables:
# mask
m = tf.Variable(tf.zeros(shape=(1, 784)), name="m")
d = tf.Variable(tf.zeros(shape=(1, 784)), name="d")
# target
targetlabel = 6
poison = fcn()
poison.load_weights("MNISTP.h5")
adam = tf.keras.optimizers.Adam(lr=.002, decay=1e-6)
poison.compile(optimizer=adam, loss=tf.losses.sparse_categorical_crossentropy)
This is how I am calling the function later: (Executing this line results in an error listed below. However if I comment off the #tf.function line, this command works!)
loss = totalloss(ptestdata)
This is the entire traceback call:
ValueError: in converted code:
<ipython-input-52-4841ad87022f>:5 totalloss *
loss1 = poison.evaluate(xt, label, steps=1)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:746 evaluate
use_multiprocessing=use_multiprocessing)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py:693 evaluate
callbacks=callbacks)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py:187 model_iteration
f = _make_execution_function(model, mode)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py:555 _make_execution_function
return model._make_execution_function(mode)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:2034 _make_execution_function
self._make_test_function()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:2010 _make_test_function
**self._function_kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:3544 function
return EagerExecutionFunction(inputs, outputs, updates=updates, name=name)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:3429 __init__
raise ValueError('Unknown graph. Aborting.')
ValueError: Unknown graph. Aborting.
The purpose of #tf.function decorator is to convert Tensorflow operations written in Python into Tensorflow graph to achieve better performance. The error might come when you tried to use a pre-trained model with a serialized graph. Thus, the decorator cannot make the graph-to-graph conversion.
I've reported this error here: https://github.com/tensorflow/tensorflow/issues/33997
A (temporary) solution is that your loss function should be separated into two small functions. The decorator should only be used in the function not including the pre-trained model. In this way, you still can achieve better performance in other operations but not with the part of using the pre-trained model.
For example:
#tf.function
def _other_ops(x):
xt = tf.multiply(x, (1.0 - m)) + tf.multiply(m, d)
label = targetlabel * np.ones(xt.shape[0])
loss2 = tf.linalg.norm(m, 1)
return xt, label, loss2
def total_loss(x):
xt, label, loss2 = _other_ops(x)
loss1 = poison.evaluate(xt, label, steps=1)
return loss1 + loss2
Update:
According to the discussion in the above TF issue link, an elegant solution is to manually pass the input through each layer of the model. You could get a list of layers in your model by calling your_model.layers
In your case, you might calculate the loss from the prediction of your output with the label in the last layer. Thus, I think you should skip the last layer and calculate the loss outside of the loop:
#tf.function
def totalloss(x):
xt = tf.multiply(x, (1.0 - m)) + tf.multiply(m, d)
label = targetlabel*np.ones(xt.shape[0])
feat = xt
# Skip the last layer which calculates loss1
for i in range(len(poison.layers) - 1):
layer = poison.layers[i]
feat = layer(feat)
# Now, calculate loss by yourself
loss1 = tf.keras.losses.sparse_categorical_crossentropy(feat, label)
loss2 = tf.linalg.norm(m, 1)
return loss1 + loss2
The way that the TF engineers explain for this issue is that a model might wrap high-level processing which does guarantee by the #tf.function. So, putting a model inside a function decorated with #tf.function is not recommended. Thus, we need to break the model to smaller pieces to bypass it.
I include Precision#K as a custom metric in Keras. According to the documentation
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy', mean_pred])
I just need to calculate a function using Keras backend and pass it at the compilation step.
With Numpy only precision#k could be calculated as the following example:
def precisionatk(y_true,y_pred,k)
precision_average = []
idx = (-y_pred).argsort(axis=-1)[:,:k]
for i in range(idx.shape[0]):
precision_sample = 0
for j in idx[i,:]:
if y_true[i,j] == 1:
precision_sample += 1
precision_sample = precision_sample / k
precision_average.append(precision_sample)
return np.mean(precision_average)
y_true = np.array([[0,0,1,0],[1,0,1,0]])
y_pred = np.array([[0.1,0.4,0.8,0.2],[0.3,0.2,0.5,0.1]])
print(precisionatk(y_true,y_pred,2))
0.75
So, how to I translate this to Keras backend?
EDIT: I'm working with a multilabel problem and the y_true is always an array with ones or zeros, and y_prediction each class probability.
You can use Keras-native implementation of Precision#k: tf.keras.metrics.TopKCategoricalAccuracy
https://www.tensorflow.org/api_docs/python/tf/keras/metrics/TopKCategoricalAccuracy
I am trying to use some_model.predict(x) within a custom loss function.
I found this custom loss function:
_EPSILON = K.epsilon()
def _loss_tensor(y_true, y_pred):
y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILON)
out = -(y_true * K.log(y_pred) + (1.0 - y_true) * K.log(1.0 - y_pred))
return K.mean(out, axis=-1)
But the problem is that model.predict() is expecting a numpy array.
So I looked for how to convert a tensor (y_pred) to a numpy array.
I found tmp = K.tf.round(y_true) but this returns a tensor.
I have also found: x = K.eval(y_true) which takes a Keras variable and returns a numpy array.
This produces the error: You must feed a value for placeholder tensor 'dense_78_target' with dtype float.....
Some people suggested setting the learning phase to true. I did that, but it did not help.
What I just want to do:
def _loss_tensor(y_true, y_pred):
y_tmp_true = first_decoder.predict(y_true)
y_tmp_pred = first_decoder.predict(y_pred)
return keras.losses.binary_crossentropy(y_tmp_true,y_tmp_pred)
Any help would be appreciated.
This works:
sess = K.get_session()
with sess.as_default():
tmp = K.tf.constant([1,2,3]).eval()
print(tmp)
I also tried this now:
tmp = first_decoder(y_true)
This fails the assertion:
assert input_shape[-1]
Maybe someone knows how to resolve this?
Update:
I can now feed it through the model with:
y_t = first_decoder(K.reshape(y_true, (1,512)))
y_p = first_decoder(K.reshape(y_pred, (1,512)))
But when I try to return the binary cross entropy the shape is not right:
Input to reshape is a tensor with 131072 values, but the requested shape has
512
I figured out that 131072 was the product of my batch size and input size (256*512). I then adopted my code to reshape to (256,512) size. The first batch runs fine, but then I get another error that says that the passed size was (96,512).
[SOLVED]Update:
It works now:
def _loss_tensor(y_true, y_pred):
num_ex = K.shape(y_true)[0]
y_t = first_decoder(K.reshape(y_true, (num_ex,512)))
y_p = first_decoder(K.reshape(y_pred, (num_ex,512)))
return keras.losses.binary_crossentropy(y_t,y_p)
I want to write a custom loss function that would penalize underestimation of positive target values with weights. It would work like mean square error, with the only difference that square errors in said case would get multiplied with a weight greater than 1.
I wrote it like this:
def wmse(ground_truth, predictions):
square_errors = np.square(np.subtract(ground_truth, predictions))
weights = np.ones_like(square_errors)
weights[np.logical_and(predictions < ground_truth, np.sign(ground_truth) > 0)] = 100
weighted_mse = np.mean(np.multiply(square_errors, weights))
return weighted_mse
However, when I supply it to my Sequential model in keras with tensorflow as backend:
model.compile(loss=wmse,optimizer='rmsprop')
I get the following error:
raise TypeError("Using a `tf.Tensor` as a Python `bool` is not allowed.
TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
The traceback points to this line in wmse:
weights[np.logical_and(predictions < ground_truth, np.sign(ground_truth) > 0)] = 100
I have never worked with keras nor tensorflow until now, so I'd appreciate if someone helped me to adapt this loss function to keras/tensorflow framework. I tried to replace np.logical_and with tensorflow.logical_and, but to no avail, the error is still there.
As #nuric mentioned, you have to implement your loss using only Keras / Tensorflow operations with derivatives, as these frameworks won't be able to back-propagate through other operations (like numpy ones).
A Keras only implementation could look like this:
from keras import backend as K
def wmse(ground_truth, predictions):
square_errors = (ground_truth - predictions) ** 2
weights = K.ones_like(square_errors)
mask = K.less(predictions, ground_truth) & K.greater(K.sign(ground_truth), 0)
weights = K.switch(mask, weights * 100, weights)
weighted_mse = K.mean(square_errors * weights)
return weighted_mse
gt = K.constant([-2, 2, 1, -1, 3], dtype="int32")
pred = K.constant([-2, 1, 1, -1, 1], dtype="int32")
weights, loss = wmse(gt, pred)
sess = K.get_session()
print(loss.eval(session=sess))
# 100