`None` in Keras loss function - tensorflow

I have a problem working with Tensorflow and keras. That problem we could explain in this way:
We have a model (convolutional neural network) which has output of the form [None, 7, 7, 6]. We have a function 'custom_loss'. This function has y_true and y_pred parameters. They are of the form [7,7,6]. When I compile it, I got error message: TypeError: must be real number, not Tensor. I suppose there is mistake when I call y_pred[k][l][m] and y_true[k][l][m] but I don't know how to fix this to include this None in [None, 7, 7, 6]. Please help.
Update: Here is the code
def custom_loss(y_true, y_pred):
loss = 0
for i in range(S*S):
k, l = i%S, i//S
first_part = 5* sum([(y_pred[k][l][m] - y_true[k][l][m])**2 for m in range(1,3)])
second_part = 5 * sum([(math.sqrt(y_pred[k][l][m]) - math.sqrt(y_true[k][l][m])) ** 2 for m in range(3, 5)])
third_part = 5* sum([(y_pred[k][l][m] - y_true[k][l][m])**2 for m in [0, 5]])
if y_true[k][l][0] > 0.5:
loss += first_part + second_part + third_part
else:
loss += 0.5 * (y_pred[k][l][0] - y_true[k][l][0])**2
return loss

In keras (and TensorFlow without eager execution) you cannot access the content of a tensor. Therefore, lines as
loss += 0.5 * (y_pred[k][l][0] - y_true[k][l][0])**2
will fail. You can try to use the eager execution mode of TensorFlow together with keras as explained here.
In general you should always try to express these things just with built-in functions of the keras backend or with TensorFlow operations. Just try to express your loss function using matrix/vector notation and then it is easier (maybe we can also help you) to express this in keras.

When you wirte a loss function in keras (with tensorflow backend) it's for building your execution graph but not for execution directly.
You have to use tensorflow or keras backend function to define your loss function. If you compile your model keras (and tensorflow as backend) try to build your execution graph and therefore send tensors trough your loss function. the math package does not support tensors. Also is not possible to use if in your loss function since it's not derivable. Instead you could use a sigmoid function which is very close to a step function.

Related

maximizing binary cross_entropy in a keras model

I don't know hot to create a model that is maximizing binary cross_entropy loss in a keras model.
research:
1.https://intellipaat.com/community/17707/how-to-maximize-loss-function-in-keras
that said:
Simply multiply the loss by -1 to maximize the loss function while trying to minimize it:
new_loss = -loss
but using:
model.compile(loss=-1 * 'binary_crossentropy', optimizer=adam_optimizer())
resulted in this error:
ValueError: The model cannot be compiled because it has no loss to optimize.
https://stats.stackexchange.com/questions/303229/why-does-keras-binary-crossentropy-loss-function-return-wrong-values
gave me a custom function that approximates the keras binary_crossentropy loss:
import keras.backend as K
def binary_crossentropy(y_true, y_pred):
result = []
for i in range(len(y_pred)):
y_pred[i] = [max(min(x, 1 - K.epsilon()), K.epsilon()) for x in y_pred[i]]
result.append(-np.mean([y_true[i][j] * math.log(y_pred[i][j]) + (1 - y_true[i][j]) * math.log(1 - y_pred[i][j]) for j in range(len(y_pred[i]))]))
return np.mean(result)
but I can not use it since it results in the error:
len is not well defined for symbolic Tensors. (43_54/Sigmoid:0) Please call `x.shape` rather than `len(x)` for shape information.
when I replace len with .shape[0]
I get the another error:
__index__ returned non-int (type NoneType)
I tinkered with the syntax in several more ways but nothing seems to work.
any ideas?
python 3.6
tensorflow 1.15
keras 2.3.1
You just need to define a new loss, based on the keras implementation:
def neg_binary_crossentropy(y_true, y_pred):
return -1.0 * keras.losses.binary_crossentropy(y_true, y_pred)
And then use it in model.compile:
model.compile(loss=neg_binary_crossentropy, optimizer="adam")

Custom gradient in tensorflow attempts to convert model to tensor

I am trying to use the output of one neural network to compute the loss value for another network. As the first network is approximating another function (L2 distance) I would like to provide the gradients myself, as if it had come from an L2 function.
An example of my loss function in simplified code is:
#tf.custom_gradient
def loss_function(model_1_output):
def grad(dy, variables=None):
gradients = 2 * pred
return gradients
pred = model_2(model_1_output)
loss = pred ** 2
return loss, grad
This is called in a standard tensorflow 2.0 custom training loop such as:
with tf.GradientTape() as tape:
model_1_output = model_1(training_data)
loss = loss_function(model_1_output)
gradients = tape.gradient(loss, model_1.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables)
However, whenever I try to run this I keep getting the error:
ValueError: Attempt to convert a value (<model.model_2 object at 0x7f41982e3240>) with an unsupported type (<class 'model.model_2'>) to a Tensor.
The whole point of using the custom_gradients decorator is that I don't want the model_2 in the loss function to be included in the back propagation as I give it the gradients manually.
How can I make tensorflow completely ignore anything inside the loss function? So that for example I could do non-differetiable operations. I have tried using with tape.stop_recording() but I always result in a no gradients found error.
Using:
OS: Ubuntu 18.04
tensorflow: 2.0.0
python: 3.7

Keras: How to access step in model

Is there a way to access the current Keras training step as a tensor in the tensorflow graph?
I am trying to build a model which has an 'epsilon' parameter which is decayed as a function of the current training step.
epsilon = some_fn_of(K.global_step) # <- Something like this?
self.q = K.Sequential([
K.layers.InputLayer(input_shape),
K.layers.Dense(n, name='q'),
K.layers.Lambda(lambda x: tf.cond(tf.random.uniform((), 0, 1) < epsilon,
lambda _: tf.constant(0.0),
lambda ac: ac)
], name='q')
FYI: I'm using the Tensorflow bundled Keras.
I don't know if this will work for all purposes, but it looks like you can find the next training step number using model.optimizer.iterations. The variable name appears to have the format "<optimizer name>/iter:0". You can find the iterations property in the Optimizer documentation. Example value:
<tf.Variable 'Adam/iter:0' shape=() dtype=int64, numpy=5978>
I suspect that Keras does not have any such tensor in the graph and that the only way to access the step is through Callbacks (Keras Docs,Tensorflow Docs). Especially since Keras is meant to be agnostic to the backend, and so would likely maintain the step outside of tensorflow.

Use TensorFlow loss Global Objectives (recall_at_precision_loss) with Keras (not metrics)

Background
I have a multi-label classification problem with 5 labels (e.g. [1 0 1 1 0]). Therefore, I want my model to improve at metrics such as fixed recall, precision-recall AUC or ROC AUC.
It doesn't make sense to use a loss function (e.g. binary_crossentropy) that is not directly related to the performance measurement I want to optimize. Therefore, I want to use TensorFlow's global_objectives.recall_at_precision_loss() or similar as loss function.
Relevant GitHub:
https://github.com/tensorflow/models/tree/master/research/global_objectives
Relevant paper (Scalable Learning of Non-Decomposable Objectives): https://arxiv.org/abs/1608.04802
Not metric
I'm not looking for implementing a tf.metrics. I already succeeded in that following: https://stackoverflow.com/a/50566908/3399066
Problem
I think my issue can be divided into 2 problems:
How to use global_objectives.recall_at_precision_loss() or similar?
How to use it in a Keras model with TF backend?
Problem 1
There is a file called loss_layers_example.py on the global objectives GitHub page (same as above). However, since I don't have much experience with TF, I don't really understand how to use it. Also, Googling for TensorFlow recall_at_precision_loss example or TensorFlow Global objectives example won't give me any clearer example.
How do I use global_objectives.recall_at_precision_loss() in a simple TF example?
Problem 2
Would something like (in Keras): model.compile(loss = ??.recall_at_precision_loss, ...) be enough?
My feeling tells me it is more complex than that, due to the use of global variables used in loss_layers_example.py.
How to use loss functions similar to global_objectives.recall_at_precision_loss() in Keras?
Similar to Martino's answer, but will infer shape from input (setting it to a fixed batch size did not work for me).
The outside function isn't strictly necessary, but it feels a bit more natural to pass params as you configure the loss function, especially when your wrapper is defined in an external module.
import keras.backend as K
from global_objectives.loss_layers import precision_at_recall_loss
def get_precision_at_recall_loss(target_recall):
def precision_at_recall_loss_wrapper(y_true, y_pred):
y_true = K.reshape(y_true, (-1, 1))
y_pred = K.reshape(y_pred, (-1, 1))
return precision_at_recall_loss(y_true, y_pred, target_recall)[0]
return precision_at_recall_loss_wrapper
Then, when compiling the model:
TARGET_RECALL = 0.9
model.compile(optimizer='adam', loss=get_precision_at_recall_loss(TARGET_RECALL))
I managed to make it work by:
Explicitly reshaping tensors to BATCH_SIZE length (see code below)
Cutting the dataset size to a multiple of BATCH_SIZE
def precision_recall_auc_loss(y_true, y_pred):
y_true = keras.backend.reshape(y_true, (BATCH_SIZE, 1))
y_pred = keras.backend.reshape(y_pred, (BATCH_SIZE, 1))
util.get_num_labels = lambda labels : 1
return loss_layers.precision_recall_auc_loss(y_true, y_pred)[0]

Tensorflow: optimize over input with gradient descent

I have a TensorFlow model (a convolutional neural network) which I successfully trained using gradient descent (GD) on some input data.
Now, in a second step, I would like to provide an input image as initialization then and optimize over this input image with fixed network parameters using GD. The loss function will be a different one, but this a detail.
So, my main question is how to tell the gradient descent algorithm to
stop optimizing the network parameters
to optimize over the input image
The first can probably done with this
Holding variables constant during optimizer
Do you guys have ideas about the second point?
I guess I can recode the gradient descent algorithm myself using the TF gradient function, but my gut feeling tells me that there should be an easier way, which also allows me to benefit from more complex GD variants (Adam etc.).
No need for your SDG own implementation. TensorFlow provides all functions:
import tensorflow as tf
import numpy as np
# some input
data_pldhr = tf.placeholder(tf.float32)
img_op = tf.get_variable('input_image', [1, 4, 4, 1], dtype=tf.float32, trainable=True)
img_assign = img_op.assign(data_pldhr)
# your starting image
start_value = (np.ones((4, 4), dtype=np.float32) + np.eye(4))[None, :, :, None]
# override variable_getter
def nontrainable_getter(getter, *args, **kwargs):
kwargs['trainable'] = False
return getter(*args, **kwargs)
# all variables in this scope are not trainable
with tf.variable_scope('myscope', custom_getter=nontrainable_getter):
x = tf.layers.dense(img_op, 10)
y = tf.layers.dense(x, 10)
# the usual stuff
cost_op = tf.losses.mean_squared_error(x, y)
train_op = tf.train.AdamOptimizer(0.1).minimize(cost_op)
# fire up the training process
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(img_assign, {data_pldhr: start_value})
print(sess.run(img_op))
for i in range(10):
_, c = sess.run([train_op, cost_op])
print(c)
print(sess.run(img_op))
represent an image as tf.Variable with trainable=True
initialise this variable with the starting image (initial guess)
recreate the NN graph using TF variables with trainable=False and copy the weights from the trained NN graph using tf.assign
calculate the loss function
plug the loss into any TF optimiser algorithm you want
Another alternative is to use ScipyOptimizerInterface, which allows to use scipy's minimizer. This supports constrained minimization.
I'm looking for a solution to the same problem, but my model is not an easy one as I have an LSTM network with cells created with MultiRNNCell, I don't think it is possible to get the weight and clone the network. Is there any workaround so that I can compute the gradient wrt the input?