How to save a neural network model in h5 with loss function "balanced categorical entropy"? - tensorflow

I am using VGG16 for image segmentation with the loss function "balanced categorical entropy" using the code
beta=0.5
def balanced_cross_entropy(beta):
def loss(y_true, y_pred):
weight_a = beta * tf.cast(y_true, tf.float32)
weight_b = (1 - beta) * tf.cast(1 - y_true, tf.float32)
o = (tf.math.log1p(tf.exp(-tf.abs(y_pred))) + tf.nn.relu(-y_pred)) * (weight_a + weight_b) + y_pred * weight_b
return tf.reduce_mean(o)
return loss
Everything works fine. Now I save this model in the h5 file using the code.
vgg.save('vgg.h5')
But when I use the load_model from Keras
model = load_model('vgg.h5', custom_objects={'balanced_cross_entropy(beta)': balanced_cross_entropy(beta)})
I encounter an error.
Unknown loss function: loss. Please ensure this object is passed to the `custom_objects` argument.
Can anybody help, I suspect the problem may be due to beta?

If you want to only perform inference, you can avoid this problem by specifying
model = load_model('vgg.h5',compile=False)
Otherwise, you need to load the in the following way:
model = load_model("vgg.h5", custom_objects={'loss': balanced_cross_entropy(beta)}); in your code you wrote balanced_cross_entropy(beta) instead of loss.
Short explanation:
The name of the key in custom_object is actually the name of the inner function (which is in fact returned by balanced_cross_entropy(beta); the name of the outer function is actually the value of the <key,value> pair in the custom_object dictionary.

Related

For loop in tensorflow/ keras

I am trying to use a for loop within a model definition (and attempting to recreate TabNet in keras).
class TabNet(keras.Model):
def __init__(self, input_dim, output_dim, steps, n_d, n_a, gamma=1.3):
super().__init__()
self.n_d, self.n_a, self.steps = n_d, n_a, steps
self.shared = SharedBlock(n_d+n_a)
self.first_block = SharedBlock(n_a)
self.decision_blocks = [DecisionBlock(n_d+n_a)] * steps
self.prior_scale = Prior(input_dim, gamma)
self.bn = layers.BatchNormalization()
self.attention = [AttentiveTransformer(input_dim)] * steps
self.final = layers.Dense(output_dim)
self.eps = 1e-8
#tf.function
def call(self, x):
self.prior_scale.reset()
final_out = 0
M_loss = 0
x = self.bn(x)
attention = self.first_block(self.shared(x))
for i in range(self.steps):
mask = self.attention[i](attention, self.prior_scale.P)
M_loss += tf.reduce_sum(mask * tf.math.log(mask + self.eps), axis=-1) / self.steps
prior = self.prior_scale(mask)
out = self.decision_blocks[i](self.shared(x * prior))
attention, output = out[:,:self.n_a], out[:,self.n_a:]
final_out += tf.nn.relu(output)
return self.final(final_out), M_loss
If you're unaware of what those individual blocks are, simply assume that they are linear layers. I have a colab notebook with the full code if you wish to see what they actually are.
However, I cannot train it as I am getting the error iterating over tf.Tensor is not allowed: AutoGraph did not convert this function. Try decorating it directly with #tf.function.. I have decorated it, and still does not help.
I am fairly certain it is the for loop that is causing me the error when I do model.fit(train_x, train_y). Would appreciate any thoughts on how to implement the above for loop in the tensorflow way. tf.while_loop is all I have seen so far and the examples given are fairly simplistic compared to what I want to do.
this is my proposal...
I don't know what your network exactly do but what I can see is that you want to produce 2 outputs and combine them inside your loss. One of your output is also the results of some hidden operation inside the network (M_loss).
so if you want to return 2 outputs, 2 targets are needed in keras in order to make a fit. In the code I provide below, the first target is the real labels and the other is a fake output (an array of zeros).
As said before, you try to build a combined loss as sparse_entropy(y_true, y_pred) - reg_sparse * M_loss. To make this possible I split the loss in two pieces (one for each output): the sparse part and the M_loss part. The sparse loss is simply SparseCategoricalCrossentropy(from_logits=True) from keras, while for the M_loss, I wrote this function following your code
def m_loss(y_true, y_pred):
m = tf.reduce_mean(y_pred, keepdims=True)
return m
the m_loss use only 'y_pred' that are the hidden pieces of your network. the y_true in this case doesn't matter for the required operation. this is why we pass an array of zeros when fitting.
At this point, we have to combine the two losses and this possible in keras in this way
reg_sparse = 0.1
model.compile('Adam', loss=[sce, m_loss], loss_weights=[1,-reg_sparse])
model.fit(train_x, [train_y, np.zeros(train_y.shape[0])], epochs=3)
in this case, the final loss is the result of the combination of 1*sce + (-reg_sparse)*m_loss
this is the full running code: https://colab.research.google.com/drive/152q1rmqTJ0dWLbFN8PqzCBhWkVKirkU5?usp=sharing
I also make some little changes in TabNet, for example in the way final_out and M_loss are created
No actually it is not a problem of for loop. I checked your code, the problem was that you forgot to call the superclass constructor in your SharedBlock, DecisionBlock and Prior.
For e.g your code should look like.
class SharedBlock(layers.Layer):
def __init__(self, units, mult=tf.sqrt(0.5)):
super().__init__()
self.layer1 = FCBlock(units)
self.layer2 = FCBlock(units)
self.mult = mult
After doing these changes you will not see that error again but something else comes up.
TypeError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1147 predict_function *
outputs = self.distribute_strategy.run(
<ipython-input-46-f609cb1acdfa>:15 call *
self.prior_scale.reset()
TypeError: tf__reset() missing 1 required positional argument: 'len_x'
To resolve this issue you will need to do following changes in the class class Prior(layers.Layer):.
def reset(self, len_x=1.0):
self.P = 1.0
Then you will get another issue.
AttributeError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1147 predict_function *
outputs = self.distribute_strategy.run(
<ipython-input-46-f609cb1acdfa>:26 call *
out = self.decision[i](self.shared(x * prior))
AttributeError: 'TabNet' object has no attribute 'decision'
For this issue I will request to open another question as I think you main issue is resolved.
UPDATE:
You can look into the comment section of this answer, there a solution has been provided for the issue AttributeError: 'TabNet' object has no attribute 'decision'
UPDATE: 21/07
I have to disappoint you again that the issue is not with the for loop.
If you look closely at the error log you will see that the issue is due to the full_loss function.
<ipython-input-10-07e59f23d230>:7 full_loss *
logits, M_loss = y_pred
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:561 __iter__
self._disallow_iteration()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:554 _disallow_iteration
self._disallow_when_autograph_enabled("iterating over `tf.Tensor`")
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:532 _disallow_when_autograph_enabled
" decorating it directly with #tf.function.".format(task))
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did not convert this function. Try decorating it directly with #tf.function.
The exact problem is caused by the below statement.
logits, M_loss = y_pred
If you use the below code that does not use your loss function you will see a different result.
model.compile('Adam', loss='sparse_categorical_crossentropy')
model.fit(train_x, train_y, batch_size=1)
Received a label value of 1 which is outside the valid range of [0, 1). Label values: 1
[[node sparse_categorical_crossentropy_1/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-26-d39f533b7a69>:2) ]] [Op:__inference_train_function_18003]
I do not understand the model code completely and the model.summary() is not that helpful in your case. There is some problem with your last layer, at least the error message suggests that you do not have ample neurons (1 for each class).
I will suggest looking into the last layer and the loss function.
Why I am sure it is not due to for loop is for the fact that even if you comment the for loop you will still receive the same error.
I hope I have helped you further, took me a few hours to figure it out.

Keras Model - Get input in custom loss function

I am having trouble with Keras Custom loss function. I want to be able to access truth as a numpy array.
Because it is a callback function, I think I am not in eager execution, which means I can't access it using the backend.get_value() function. i also tried different methods, but it always comes back to the fact that this 'Tensor' object doesn't exist.
Do I need to create a session inside the custom loss function ?
I am using Tensorflow 2.2, which is up to date.
def custom_loss(y_true, y_pred):
# 4D array that has the label (0) and a multiplier input dependant
truth = backend.get_value(y_true)
loss = backend.square((y_pred - truth[:,:,0]) * truth[:,:,1])
loss = backend.mean(loss, axis=-1)
return loss
model.compile(loss=custom_loss, optimizer='Adam')
model.fit(X, np.stack(labels, X[:, 0], axis=3), batch_size = 16)
I want to be able to access truth. It has two components (Label, Multiplier that his different for each item. I saw a solution that is input dependant, but I am not sure how to access the value. Custom loss function in Keras based on the input data
I think you can do this by enabling run_eagerly=True in model.compile as shown below.
model.compile(loss=custom_loss(weight_building, weight_space),optimizer=keras.optimizers.Adam(), metrics=['accuracy'],run_eagerly=True)
I think you also need to update custom_loss as shown below.
def custom_loss(weight_building, weight_space):
def loss(y_true, y_pred):
truth = backend.get_value(y_true)
error = backend.square((y_pred - y_true))
mse_error = backend.mean(error, axis=-1)
return mse_error
return loss
I am demonstrating the idea with a simple mnist data. Please take a look at the code here.

maximizing binary cross_entropy in a keras model

I don't know hot to create a model that is maximizing binary cross_entropy loss in a keras model.
research:
1.https://intellipaat.com/community/17707/how-to-maximize-loss-function-in-keras
that said:
Simply multiply the loss by -1 to maximize the loss function while trying to minimize it:
new_loss = -loss
but using:
model.compile(loss=-1 * 'binary_crossentropy', optimizer=adam_optimizer())
resulted in this error:
ValueError: The model cannot be compiled because it has no loss to optimize.
https://stats.stackexchange.com/questions/303229/why-does-keras-binary-crossentropy-loss-function-return-wrong-values
gave me a custom function that approximates the keras binary_crossentropy loss:
import keras.backend as K
def binary_crossentropy(y_true, y_pred):
result = []
for i in range(len(y_pred)):
y_pred[i] = [max(min(x, 1 - K.epsilon()), K.epsilon()) for x in y_pred[i]]
result.append(-np.mean([y_true[i][j] * math.log(y_pred[i][j]) + (1 - y_true[i][j]) * math.log(1 - y_pred[i][j]) for j in range(len(y_pred[i]))]))
return np.mean(result)
but I can not use it since it results in the error:
len is not well defined for symbolic Tensors. (43_54/Sigmoid:0) Please call `x.shape` rather than `len(x)` for shape information.
when I replace len with .shape[0]
I get the another error:
__index__ returned non-int (type NoneType)
I tinkered with the syntax in several more ways but nothing seems to work.
any ideas?
python 3.6
tensorflow 1.15
keras 2.3.1
You just need to define a new loss, based on the keras implementation:
def neg_binary_crossentropy(y_true, y_pred):
return -1.0 * keras.losses.binary_crossentropy(y_true, y_pred)
And then use it in model.compile:
model.compile(loss=neg_binary_crossentropy, optimizer="adam")

How to debug custom metric values in tf.keras

I have defined a very simple custom metric, in tf.keras, for tracking number of pixels predicted as '1' for a segmentation problem. Since the output from the last layer has sigmoid activation, I'm rounding y_pred and then summing. I expect to see a whole integer value (>= 0) (because of the rounding) but the output shows floating point numbers like 0.28. How is that possible? How can I debug this to figure out where the problem is?
I tried switching from tf.keras.backend.sum & tf.keras.backend.round to tf.reduce_sum & tf.round but that didnt solve the issue
def num_ones(y_true, y_pred):
return tf.keras.backend.sum(tf.keras.backend.flatten(tf.keras.backend.round(y_pred)))
model.compile(optimizer = tf.train.AdamOptimizer(learning_rate = 1e-4), loss = 'binary_crossentropy', metrics = ['accuracy', num_ones])
output-
INFO:tensorflow:Saving dict for global step 3408: accuracy = 0.9551756, global_step = 3408, loss = 0.7224839, num_ones = 0.28
Function
tf.config.run_functions_eagerly(True)
works fine with Tensorflow >2.3 but you have to define your custom metric function as tensorflow function (add the decorator):
#tf.function
def num_ones(y_true, y_pred):
return tf.keras.backend.sum(tf.keras.backend.flatten(tf.keras.backend.round(y_pred)))
To answer how you should debug the custom metrics, call the following function at the top of your python script:
tf.config.experimental_run_functions_eagerly(True)
This will force tensorflow to run all functions eagerly (including custom metrics) so you can then just set a breakpoint and check the values of everything like you would normally in your debugger.

Using a keras model in a custom keras loss

I have a regular keras model called e and I would like to compare its output for both y_pred and y_true in my custom loss function.
from keras import backend as K
def custom_loss(y_true, y_pred):
return K.mean(K.square(e.predict(y_pred)-e.predict(y_true)), axis=-1)
I am getting the error: AttributeError: 'Tensor' object has no attribute 'ndim'
This is because y_true and y_pred are both tensor object and keras.model.predict expects to be passed a numpy.array.
Any idea how I may succeed in using my keras.model in my custom loss function?
I am open to getting the output of a specified layer if need be or to converting my keras.model to a tf.estimator object (or anything else).
First, let's try to understand the error message you're getting:
AttributeError: 'Tensor' object has no attribute 'ndim'
Let's take a look at the Keras documentation and find the predict method of Keras model. We can see the description of the function parameters:
x: the input data, as a Numpy array.
So, the model is trying to get a ndims property of a numpy array, because it expects an array as input. On other hand, the custom loss function of the Keras framework gets tensors as inputs. So, don't write any python code inside it - it will never be executed during evaluation. This function is just called to construct the computational graph.
Okay, now that we found out the meaning behind that error message, how can we use a Keras model inside custom loss function? Simple! We just need to get the evaluation graph of the model.
Update
The use of global keyword is a bad coding practice. Also, now in 2020 we have better functional API in Keras that makes hacks with layers uneccessary. Better use something like this:
from keras import backend as K
def make_custom_loss(model):
"""Creates a loss function that uses `model` for evaluation
"""
def custom_loss(y_true, y_pred):
return K.mean(K.square(model(y_pred) - model(y_true)), axis=-1)
return custom_loss
custom_loss = make_custom_loss(e)
Deprecated
Try something like this (only for Sequential models and very old API):
def custom_loss(y_true, y_pred):
# Your model exists in global scope
global e
# Get the layers of your model
layers = [l for l in e.layers]
# Construct a graph to evaluate your other model on y_pred
eval_pred = y_pred
for i in range(len(layers)):
eval_pred = layers[i](eval_pred)
# Construct a graph to evaluate your other model on y_true
eval_true = y_true
for i in range(len(layers)):
eval_true = layers[i](eval_true)
# Now do what you wanted to do with outputs.
# Note that we are not returning the values, but a tensor.
return K.mean(K.square(eval_pred - eval_true), axis=-1)
Please note that the code above is not tested. However, the general idea will stay the same regardless of the implementation: you need to construct a graph, in which the y_true and y_pred will flow through it to the final operations.