Using kohannenkappa loss function in keras models - tensorflow

Trying to use non keras backend functions for custom loss calculation in keras models.
I am trying to make my keras cnn model use a custom loss function ( KAppa score). However since kappas is not defined in Keras backend , i need to used scikit-learn based kappa implementation. This sklearn function takes array of labels as the argument unlike keras backend functions which take tensors. The loss function call within keras mostly sends tensors Y_pred and Y_true. I did the implementation below using some quide i found online but I get errors .
import keras.backend as K
def cohen_kappa_score_func(y_true, y_pred):
sess = tf.Session()
with sess.as_default():
score = cohen_kappa_score(type(y_true.eval()),type(y_pred.eval()), weights='linear')#idea is to convert the tensor to array
sess.close()
return score
#use this later to compile the keras model with custom loss function as
model.compile(optimizer=optimizers.SGD(lr=0.001, momentum=0.9),
loss=cohen_kappa_score_func,
metrics=['categorical_crossentropy', 'mae','categorical_accuracy'])
This doesnt work and i get the following error
"InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'dense_15_target' with dtype float and shape [?,?]
[[node dense_15_target "
Please give me suggestios to solve this.

Related

Simple way to convert tensor to numpy array without eager mode in TF 2.2

I can't find a simple way to convert a tensor to a NumPy array without enabling eager mode, which gives a nice .numpy() method, but also slows down my model training.
I'd be super grateful for your suggestions. For context, I'm writing a custom metric for my TensorFlow model that relies on a scikit learn function, which only takes numpy arrays.
I've tried wrapping the tensors with np.array(), which throws a not implemented error. Also gave sessions and .eval() a go, but didn't get it to work either and seemed like too much for this simple job.
My specific error:
NotImplementedError: Cannot convert a symbolic Tensor (model_17/dense_17/Sigmoid:0) to a numpy array.
# Custom metric
def accuracy_ml(y_true, y_pred):
return accuracy_score(y_true, np.round(y_pred)) # ERROR here feeding tensor to sklearn function
# Model
cnn = simple_model(input_shape=(224, 224, 3),
num_classes=10,
base_model = base_ResNet101)
lr = 1e-2
loss_fn = tf.keras.losses.BinaryCrossentropy()
metrics = [accuracy_ml]
cnn.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=lr),
loss=loss_fn,
metrics=metrics)
# Simple baseline eval that fails
validation_steps=17
loss0, accuracy0 = cnn.evaluate(validation_batches, steps = validation_steps)
Wrapping my NumPy metric with tf.numpy_function() solved it. https://www.tensorflow.org/api_docs/python/tf/numpy_function

Using a keras model in a custom keras loss

I have a regular keras model called e and I would like to compare its output for both y_pred and y_true in my custom loss function.
from keras import backend as K
def custom_loss(y_true, y_pred):
return K.mean(K.square(e.predict(y_pred)-e.predict(y_true)), axis=-1)
I am getting the error: AttributeError: 'Tensor' object has no attribute 'ndim'
This is because y_true and y_pred are both tensor object and keras.model.predict expects to be passed a numpy.array.
Any idea how I may succeed in using my keras.model in my custom loss function?
I am open to getting the output of a specified layer if need be or to converting my keras.model to a tf.estimator object (or anything else).
First, let's try to understand the error message you're getting:
AttributeError: 'Tensor' object has no attribute 'ndim'
Let's take a look at the Keras documentation and find the predict method of Keras model. We can see the description of the function parameters:
x: the input data, as a Numpy array.
So, the model is trying to get a ndims property of a numpy array, because it expects an array as input. On other hand, the custom loss function of the Keras framework gets tensors as inputs. So, don't write any python code inside it - it will never be executed during evaluation. This function is just called to construct the computational graph.
Okay, now that we found out the meaning behind that error message, how can we use a Keras model inside custom loss function? Simple! We just need to get the evaluation graph of the model.
Update
The use of global keyword is a bad coding practice. Also, now in 2020 we have better functional API in Keras that makes hacks with layers uneccessary. Better use something like this:
from keras import backend as K
def make_custom_loss(model):
"""Creates a loss function that uses `model` for evaluation
"""
def custom_loss(y_true, y_pred):
return K.mean(K.square(model(y_pred) - model(y_true)), axis=-1)
return custom_loss
custom_loss = make_custom_loss(e)
Deprecated
Try something like this (only for Sequential models and very old API):
def custom_loss(y_true, y_pred):
# Your model exists in global scope
global e
# Get the layers of your model
layers = [l for l in e.layers]
# Construct a graph to evaluate your other model on y_pred
eval_pred = y_pred
for i in range(len(layers)):
eval_pred = layers[i](eval_pred)
# Construct a graph to evaluate your other model on y_true
eval_true = y_true
for i in range(len(layers)):
eval_true = layers[i](eval_true)
# Now do what you wanted to do with outputs.
# Note that we are not returning the values, but a tensor.
return K.mean(K.square(eval_pred - eval_true), axis=-1)
Please note that the code above is not tested. However, the general idea will stay the same regardless of the implementation: you need to construct a graph, in which the y_true and y_pred will flow through it to the final operations.

How can I use TensorFlow's sampled softmax loss function in a Keras model?

I'm training a language model in Keras and would like to speed up training by using sampled softmax as the final activation function in my network. From the TF docs, it looks like I need to supply arguments for weights and biases, but I'm unsure of what is expected as input for these. It seems like I could write a custom function in Keras as follows:
import keras.backend as K
def sampled_softmax(weights, biases, y_true, y_pred, num_sampled, num_classes):
return K.sampled_softmax(weights, biases, y_true, y_pred, num_sampled, num_classes)
However, I'm unsure of how to "plug this in" to my existing network. The architecture for the LM is pretty dead-simple:
model = Sequential()
model.add(Embedding(input_dim=len(vocab), output_dim=256))
model.add(LSTM(1024, return_sequence=True))
model.add(Dense(output_dim=len(vocab), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
Given this architecture, could I pass the sampled_softmax function as the loss argument when calling the compile method on the model? Or do this need to be written as a layer that comes after the final fully-connected layer. Any guidance here would be greatly appreciated. Thanks.
The key observation here is that the TensorFlow sampled softmax function returns actual losses, not a set of predictions over the set of possible labels to compare with the ground truth data to then compute losses as a separate step. This makes the model setup a little bit weird.
First, we add a second input layer to the model that encodes the target (training) data a second time as an input, in addition to being the target output. This is used for the labels argument of the sampled_softmax_loss function. It needs to be a Keras input, because it's treated as an input when we go to instantiate and set up the model.
Second, we construct a new custom Keras layer that calls the sampled_softmax_loss function with two Keras layers as its inputs: the output of the dense layer that predicts our classes, and then the second input that contains a copy of the training data. Note that we're doing some serious hackery accessing the _keras_history instance variable to fetch the weight and bias tensors from the output tensor of the original fully-connected layer.
Finally, we have to construct a new "dumb" loss function that ignores the training data and just uses the loss reported by the sampled_softmax_loss function.
Note that because the sampled softmax function returns losses, not class predictions, you can't use this model specification for validation or inference. You'll need to re-use the trained layers from this "training version" in a new specification that applies a standard softmax function to the original dense layer which has the default activation function applied.
There is definitely a more elegant way to do this, but I believe this works, so I figured I'd post it here now as-is rather than wait until I have something that's a little bit neater. For example, you'd probably want to make the number of classes an argument of the SampledSoftmax layer, or better yet, condense this all into the loss function as in the original question and avoid passing in the training data twice.
from keras.models import Model
from keras.layers import Input, Dense, Layer
from keras import backend as K
class SampledSoftmax(Layer):
def __init__(self, **kwargs):
super(SampledSoftmax, self).__init__(**kwargs)
def call(self, inputs):
"""
The first input should be the model as it were, and the second the
target (i.e., a repeat of the training data) to compute the labels
argument
"""
# the labels input to this function is batch size by 1, where the
# value at position (i, 1) is the index that is true (not zero)
# e.g., (0, 0, 1) => (2) or (0, 1, 0, 0) => (1)
return K.tf.nn.sampled_softmax_loss(weights=inputs[0]._keras_history[0].weights[0],
biases=inputs[0]._keras_history[0].bias,
inputs=inputs[0],
labels=K.tf.reshape(K.tf.argmax(inputs[1], 1), [-1, 1]),
num_sampled=1000,
num_classes=200000)
def custom_loss(y_true, y_pred):
return K.tf.reduce_mean(y_pred)
num_classes = 200000
input = Input(shape=(300,))
target_input = Input(shape=(num_classes,))
dense = Dense(num_classes)
outputs = dense(input)
outputs = SampledSoftmax()([outputs, target_input])
model = Model([input, target_input], outputs)
model.compile(optimizer=u'adam', loss=custom_loss)
# train as desired

Keras fails to set dynamic shape of layer properly

I am using keras==2.0.8 with tensorflow==1.3.0 backend.
Here is the example which I am confused with:
from keras.layers import Input, Reshape, Conv2DTranspose
x = Input((5000,))
y = Reshape((25, 25, 8))(x)
y = Conv2DTranspose(10, 5, padding='same', strides=2)(y)
print(y)
It's just part of my model and after these lines I use y in some tensorflow operations, but code above prints node of shape (?, ?, ?, 10). I have no idea why TF cannot deduce height and width of resulting tensor statically. (I know that keras can, but I want TF node with proper shape)
If you intend to use these tensorflow operations in a keras model, you have to use them inside Lambda layers.
In the function you create for the lambda layer, you can use the given tensor normally. Unless you have a very specific reason for tensorflow to have this fixed size explicit, there won't be any problem. Is there any special need that demands you to have the tensorflow tensor with explicit shape?
In Keras, you can always use K.shape() in a keras tensor to get its shape. Many keras backend functions can take this shape (mostly with tensorflow) as input. If you can use the keras backend functions instead of pure tensorflow functions, your code may be portable to other backends later.
Example of function:
def tensorflowPart(x):
#do tensorflow operations with the tensor x
shape = K.shape(x) #use the shape of the tensor, as a tensor
#more tensorflow operations
return result
Use the lambda layer in your model:
y = Lambda(tensorflowPart)(y)

Keras weights and get_weights() show different values

I am using Keras with Tensorflow. A Keras layer has a method "get_weights()" and an attribute "weights". My understanding is that "weights" output the Tensorflow tensors of the weights and "get_weights()" evaluate the weight tensors and output the values as numpy arrays. However, the two actually show me different values. Here is the code to replicate.
from keras.applications.vgg19 import VGG19
import tensorflow as tf
vgg19 = VGG19(weights='imagenet', include_top=False)
vgg19.get_layer('block5_conv1').get_weights()[0][0,0,0,0]
#result is 0.0028906602, this is actually the pretrained weight
sess = tf.Session()
sess.run(tf.global_variables_initializer())
#I have to run the initializer here. Otherwise, the next line will give me an error
sess.run(vgg19.get_layer('block5_conv1').weights[0][0,0,0,0])
#The result here is -0.017039195 for me. It seems to be a random number each time.
My Keras version is 2.0.6. My Tensorflow is 1.3.0. Thank you!
The method get_weights() is indeed just evaluating the values of the the Tensorflow tensor given by the attribute weights. THe reason that I got different values between get_weights() and sess.run(weight) is that I was referring to the variables in two different sessions. When I ran vgg19 = VGG19(weights='imagenet', include_top=False), Keras has already created a Tensorflow session and initialized the weights with pre-trained values in that session. Then I created another Tensorflow session called sess by running sess = tf.Session(). In this session, the weights are not initialized yet. Then when I ran sess.run(tf.global_variables_initializer()), random numbers were assigned to the weights in this session. So the key is to make sure that you are working with the same session when using Tensorflow and Keras. The following code show that get_weights() and sess.run(weight) give the same value.
import tensorflow as tf
from keras import backend as K
from keras.applications.vgg19 import VGG19
sess = tf.Session()
K.set_session(sess)
vgg19 = VGG19(weights='imagenet', include_top=False)
vgg19.get_layer('block5_conv1').get_weights()[0][0,0,0,0]
#result is 0.0028906602, this is actually the pretrained weight
sess.run(vgg19.get_layer('block5_conv1').weights[0][0,0,0,0])
#The result here is also 0.0028906602