I am trying to pass custom metrics to keras.compile. I am also learning OOP and trying to apply it to machine learning. What I want to do is also track f1, precision and recall per epoch.
I can pass for example f1, recall and precision to the function as separate functions but not as a class with an init method.
Here is what I have been trying to do:
class Metrics:
def __init__(self, y_true, y_pred):
self.y_true = y_true
self.y_pred = y_pred
self.tp = K.sum(K.cast(y_true * y_pred, 'float'), axis=0)
self.fp = K.sum(K.cast((1 - y_true) * y_pred, 'float'), axis=0)
self.fn = K.sum(K.cast(y_true*(1 - y_pred), 'float'), axis=0)
def precision_score(self):
precision = self.tp / (self.tp + self.fp + K.epsilon())
return precision
def recall_score(self):
recall = self.tp / (self.tp + self.fn + K.epsilon())
return recall
def f1_score(self):
precision = precision_score(self.y_true, self.y_pred)
recall = recall_score(self.y_true, self.y_pred)
f1 = 2 * precision * recall / (precision + recall + K.epsilon())
f1 = tf.where(tf.is_nan(f1), tf.zeros_like(f1), f1)
f1 = K.mean(f1)
return f1
if __name__ == '__main__':
# Some images
train_generator = DataGenerator().create_data()
validation_generator = DataGenerator().create_data()
model = create_model(
input_shape = INPUT_SHAPE,
n_out = N_CLASSES)
model.compile(
loss = 'binary_crossentropy',
optimizer = Adam(0.03),
# This is the part in question:
metrics = ['acc', Metrics.f1_score, Metrics.recall_score, Metrics.precision_score]
)
history = model.fit_generator(
train_generator,
steps_per_epoch = 5,
epochs = 5,
validation_data = next(validation_generator),
validation_steps = 7,
verbose = 1
)
It also works without the def init part by passing in Metrics.f1_score but why is it not working with initializing?
If I pass in Metrics.f1_score I get:
TypeError: f1_score() takes 1 positional argument but 2 were given
If I pass in Metrics.f1_score() I get:
TypeError: f1_score() missing 1 required positional argument: 'self'
If I pass in Metrics().f1_score I get:
TypeError: __init__() missing 2 required positional arguments: 'y_true' and 'y_pred'
If I pass in Metrics().f1_score() I get:
TypeError: __init__() missing 2 required positional arguments: 'y_true' and 'y_pred'
I'm afraid you can't do that. Keras is expecting a function that takes 2 arguments (y_true, y_pred). You are passing a function that takes 1 argument (self), so it's never going to be compatible. You can't change this behaviour because it's keras that gets to define this interface. This is why you get all the errors:
TypeError: f1_score() takes 1 positional argument but 2 were given
You passed a function that takes 1 argument (self) but Keras passed 2 (y_true,y_pred).
TypeError: f1_score() missing 1 required positional argument: 'self'
By passing it with () you are not really passing the function but calling it. You called it without arguments, but it expects 1 (self).
TypeError: __init__() missing 2 required positional arguments: 'y_true' and 'y_pred'
You are instantiating a Metrics object with 0 arguments, but your constructor (init) expects 2: y_true and y_pred.
If you wanted to group all your custom metrics in a Class, they would have to be static methods. A static method cannot access instance variables, because it does not receive a self argument. It would look like this:
class Metrics:
#staticmethod
def precision_score(tp, fp):
precision = tp / (tp + fp + K.epsilon())
return precision
#staticmethod
def recall_score(tp, fn):
recall = tp / (tp + fn + K.epsilon())
return recall
#staticmethod
def f1_score(y_true,y_pred):
tp = K.sum(K.cast(y_true * y_pred, 'float'), axis=0)
fp = K.sum(K.cast((1 - y_true) * y_pred, 'float'), axis=0)
fn = K.sum(K.cast(y_true*(1 - y_pred), 'float'), axis=0)
precision = Metrics.precision_score(tp,fp)
recall = Metrics.recall_score(tp, fn)
f1 = 2 * precision * recall / (precision + recall + K.epsilon())
f1 = tf.where(tf.is_nan(f1), tf.zeros_like(f1), f1)
f1 = K.mean(f1)
return f1
This way you can pass Metrics.f1_score to Keras. There is almost no difference between this Metrics class and having all these 3 static methods as module level functions, it's just a different way to group related functionality together. There is even a third way: use nested functions and drop the class altogether:
def f1_score(y_true,y_pred):
def precision_score(tp, fp):
precision = tp / (tp + fp + K.epsilon())
return precision
def recall_score(tp, fn):
recall = tp / (tp + fn + K.epsilon())
return recall
tp = K.sum(K.cast(y_true * y_pred, 'float'), axis=0)
fp = K.sum(K.cast((1 - y_true) * y_pred, 'float'), axis=0)
fn = K.sum(K.cast(y_true*(1 - y_pred), 'float'), axis=0)
precision = precision_score(tp,fp)
recall = recall_score(tp, fn)
f1 = 2 * precision * recall / (precision + recall + K.epsilon())
f1 = tf.where(tf.is_nan(f1), tf.zeros_like(f1), f1)
f1 = K.mean(f1)
return f1
Related
I want to custom loss function to quantile loss(pinball loss) in XGBRegressor
I use this code
def xgb_quantile_eval(preds, dmatrix, quantile=0.2):
labels = dmatrix.get_label()
return ('q{}_loss'.format(quantile),
np.nanmean((preds >= labels) * (1 - quantile) * (preds - labels) +
(preds < labels) * quantile * (labels - preds)))
def xgb_quantile_obj(preds, dmatrix, quantile=0.2):
try:
assert 0 <= quantile <= 1
except AssertionError:
raise ValueError("Quantile value must be float between 0 and 1.")
labels = dmatrix.get_label()
errors = preds - labels
left_mask = errors < 0
right_mask = errors > 0
grad = -quantile * left_mask + (1 - quantile) * right_mask
hess = np.ones_like(preds)
return grad, hess
And I build model like this
def XGB(q, X_train, Y_train, X_valid, Y_valid, X_test):
# (a) Modeling
model = XGBRegressor(objective=xgb_quantile_obj, alpha=q,
n_estimators=10000, bagging_fraction=0.7, learning_rate=0.027, subsample=0.7)
model.fit(X_train, Y_train, eval_metric = xgb_quantile_eval,
eval_set=[(X_valid, Y_valid)], early_stopping_rounds=300, verbose=500)
# (b) Predictions
pred = pd.Series(model.predict(X_test).round(2))
return model, pred
But I got an error
models_2, results_2 = XGB(0.5, X_train_1, Y_train_1, X_valid_1, Y_valid_1, X_test)
results_2
AttributeError: 'numpy.ndarray' object has no attribute 'get_label'
I am not sure if I am doing well. Please help me
Oh I catch the error, I have to change the sequence between preds and dmatrix in xgb_quantile_obj and change dmatrix to labels
I am working on CapsNet and taking code help from here. Simulation is performed on google colab with tensorflow = 2.4.0. I am getting following error:
AttributeError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:805 train_function *
return step_function(self, iterator)
/content/drive/My Drive/Cervical GAN/Segmentation/Cheng-Lin-Li/SegCaps-master-aashish/utils/custom_losses.py:102 dice_loss *
return 1-dice_soft(y_true, y_pred, from_logits=False)
/content/drive/My Drive/Cervical GAN/Segmentation/Cheng-Lin-Li/SegCaps-master-aashish/utils/custom_losses.py:41 dice_soft *
y_pred = tf.log(y_pred / (1 - y_pred))
AttributeError: module 'tensorflow' has no attribute 'log'
Following is custom_losses.py
'''
Capsules for Object Segmentation (SegCaps)
Original Paper: https://arxiv.org/abs/1804.04241
Code written by: Rodney LaLonde
If you use significant portions of this code or the ideas from our paper, please cite it :)
If you have any questions, please email me at lalonde#knights.ucf.edu.
This file contains the definitions of custom loss functions not present in the default Keras.
=====
This program includes all custom loss functions UNet, tiramisu, Capsule Nets (capsbasic) or SegCaps(segcapsr1 or segcapsr3).
#author: Cheng-Lin Li a.k.a. Clark
#copyright: 2018 Cheng-Lin Li#Insight AI. All rights reserved.
#license: Licensed under the Apache License v2.0. http://www.apache.org/licenses/
#contact: clark.cl.li#gmail.com
Enhancement:
1. Revise default loss_type to jaccard on dice_soft function.
2. add bce_dice_loss for future usage.
'''
import tensorflow as tf
from keras import backend as K
from keras.losses import binary_crossentropy
def dice_soft(y_true, y_pred, loss_type='jaccard', axis=[1,2,3], smooth=1e-5, from_logits=False):
"""Soft dice (Sørensen or Jaccard) coefficient for comparing the similarity
of two batch of data, usually be used for binary image segmentation
i.e. labels are binary. The coefficient between 0 to 1, 1 means totally match.
Parameters
-----------
y_pred : tensor
A distribution with shape: [batch_size, ....], (any dimensions).
y_true : tensor
A distribution with shape: [batch_size, ....], (any dimensions).
loss_type : string
``jaccard`` or ``sorensen``, default is ``jaccard``.
axis : list of integer
All dimensions are reduced, default ``[1,2,3]``.
smooth : float
This small value will be added to the numerator and denominator.
If both y_pred and y_true are empty, it makes sure dice is 1.
If either y_pred or y_true are empty (all pixels are background), dice = ```smooth/(small_value + smooth)``,
then if smooth is very small, dice close to 0 (even the image values lower than the threshold),
so in this case, higher smooth can have a higher dice.
Examples
---------
>>> outputs = tl.act.pixel_wise_softmax(network.outputs)
>>> dice_loss = 1 - tl.cost.dice_coe(outputs, y_)
References
-----------
- `Wiki-Dice <https://en.wikipedia.org/wiki/Sørensen–Dice_coefficient>`_
"""
if not from_logits:
# transform back to logits
_epsilon = tf.convert_to_tensor(1e-7, y_pred.dtype.base_dtype)
y_pred = tf.clip_by_value(y_pred, _epsilon, 1 - _epsilon)
y_pred = tf.log(y_pred / (1 - y_pred))
inse = tf.reduce_sum(y_pred * y_true, axis=axis)
if loss_type == 'jaccard':
l = tf.reduce_sum(y_pred * y_pred, axis=axis)
r = tf.reduce_sum(y_true * y_true, axis=axis)
elif loss_type == 'sorensen':
l = tf.reduce_sum(y_pred, axis=axis)
r = tf.reduce_sum(y_true, axis=axis)
else:
raise Exception("Unknow loss_type")
## old axis=[0,1,2,3]
# dice = 2 * (inse) / (l + r)
# epsilon = 1e-5
# dice = tf.clip_by_value(dice, 0, 1.0-epsilon) # if all empty, dice = 1
## new haodong
dice = (2. * inse + smooth) / (l + r + smooth)
##
dice = tf.reduce_mean(dice)
return dice
def dice_hard(y_true, y_pred, threshold=0.5, axis=[1,2,3], smooth=1e-5):
"""Non-differentiable Sørensen–Dice coefficient for comparing the similarity
of two batch of data, usually be used for binary image segmentation i.e. labels are binary.
The coefficient between 0 to 1, 1 if totally match.
Parameters
-----------
y_pred : tensor
A distribution with shape: [batch_size, ....], (any dimensions).
y_true : tensor
A distribution with shape: [batch_size, ....], (any dimensions).
threshold : float
The threshold value to be true.
axis : list of integer
All dimensions are reduced, default ``[1,2,3]``.
smooth : float
This small value will be added to the numerator and denominator, see ``dice_coe``.
References
-----------
- `Wiki-Dice <https://en.wikipedia.org/wiki/Sørensen–Dice_coefficient>`_
"""
y_pred = tf.cast(y_pred > threshold, dtype=tf.float32)
y_true = tf.cast(y_true > threshold, dtype=tf.float32)
inse = tf.reduce_sum(tf.multiply(y_pred, y_true), axis=axis)
l = tf.reduce_sum(y_pred, axis=axis)
r = tf.reduce_sum(y_true, axis=axis)
## old axis=[0,1,2,3]
# hard_dice = 2 * (inse) / (l + r)
# epsilon = 1e-5
# hard_dice = tf.clip_by_value(hard_dice, 0, 1.0-epsilon)
## new haodong
hard_dice = (2. * inse + smooth) / (l + r + smooth)
##
hard_dice = tf.reduce_mean(hard_dice)
return hard_dice
def dice_loss(y_true, y_pred, from_logits=False):
return 1-dice_soft(y_true, y_pred, from_logits=False)
def bce_dice_loss(y_true, y_pred):
return binary_crossentropy(y_true, y_pred) + dice_loss(y_true, y_pred)
def weighted_binary_crossentropy_loss(pos_weight):
# pos_weight: A coefficient to use on the positive examples.
def weighted_binary_crossentropy(target, output, from_logits=False):
"""Binary crossentropy between an output tensor and a target tensor.
# Arguments
target: A tensor with the same shape as `output`.
output: A tensor.
from_logits: Whether `output` is expected to be a logits tensor.
By default, we consider that `output`
encodes a probability distribution.
# Returns
A tensor.
"""
# Note: tf.nn.sigmoid_cross_entropy_with_logits
# expects logits, Keras expects probabilities.
if not from_logits:
# transform back to logits
_epsilon = tf.convert_to_tensor(1e-7, output.dtype.base_dtype)
output = tf.clip_by_value(output, _epsilon, 1 - _epsilon)
output = tf.log(output / (1 - output))
return tf.nn.weighted_cross_entropy_with_logits(targets=target,
logits=output,
pos_weight=pos_weight)
return weighted_binary_crossentropy
def margin_loss(margin=0.4, downweight=0.5, pos_weight=1.0):
'''
Args:
margin: scalar, the margin after subtracting 0.5 from raw_logits.
downweight: scalar, the factor for negative cost.
'''
def _margin_loss(labels, raw_logits):
"""Penalizes deviations from margin for each logit.
Each wrong logit costs its distance to margin. For negative logits margin is
0.1 and for positives it is 0.9. First subtract 0.5 from all logits. Now
margin is 0.4 from each side.
Args:
labels: tensor, one hot encoding of ground truth.
raw_logits: tensor, model predictions in range [0, 1]
Returns:
A tensor with cost for each data point of shape [batch_size].
"""
logits = raw_logits - 0.5
positive_cost = pos_weight * labels * tf.cast(tf.less(logits, margin),
tf.float32) * tf.pow(logits - margin, 2)
negative_cost = (1 - labels) * tf.cast(
tf.greater(logits, -margin), tf.float32) * tf.pow(logits + margin, 2)
return 0.5 * positive_cost + downweight * 0.5 * negative_cost
return _margin_loss
The above comes while using dice loss. When using bce loss there is no error. I have tried tf.math.log instead of tf.log but still getting following error:
TypeError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:805 train_function *
return step_function(self, iterator)
/content/drive/MyDrive/Cervical GAN/Segmentation/Cheng-Lin-Li/SegCaps-master-aashish/utils/custom_losses.py:102 dice_loss *
return 1-dice_soft(y_true, y_pred, from_logits=False)
/content/drive/MyDrive/Cervical GAN/Segmentation/Cheng-Lin-Li/SegCaps-master-aashish/utils/custom_losses.py:43 dice_soft *
inse = tf.reduce_sum(y_pred * y_true, axis=axis)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:1180 binary_op_wrapper
raise e
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:1164 binary_op_wrapper
return func(x, y, name=name)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:1496 _mul_dispatch
return multiply(x, y, name=name)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper
return target(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:518 multiply
return gen_math_ops.mul(x, y, name)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py:6078 mul
"Mul", x=x, y=y, name=name)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:558 _apply_op_helper
inferred_from[input_arg.type_attr]))
TypeError: Input 'y' of 'Mul' Op has type uint8 that does not match type float32 of argument 'x'.
The error
TypeError: Input 'y' of 'Mul' Op has type uint8 that does not match type float32 of argument 'x'.
indicates that y does not match the type of x in x * y. This can be fixed by casting to tf.float32.
The problem arises in this line in dice_soft:
inse = tf.reduce_sum(y_pred * y_true, axis=axis)
So one solution is to use tf.cast to cast y_true to the same type as y_pred.
I'm trying to implement in TF 2.2 the loss function from this paper (an existing version in TensorFlow 1.10.1, made by the author of the paper can be found here).
However, the theoretical details of the loss function are not relevant to my problem.
My loss function:
def z_score_based_deviation_loss(y_true, y_pred):
confidence_margin = 5.0
ref = K.variable(np.random.normal(loc=0., scale=1.0, size=5000), dtype='float32')
dev = (y_pred - K.mean(ref)) / K.std(ref)
inlier_loss = K.abs(dev)
outlier_loss = K.abs(K.maximum(confidence_margin - dev, 0.))
return K.mean((1 - y_true) * inlier_loss + y_true * outlier_loss)
Calling it by:
model.compile(optimizer='adam', loss=z_score_based_deviation_loss)
I got this error:
ValueError: tf.function-decorated function tried to create variables on non-first call.
I know there are other questions on this subject, but I haven't found any that are related to a custom loss function in Keras.
I can't figure out how to adapt it.
Also, from what I've read, the problem should exists when the function is tagged with #tf.function (which is recommended anyway to speed up the computation), but I didn't add it...
try in this way
def z_score_based_deviation_loss(y_true, y_pred):
confidence_margin = 5.0
ref = tf.random.normal([5000], mean=0., stddev=1.)
dev = (y_pred - K.mean(ref)) / K.std(ref)
inlier_loss = K.abs(dev)
outlier_loss = K.abs(K.maximum(confidence_margin - dev, 0.))
return K.mean((1 - y_true) * inlier_loss + y_true * outlier_loss)
X = np.random.uniform(0,1, (100,10))
y = np.random.uniform(0,1, 100)
x_input = Input((10,))
intermediate = Dense(1, activation='linear', name = 'score')(x_input)
model = Model(x_input, intermediate)
model.compile('adam', z_score_based_deviation_loss)
model.fit(X, y, epochs=10)
I have a question concerning the implementation of a correlation-based loss function for a sequence labelling task in Keras (Tensorflow backend).
Consider we have a sequence labelling problem, e.g., the input is a tensor of shape (20,100,5), the output is a tensor of shape (20,100,1).
In the documentation it is written that, the loss function needs to return a "scalar for each data point". What the default MSE loss does for the loss between tensors of shape (20,100,1) is to return a loss tensor of shape (20,100).
Now, if we use a loss function based on the correlation coefficient for each sequence, in theory, we will get only a single value for each sequence, i.e., a tensor of shape (20,).
However, using this in Keras as a loss function, fit() returns an error as a tensor of shape (20,100) is expected.
On the other side, there is no error when I either
Return just the mean value of the tensor (a single scalar for the whole data), or
Repeat the tensor (using K.repeat_elements) ending up in a tensor of shape (20,100).
The framework does not return an error (Tensorflow backend) and the loss is reduced over epochs, also on independent test data, the performance is good.
My questions are:
Which dimensionality of the targets/losses does the "fit" function usually assume in case of sequences?
Is the Tensorflow backend able to derive the gradients properly also with only the mean value returned?
Please find below an executable example with my implementations of correlation-based loss functions.
my_loss_1 returns only the mean value of the correlation coefficients of all (20) sequences.
my_loss_2 returns only one loss for each sequence (does not work in a real training).
my_loss_3 repeats the loss for each sample within each sequence.
Many thanks and best wishes
from keras import backend as K
from keras.losses import mean_squared_error
import numpy as np
import tensorflow as tf
def my_loss_1(seq1, seq2): # Correlation-based loss function - version 1 - return scalar
seq1 = K.squeeze(seq1, axis=-1)
seq2 = K.squeeze(seq2, axis=-1)
seq1_mean = K.mean(seq1, axis=-1, keepdims=True)
seq2_mean = K.mean(seq2, axis=-1, keepdims=True)
nominator = K.sum((seq1-seq1_mean) * (seq2-seq2_mean), axis=-1)
denominator = K.sqrt( K.sum(K.square(seq1-seq1_mean), axis=-1) * K.sum(K.square(seq2-seq2_mean), axis=-1) )
corr = nominator / (denominator + K.common.epsilon())
corr_loss = K.constant(1.) - corr
corr_loss = K.mean(corr_loss)
return corr_loss
def my_loss_2(seq1, seq2): # Correlation-based loss function - version 2 - return 1D array
seq1 = K.squeeze(seq1, axis=-1)
seq2 = K.squeeze(seq2, axis=-1)
seq1_mean = K.mean(seq1, axis=-1, keepdims=True)
seq2_mean = K.mean(seq2, axis=-1, keepdims=True)
nominator = K.sum((seq1-seq1_mean) * (seq2-seq2_mean), axis=-1)
denominator = K.sqrt( K.sum(K.square(seq1-seq1_mean), axis=-1) * K.sum(K.square(seq2-seq2_mean), axis=-1) )
corr = nominator / (denominator + K.common.epsilon())
corr_loss = K.constant(1.) - corr
return corr_loss
def my_loss_3(seq1, seq2): # Correlation-based loss function - version 3 - return 2D array
seq1 = K.squeeze(seq1, axis=-1)
seq2 = K.squeeze(seq2, axis=-1)
seq1_mean = K.mean(seq1, axis=-1, keepdims=True)
seq2_mean = K.mean(seq2, axis=-1, keepdims=True)
nominator = K.sum((seq1-seq1_mean) * (seq2-seq2_mean), axis=-1)
denominator = K.sqrt( K.sum(K.square(seq1-seq1_mean), axis=-1) * K.sum(K.square(seq2-seq2_mean), axis=-1) )
corr = nominator / (denominator + K.common.epsilon())
corr_loss = K.constant(1.) - corr
corr_loss = K.reshape(corr_loss, (-1,1))
corr_loss = K.repeat_elements(corr_loss, K.int_shape(seq1)[1], 1) # Does not work for fit(). It seems that NO dimension may be None in order to get a value!=None from int_shape().
return corr_loss
# Test
sess = tf.Session()
# input (20,100,1)
a1 = np.random.rand(20,100,1)
a2 = np.random.rand(20,100,1)
print('\nInput: ' + str(a1.shape))
p1 = K.placeholder(shape=a1.shape, dtype=tf.float32)
p2 = K.placeholder(shape=a1.shape, dtype=tf.float32)
loss0 = mean_squared_error(p1,p2)
print('\nMSE:') # output: (20,100)
print(sess.run(loss0, feed_dict={p1: a1, p2: a2}))
loss1 = my_loss_1(p1,p2)
print('\nCorrelation coefficient:') # output: ()
print(sess.run(loss1, feed_dict={p1: a1, p2: a2}))
loss2 = my_loss_2(p1,p2)
print('\nCorrelation coefficient:') # output: (20,)
print(sess.run(loss2, feed_dict={p1: a1, p2: a2}))
loss3 = my_loss_3(p1,p2)
print('\nCorrelation coefficient:') # output: (20,100)
print(sess.run(loss3, feed_dict={p1: a1, p2: a2}))
Now, if we use a loss function based on the correlation coefficient
for each sequence, in theory, we will get only a single value for each
sequence, i.e., a tensor of shape (20,).
That's not true. the coefficient is something like
average((avg_label - label_value)(average_prediction - prediction_value)) /
(var(label_value)*var(prediction_value))
Remove the overall average and you are left the componenets of the correlation coefficient, per element of the sequence, which is the right shape.
You can plug in other correlation formulas as well, just stop before computing the single value.
Thanks a lot!
Well, I thought the coefficient is already the overall (averaged) metric over a sample sequence, but your solution makes sense, indeed.
Below, there is my running code (the summation in the denominator has also been changed to averaging now, otherwise the result would get smaller the longer the sequence is and this may not be as the overall loss is the mean over all losses). It works well when applied to real tasks (not shown here).
The only problem I still have is that the squeezing step at the beginning of the loss function is not so nice, but I was not able to find a nicer solution.
from keras import backend as K
from keras.losses import mean_squared_error
import numpy as np
import tensorflow as tf
def my_loss(seq1, seq2): # Correlation-based loss function
seq1 = K.squeeze(seq1, axis=-1) # To remove the last dimension
seq2 = K.squeeze(seq2, axis=-1) # To remove the last dimension
seq1_mean = K.mean(seq1, axis=-1, keepdims=True)
seq2_mean = K.mean(seq2, axis=-1, keepdims=True)
nominator = (seq1-seq1_mean) * (seq2-seq2_mean)
denominator = K.sqrt( K.mean(K.square(seq1-seq1_mean), axis=-1, keepdims=True) * K.mean(K.square(seq2-seq2_mean), axis=-1, keepdims=True) )
corr = nominator / (denominator + K.common.epsilon())
corr_loss = K.constant(1.) - corr
return corr_loss
# Test
sess = tf.Session()
# Input (20,100,1)
a1 = np.random.rand(20,100,1)
a2 = np.random.rand(20,100,1)
print('\nInput: ' + str(a1.shape))
p1 = K.placeholder(shape=a1.shape, dtype=tf.float32)
p2 = K.placeholder(shape=a1.shape, dtype=tf.float32)
loss0 = mean_squared_error(p1,p2)
print('\nMSE:') # output: (20,100)
print(sess.run(loss0, feed_dict={p1: a1, p2: a2}))
loss1 = my_loss(p1,p2)
print('\nCorrelation coefficient-based loss:') # output: (20,100)
print(sess.run(loss1, feed_dict={p1: a1, p2: a2}))
I am trying to create a custom macro for recall = (recall of class1 + recall of class2)/2. I came up with the following code but I am not sure how to calculate the true positive of class 0.
def unweightedRecall():
def recall(y_true, y_pred):
# recall of class 1
true_positives1 = K.sum(K.round(K.clip(y_pred * y_true, 0, 1)))
possible_positives1 = K.sum(K.round(K.clip(y_true, 0, 1)))
recall1 = true_positives1 / (possible_positives1 + K.epsilon())
# --- get true positive of class 0 in true_positives0 here ---
# Also, is there a cleaner way to get possible_positives0
possible_positives0 = K.int_shape(y_true)[0] - possible_positives1
recall0 = true_positives0 / (possible_positives0 + K.epsilon())
return (recall0 + recall1)/2
return recall
It seems I will have to use Keras.backend.equal(x, y), but how do i create a tensor with shape K.int_shape(y_true)[0] and all values, say x?
Edit 1
Based on Marcin's comments, I wanted to create a custom metric based on callback in keras. While browsing issues in Keras, I came across the following code for f1 metric:
class Metrics(keras.callbacks.Callback):
def on_epoch_end(self, batch, logs={}):
predict = np.asarray(self.model.predict(self.validation_data[0]))
targ = self.validation_data[1]
self.f1s=f1(targ, predict)
return
metrics = Metrics()
model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=[X_test,y_test],
verbose=1, callbacks=[metrics])
But how is the callback returning the accuracy? I wanted to implement unweighted recall = (recall class1 + recall class2)/2. I can think of the following code but would appreciate any help to complete it
from sklearn.metrics import recall_score
class Metrics(keras.callbacks.Callback):
def on_epoch_end(self, batch, logs={}):
predict = np.asarray(self.model.predict(self.validation_data[0]))
targ = self.validation_data[1]
# --- what to store the result in?? ---
self.XXXX=recall_score(targ, predict, average='macro')
# we really dont need to return anything ??
return
metrics = Metrics()
model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=[X_test,y_test],
verbose=1, callbacks=[metrics])
Edit 2: model:
def createModelHelper(numNeurons=40, optimizer='adam'):
inputLayer = Input(shape=(data.shape[1],))
denseLayer1 = Dense(numNeurons)(inputLayer)
outputLayer = Dense(1, activation='sigmoid')(denseLayer1)
model = Model(input=inputLayer, output=outputLayer)
model.compile(loss=unweightedRecall, optimizer=optimizer)
return model
keras version (with the mean problem).
Are your two classes actually only one dimension output (0 or 1)?
If so:
def recall(y_true, y_pred):
# recall of class 1
#do not use "round" here if you're going to use this as a loss function
true_positives = K.sum(K.round(y_pred) * y_true)
possible_positives = K.sum(y_true)
return true_positives / (possible_positives + K.epsilon())
def unweightedRecall(y_true, y_pred):
return (recall(y_true,y_pred) + recall(1-y_true,1-y_pred))/2.
Now, if your two classes are actually a 2-element output:
def unweightedRecall(y_true, y_pred):
return (recall(y_true[:,0],y_pred[:,0]) + recall(y_true[:,1],y_pred[:,1]))/2.
Callback version:
For the callback, you can use a LambdaCallback, and you manually print or store the results:
myCallBack = LambdaCallback(on_epoch_end=unweightedRecall)
stored_metrics = []
def unweightedRecall(epoch,logs):
predict = model.predict(self.validation_data[0])
targ = self.validation_data[1]
result = (recall(targ,predict) + recall(1-targ,1-predict))/2.
print("recall for epoch " + str(epoch) + ": " + str(result))
stored_metrics.append(result)
Where recall is a function using np instead of K. And epsilon = np.finfo(float).eps or epsilon = np.finfo(np.float32).eps)