Can you process a tensor in chunks in a custom Keras loss function? - tensorflow

I am trying to write a cusom Keras loss function in which I process the tensors in sub-vector chunks. For example, if an output tensor represented a concatenation of quaternion coefficients (i.e. w,x,y,z,w,x,y,z...) I might wish to normalize each quaternion before calculating the mean squared error in a loss function like:
def norm_quat_mse(y_true, y_pred):
diff = y_pred - y_true
dist = 0
for i in range(0,16,4):
dist += K.sum( K.square(diff[i:i+4] / K.sqrt(K.sum(K.square(diff[i:i+4])))))
return dist/4
While Keras will accept this function without error and use in training, it outputs a different loss value from when applied as an independent function and when using model.predict(), so I suspect it is not working properly. None of the built-in Keras loss functions use this per-chunk processing approach, is it possible to do this within Keras' auto-differentiation framework?

Try:
def norm_quat_mse(y_true, y_pred):
diff = y_pred - y_true
dist = 0
for i in range(0,16,4):
dist += K.sum( K.square(diff[:,i:i+4] / K.sqrt(K.sum(K.square(diff[:,i:i+4])))))
return dist/4
You need to know that shape of y_true and y_pred is (batch_size, output_size) so you need to skip first dimension during computations.

Related

Keras custom loss: how do I know to which patterns correponds to y_pred and y_true?

In keras you can define a custom loss with arguments (y_true, y_pred).
How do I know to which patterns are they correlated?
I mean, y_true is a tensor with batchSize elements. How can I relate those element to the original X?
I would like to know the correspondence between y_true[0] and the relative X[i].
So what you would like to have is a loss function like this
def custom_loss(y_true, y_pred, X):
because you need the input for your loss calculation.
That's not directly possible in Keras, as far as I know.
One possible workaround could be to have a running index:
X = ...
Y = ...
batch_size = ...
i = 0
def custom_loss(y_true, y_pred):
x = X[i*batch_size:(i+1)*batch_size]
loss = ...
i += 1
return loss
Make sure to reset i after each epoch. You can do this in a LambdaCallback that you pass to model.fit(). Also make sure to pass shuffle=False to model.fit().

Custom loss function with gradient calculation

I am trying to create the custom loss function in Keras. I want to compute the loss function based on the input and predicted output of the neural network. I created the custom loss function which takes the y_true, y_pred and t as the arguments. t is the variable that I would like to use for the custom loss function calculation. I have two parts in the loss function (please refer to the attached image)
I can create the first part of the loss function (which is the mean squared error). I would like to slice the y_pred tensor and assign it to three tensors (y1_pred, y2_pred, and y3_pred). Is there a way to do that directly in Keras or I have to use tensorflow for that? How can I calculate the gradient in keras? Do I need to create a session for computing loss2?
def customloss(y_true, y_pred, t):
loss1 = K.mean(K.square(y_pred - y_true), axis=-1)
loss2 = tf.gradients(y1_pred, t) - y1_pred*y3_pred
return loss1+loss2
Thank you.

InvalidArgumentError: In[0] is not a matrix. Instead it has shape []

I'm not able to train the network using keras, getting the following error, at epoch 1, first batch:
InvalidArgumentError: In[0] is not a matrix. Instead it has shape []
[[{{node training/SGD/gradients/dense_1/MatMul_grad/MatMul}}]]
I'm trying to solve a regression problem using Keras and a custom function provided by https://github.com/farrell236/DeepPose
The network is a quite simple CNN VGG-like.
I think the problem is the loss function. In particular, I suppose that the weight initialization is the issue (take a look at the Tensorflow example: https://github.com/farrell236/DeepPose/blob/master/tensorflow/example)
That's my loss function:
def custom_loss(y_true, y_pred):
loss = SE3GeodesicLoss(np.ones((1, 6)))
tf.initializers.constant([loss])
y_pred = tf.cast(y_pred, dtype=tf.float32)
y_true = tf.cast(y_true, dtype=tf.float32)
loss = SE3GeodesicLoss(np.ones(6))
geodesic_loss = loss.geodesic_loss(y_pred, y_true)
geodesic_loss = tf.cast(geodesic_loss, dtype=tf.float32)
return geodesic_loss
What's strange is that I'm able to use this function as a metric for the training.
Further information:
What I'm trying to do is to estimate the position of an object having images as input and relative Eulerian angles and distance of the target as labels (which means 6 parameters [r_x, r_y, r_z, t_x, t_y, t_z]). I'm trying to implement this loss function in order to solve the attitude estimation problem. Other losses (means: MSE, MAE) are not effective enough in solving attitude regression problem.
Do you have any suggestion?

Need custom loss function that uses if statement

I'm trying to train DNN that outputs 3 values (x,y,z) where x and y are coordinates of the object I'm looking for and z is the probability that object is present
I need custom loss function:
If z_true<0.5 I don't care of x and y values, so error should be equal to (0, 0, sqr(z_true - z_pred))
otherwise error should be like (sqr(x_true - x_pred), sqr(y_true - y_pred), sqr(z_true - z_pred))
I'm in a struggle with mixing tensors and if statements together.
Maybe this example of a custom loss function will get you up and running. It shows how you can mix tensors with if statements.
def conditional_loss_function(l):
def loss(y_true, y_pred):
if l == 0:
return loss_funtion1(y_true, y_pred)
else:
return loss_funtion2(y_true, y_pred)
return loss
model.compile(loss=conditional_loss_function(l), optimizer=...)
Use switch from Keras backend: https://keras.io/backend/#switch
It is similar to tf.cond
How to create a custom loss in Keras described here: Make a custom loss function in keras

Keras - custom loss function - chamfer distance

I am attempting object segmentation using a custom loss function as defined below:
def chamfer_loss_value(y_true, y_pred):
# flatten the batch
y_true_f = K.batch_flatten(y_true)
y_pred_f = K.batch_flatten(y_pred)
# ==========
# get chamfer distance sum
// error here
y_pred_mask_f = K.cast(K.greater_equal(y_pred_f,0.5), dtype='float32')
finalChamferDistanceSum = K.sum(y_pred_mask_f * y_true_f, axis=1, keepdims=True)
return K.mean(finalChamferDistanceSum)
def chamfer_loss(y_true, y_pred):
return chamfer_loss_value(y_true, y_pred)
y_pred_f is the result of my U-net. y_true_f is the result of a euclidean distance transform on the ground truth label mask x as shown below:
distTrans = ndimage.distance_transform_edt(1 - x)
To compute the Chamfer distance, you multiply the predicted image (ideally, a mask with 1 and 0) with the ground truth distance transform, and simply sum over all pixels. To do this, I needed to get a mask y_pred_mask_f by thresholding y_pred_f, then multiply with y_true_f, and sum over all pixels.
y_pred_f provides a continuous range of values in [0,1], and I get the error None type not supported at the evaluation of y_true_mask_f. I know the loss function has to be differentiable, and greater_equal and cast are not. But, is there a way to circumvent this in Keras? Perhaps using some workaround in Tensorflow?
Well, this was tricky. The reason behind your error is that there is no continuous dependence between your loss and your network. In order to compute gradients of your loss w.r.t. to network, your loss must compute the gradient of indicator if your output is greater than 0.5 (as this is the only connection between your final loss value and output y_pred from your network). This is impossible as this indicator is partially constant and not continuous.
Possible solution - smooth your indicator:
def chamfer_loss_value(y_true, y_pred):
# flatten the batch
y_true_f = K.batch_flatten(y_true)
y_pred_f = K.batch_flatten(y_pred)
y_pred_mask_f = K.sigmoid(y_pred_f - 0.5)
finalChamferDistanceSum = K.sum(y_pred_mask_f * y_true_f, axis=1, keepdims=True)
return K.mean(finalChamferDistanceSum)
As sigmoid is a continuous version of a step function. If your output comes from sigmoid - you could simply use y_pred_f instead of y_pred_mask_f.