BACKGROUND:
I want to retrieve the equal of len(x) and x.shape[0] for y_pred and y_true inside a custom Keras metric without using anything but Keras backend.
Consider a minimal Keras metric example:
from keras import backend as K
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
Here y_pred and y_true are tensors that represent numpy arrays of a certain shape.
QUESTION:
How to get the length of the underlying array inside the keras metric function so that the resulting code will be in the form:
def binary_accuracy(y_true, y_pred):
# some Keras backend code
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
NOTE: the code has to be Keras backend code, so that it works on any Keras backend.
I've already tried K.ndim(y_pred) which returns 2 even though the length is 45 actually and K.int_shape(y_pred) which returns None.
You need to remember that in some cases, the shape of a given symbolic tensor (e.g. y_true and y_pred in your case) cannot be determined until you feed values to specific placeholders that this tensor relies on.
Keeping that in mind, you have two options:
Use K.int_shape(x) to get a tuple of ints and Nones that represent the shape of the input tensor x. In this case, the dimensions with undetermined lengths will be None.
This is useful in cases where your non-Tensorflow code does not depend on
undetermined dimensions. e.g. you cannot do the following:
if K.shape(x)[0] == 5:
...
else:
...
Use K.shape(x) to get a symbolic tensor that represents the shape of the
tensor x.
This is useful in cases where you want to use the shape of a tensor to change your TF graph, e.g.:
t = tf.ones(shape=K.shape(x)[0])
You can access the shape of the tensor through K.int_shape(x)
By taking the first value of the result, you will get the length of the underlying array : K.int_shape(x)[0]
Related
I am trying to write a custom loss function to use in a Tensorflow 2 model.
def loss(y_true, y_pred):
1. wish to convert y_true and y_pred to numpy array (these are images)
2. After that I wish to carry out some operations (such as binarization of images, some pixel wise AND
or OR operation etc.)
3. Finally get a single floating number as a loss, and feed it to the model and minimize the loss.
can any one please suggest me how to do it? I already tried many of the options given on the internet but I am not able to convert y_true and y_pred into a numpy array.
I have a regular keras model called e and I would like to compare its output for both y_pred and y_true in my custom loss function.
from keras import backend as K
def custom_loss(y_true, y_pred):
return K.mean(K.square(e.predict(y_pred)-e.predict(y_true)), axis=-1)
I am getting the error: AttributeError: 'Tensor' object has no attribute 'ndim'
This is because y_true and y_pred are both tensor object and keras.model.predict expects to be passed a numpy.array.
Any idea how I may succeed in using my keras.model in my custom loss function?
I am open to getting the output of a specified layer if need be or to converting my keras.model to a tf.estimator object (or anything else).
First, let's try to understand the error message you're getting:
AttributeError: 'Tensor' object has no attribute 'ndim'
Let's take a look at the Keras documentation and find the predict method of Keras model. We can see the description of the function parameters:
x: the input data, as a Numpy array.
So, the model is trying to get a ndims property of a numpy array, because it expects an array as input. On other hand, the custom loss function of the Keras framework gets tensors as inputs. So, don't write any python code inside it - it will never be executed during evaluation. This function is just called to construct the computational graph.
Okay, now that we found out the meaning behind that error message, how can we use a Keras model inside custom loss function? Simple! We just need to get the evaluation graph of the model.
Update
The use of global keyword is a bad coding practice. Also, now in 2020 we have better functional API in Keras that makes hacks with layers uneccessary. Better use something like this:
from keras import backend as K
def make_custom_loss(model):
"""Creates a loss function that uses `model` for evaluation
"""
def custom_loss(y_true, y_pred):
return K.mean(K.square(model(y_pred) - model(y_true)), axis=-1)
return custom_loss
custom_loss = make_custom_loss(e)
Deprecated
Try something like this (only for Sequential models and very old API):
def custom_loss(y_true, y_pred):
# Your model exists in global scope
global e
# Get the layers of your model
layers = [l for l in e.layers]
# Construct a graph to evaluate your other model on y_pred
eval_pred = y_pred
for i in range(len(layers)):
eval_pred = layers[i](eval_pred)
# Construct a graph to evaluate your other model on y_true
eval_true = y_true
for i in range(len(layers)):
eval_true = layers[i](eval_true)
# Now do what you wanted to do with outputs.
# Note that we are not returning the values, but a tensor.
return K.mean(K.square(eval_pred - eval_true), axis=-1)
Please note that the code above is not tested. However, the general idea will stay the same regardless of the implementation: you need to construct a graph, in which the y_true and y_pred will flow through it to the final operations.
I have a Keras classifier built using the Keras wrapper of the Scikit-Learn API. The neural network has 10 output nodes, and the training data is all represented using one-hot encoding.
According to Tensorflow documentation, the predict function outputs a shape of (n_samples,). When I fitted 514541 samples, the function returned an array with shape (514541, ), and each entry of the array ranged from 0 to 9.
Since I have ten different outputs, does the numerical value of each entry correspond exactly to the result that I encoded in my training matrix?
i.e. if index 5 of my one-hot encoding of y_train represents "orange", does a prediction value of 5 mean that the neural network predicted "orange"?
Here is a sample of my model:
model = Sequential()
model.add(Dropout(0.2, input_shape=(32,) ))
model.add(Dense(21, activation='selu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
There are some issues with your question.
The neural network has 10 output nodes, and the training data is all represented using one-hot encoding.
Since your network has 10 output nodes, and your labels are one-hot encoded, your model's output should also be 10-dimensional, and again hot-encoded, i.e. of shape (n_samples, 10). Moreover, since you use a softmax activation for your final layer, each element of your 10-dimensional output should be in [0, 1], and interpreted as the probability of the output belonging to the respective (one-hot encoded) class.
According to Tensorflow documentation, the predict function outputs a shape of (n_samples,).
It's puzzling why you refer to Tensorflow, while your model is clearly a Keras one; you should refer to the predict method of the Keras sequential API.
When I fitted 514541 samples, the function returned an array with shape (514541, ), and each entry of the array ranged from 0 to 9.
If something like that happens, it must be due to a later part in your code that you do not show here; in any case, the idea would be to find the argument with the highest value from each 10-dimensional network output (since they are interpreted as probabilities, it is intuitive that the element with the highest value would be the most probable). In other words, somewhere in your code there must be something like this:
pred = model.predict(x_test)
y = np.argmax(pred, axis=1) # numpy must have been imported as np
which will give an array of shape (n_samples,), with each y an integer between 0 and 9, as you report.
i.e. if index 5 of my one-hot encoding of y_train represents "orange", does a prediction value of 5 mean that the neural network predicted "orange"?
Provided that the above hold, yes.
I am aware that there is a similar topic at LSTM Followed by Mean Pooling, but that is about Keras and I work in pure TensorFlow.
I have an LSTM network where the recurrence is handled by:
outputs, final_state = tf.nn.dynamic_rnn(cell,
embed,
sequence_length=seq_lengths,
initial_state=initial_state)
where I pass the correct sequence lengths for each sample (padding by zeros). In any case, outputs contains irrelevant outputs since some samples produce longer outputs than others, based on sequence lengths.
Right now I'm extracting the last relevant output by means of the following method:
def extract_axis_1(data, ind):
"""
Get specified elements along the first axis of tensor.
:param data: Tensorflow tensor that will be subsetted.
:param ind: Indices to take (one for each element along axis 0 of data).
:return: Subsetted tensor.
"""
batch_range = tf.range(tf.shape(data)[0])
indices = tf.stack([batch_range, ind], axis=1)
res = tf.reduce_mean(tf.gather_nd(data, indices), axis=0)
where I pass sequence_length - 1 as indices. In reference to the last topic, I would like to select all relevant outputs followed by average pooling, instead of just the last one.
Now, I tried passing nested lists as indeces to extract_axis_1 but tf.stack does not accept this.
Any solution directions for this?
You can exploit the weight parameter of the tf.contrib.seq2seq.sequence_loss function.
From the documentation:
weights: A Tensor of shape [batch_size, sequence_length] and dtype float. weights constitutes the weighting of each prediction in the sequence. When using weights as masking, set all valid timesteps to 1 and all padded timesteps to 0, e.g. a mask returned by tf.sequence_mask.
You need to compute a binary mask that distinguish between your valid outputs and invalid ones. Then you can just provide this mask to the weights parameter of the loss function (probably, you will want to use a loss like this one); the function will not consider the outputs with a 0 weight in the computation of the loss.
If you can't/don't need to use a sequence loss you can do exactly the same thing manually. You compute a binarymask and then multiply your outputs by this mask and provide these as inputs to your fully connected layer.
I'm doing a Matrix Factorization in TensorFlow, I want to use coo_matrix from Spicy.sparse cause it uses less memory and it makes it easy to put all my data into my matrix for training data.
Is it possible to use coo_matrix to initialize a variable in tensorflow?
Or do I have to create a session and feed the data I got into tensorflow using sess.run() with feed_dict.
I hope that you understand my question and my problem otherwise comment and i will try to fix it.
The closest thing TensorFlow has to scipy.sparse.coo_matrix is tf.SparseTensor, which is the sparse equivalent of tf.Tensor. It will probably be easiest to feed a coo_matrix into your program.
A tf.SparseTensor is a slight generalization of COO matrices, where the tensor is represented as three dense tf.Tensor objects:
indices: An N x D matrix of tf.int64 values in which each row represents the coordinates of a non-zero value. N is the number of non-zeroes, and D is the rank of the equivalent dense tensor (2 in the case of a matrix).
values: A length-N vector of values, where element i is the value of the element whose coordinates are given on row i of indices.
dense_shape: A length-D vector of tf.int64, representing the shape of the equivalent dense tensor.
For example, you could use the following code, which uses tf.sparse_placeholder() to define a tf.SparseTensor that you can feed, and a tf.SparseTensorValue that represents the actual value being fed :
sparse_input = tf.sparse_placeholder(dtype=tf.float32, shape=[100, 100])
# ...
train_op = ...
coo_matrix = scipy.sparse.coo_matrix(...)
# Wrap `coo_matrix` in the `tf.SparseTensorValue` form that TensorFlow expects.
# SciPy stores the row and column coordinates as separate vectors, so we must
# stack and transpose them to make an indices matrix of the appropriate shape.
tf_coo_matrix = tf.SparseTensorValue(
indices=np.array([coo_matrix.rows, coo_matrix.cols]).T,
values=coo_matrix.data,
dense_shape=coo_matrix.shape)
Once you have converted your coo_matrix to a tf.SparseTensorValue, you can feed sparse_input with the tf.SparseTensorValue directly:
sess.run(train_op, feed_dict={sparse_input: tf_coo_matrix})