Suppose I have a multiple losses defined as
losses = ... # a tensor with shape: (10,)
Now I want to find the gradient each loss over a weight w
for i in range(len(10)):
grad[i] = tf.gradients(losses[i], w)
Now, can I do this without the above for loop but directly?
you can use the tf.map_fn to do map any arbitrary function across the first dimension of a tensor. So something like this should do the trick:
def get_grads(x):
return tf.gradients(x, w)
tf.map_fun(get_grads, losses)
Related
The API of sampled_softmax_loss goes like:
tf.nn.sampled_softmax_loss(
weights,
biases,
labels,
inputs,
num_sampled,
num_classes,
num_true=1,
sampled_values=None,
...
)
I've noticed that arg sampled_values is the one which determines what negatives samples we take and it's returned by a _candidate_sampler function like tf.random.fixed_unigram_candidate_sampler.
And in tf.random.fixed_unigram_candidate_sampler we can decide the probability of each sample chosen as negative sample.
How can I assign certain sample as negative sample on purpose?
For instance, in the case of recommender system, I'd like to add some hard negative sample to the model. So I want the hard negative samples been chosen for sure, not by probability like in _candidate_sampler function
How can I assign certain samples as negative samples when using sampled_softmax_loss in TensorFlow?
You need to understand that the sampler candidates function is only a remarks function and your question is right about how to create a negative sampler.
You don't need to create a negative sampler when you assigned a unique. The sampler is (sampled_candidates, true_expected_count, sampled_expected_count). Hard negative is when you add contrast values to significant the candidates. In this way, you can have it with distributions.
Random Uniform Candidates Sampler
Candidate Sampling
Sampled SoftMax
Simple: It is weight and bias are varies, and functions are the same.
import tensorflow as tf
weights = tf.zeros([4, 1])
biases = tf.zeros([4])
labels = tf.ones([4, 1])
inputs = tf.zeros([4, 1])
num_sampled = 1
num_classes = 1
true_classes = tf.ones([4, 4], dtype=tf.int64)
num_true = 4
num_sampled = 1
unique =True
range_max = 1
sampler = tf.random.uniform_candidate_sampler(
true_classes,
num_true,
num_sampled,
unique,
range_max,
seed=None,
name=None
)
loss_fn = tf.nn.sampled_softmax_loss(
weights,
biases,
labels,
inputs,
num_sampled,
num_classes,
num_true=1,
sampled_values=sampler,
remove_accidental_hits=True,
seed=None,
name='sampled_softmax_loss'
)
print( loss_fn )
Output: Value output as examples, and ran three times.
tf.Tensor([6.437752 6.437752 6.437752 6.437752], shape=(4,), dtype=float32)
tf.Tensor([6.437752 6.437752 6.437752 6.437752], shape=(4,), dtype=float32)
tf.Tensor([6.437752 6.437752 6.437752 6.437752], shape=(4,), dtype=float32)
My codes are as follow:
v = tf.Variable(initial_value=v, trainable=True)
v.shape is (1, 768)
In the model:
inputs_sents = keras.Input(shape=(50,3))
inputs_events = keras.Input(shape=(50,768))
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)
But I got an error,
ValueError: Dimensions must be equal, but are 768 and 50 for
'{{node BatchMatMulV2_3}} =
BatchMatMulV2[T=DT_FLOAT,
adj_x=false,
adj_y=false](BatchMatMulV2_3/ReadVariableOp,
Transpose_3)' with input shapes: [1,768], [768,50,?]
I think it takes consideration of the batch? But how shall I deal with this?
v is a trainable vector (or 2d array with first dimension being 1), I want it to be trained in the training process.
PS: This is the result I got using the codes provided by the first answer, I think it is incorrect cause keras already takes consideration of the first batch dimension.
Plus, from the keras documentation,
shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; 'None' elements represent dimensions where the shape is not known.
https://keras.io/api/layers/core_layers/input/
Should I rewrite my codes without keras?
The shape of a batch is denoted by None:
import numpy as np
inputs_sents = keras.Input(shape=(None,1,3))
inputs_events = keras.Input(shape=(None,1,768))
v = np.ones(shape=(1,768), dtype=np.float32)
v = tf.Variable(initial_value=v, trainable=True)
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)
Hi tensorflow beginner here... I'm trying to get the value of a certain elements in an 2 dim tensor, in my case class scores from a probability matrix.
The probability matrix is (1000,81) with batchsize 1000 and number of classes 81. ClassIDs is (1000,) and contains the index for the highest class score for each sample. How do I get the corresponding class score from the probability matrix using tf.gather?
class_ids = tf.cast(tf.argmax(probs, axis=1), tf.int32)
class_scores = tf.gather_nd(probs,class_ids)
class_scores should be a tensor of shape (1000,) containing the highest class_score for each sample.
Right now I'm using a workaround that looks like this:
class_score_count = []
for i in range(probs.shape[0]):
prob = probs[i,:]
class_score = prob[class_ids[i]]
class_score_count.append(class_score)
class_scores = tf.stack(class_score_count, axis=0)
Thanks for the help!
You can do it with tf.gather_nd like this:
class_ids = tf.cast(tf.argmax(probs, axis=1), tf.int32)
# If shape is not dynamic you can use probs.shape[0].value instead of tf.shape(probs)[0]
row_ids = tf.range(tf.shape(probs)[0], dtype=tf.int32)
idx = tf.stack([row_ids, class_ids], axis=1)
class_scores = tf.gather_nd(probs, idx)
You could also just use tf.reduce_max, even though it would actually compute the maximum again it may not be much slower if your data is not too big:
class_scores = tf.reduce_max(probs, axis=1)
you need to run the tensor class_ids to get the values
the values will be a bumpy array
you can access numpy array normally by a loop
you have to do something like this :
predictions = sess.run(tf.argmax(probs, 1), feed_dict={x: X_data})
predictions variable has all the information you need
tensorflow only returns those tensor values which you run explicitly
I think this is what the batch_dims argument for tf.gather is for.
I need to minimize KL loss in tensorflow.
I tried this function tf.contrib.distributions.kl(dist_a, dist_b, allow_nan=False, name=None), but I failed.
I tried to implement it manually:
def kl_divergence(p,q):
return p* tf.log(p/q)+(1-p)*tf.log((1-p)/(1-q))
Is it correct?
What you have there is the cross entropy, KL divergence should be something like:
def kl_divergence(p, q):
return tf.reduce_sum(p * tf.log(p/q))
This assumes that p and q are both 1-D tensors of float, of the same shape and for each, their values sum to 1.
It should also work if p and q are equally sized mini-batches of 1-D tensors that obey the above constraints.
I started to play with TensorFlow two days ago and I'm wondering if there is the triplet and the contrastive losses implemented.
I've been looking at the documentation, but I haven't found any example or description about these things.
Update (2018/03/19): I wrote a blog post detailing how to implement triplet loss in TensorFlow.
You need to implement yourself the contrastive loss or the triplet loss, but once you know the pairs or triplets this is quite easy.
Contrastive Loss
Suppose you have as input the pairs of data and their label (positive or negative, i.e. same class or different class). For instance you have images as input of size 28x28x1:
left = tf.placeholder(tf.float32, [None, 28, 28, 1])
right = tf.placeholder(tf.float32, [None, 28, 28, 1])
label = tf.placeholder(tf.int32, [None, 1]). # 0 if same, 1 if different
margin = 0.2
left_output = model(left) # shape [None, 128]
right_output = model(right) # shape [None, 128]
d = tf.reduce_sum(tf.square(left_output - right_output), 1)
d_sqrt = tf.sqrt(d)
loss = label * tf.square(tf.maximum(0., margin - d_sqrt)) + (1 - label) * d
loss = 0.5 * tf.reduce_mean(loss)
Triplet Loss
Same as with contrastive loss, but with triplets (anchor, positive, negative). You don't need labels here.
anchor_output = ... # shape [None, 128]
positive_output = ... # shape [None, 128]
negative_output = ... # shape [None, 128]
d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)
loss = tf.maximum(0., margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)
The real trouble when implementing triplet loss or contrastive loss in TensorFlow is how to sample the triplets or pairs. I will focus on generating triplets because it is harder than generating pairs.
The easiest way is to generate them outside of the Tensorflow graph, i.e. in python and feed them to the network through the placeholders. Basically you select images 3 at a time, with the first two from the same class and the third from another class. We then perform a feedforward on these triplets, and compute the triplet loss.
The issue here is that generating triplets is complicated. We want them to be valid triplets, triplets with a positive loss (otherwise the loss is 0 and the network doesn't learn).
To know whether a triplet is good or not you need to compute its loss, so you already make one feedforward through the network...
Clearly, implementing triplet loss in Tensorflow is hard, and there are ways to make it more efficient than sampling in python but explaining them would require a whole blog post !
Triplet loss with semihard negative mining is now implemented in tf.contrib, as follows:
triplet_semihard_loss(
labels,
embeddings,
margin=1.0
)
where:
Args:
labels: 1-D tf.int32 Tensor with shape [batch_size] of multiclass
integer labels.
embeddings: 2-D float Tensor of embedding vectors.Embeddings should
be l2 normalized.
margin: Float, margin term in theloss definition.
Returns:
triplet_loss: tf.float32 scalar.
For further information, check the link bellow:
https://www.tensorflow.org/versions/master/api_docs/python/tf/contrib/losses/metric_learning/triplet_semihard_loss
Tiago, I don't think you are using the same formula Olivier gave.
Here is the right code (not sure it will work though, just fixing the formula) :
def compute_euclidean_distance(x, y):
"""
Computes the euclidean distance between two tensorflow variables
"""
d = tf.reduce_sum(tf.square(tf.sub(x, y)),1)
return d
def compute_contrastive_loss(left_feature, right_feature, label, margin):
"""
Compute the contrastive loss as in
L = 0.5 * Y * D^2 + 0.5 * (Y-1) * {max(0, margin - D)}^2
**Parameters**
left_feature: First element of the pair
right_feature: Second element of the pair
label: Label of the pair (0 or 1)
margin: Contrastive margin
**Returns**
Return the loss operation
"""
label = tf.to_float(label)
one = tf.constant(1.0)
d = compute_euclidean_distance(left_feature, right_feature)
d_sqrt = tf.sqrt(compute_euclidean_distance(left_feature, right_feature))
first_part = tf.mul(one-label, d)# (Y-1)*(d)
max_part = tf.square(tf.maximum(margin-d_sqrt, 0))
second_part = tf.mul(label, max_part) # (Y) * max(margin - d, 0)
loss = 0.5 * tf.reduce_mean(first_part + second_part)
return loss