It seems that there is no simple way to assign a value to the diagonal of a Tensor. Ideally I am looking for a command like numpy.fill_diagonal.
Currently I accomplish this by doing:
tf.matrix_set_diag(
matrix,
tf.zeros_like(matrix.shape[0:-1]),
name=None
)
Is there a better way?
I think your answer should be:
tf.matrix_set_diag(matrix, tf.zeros(matrix.shape[0:-1]), name=None)
This should be updated to tf.linalg.set_diag, which can be found here
Related
This is a very simple question. I'm learning tensorflow and converting my numpy-written code using Tensorflow.
I have word embedding matrix defined U = [embedding_size, vocab_size] therefore each column is the embedding vector of each word.
I converted U into TF like below:
U = tf.Variable(tf.truncated_normal([embedding_size, vocab_size], -0.1, 0.1))
So far, so good.
Now I need to look up each word's embedding for training. I assume it would be
tf.nn.embedding_lookup(U, word_index)
My question is because my embedding is a column vector, I need to look up like this U[:,x[t]] in numpy.
How does TF figure out it needs to return the row OR column by word_index?
What's the default? Row or column?
If it's a row vector, then do I need to transpose my embedding matrix?
https://www.tensorflow.org/api_docs/python/tf/nn/embedding_lookup
doesn't mention this. If anyone could point me to right resource, I'd appreciate it.
If params is a single tensor, the tf.nn.embedding_lookup(params, ids) operation treats ids as the indices of rows in params. If params is a list of tensors or a partitioned variable, then ids still correspond to rows in those tensors, but the partition_strategy (either "div" or "mod") determines how the ids map to a particular row.
As Aaron suggests, it will probably be easiest to define your embedding U as having shape [vocab_size, embedding_size], so that you can use tf.nn.embedding_lookup() and related functions.
Alternatively, you can use the axis argument to tf.gather() to select columns from U:
embedding = tf.gather(U, word_index, axis=1)
U should be vocab_size x embedding_size, the transpose of what you have now.
tf.argmax return the top 1 of a tensor. I did some research and did not find a good way (other than scan) to get top 5 of a tensor. Please let me know if you have a better approach. Thanks!
You can use tf.nn.top_k as found in the documentation here.
tf.nn.top_k(input, k=5, sorted=True, name=None)
For your case, k=5, as shown above.
I want to get the gradient of a layer with respect to a parameter matrix for each example. Normally, I would need a Jacobian, but following this idea, I decided to use map_fn so I could feed forward data in a batch rather than one by one. This gives me a problem I do not understand, unfortunately. With the code
get_grads = tf.map_fn(lambda x: tf.gradients(x, W['1'])[0], softmax_probs)
sess.run(get_grads, feed_dict={x: images[0:100]})
I get this error
InvalidArgumentError: TensorArray map_21/TensorArray_36#map_21/while/gradients: Could not write to TensorArray index 0 because it has already been read.
W['1'] is a variable in the graph. Ideas?
It seems like your issue may be connected with the bug
https://github.com/tensorflow/tensorflow/issues/7643
One commenter posts a possible fix at the end. You could try that out.
Alternatively, if you what you want is the jacobian, then you can check out this solution:
https://github.com/tensorflow/tensorflow/issues/675#issuecomment-362853672
although it appears that it will not work when nested.
I don't think this will work because x in this case is a loop variable which TensorFlow does not know how to connect to softmax_probs.
I am trying to solve the following linear system using optimize.root
AX = b
With the following code.
A = [[0,1,0],[2,1,0],[1,4,1]]
def foo(X):
b = np.matrix([2,1,1])
out = np.dot(A,X) - b
return out.tolist()
sol = scipy.optimize.root(foo,[0,0,0])
I know that I can simply use the numpy.linalg.solve to do this easily. But I am actually trying to solve a non linear system that is in matrix form. See my question here. So I need to find a way to make this method work. To do that I am trying to solve this problem in this simple case. But I get the error
TypeError: fsolve: there is a mismatch between the input and output shape of the 'func' argument 'foo'.Shape should be (3,) but it is (1, 3).
From what I have read from other similar stackoverflow questions this happens because the out put of the foo function is not compatible with the shape of the initial guess [0,0,0]
Surely there is a way to solve this equation using scipy.optimize.root. Can anyone please help?
(I'm assuming the capital B in your .dot is a typo for A.)
Try using np.array for b. np.matrix creates a "row vector", i.e. shape (1, 3) whereas your initial guess has shape (3,).
Given...
a Matrix A of shape [m, n]
a tensor I of shape [m]
I want to get a list J of elements from A where
J[i] = A[i, I[i]].
That is, I holds the index of the element to select from each row in A.
Context: I already have the argmax(A, 1) and now I also want the max.
I know that I can just use reduce_max.
And after trying around for a bit I also came up with this:
J = tf.gather_nd(A,
tf.transpose(tf.pack([tf.to_int64(tf.range(A.get_shape()[0])), I])))
Where the to_int64 is needed because range only produces int32 and argmax only produces int64.
None of the two strike me as particularly elegant.
One has runtime overhead (probably about factor n) and the other has an unknown factor cognitive overhead. Am I missing something here?
The gather() function provides a way to do it:
r = tf.random.uniform([4,5],0, 9, dtype=tf.int32)
i = tf.random.uniform([4], 0, 4, dtype=tf.int32)
tf.gather(r, i, axis=1, batch_dims=1)
This is a rather late answer, but could doing
mask = tf.one_hot(I, depth=n, dtype=tf.bool, on_value=True, off_value=False)
elements = tf.boolean_mask(A, mask)
Accomplish what you're looking for?
edit: I should point out that this is NOT a good idea if A is already a very large tensor, as this ends up making a dense matrix.
Link provided by #yaroslav-bulatov mentiones this solution:
def get_elements(data, indices):
indeces = tf.range(0, tf.shape(indices)[0])*data.shape[1] + indices
return tf.gather(tf.reshape(data, [-1]), indeces)
Your solution is not currently differentiable (because gradients for tf.gather_nd are not currently supported).
Hopefully, data[:, indices] will be introduced soon.