tf.argmax return the top 1 of a tensor. I did some research and did not find a good way (other than scan) to get top 5 of a tensor. Please let me know if you have a better approach. Thanks!
You can use tf.nn.top_k as found in the documentation here.
tf.nn.top_k(input, k=5, sorted=True, name=None)
For your case, k=5, as shown above.
Related
I am using Pytorch to training some neural networks. The part I am confused about is:
prediction = myNetwork(img_batch)
max_act = prediction.max(1)[0].sum()
loss = softcrossentropy_loss - alpha * max_act
In the above codes, "prediction" is the output tensor of "myNetwork".
I hope to maximize the larget output of "prediction" over a batch.
For example:
[[-1.2, 2.0, 5.0, 0.1, -1.5] [9.6, -1.1, 0.7, 4,3, 3.3]]
For the first prediction vector, the 3rd element is the larget, while for the second vector, the 1st element is the largets. And I want to maximize "5.0+9.6", although we cannot know what index is the larget output for a new input data.
In fact, my training seems to be successful, because the "max_act" part was really increased, which is the desired behavior to me. However, I heard some discussion about whether max() operation is differentiable or not:
Some says, mathmatically, max() is not differentiable.
Some says, max() is just an identity function to select the largest element, and this largest element is differentiable.
So I got confused now, and I am worried if my idea of maximizing "max_act" is wrong from the beginning.
Could someone provide some guidance if max() operation is differentiable in Pytorch?
max is differentiable with respect to the values, not the indices. It is perfectly valid in your application.
From the gradient point of view, d(max_value)/d(v) is 1 if max_value==v and 0 otherwise. You can consider it as a selector.
d(max_index)/d(v) is not really meaningful as it is a discontinuous function, with only 0 and undefined as possible gradients.
I am a beginner with DNN and pytorch.
I am dealing with a multi-classification problem where my label are encoded into a one-hotted vector, say of dimension D.
To this end, I am using the CrossEntropyLoss. However now I want to modify or change such criterion in order to penalize value distant from the actual one, say classify 4 instead of 5 is better than 2 instead of 5.
Is there a function already built-in in Pytorch that implement this behavior? Otherwise how can I modify the CrossEntropyLoss to achieve it?
This could help you. It is a PyTorch implementation ordinal regression:
https://www.ethanrosenthal.com/2018/12/06/spacecutter-ordinal-regression/
It seems that there is no simple way to assign a value to the diagonal of a Tensor. Ideally I am looking for a command like numpy.fill_diagonal.
Currently I accomplish this by doing:
tf.matrix_set_diag(
matrix,
tf.zeros_like(matrix.shape[0:-1]),
name=None
)
Is there a better way?
I think your answer should be:
tf.matrix_set_diag(matrix, tf.zeros(matrix.shape[0:-1]), name=None)
This should be updated to tf.linalg.set_diag, which can be found here
I want to get the gradient of a layer with respect to a parameter matrix for each example. Normally, I would need a Jacobian, but following this idea, I decided to use map_fn so I could feed forward data in a batch rather than one by one. This gives me a problem I do not understand, unfortunately. With the code
get_grads = tf.map_fn(lambda x: tf.gradients(x, W['1'])[0], softmax_probs)
sess.run(get_grads, feed_dict={x: images[0:100]})
I get this error
InvalidArgumentError: TensorArray map_21/TensorArray_36#map_21/while/gradients: Could not write to TensorArray index 0 because it has already been read.
W['1'] is a variable in the graph. Ideas?
It seems like your issue may be connected with the bug
https://github.com/tensorflow/tensorflow/issues/7643
One commenter posts a possible fix at the end. You could try that out.
Alternatively, if you what you want is the jacobian, then you can check out this solution:
https://github.com/tensorflow/tensorflow/issues/675#issuecomment-362853672
although it appears that it will not work when nested.
I don't think this will work because x in this case is a loop variable which TensorFlow does not know how to connect to softmax_probs.
I am looking for examples of how to build a multivariate time-series RNN using Tensorflow. Is this possible with an LSTM cell or similar?
e.g. the data might look something like this:
Time,A,B,C,...
0,3.5,4.5,7.7,...
1,2.1,6.4,8.2,...
...
Any help much appreciated. Thanks, John
It depends on exactly what you mean, but yes, it should be possible. If you write more specifically how exactly your input and target data looks like, somebody may be able to help. You can generally have sequential continuous or categorical input data and sequential continuous or categorical output data or a mix of those. I would suggest you look at the tutorials and try out a few things, then ask again here.
Thanks. I have figured it out now. I misunderstood the docs 'inputs: A length T list of inputs, each a vector with shape [batch_size].'
The following link was useful:
https://m.reddit.com/r/MachineLearning/comments/3sok8k/tensorflow_basic_rnn_example_with_variable_length/