Pytorch setting elements to zero with "tensor index" - indexing

I've used Pytorch for a few months. But I recently want to create a customized pooling layer which is similar to the "Max-Pooling Dropout" layer and I think Pytorch provided us a bunch of tools to build such a layer I need. Here is my approach:
use MaxPool2d with indices returned
set tensor[indices] to zero
I want it behaves like torch.take (without flatten) if possible.
here is how to get the "index tensor". (I think it is called "index tensor". correct me if I was wrong)
input1 = torch.randn(1, 1, 6, 6)
m = nn.MaxPool2d(2,2, return_indices=True)
val, indx = m(input1)
indx is the "index tensor" which can be used easily as
torch.take(input1, indx)
No flatten needed, no argument needed to set dimension. I think it make sense since indx is generated from input1.
Question: how do I set the values input1 pointed by indx to 0 in the "torch.take" style? I saw some answers like Indexing a multi-dimensional tensor with a tensor in PyTorch. But I don't think FB returning such "index tensor" thing which cannot be applied directly. (Maybe I was wrong.)
Is there something like
torch.set_value(input1, indx, 0) ?

Related

Tensorflow: iterating over a Tensor for embedding lookup?

Suppose I have a matrix of N users, and each user is associated with a vector of words (translated to integers). So for example for N = 2 I'd have:
user 0 corresponds to words['20','56']
user 1 corresponds to words ['58','10','105']
So I have a list
user_words = [['20','56'],['58','10','105']]
Suppose further I created a 100-column embedding matrix (word_emb) for these words. I'd like to look up the (mean) embeddings of each of the user vectors and create a new Tensor, whose shape I would expect to be [2,100]. I tried doing this:
word_vec = []
for word_sequence_i in tf.map_fn(lambda x: x, user_words):
all_word_vecs = tf.nn.embedding_lookup(word_emb, word_sequence_i)
word_vec.append( tf.reduce_mean(all_word_vecs, 1))
But this gives me an error:
TypeError: `Tensor` objects are not iterable when eager execution is not enabled. To iterate over this tensor use `tf.map_fn`.
I thought I already was using tf.map_fn above! So what is Tensorflow complaining about? Is there even a way to do what I am trying to do?
Thanks so much!
tf.map_fn returns a Tensor object itself, which is a symbolic reference to a value that will be computed at Session.run() time. You can see this with type(tf.map_fn(lambda x: x, user_words)). So, it's the iteration implied in for word_sequence_i in tf.map_fn(...) that is generating the error.
Perhaps what you're looking for is something like:
all_word_vecs = tf.map_fn(lambda x: tf.nn.embedding_lookup(word_emb, x), user_words)
word_vec = tf.reduce_mean(all_word_vecs, axis=1)
On a related note, if this distinction between graph construction and execution is getting bothersome, you might want to give TensorFlow's eager execution a spin. See getting started and the programmer's guide.
Hope that helps.

Changing the Contents of a Tensor in TensorFlow

Before I continue, please excuse my ignorance. I have some experience programming before this, but my previous intuition has failed me presently.
Essentially, I need to expand a 1-D vector (size M x 1) of numbers ranging from 0...K, to a 2-D matrix (or Tensor, size M x K) where each row is a 1-D vector (size 1 x K), and each element is a 0 except for the index of the initial value being 1.
Yes, this is a multiclass classification problem for a ML class.
I had the idea of creating a zeros matrix of the correct shape, and then assigning the index of the element I need manually to a 1, but cannot seem to change the values of the already created Variable. I get the error:
TypeError: 'Tensor' object does not support item assignment
Can anyone assist with this? If you feel as though my way of going about creating this final Tensor could use a different approach, any advice would be appreciated.
In tensorflow, the function tf.one_hot() is what you seek. One hot encoding is the term describing the operation you are looking to implement. See https://www.tensorflow.org/api_docs/python/tf/one_hot .

How to assign values to a subset of a tensor in tensorflow?

Two parts to this question:
(1) What is the best way to update a subset of a tensor in tensorflow? I've seen several related questions:
Adjust Single Value within Tensor -- TensorFlow
and
How to update a subset of 2D tensor in Tensorflow?
and I'm aware that Variable objects can be assigned using Variable.assign() (and/or scatter_update, etc.), but it seems very strange to me that tensorflow does not have a more intuitive way to update a part of a Tensor object. I have searched through the tensorflow api docs and stackoverflow for quite some time now and can't seem to find a simpler solution than what is presented in the links above. This seems particularly odd, especially given that Theano has an equivalent version with Tensor.set_subtensor(). Am I missing something or is there no simple way to do this through the tensorflow api at this point?
(2) If there is a simpler way, is it differentiable?
Thanks!
I suppose the immutability of Tensors is required for the construction of a computation graph; you can't have a Tensor update some of its values without becoming another Tensor or there will be nothing to put in the graph before it. The same issue comes up in Autograd.
It's possible to do this (but ugly) using boolean masks (make them variables and use assign, or even define them prior in numpy). That would be differentiable, but in practice I'd avoid having to update subtensors.
If you really have to, and I really hope there is a better way to do this, but here is a way to do it in 1D using tf.dynamic_stitch and tf.setdiff1d:
def set_subtensor1d(a, b, slice_a, slice_b):
# a[slice_a] = b[slice_b]
a_range = tf.range(a.shape[0])
_, a_from = tf.setdiff1d(a_range, a_range[slice_a])
a_to = a_from
b_from, b_to = tf.range(b.shape[0])[slice_b], a_range[slice_a]
return tf.dynamic_stitch([a_to, b_to],
[tf.gather(a, a_from),tf.gather(b, b_from)])
For higher dimensions this could be generalised by abusing reshape (where nd_slice could be implemented like this but there is probably a better way):
def set_subtensornd(a, b, slice_tuple_a, slice_tuple_b):
# a[*slice_tuple_a] = b[*slice_tuple_b]
a_range = tf.range(tf.reduce_prod(tf.shape(a)))
a_idxed = tf.reshape(a_range, tf.shape(a))
a_dropped = tf.reshape(nd_slice(a_idxed, slice_tuple_a), [-1])
_, a_from = tf.setdiff1d(a_range, a_dropped)
a_to = a_from
b_range = tf.range(tf.reduce_prod(tf.shape(b)))
b_idxed = tf.reshape(b_range, tf.shape(b))
b_from = tf.reshape(nd_slice(b_idxed, slice_tuple_b), [-1])
b_to = a_dropped
a_flat, b_flat = tf.reshape(a, [-1]), tf.reshape(b, [-1])
stitched = tf.dynamic_stitch([a_to, b_to],
[tf.gather(a_flat, a_from),tf.gather(b_flat, b_from)])
return tf.reshape(stitched, tf.shape(a))
I have no idea how slow this will be. I'd guess quite slow. And, I haven't tested it much beyond running it on a couple of tensors.

TensorFlow tf.nn.rnn function ... how to use the results of your training to do a single forward-pass through the RNN

I'm having a tough time using the 'initial state' argument in the tf.nn.rnn function.
val, _ = tf.nn.rnn(cell1, newBatch, initial_state=stateP, dtype=tf.float32)
newBatch.shape => (1, 1, 11)
stateP.shape => (2, 2, 1, 11)
In general, I've gone through the training for my LSTM neural net and now I want to use the values of it. How do I do this? I know that the tf.nn.rnn() function will return state... but I don't know how to plug it in.
fyi stateP.shape => (2, 2, 1, 11) ..... maybe because I used stacked LSTM cells?
I've also tried:
val, _ = tf.nn.dynamic_rnn(stacked_lstm, newBatch, initial_state=stateP, dtype=tf.float32)
but I get the error "AttributeError: 'NoneType' object has no attribute 'op'".
I'm pretty sure that the 'NoneType' object being talked about is the stateP tuple I gave, but I'm not sure what to do here.
EDIT: I finally got this running by using:
init_state = cell.zero_state(batch_size, tf.float32)
To determine the exact shape I need to pass into the 'initial_state' argument. In my case, it was a TUPLE of 4 tensors, each with the shape of (1, 11). I made it like this:
stateP0 = tf.convert_to_tensor(stateP[0][0])
stateP1 = tf.convert_to_tensor(stateP[0][1])
stateP2 = tf.convert_to_tensor(stateP[1][0])
stateP3 = tf.convert_to_tensor(stateP[1][1])
newStateP = stateP0, stateP1, stateP2, stateP3
Alright! Now the tf.dynamic_rnn() function is working, but it's giving me different results every time I run it.... so what's the point of passing in the initial state? I want to use the state I trained to find... and I don't want it to change. I want to actually use the results of my training!
You are probably using the deprecated (or soon to be) behavior. stateP in your case represents the concatenation of c (cell state) and h (output of lstm from the final step of unrolling). So you need to slice the state along dimension 1 to get the actual state.
Or, you can initialize your LSTM cell with state_is_tuple=True, which I would recommend, so that you could easily get the final state (if you want to tinker with it) by indexing the state stateP[0]. Or you could just pass the state tuple directly to rnn (or dynamic_rnn).
I cant say anything beyond that because you have not provided your initialization code. So I would be guessing.
You can edit your question to provide more details if you still face problems and I would edit the answer.

sklearn: get feature names after L1-based feature selection

This question and answer demonstrate that when feature selection is performed using one of scikit-learn's dedicated feature selection routines, then the names of the selected features can be retrieved as follows:
np.asarray(vectorizer.get_feature_names())[featureSelector.get_support()]
For example, in the above code, featureSelector might be an instance of sklearn.feature_selection.SelectKBest or sklearn.feature_selection.SelectPercentile, since these classes implement the get_support method which returns a boolean mask or integer indices of the selected features.
When one performs feature selection via linear models penalized with the L1 norm, it's unclear how to accomplish this. sklearn.svm.LinearSVC has no get_support method and the documentation doesn't make clear how to retrieve the feature indices after using its transform method to eliminate features from a collection of samples. Am I missing something here?
For sparse estimators you can generally find the support by checking where the non-zero entries are in the coefficients vector (provided the coefficients vector exists, which is the case for e.g. linear models)
support = np.flatnonzero(estimator.coef_)
For your LinearSVC with l1 penalty it would accordingly be
from sklearn.svm import LinearSVC
svc = LinearSVC(C=1., penalty='l1', dual=False)
svc.fit(X, y)
selected_feature_names = np.asarray(vectorizer.get_feature_names())[np.flatnonzero(svc.coef_)]
I've been using sklearn 15.2, and according to LinearSVC documentation , coef_ is an array, shape = [n_features] if n_classes == 2 else [n_classes, n_features].
So first, np.flatnonzero doesn't work for multi-class. You'll have index out of range error. Second, it should be np.where(svc.coef_ != 0)[1] instead of np.where(svc.coef_ != 0)[0] . 0 is index of classes, not features. I ended up with using np.asarray(vectorizer.get_feature_names())[list(set(np.where(svc.coef_ != 0)[1]))]