I am trying to using tf.multinomial to sample, and I want to get the associated probability value of the sampled values. Here is my example code,
In [1]: import tensorflow as tf
In [2]: tf.enable_eager_execution()
In [3]: probs = tf.constant([[0.5, 0.2, 0.1, 0.2], [0.6, 0.1, 0.1, 0.1]], dtype=tf.float32)
In [4]: idx = tf.multinomial(probs, 1)
In [5]: idx # print the indices
Out[5]:
<tf.Tensor: id=43, shape=(2, 1), dtype=int64, numpy=
array([[3],
[2]], dtype=int64)>
In [6]: probs[tf.range(probs.get_shape()[0], tf.squeeze(idx)]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-56ef51f84ca2> in <module>
----> 1 probs[tf.range(probs.get_shape()[0]), tf.squeeze(idx)]
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py in _slice_helper(tensor, slice_spec, var)
616 new_axis_mask |= (1 << index)
617 else:
--> 618 _check_index(s)
619 begin.append(s)
620 end.append(s + 1)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py in _check_index(idx)
514 # TODO(slebedev): IndexError seems more appropriate here, but it
515 # will break `_slice_helper` contract.
--> 516 raise TypeError(_SLICE_TYPE_ERROR + ", got {!r}".format(idx))
517
518
TypeError: Only integers, slices (`:`), ellipsis (`...`), tf.newaxis (`None`) and scalar tf.int32/tf.int64 tensors are valid indices, got <tf.Tensor: id=7, shape=(2,), dtype=int32, numpy=array([3, 2])>
The expected result I want is [0.2, 0.1] as indicated by idx.
But in Numpy, this method works as answered in https://stackoverflow.com/a/23435869/5046896
How can I fix it?
You can try tf.gather_nd, you can try
>>> import tensorflow as tf
>>> tf.enable_eager_execution()
>>> probs = tf.constant([[0.5, 0.2, 0.1, 0.2], [0.6, 0.1, 0.1, 0.1]], dtype=tf.float32)
>>> idx = tf.multinomial(probs, 1)
>>> row_indices = tf.range(probs.get_shape()[0], dtype=tf.int64)
>>> full_indices = tf.stack([row_indices, tf.squeeze(idx)], axis=1)
>>> rs = tf.gather_nd(probs, full_indices)
Or, you can use tf.distributions.Multinomial, the advantage is you do not need to care about the batch_size in the above code. It works under varying batch_size when you set the batch_size=None. Here is a simple example,
multinomail = tf.distributions.Multinomial(
total_count=tf.constant(1, dtype=tf.float32), # sample one for each record in the batch, that is [1, batch_size]
probs=probs)
sampled_actions = multinomail.sample() # sample one action for data in the batch
predicted_actions = tf.argmax(sampled_actions, axis=-1)
action_probs = sampled_actions * predicted_probs
action_probs = tf.reduce_sum(action_probs, axis=-1)
I prefer the latter one because it is flexible and elegant.
For the below pandas DataFrame df, I want to transform the type column to OneHotEncoding, and transform the word column to its vector representation using the dictionary word2vec. Then I want to concatenate the two transformed vectors with the count column to form the final feature for classification.
>>> df
word type count
0 apple A 4
1 cat B 3
2 mountain C 1
>>> df.dtypes
word object
type category
count int64
>>> word2vec
{'apple': [0.1, -0.2, 0.3], 'cat': [0.2, 0.2, 0.3], 'mountain': [0.4, -0.2, 0.3]}
I defined customized Transformer, and use FeatureUnion to concatenate the features.
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.preprocessing import OneHotEncoder
class w2vTransformer(TransformerMixin):
def __init__(self,word2vec):
self.word2vec = word2vec
def fit(self,x, y=None):
return self
def wv(self, w):
return self.word2vec[w] if w in self.word2vec else [0, 0, 0]
def transform(self, X, y=None):
return df['word'].apply(self.wv)
pipeline = Pipeline([
('features', FeatureUnion(transformer_list=[
# Part 1: get integer column
('numericals', Pipeline([
('selector', TypeSelector(np.number)),
])),
# Part 2: get category column and its onehotencoding
('categoricals', Pipeline([
('selector', TypeSelector('category')),
('labeler', StringIndexer()),
('encoder', OneHotEncoder(handle_unknown='ignore')),
])),
# Part 3: transform word to its embedding
('word2vec', Pipeline([
('w2v', w2vTransformer(word2vec)),
]))
])),
])
When I run pipeline.fit_transform(df), I got the error: blocks[0,:] has incompatible row dimensions. Got blocks[0,2].shape[0] == 1, expected 3.
However, if I removed the word2vec Transformer (Part 3) from the pipeline, the pipeline (Part1 1 + Part 2) works fine.
>>> pipeline_no_word2vec.fit_transform(df).todense()
matrix([[4., 1., 0., 0.],
[3., 0., 1., 0.],
[1., 0., 0., 1.]])
And if I keep only the w2v transformer in the pipeline, it also works.
>>> pipeline_only_word2vec.fit_transform(df)
array([list([0.1, -0.2, 0.3]), list([0.2, 0.2, 0.3]),
list([0.4, -0.2, 0.3])], dtype=object)
My guess is that there is something wrong in my w2vTransformer class but don't know how to fix it. Please help.
This error is due to the fact that the FeatureUnion expects a 2-d array from each of its parts.
Now the first two parts of your FeatureUnion:- 'numericals' and 'categoricals' are correctly sending 2-d data of shape (n_samples, n_features).
n_samples = 3 in your example data. n_features will depend on individual parts (like OneHotEncoder will change them in 2nd part, but will be 1 in first part).
But the third part 'word2vec' returns a pandas.Series object which have the 1-d shape (3,). FeatureUnion takes this a shape (1, 3) by default and hence the complains that it does not match other blocks.
So you need to correct that shape.
Now even if you simply do a reshape() at the end and change it to shape (3,1), your code will not run, because the internal contents of that array are lists from your word2vec dict, which are not transformed correctly to a 2-d array. Instead it will become a array of lists.
Change the w2vTransformer to correct the error:
class w2vTransformer(TransformerMixin):
...
...
def transform(self, X, y=None):
return np.array([np.array(vv) for vv in X['word'].apply(self.wv)])
And after that the pipeline will work.
Look at the code:
import tensorflow as tf
import numpy as np
elems = tf.ones([1,2,3],dtype=tf.int64)
alternates = tf.map_fn(lambda x: (x, x, x), elems, dtype=(tf.int64, tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
The output is:
(array([[[1, 1, 1],
[1, 1, 1]]], dtype=int64), array([[[1, 1, 1],
[1, 1, 1]]], dtype=int64), array([[[1, 1, 1],
[1, 1, 1]]], dtype=int64))
I can't understand the output, who can tell me?
update
elems is a tensor, so it should be unpacked along axis-0, and we will get [[1,1,1],[1,1,1]], and then map_fn pass [[1,1,1],[1,1,1]] into lambda x:(x,x,x),which means x=[[1,1,1],[1,1,1]], and I think the output of map_fn is
[[[1,1,1],[1,1,1]],
[[1,1,1],[1,1,1]],
[[1,1,1],[1,1,1]]]
The shape of output is [3,2,3] or a list of shape(2,3)
But in fact, the output is a list of tensor, the shape of each tensor is [1,2,3].
Or in other words:
import tensorflow as tf
import numpy as np
elems = tf.constant([1,2,3],dtype=tf.int64)
alternates = tf.map_fn(lambda x: (x, 2*x, -x), elems, dtype=(tf.int64, tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
Why the output is
(array([1, 2, 3], dtype=int64),
array([2, 4, 6], dtype=int64),
array([-1, -2, -3], dtype=int64))
rather than
(array([1, 2, -1], dtype=int64),
array([2, 4, -2], dtype=int64),
array([3, 6, -3], dtype=int64))
The two question is the same.
Update2
import tensorflow as tf
import numpy as np
elems = [tf.constant([1,2,3],dtype=tf.int64)]
alternates = tf.map_fn(lambda x: x, elems, dtype=tf.int64)
with tf.Session() as sess:
print(sess.run(alternates))
elems is a list of tensor, so according to api, tf.constant([1,2,3],dtype=tf.int64) will be unpacked along axis-0, so map_fn will works as [x for x in [1,2,3]], but in fact it will raise a error.
ValueError: The two structures don't have the same nested structure. First struc
ture: <dtype: 'int64'>, second structure: [<tf.Tensor 'map/while/TensorArrayRead
V3:0' shape=() dtype=int64>].
What's wrong?
update3
import tensorflow as tf
import numpy as np
elems = (tf.constant([1,2,3],dtype=tf.int64),tf.constant([1,2,3],dtype=tf.int64))
alternates = tf.map_fn(lambda x: x, elems, dtype=(tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
The output is
(array([1, 2, 3], dtype=int64), array([1, 2, 3], dtype=int64))
It seems that elems aren't unpacked, why?
import tensorflow as tf
import numpy as np
elems = (tf.constant([1,2,3],dtype=tf.int64),tf.constant([1,2,3],dtype=tf.int64))
alternates = tf.map_fn(lambda x: [x], elems, dtype=(tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
It will raise a error
TypeError: The two structures don't have the same sequence type. First structure
has type <class 'tuple'>, while second structure has type <class 'list'>.
Who can tell me how tf.map_fn works?
First,
elems = tf.ones([1,2,3],dtype=tf.int64)
elems is a 3-dimensional tensor with shape 1x2x3 full of ones, that is:
[[[1, 1, 1],
[1, 1, 1]]]
Then,
alternates = tf.map_fn(lambda x: (x, x, x), elems, dtype=(tf.int64, tf.int64, tf.int64))
alternates is a tuple of three tensors with the same shape as elems, each of which is built according to the given function. Since the function simply returns a tuple repeating its input three times, that means that the three tensors will be the same as elems. If the function were lambda x: (x, 2 * x, -x) then the first output tensor would be the same as elems, the second would be the double of elems and the third one the opposite.
In all these cases it is preferable to use regular operations instead of tf.map_fn; however, there may be cases where you have a function accepting tensors with N dimensions and you have a tensor with N + 1 that you want to have it applied to.
UPDATE:
I think you are thinking of tf.map_fn "the other way around", so to say. There is not a one-to-one correspondence between the number of elements or rows in the tensor and the number of outputs in the function; in fact, you could pass a function returning a tuple with as many elements as you want.
Taking your last example:
elems = tf.constant([1,2,3],dtype=tf.int64)
alternates = tf.map_fn(lambda x: (x, 2*x, -x), elems, dtype=(tf.int64, tf.int64, tf.int64))
tf.map_fn first split elems in the first axis, that is into 1, 2 and 3, and applies the function to each of them, getting:
(1, 2, -1)
(2, 4, -2)
(3, 6, -3)
Note that, as I said, each of these tuples could have as many elements as you wanted. Now, the final output is produced concatenating the results in the same position; so you get:
[1, 2, 3]
[2, 4, 6]
[-1, -2, -3]
Again, if the function produced tuples with more elements you would get more output tensors.
UPDATE 2:
About your new example:
import tensorflow as tf
import numpy as np
elems = (tf.constant([1,2,3],dtype=tf.int64),tf.constant([1,2,3],dtype=tf.int64))
alternates = tf.map_fn(lambda x: x, elems, dtype=(tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
The documentation says:
This method also allows multi-arity elems and output of fn. If elems is a (possibly nested) list or tuple of tensors, then each of these tensors must have a matching first (unpack) dimension. The signature of fn may match the structure of elems. That is, if elems is (t1, [t2, t3, [t4, t5]]), then an appropriate signature for fn is: fn = lambda (t1, [t2, t3, [t4, t5]]):.
Here elems is a tuple of two tensors with the same size in the first dimension, as needed. tf.map_fn takes one element of each input tensor at a time (so a tuple of two elements) and applies the given function to it, which should return the same structure that you passed in dtypes (a tuple of two elements, too); if you don't give a dtypes, then the expected output is the same as the input (again, a tuple of two elements, so in your case dtypes is optional). Anyway, it goes like this:
f((1, 1)) -> (1, 1)
f((2, 2)) -> (2, 2)
f((3, 3)) -> (3, 3)
These results are combined, concatenating all the corresponding elements in the structure; in this case, all the numbers in the first position produce the first output and all the numbers in the second positions produce the second output. The result is, finally, the requested structure (the two-element tuple) filled with these concatenations:
([1, 2, 3], [1, 2, 3])
Your input elems have shape (1,2,3) and look like this:
[[[1, 1, 1],
[1, 1, 1]]]
It's not a matrix containing values 1,2,3, because you create it with tf.ones() that makes a tensor filled with 1 with the shape you pass as parameter
Replying to the Update:
map_fn is applied to elems itself.
According to tf.map_fn's documentation:
elems: A tensor or (possibly nested) sequence of tensors, each of which will be unpacked along their first dimension. The nested sequence of the resulting slices will be applied to fn.
From what I understand there, the function expects a tensor or a list of tensors and supposedly slices it and applies the function to each element. However, from the results it seems that if you pass in a tensor that's the element it applies the function to directly, so x has shape (1,2,3) when the lambda function is called.
The function then creates a tuple with 3 copies of your (1,2,3) matrix (which is the array(...) in your output)
Restructuring the output line and adding indent to make it more clear, the output looks as follows:
(
array( # first copy of `x`
[
[
[1, 1, 1],
[1, 1, 1]
]
], dtype=int64
),
array( # second copy of `x`
[
[
[1, 1, 1],
[1, 1, 1]
]
], dtype=int64
),
array( # third copy of `x`
[
[
[1, 1, 1],
[1, 1, 1]
]
], dtype=int64
),
) # end of the tuple
Update 2:
My suspicion is that you ran into a bug. If you define elems as a list, you have the error, but if you define it as a tuple with elems = (tf.constant([1,2,3],dtype=tf.int64)), the code works as expected. Different handling of tuples and lists is very suspicious... which is why I believe it's a bug.
As #mrry pointed out, in my example with the tuple I missed a comma (and thus elems was the tensor itself and not a tuple containing the tensor).
I'm working on a classification problem. The labels I am trying to predict:
df3['relevance'].unique()
array([ 3. , 2.5 , 2.33, 2.67, 2. , 1. , 1.67, 1.33, 1.25,
2.75, 1.75, 1.5 , 2.25])
When I call predict using the features I've made, it works OK:
clf = RandomForestClassifier()
clf.fit(df3[features], df['relevance'])
pd.crosstab(clf.predict(df3[features]), df3['relevance'])
But when I call clf.score:
clf.score(df3['features'], df3['relevance'])
I get
ValueError: continuous is not supported
Should I be classifying the relevance label I am trying to predict as another data type? Thanks for any help.
The issue you are facing happens is likely because your relevance column is made up of continuous numbers.
I would suggest switching over to the RandomForestRegressor() if you are trying to predict continuous numbers. Otherwise, convert your variables into 1s and 0s based on some threshold value.
Simply encode labels as integers and everything will work well. Floats suggest regression.
In particular you can use LabelEncoder http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html
>>> from sklearn.ensemble import RandomForestClassifier as RF
>>> import numpy as np
>>> X = np.array([[0], [1], [1.2]])
>>> y = [0.5, 1.2, -0.1]
>>> clf = RF()
>>> clf.fit(X, y)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
>>> print clf.score(y, X)
Traceback (most recent call last):
[.....]
ValueError: continuous is not supported
>>> y = [0, 1, 2]
>>> clf.fit(X, y)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
>>> print clf.score(X, y)
1.0
or compute .score yourself as this is extremely trivial function
print np.mean(clf.predict(X) == y)
Is there a numpy function to divide an array along an axis with elements from another array? For example, suppose I have an array a with shape (l,m,n) and an array b with shape (m,); I'm looking for something equivalent to:
def divide_along_axis(a,b,axis=None):
if axis is None:
return a/b
c = a.copy()
for i, x in enumerate(c.swapaxes(0,axis)):
x /= b[i]
return c
For example, this is useful when normalizing an array of vectors:
>>> a = np.random.randn(4,3)
array([[ 1.03116167, -0.60862215, -0.29191449],
[-1.27040355, 1.9943905 , 1.13515384],
[-0.47916874, 0.05495749, -0.58450632],
[ 2.08792161, -1.35591814, -0.9900364 ]])
>>> np.apply_along_axis(np.linalg.norm,1,a)
array([ 1.23244853, 2.62299312, 0.75780647, 2.67919815])
>>> c = divide_along_axis(a,np.apply_along_axis(np.linalg.norm,1,a),0)
>>> np.apply_along_axis(np.linalg.norm,1,c)
array([ 1., 1., 1., 1.])
For the specific example you've given: dividing an (l,m,n) array by (m,) you can use np.newaxis:
a = np.arange(1,61, dtype=float).reshape((3,4,5)) # Create a 3d array
a.shape # (3,4,5)
b = np.array([1.0, 2.0, 3.0, 4.0]) # Create a 1-d array
b.shape # (4,)
a / b # Gives a ValueError
a / b[:, np.newaxis] # The result you want
You can read all about the broadcasting rules here. You can also use newaxis more than once if required. (e.g. to divide a shape (3,4,5,6) array by a shape (3,5) array).
From my understanding of the docs, using newaxis + broadcasting avoids also any unecessary array copying.
Indexing, newaxis etc are described more fully here now. (Documentation reorganised since this answer first posted).
I think you can get this behavior with numpy's usual broadcasting behavior:
In [9]: a = np.array([[1., 2.], [3., 4.]])
In [10]: a / np.sum(a, axis=0)
Out[10]:
array([[ 0.25 , 0.33333333],
[ 0.75 , 0.66666667]])
If i've interpreted correctly.
If you want the other axis you could transpose everything:
> a = np.random.randn(4,3).transpose()
> norms = np.apply_along_axis(np.linalg.norm,0,a)
> c = a / norms
> np.apply_along_axis(np.linalg.norm,0,c)
array([ 1., 1., 1., 1.])