How to iterate over a 3 dimensional tensor - numpy

I have a tensor say:
y_true = np.array([[[1.], [0.], [3.]], [[5.], [0.], [0.]]])
I want to iterate over y_true accessing all indevidual values. I want to do something like following in java:
for(i=0;i<y_true.length;i++){
arr2 = y_true[i];
for(j=0;j<arr2.length;j++){
print(arr2[j][0])
}
}.

Are you looking for slicing with [:,:,0]?
>>> y_true[:,:,0]
array([[1., 0., 3.],
[5., 0., 0.]])

There are 2 cases:
You know the rank(dimensionality) of your created numpy array in the example y_true array has rank of 3, and you can check y_true.shape property which should give you with exact size of each dimension of the y_true, then you can write as many for loops the rank of y_true and output each element separately, for example:
import numpy as np
y_true = np.array([[[1.], [0.], [3.]], [[5.], [0.], [0.]]])
dims = y_true.shape
for i in range(dims[0]):
for j in range(dims[1]):
for k in range(dims[2]):
print("Element of np array with indices {} is equal to {}".format([i, j, k], y_true[i, j, k]))
If you don't know the rank of the tensor you want to print then you can write recursive function that will print all the elements, for example:
import numpy as np
def recursively_print_elems(np_arr, idx, pos):
if pos >= len(np_arr.shape):
print("Element of np array with indeces {} is equal to: {}".format(idx, np_arr[tuple(idx)]))
return
for i in range(np_arr.shape[pos]):
idx[pos] = i
recursively_print_elems(np_arr, idx, pos + 1)
def print_elems(np_arr):
idx = [0] * len(np_arr.shape)
recursively_print_elems(np_arr, idx, 0)
y_true = np.array([[[1.], [0.], [3.]], [[5.], [0.], [0.]]])
print_elems(y_true)
The 2nd approach is more general it will work for any dimensional tensor.

Your array:
In [19]: y_true
Out[19]:
array([[[1.],
[0.],
[3.]],
[[5.],
[0.],
[0.]]])
In [20]: y_true.shape
Out[20]: (2, 3, 1)
With a last dimension of size 1, we can reshape it
In [21]: y_true.reshape(2,3)
Out[21]:
array([[1., 0., 3.],
[5., 0., 0.]])
Selecting on that index does just as well.
But you can access all values in order just by raveling/flattening:
In [22]: y_true.ravel()
Out[22]: array([1., 0., 3., 5., 0., 0.])
Or get a 1 iterator:
In [23]: yiter = y_true.flat
In [24]: yiter?
Type: flatiter
String form: <numpy.flatiter object at 0x1fdd200>
Length: 6
File: ~/.local/lib/python3.6/site-packages/numpy/__init__.py
Docstring: <no docstring>
Class docstring:
Flat iterator object to iterate over arrays.
A `flatiter` iterator is returned by ``x.flat`` for any array `x`.
It allows iterating over the array as if it were a 1-D array,
either in a for-loop or by calling its `next` method.
...
So instead of constructing an iterator for each dimension we can iterate on this flat one:
In [25]: for item in yiter:print(item)
1.0
0.0
3.0
5.0
0.0
0.0
ndenumerate uses this flat iterator, and returns both coordinates and values:
In [26]: list(np.ndenumerate(y_true))
Out[26]:
[((0, 0, 0), 1.0),
((0, 1, 0), 0.0),
((0, 2, 0), 3.0),
((1, 0, 0), 5.0),
((1, 1, 0), 0.0),
((1, 2, 0), 0.0)]
A variation on this is ndindex:
In [27]: indexs = np.ndindex(y_true.shape)
In [28]: for ijk in indexs:
...: print(ijk, y_true[ijk])
...:
(0, 0, 0) 1.0
(0, 1, 0) 0.0
(0, 2, 0) 3.0
(1, 0, 0) 5.0
(1, 1, 0) 0.0
(1, 2, 0) 0.0
But where possible it is better to operate on the whole array, rather than iterate. Those whole-array operations do the iteration in compiled code.

Related

tf.math.bincount - use min/max weight instead of weight sum

I would like to get a max/min value in tf.math.bincount instead of the weight sum. Basically currently it works as:
values = tf.constant([1,1,2,3,2,4,4,5])
weights = tf.constant([1,5,0,1,0,5,4,5])
tf.math.bincount(values, weights=weights) #[0 6 0 1 9 5]
However, I would like to get max/min for the conflicting weights instead, e.g. for max it should return:
[0 5 0 1 5 5]
It requires some finessing, but you can accomplish this as follows:
def bincount_with_max_weight(values: tf.Tensor, weights: tf.Tensor) -> tf.Tensor:
_range = tf.range(tf.reduce_max(values) + 1)
return tf.map_fn(lambda x: tf.maximum(
tf.reduce_max(tf.gather(weights, tf.where(tf.equal(values, x)))), 0), _range)
The output for the example case is:
[0 5 0 1 5 5]
Breaking it down, the first line computes the range of values in values:
_range = tf.range(tf.reduce_max(values) + 1)
and in the second line, the maximum of weight is computed per element in _range using tf.map_fn with tf.where, which retrieves indices where the clause is true, and tf.gather, which retrieves the values corresponding to supplied indices.
The tf.maximum wraps the output to handle the case where the element does not exist in values i.e; in the example case, 0 does not exist in values so the output without tf.maximum would be INT_MIN for 0:
[-2147483648 5 0 1 5 5]
This could also be applied on the final result tensor instead of per element:
def bincount_with_max_weight(values: tf.Tensor, weights: tf.Tensor) -> tf.Tensor:
_range = tf.range(tf.reduce_max(values) + 1)
result = tf.map_fn(lambda x:
tf.reduce_max(tf.gather(weights, tf.where(tf.equal(values, x)))), _range)
return tf.maximum(result, 0)
Note that this would not work if negative weights are utilized - in that case, tf.where can be used for comparing against the minimum integer value (tf.int32.min in the example, although this can be applied for any numeric dtype) instead of applying tf.maximum:
def bincount_with_max_weight(values: tf.Tensor, weights: tf.Tensor) -> tf.Tensor:
_range = tf.range(tf.reduce_max(values) + 1)
result = tf.map_fn(lambda x:
tf.reduce_max(tf.gather(weights, tf.where(tf.equal(values, x)))), _range)
return tf.where(tf.equal(result, tf.int32.min), 0, result)
Update
For handling the 2D Tensor case, we can use tf.map_fn to apply the maximum weight function to each pair of values and weights in the batch:
def bincount_with_max_weight(values: tf.Tensor, weights: tf.Tensor, axis: Optional[int] = None) -> tf.Tensor:
_range = tf.range(tf.reduce_max(values) + 1)
def mapping_function(x: int, _values: tf.Tensor, _weights: tf.Tensor) -> tf.Tensor:
return tf.reduce_max(tf.gather(_weights, tf.where(tf.equal(_values, x))))
if axis == -1:
result = tf.map_fn(lambda pair: tf.map_fn(lambda x: mapping_function(x, *pair), _range), (values, weights),
dtype=tf.int32)
else:
result = tf.map_fn(lambda x: mapping_function(x, values, weights), _range)
return tf.where(tf.equal(result, tf.int32.min), 0, result)
For the 2D example provided:
values = tf.constant([[1, 1, 2, 3], [2, 1, 4, 5]])
weights = tf.constant([[1, 5, 0, 1], [0, 5, 4, 5]])
print(bincount_with_max_weight(values, weights, axis=-1))
The output is:
tf.Tensor(
[[0 5 0 1 0 0]
[0 5 0 0 4 5]], shape=(2, 6), dtype=int32)
This implementation is a generalization of the approach originally described - if axis is omitted, it will compute results for the 1D case.
For Faster Execution try this,
values = tf.constant([[1,1,2,3], [2,1,4,5]])
weights = tf.constant([[1,5,0,1], [0,5,4,5]])
def find_max_bins(output , values , weights):
np.maximum.at(output , values , weights)
return output
#tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype = tf.float32),
tf.TensorSpec(shape=[None], dtype = tf.int32),
tf.TensorSpec(shape=[None], dtype = tf.int32)
])
def tf_function(output , values , weights):
print(values)
y = tf.numpy_function(find_max_bins, [output , values , weights], tf.float32)
return y
length = np.max(values)+1
initial_value = [0 for x in range(length)]
variable = tf.Variable(initial_value = initial_value, shape=(length) , dtype=tf.float32)
for i , (value , weight) in enumerate(zip(values , weights)):
if(i > 0):
output = tf.stack([output , tf_function(variable , value , weight)] , 0)
else:
output = tf_function(variable , value , weight)
variable.assign_sub(initial_value)
Output:
<tf.Tensor: shape=(2, 6), dtype=float32, numpy=
array([[0., 5., 0., 1., 0., 0.],
[0., 5., 0., 0., 4., 5.]], dtype=float32)>

tf.keras.losses.CategoricalCrossentropy gives different values than plain implementation

Any one knows why raw implementation of Categorical Crossentropy function is so different from the tf.keras's api function?
import tensorflow as tf
import math
tf.enable_eager_execution()
y_true =np.array( [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
y_pred = np.array([[.9, .05, .05], [.5, .89, .6], [.05, .01, .94]])
ce = tf.keras.losses.CategoricalCrossentropy()
res = ce(y_true, y_pred).numpy()
print("use api:")
print(res)
print()
print("implementation:")
step1 = -y_true * np.log(y_pred )
step2 = np.sum(step1, axis=1)
print("step1.shape:", step1.shape)
print(step1)
print("sum step1:", np.sum(step1, ))
print("mean step1", np.mean(step1))
print()
print("step2.shape:", step2.shape)
print(step2)
print("sum step2:", np.sum(step2, ))
print("mean step2", np.mean(step2))
Above gives:
use api:
0.3239681124687195
implementation:
step1.shape: (3, 3)
[[0.10536052 0. 0. ]
[0. 0.11653382 0. ]
[0. 0. 0.0618754 ]]
sum step1: 0.2837697356318653
mean step1 0.031529970625762814
step2.shape: (3,)
[0.10536052 0.11653382 0.0618754 ]
sum step2: 0.2837697356318653
mean step2 0.09458991187728844
If now with another y_true and y_pred:
y_true = np.array([[0, 1]])
y_pred = np.array([[0.99999999999, 0.00000000001]])
It gives:
use api:
16.11809539794922
implementation:
step1.shape: (1, 2)
[[-0. 25.32843602]]
sum step1: 25.328436022934504
mean step1 12.664218011467252
step2.shape: (1,)
[25.32843602]
sum step2: 25.328436022934504
mean step2 25.328436022934504
The difference is because of these values: [.5, .89, .6], since it's sum is not equal to 1. I think you have made a mistake and you meant this instead: [.05, .89, .06].
If you provide the values with sum equal to 1, then both formulas results will be the same:
import tensorflow as tf
import numpy as np
y_true = np.array( [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
y_pred = np.array([[.9, .05, .05], [.05, .89, .06], [.05, .01, .94]])
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred).numpy())
print(np.sum(-y_true * np.log(y_pred), axis=1))
#output
#[0.10536052 0.11653382 0.0618754 ]
#[0.10536052 0.11653382 0.0618754 ]
However, let's explore how is calculated if the y_pred tensor is not scaled (the sum of values is not equal to 1)? If you look at the source code of categorical cross entropy here, you will see that it scales y_pred so that the class probas of each sample sum to 1:
if not from_logits:
# scale preds so that the class probas of each sample sum to 1
output /= tf.reduce_sum(output,
reduction_indices=len(output.get_shape()) - 1,
keep_dims=True)
since we passed a pred which the sum of probas is not 1, let's see how this operation changes our tensor [.5, .89, .6]:
output = tf.constant([.5, .89, .6])
output /= tf.reduce_sum(output,
axis=len(output.get_shape()) - 1,
keepdims=True)
print(output.numpy())
# array([0.2512563 , 0.44723618, 0.30150756], dtype=float32)
So, it should be equal if we replace the above operation output (scaled y_pred), and pass it to your own implemented categorical cross entropy, with the unscaled y_pred passing to tensorflow implementation:
y_true =np.array( [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
#unscaled y_pred
y_pred = np.array([[.9, .05, .05], [.5, .89, .6], [.05, .01, .94]])
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred).numpy())
#scaled y_pred (categorical_crossentropy scales above tensor to this internally)
y_pred = np.array([[.9, .05, .05], [0.2512563 , 0.44723618, 0.30150756], [.05, .01, .94]])
print(np.sum(-y_true * np.log(y_pred), axis=1))
Output:
[0.10536052 0.80466845 0.0618754 ]
[0.10536052 0.80466846 0.0618754 ]
Now, let's explore the results of your second example. Why your second example shows different output?
If you check the source code again, you will see this line:
output = tf.clip_by_value(output, epsilon, 1. - epsilon)
which clips values below than a threshold. Your input [0.99999999999, 0.00000000001] will be converted to [0.9999999, 0.0000001] in this line, so it gives you a different result:
y_true = np.array([[0, 1]])
y_pred = np.array([[0.99999999999, 0.00000000001]])
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred).numpy())
print(np.sum(-y_true * np.log(y_pred), axis=1))
#now let's first clip the values less than epsilon, then compare loss
epsilon=1e-7
y_pred = tf.clip_by_value(y_pred, epsilon, 1. - epsilon)
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred).numpy())
print(np.sum(-y_true * np.log(y_pred), axis=1))
Output:
#results without clipping values
[16.11809565]
[25.32843602]
#results after clipping values if there is a value less than epsilon (1e-7)
[16.11809565]
[16.11809565]

From softmax output to class prediction

Is there an easy way to go from a Softmax output to a class prediction?
For instance,
from this:
[0.83128697, 0.06161868, 0.10709436]
to this:
[1, 0, 0]
You can use np.argmax to retrieve the index of max value:
import numpy as np
a = [0.83128697, 0.06161868, 0.10709436]
r = np.zeros(len(a)) # a.size if a is a numpy array
r[np.argmax(a)]=1
r
array([1., 0., 0.])

Custom word2vec Transformer on pandas dataframe and using it in FeatureUnion

For the below pandas DataFrame df, I want to transform the type column to OneHotEncoding, and transform the word column to its vector representation using the dictionary word2vec. Then I want to concatenate the two transformed vectors with the count column to form the final feature for classification.
>>> df
word type count
0 apple A 4
1 cat B 3
2 mountain C 1
>>> df.dtypes
word object
type category
count int64
>>> word2vec
{'apple': [0.1, -0.2, 0.3], 'cat': [0.2, 0.2, 0.3], 'mountain': [0.4, -0.2, 0.3]}
I defined customized Transformer, and use FeatureUnion to concatenate the features.
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.preprocessing import OneHotEncoder
class w2vTransformer(TransformerMixin):
def __init__(self,word2vec):
self.word2vec = word2vec
def fit(self,x, y=None):
return self
def wv(self, w):
return self.word2vec[w] if w in self.word2vec else [0, 0, 0]
def transform(self, X, y=None):
return df['word'].apply(self.wv)
pipeline = Pipeline([
('features', FeatureUnion(transformer_list=[
# Part 1: get integer column
('numericals', Pipeline([
('selector', TypeSelector(np.number)),
])),
# Part 2: get category column and its onehotencoding
('categoricals', Pipeline([
('selector', TypeSelector('category')),
('labeler', StringIndexer()),
('encoder', OneHotEncoder(handle_unknown='ignore')),
])),
# Part 3: transform word to its embedding
('word2vec', Pipeline([
('w2v', w2vTransformer(word2vec)),
]))
])),
])
When I run pipeline.fit_transform(df), I got the error: blocks[0,:] has incompatible row dimensions. Got blocks[0,2].shape[0] == 1, expected 3.
However, if I removed the word2vec Transformer (Part 3) from the pipeline, the pipeline (Part1 1 + Part 2) works fine.
>>> pipeline_no_word2vec.fit_transform(df).todense()
matrix([[4., 1., 0., 0.],
[3., 0., 1., 0.],
[1., 0., 0., 1.]])
And if I keep only the w2v transformer in the pipeline, it also works.
>>> pipeline_only_word2vec.fit_transform(df)
array([list([0.1, -0.2, 0.3]), list([0.2, 0.2, 0.3]),
list([0.4, -0.2, 0.3])], dtype=object)
My guess is that there is something wrong in my w2vTransformer class but don't know how to fix it. Please help.
This error is due to the fact that the FeatureUnion expects a 2-d array from each of its parts.
Now the first two parts of your FeatureUnion:- 'numericals' and 'categoricals' are correctly sending 2-d data of shape (n_samples, n_features).
n_samples = 3 in your example data. n_features will depend on individual parts (like OneHotEncoder will change them in 2nd part, but will be 1 in first part).
But the third part 'word2vec' returns a pandas.Series object which have the 1-d shape (3,). FeatureUnion takes this a shape (1, 3) by default and hence the complains that it does not match other blocks.
So you need to correct that shape.
Now even if you simply do a reshape() at the end and change it to shape (3,1), your code will not run, because the internal contents of that array are lists from your word2vec dict, which are not transformed correctly to a 2-d array. Instead it will become a array of lists.
Change the w2vTransformer to correct the error:
class w2vTransformer(TransformerMixin):
...
...
def transform(self, X, y=None):
return np.array([np.array(vv) for vv in X['word'].apply(self.wv)])
And after that the pipeline will work.

numpy divide along axis

Is there a numpy function to divide an array along an axis with elements from another array? For example, suppose I have an array a with shape (l,m,n) and an array b with shape (m,); I'm looking for something equivalent to:
def divide_along_axis(a,b,axis=None):
if axis is None:
return a/b
c = a.copy()
for i, x in enumerate(c.swapaxes(0,axis)):
x /= b[i]
return c
For example, this is useful when normalizing an array of vectors:
>>> a = np.random.randn(4,3)
array([[ 1.03116167, -0.60862215, -0.29191449],
[-1.27040355, 1.9943905 , 1.13515384],
[-0.47916874, 0.05495749, -0.58450632],
[ 2.08792161, -1.35591814, -0.9900364 ]])
>>> np.apply_along_axis(np.linalg.norm,1,a)
array([ 1.23244853, 2.62299312, 0.75780647, 2.67919815])
>>> c = divide_along_axis(a,np.apply_along_axis(np.linalg.norm,1,a),0)
>>> np.apply_along_axis(np.linalg.norm,1,c)
array([ 1., 1., 1., 1.])
For the specific example you've given: dividing an (l,m,n) array by (m,) you can use np.newaxis:
a = np.arange(1,61, dtype=float).reshape((3,4,5)) # Create a 3d array
a.shape # (3,4,5)
b = np.array([1.0, 2.0, 3.0, 4.0]) # Create a 1-d array
b.shape # (4,)
a / b # Gives a ValueError
a / b[:, np.newaxis] # The result you want
You can read all about the broadcasting rules here. You can also use newaxis more than once if required. (e.g. to divide a shape (3,4,5,6) array by a shape (3,5) array).
From my understanding of the docs, using newaxis + broadcasting avoids also any unecessary array copying.
Indexing, newaxis etc are described more fully here now. (Documentation reorganised since this answer first posted).
I think you can get this behavior with numpy's usual broadcasting behavior:
In [9]: a = np.array([[1., 2.], [3., 4.]])
In [10]: a / np.sum(a, axis=0)
Out[10]:
array([[ 0.25 , 0.33333333],
[ 0.75 , 0.66666667]])
If i've interpreted correctly.
If you want the other axis you could transpose everything:
> a = np.random.randn(4,3).transpose()
> norms = np.apply_along_axis(np.linalg.norm,0,a)
> c = a / norms
> np.apply_along_axis(np.linalg.norm,0,c)
array([ 1., 1., 1., 1.])