For a given NumPy array, it is easy to perform a "normal" sum along one dimension. For example:
X = np.array([[1, 0, 0], [0, 2, 2], [0, 0, 3]])
=array([1, 2, 5])
=array([1, 4, 3])
Instead, is there an "efficient" way of computing the bitwise OR along one dimension of an array similarly? Something like the following, except without requiring for-loops or nested function calls.
Example: bitwise OR along zeroeth dimension as I currently am doing it:
=array([1, 2, 3])
What I would like:
=array([1, 2, 3])

numpy.bitwise_or.reduce(X, axis=whichever_one_you_wanted)
Use the reduce method of the numpy.bitwise_or ufunc.


How to return one NumPy array per partition in Dask?

I need to compute many NumPy arrays (that can be up to 4-dimensional), one for each partition of a Dask dataframe, and then add them as arrays. However, I'm struggling to make map_partitions return an array for each partition instead of a single array for all of them.
import dask.dataframe as dd
import numpy as np, pandas as pd
df = pd.DataFrame(range(15), columns=['x'])
ddf = dd.from_pandas(df, npartitions=3)
def func(partition):
# Here I also tried returning the array in a list and in a tuple
return np.array([[1, 2], [3, 4]])
# Here I tried all the options available for 'meta'
results = ddf.map_partitions(func).compute()
Then results is:
array([[1, 2],
[3, 4],
[1, 2],
[3, 4],
[1, 2],
[3, 4]])
And if, instead, I do results.sum().compute() I get 30.
What I'd like to get is:
[np.array([[1, 2],[3, 4]]), np.array([[1, 2],[3, 4]]), np.array([[1, 2],[3, 4]])]
So that if I compute the sum, I get:
array([[ 3, 6],
[ 9, 12]])
How can you achieve this result with Dask?
I managed to make it work like this, but I don't know if this is the best way:
from dask import delayed
results = []
for partition in ddf.partitions:
result = delayed(func)(partition)
The result of the computation is:
array([[ 3, 6],
[ 9, 12]])
You are right, a dask-array is usually to be viewed as a single logical array, which just happens to be made of pieces. Single you are not using the logical layer, you could have done your work with delayed alone. On the other hand, it seems like the end result you want really is a sum over all the data, so maybe even simpler would be an appropriate reshape and sum(axis=)?
-1, 2, 2).sum(axis=0).compute()
(compute_chunk_sizes is needed because although your original pandas dataframe had a known size, Dask did not evaluate your function yet to know what sizes it gave back)
However, given your setup, the following would work and be more similar to your original attempt, see .to_delayed()
list_of_delayed = ddf.map_partitions(func).to_delayed().tolist()
tuple_of_np_lists = dask.compute(*list_of_delayed)
(tolist forces evaluating the contained delayed objects)

What is the difference between tf.scatter_add and tf.scatter_nd when indices is a matrix?

Both tf.scatter_add and tf.scatter_nd allow indices to be a matrix. It is clear from the documentation of tf.scatter_nd that the last dimension of indices contains values that are used to index a tensor of shape shape. The other dimensions of indices define the number of elements/slices to be scattered. Suppose updates has a rank N. First k dimensions of indices (except the last dimension) should match with first k dimensions of updates. The last (N-k) dimensions of updates should match with the last (N-k) dimensions of shape.
This implies that tf.scatter_nd can be used to perform an N-dimensional scatter. However, tf.scatter_add also takes matrices as indices. But, its not clear which dimensions of indices correspond to the number of scatters to be performed and how do these dimensions align with updates. Can someone provide a clear explanation possibly with examples?
#shaunshd , I finally fully understand the 3 tensors relationship in tf.scatter_nd_*() arguments, especially when the indices have multi-demensions. e.g:
indices = tf.constant([[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3], [3,3,2]], dtype=tf.int32)
Please don't expect tf.rank(indices)>2, tf.rank(indices)==2 is permanently true;
The following is my test codes to show more complex test case than the examples provided in tensroflow's official website:
def testScatterNDUpdate(self):
ref = tf.Variable(np.zeros(shape=[4, 4, 4], dtype=np.float32))
indices = tf.constant([[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3], [3,3,2]], dtype=tf.int32)
updates = tf.constant([1,2,3,4,5], dtype=tf.float32)
#shape = (4,4,4)
print(tf.tensor_scatter_nd_update(ref, indices, updates))
print(ref.scatter_nd_update(indices, updates))
#print(updates.shape[-1]==shape[-1], updates.shape[0]<=shape[0])
#conditions are:
# updates.shape[0]==indices[0]
# indices[1]<=len(shape)
# tf.rank(indices)==2
You also could understand the indices with the following psudo codes:
def scatter_nd_update(ref, indices, updates):
for i in range(tf.shape(indices)[0]):
return ref
Comapring with numpy's fancy indexing feature, tensorflow's indexing features are still very difficult to use and have different using style, not unified as same as numpy yet. Hope the situation could be better in tf3.x

Most efficient way to do this slice based multiplication in Tensorflow

I'm trying to perform an operation of multiplying a slice of a 2D matrix by a constant.
For example, if i wanted to multiply everything but the first 2 columns
To perform this in numpy, one could do:
a = np.array([[0,7,4],
a[:, 2:] = 2.0*a[:, 2:]
>> a
>> array([[ 0, 7, 8],
[ 1, 6, 8],
[ 0, 2, 8],
[ 4, 2, 14]])
However, at least from what i've searched, tensorflow currently doesn't have a straightforward way to do this.
My current solution is to create a originally as two separate Tensors a1 and a2, multiply the second one by 2.0 and then concatenate them across axis=1. The operation is simple enough that this is possible. However I have two questions
Is that the most efficient way to do this
Is there a better (general/efficient) way to perform this to bring the functionality closer to numpy's slicing magic (perhaps
One option is to perform entrywise multiplication, as follows:
import tensorflow as tf
a = tf.Variable(initial_value=[[0,7,4],[1,6,4],[0,2,4],[4,2,7]])
b = tf.mul(a,[1,1,2])
This prints
array([[ 0, 7, 8],
[ 1, 6, 8],
[ 0, 2, 8],
[ 4, 2, 14]])
More generally, if a has more columns, you can do something like that:
import tensorflow as tf
a = tf.Variable(initial_value=[[0,7,4],[1,6,4],[0,2,4],[4,2,7]])
b = tf.mul(a,[1,1]+[2 for i in range(a.get_shape()[1]-2)])
Or if your matrix has many columns you could replace
b = tf.mul(a,[1,1]+[2 for i in range(a.get_shape()[1]-2)])
import numpy as np
b = tf.mul(a,np.concatenate((np.array([1,1]),2*np.ones(a.get_shape()[1]-2))))

Vectorise numpy code on demand

Suppose I have a very basic function in Python:
def f(x, y):
return x + y
Then I can call this with scalars, f(1, 5.4) == 6.4 or with numpy vectors of arbitrary (but the same) shape. E.g. this works:
x = np.arange(3)
y = np.array([1,4,2.3])
f(x, y)
which gives an array with entries 1, 5, 4.3.
But what if f is more complicated? For example, xx and yy are 1D numpy arrays here.
def g(x, y):
return np.sum((xx - x)**2 + (yy - y)**2)
(I hasten to add that I'm not interested in this specific g, but in general strategies...) Then g(5, 6) works fine, but if I want to pass numpy arrays, I seem to have to write a very different function with explict broadcasting etc. For example:
def gg(x, y):
xfull = np.stack([x]*len(xx),axis=-1)
yfull = np.stack([y]*len(xx),axis=-1)
return np.sum((xfull - xx)**2 + (yfull - yy)**2, axis=-1)
This does now work with scalars and arrays. But it seems like a mess, and is hard to read.
Is there a better way?
def g(x, y):
return np.sum((xx - x)**2 + (yy - y)**2)
my first questions are:
this is written with scalar x and y in mind?
what are xx and yy? You say 1d arrays. Same length?
why aren't they parameters? Because in this context they are fixed?
in words, this offsets xx and yy by constant amounts and takes the sum of their squares, returning a single value?
My next step is to explore the 'broadcasting' limits of this expression. For example it runs for any x that can be used in xx-x. That could be a 0d array, a one element 1d array, an array with the same shape as xx, or anything else that can 'broadcast' with `xx. That's where a thorough understanding of 'broadcasting' is essential.
xx-xx[:,None] though produces a 2d array. np.sum as written takes the sum over all values, i.e. a flattened. Your gg suggests you want to sum on the last axis. If so go ahead and put that in g
def g(x, y):
return np.sum((xx - x)**2 + (yy - y)**2, axis=-1)
Your use of stack in gg produces:
In [101]: xx
Out[101]: array([0, 1, 2, 3, 4])
In [103]: np.stack([np.arange(3)]*len(xx), axis=-1)
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2]])
I would have written that as x[:,None]
In [104]: xx-_
array([[ 0, 1, 2, 3, 4],
[-1, 0, 1, 2, 3],
[-2, -1, 0, 1, 2]])
In [105]: xx-np.arange(3)[:,None]
array([[ 0, 1, 2, 3, 4],
[-1, 0, 1, 2, 3],
[-2, -1, 0, 1, 2]])
That does not work with scalar x; but this does
np.array or np.asarray is commonly used as the start of numpy functions to accommodate scalar or list inputs. ... is handy when dealing with a variable number of dimensions. reshape(...,-1) and [...,None] are widely used to expand or generalize dimensions.
I've learned a lot by looking the Python code of numpy functions. I've also learned from years of work with MATLAB to be pedantic about dimensions. Keep track of intended and actual array shapes. It helps to use test shapes that will highlight errors. Test with a (2,3) array instead of an ambiguous (3,3) one.

Slicing a tensor by an index tensor in Tensorflow

I have two following tensors (note that they are both Tensorflow tensors which means they are still virtually symbolic at the time I construct the following slicing op before I launch a tf.Session()):
params: has shape (64,784, 256)
indices: has shape (64, 784)
and I want to construct an op that returns the following tensor:
output: has shape (64,784) where
output[i,j] = params_tensor[i,j, indices[i,j] ]
What is the most efficient way in Tensorflow to do so?
ps: I tried with tf.gather but couldn't make use of it to perform the operation I described above.
Many thanks.
You can get exactly what you want using tf.gather_nd. The final expression is:
tf.gather_nd(params, tf.stack([tf.tile(tf.expand_dims(tf.range(tf.shape(indices)[0]), 1), [1, tf.shape(indices)[1]]), tf.transpose(tf.tile(tf.expand_dims(tf.range(tf.shape(indices)[1]), 1), [1, tf.shape(indices)[0]])), indices], 2))
This expression has the following explanation:
tf.gather_nd does what you expected and uses the indices to gather the output from the params
tf.stack combines three separate tensors, the last of which is the indices. The first two tensors specify the ordering of the first two dimensions (axis 0 and axis 1 of params/indices)
For the example provided, this ordering is simply 0, 1, 2, ..., 63 for axis 0, and 0, 1, 2, ... 783 for axis 1. These sequences are obtained with tf.range(tf.shape(indices)[0]) and tf.range(tf.shape(indices)[1]), respectively.
For the example provided, indices has shape (64, 784). The other two tensors from the last point above need to have this same shape in order to be combined with tf.stack
First, an additional dimension/axis is added to each of the two sequences using tf.expand_dims.
The use of tf.tile and tf.transpose can be shown by example: Assume the first two axes of params and index have shape (5,3). We want the first tensor to be:
[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]
We want the second tensor to be:
[[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]
These two tensors almost function like specifying the coordinates in a grid for the associated indices.
The final part of tf.stack combines the three tensors on a new third axis, so that the result has the same 3 axes as params.
Keep in mind if you have more or less axes than in the question, you need to modify the number of coordinate-specifying tensors in tf.stack accordingly.
What you want is like a custom reduction function. If you are keeping something like index of maximum value at indices then I would suggest using tf.reduce_max:
max_params = tf.reduce_max(params_tensor, reduction_indices=[2])
Otherwise, here is one way to get what you want (Tensor objects are not assignable so we create a 2d list of tensors and pack it using tf.pack):
import tensorflow as tf
import numpy as np
with tf.Graph().as_default():
params_tensor = tf.pack(np.random.randint(1,256, [5,5,10]).astype(np.int32))
indices = tf.pack(np.random.randint(1,10,[5,5]).astype(np.int32))
output = [ [None for j in range(params_tensor.get_shape()[1])] for i in range(params_tensor.get_shape()[0])]
for i in range(params_tensor.get_shape()[0]):
for j in range(params_tensor.get_shape()[1]):
output[i][j] = params_tensor[i,j,indices[i,j]]
output = tf.pack(output)
with tf.Session() as sess:
params_tensor,indices,output =[params_tensor,indices,output])
print params_tensor
print indices
print output
I know I'm late, but I recently had to do something similar, and was able to to do it using Ragged Tensors:
output = tf.gather(params, tf.RaggedTensor.from_tensor(indices), batch_dims=-1, axis=-1)
Hope it helps