Given...
a Matrix A of shape [m, n]
a tensor I of shape [m]
I want to get a list J of elements from A where
J[i] = A[i, I[i]].
That is, I holds the index of the element to select from each row in A.
Context: I already have the argmax(A, 1) and now I also want the max.
I know that I can just use reduce_max.
And after trying around for a bit I also came up with this:
J = tf.gather_nd(A,
tf.transpose(tf.pack([tf.to_int64(tf.range(A.get_shape()[0])), I])))
Where the to_int64 is needed because range only produces int32 and argmax only produces int64.
None of the two strike me as particularly elegant.
One has runtime overhead (probably about factor n) and the other has an unknown factor cognitive overhead. Am I missing something here?
The gather() function provides a way to do it:
r = tf.random.uniform([4,5],0, 9, dtype=tf.int32)
i = tf.random.uniform([4], 0, 4, dtype=tf.int32)
tf.gather(r, i, axis=1, batch_dims=1)
This is a rather late answer, but could doing
mask = tf.one_hot(I, depth=n, dtype=tf.bool, on_value=True, off_value=False)
elements = tf.boolean_mask(A, mask)
Accomplish what you're looking for?
edit: I should point out that this is NOT a good idea if A is already a very large tensor, as this ends up making a dense matrix.
Link provided by #yaroslav-bulatov mentiones this solution:
def get_elements(data, indices):
indeces = tf.range(0, tf.shape(indices)[0])*data.shape[1] + indices
return tf.gather(tf.reshape(data, [-1]), indeces)
Your solution is not currently differentiable (because gradients for tf.gather_nd are not currently supported).
Hopefully, data[:, indices] will be introduced soon.
Related
I have the following setup:
B = Batchsize
N = Number of Objects
T = Number of Targets
L = Length of feature embedding per target
For each object, I want to attend to a target. The model decides which target to attend to by taking the argmax of a vector attention_weights with shape=[B,N,T]:
pick = tf.math.argmax(attention_weights, axis=2)
So pick has the shape [B,N] and each entry is an index. Now I would like to use these indices to access the correct target features
target_features.set_shape(target_features, [B, D, L])
features_picked = tf.some_function(target_features, pick)
My question is, what to use for tf.some_function? Is it something related to tf.gather? I have trouble figuring out how to use it in this case.
Many thanks in advance for any help!
PS: I am using tf.version = '1.13.1'
I came up with the following solution, I will accept the answer once I confirmed it's doing what it should:
I tiled target_features so it has shape [B, N, T, L]
Then I do:
features_picked = tf.batch_gather(target_features, indices=pick)
where features_picked has shape [B, N, 1, L]
Lets say, that we do want to process images (or ndim vectors) using Keras/TensorFlow.
And we want, for fancy regularization, to shift each input by a random number of positions to the left (owerflown portions reappearing at the right side ).
How could it be viewed and solved:
1)
Is there any variation to numpy roll function for TensorFlow?
2)
x - 2D tensor
ri - random integer
concatenate(x[:,ri:],x[:,0:ri], axis=1) #executed for each single input to the layer, ri being random again and again (I can live with random only for each batch)
In TensorFlow v1.15.0 and up, you can use tf.roll which works just like numpy roll. https://github.com/tensorflow/tensorflow/pull/14953 .
To improve on the answer above you can do:
# size of x dimension
x_len = tensor.get_shape().as_list()[1]
# random roll amount
i = tf.random_uniform(shape=[1], maxval=x_len, dtype=tf.int32)
output = tf.roll(tensor, shift=i, axis=[1])
For older versions starting from v1.6.0 you will have to use tf.manip.roll :
# size of x dimension
x_len = tensor.get_shape().as_list()[1]
# random roll amount
i = tf.random_uniform(shape=[1], maxval=x_len, dtype=tf.int32)
output = tf.manip.roll(tensor, shift=i, axis=[1])
I just had to do this myself, and I don't think there is a tensorflow op to do np.roll unfortunately. Your code above looks basically correct though, except it doesn't roll by ri, rather by (x.shape[1] - ri).
Also you need to be careful in choosing your random integer that it is from range(1,x.shape[1]+1) rather than range(0,x.shape[1]), as if ri was 0, then x[:,0:ri] would be empty.
So what I would suggest would be something more like (for rolling along dimension 1):
x_len = x.get_shape().as_list()[1]
i = np.random.randint(0,x_len) # The amount you want to roll by
y = tf.concat([x[:,x_len-i:], x[:,:x_len-i]], axis=1)
EDIT: added missing colon after hannes' correct comment.
Suppose that I have tensors x[i,j,k] and y[p,q] in a graph. What is the correct way to specify the tensor z[i,j,k,p,q] = x[i,j,k]y[p,q]? This is the coordinate representation of the tensor product of x and y. I can get the job done using a combination of tf.expand_dims, tf.mult and tf.tile, but I feel like there should be a better way...
I think you can get away without the tile operation using broadcasting.
x_reshaped = tf.reshape(x, (i, j, k, 1, 1))
y_reshaped = tf.reshape(y, (1, 1, 1, p, q))
z = x_reshaped * y_reshaped
When a dimension has size 1 and does not match the size of the other tensor's dimensions it is being multiplied with, it is copied / broadcasted automatically along that dimension and the product is carried out. Tile is often unnecessary. I actually don't think I have ever even used tile in tensorflow. Here I also used reshape rather than expand_dims but the result is the same either way.
I have a 1000x784 matrix of data (10000 examples and 784 features) called X_valid and I'd like to apply the following function to each row in this matrix and get the numerical result:
def predict_prob(x_valid, cov, mean, prior):
return -0.5 * (x_valid.T.dot(np.linalg.inv(cov)).dot(x_valid) + mean.T.dot(
np.linalg.inv(cov)).dot(mean) + np.linalg.slogdet(cov)[1]) + np.log(
prior)
(x_valid is simply a row of data). I'm using numpy's vectorize to do this with the following code:
v_predict_prob = np.vectorize(predict_prob)
scores = v_predict_prob(X_valid, covariance[num], means[num], priors[num])
(covariance[num], means[num], and priors[num] are just constants.)
However, I get the following error when running this:
File "problem_5.py", line 48, in predict_prob
return -0.5 * (x_valid.T.dot(np.linalg.inv(cov)).dot(x_valid) + mean.T.dot(np.linalg.inv(cov)).dot(mean) + np.linalg.slogdet(cov)[1]) + np.log(prior)
AttributeError: 'numpy.float64' object has no attribute 'dot'
That is, it's not passing in each row of the matrix individually. Instead, it is passing in each entry of the matrix (not what I want).
How can I alter this to get the desired behavior?
vectorize is NOT a general substitute for iteration, nor does it claim to be faster. It mainly streamlines access to the numpy broadcasting functionality. In general the function that you vectorize will take scalar inputs, not rows or 1d arrays.
I don't think there is a way of configuring vectorize to pass an array to your function as opposed to an item.
You describe x_valid as 2d that you want to evaluate row by row. And the other terms as 'constants' which you select with [num]. What shape are those constants?
You function treats a lot of these terms as 2d arrays:
x_valid.T.dot(np.linalg.inv(cov)).dot(x_valid) +
mean.T.dot(np.linalg.inv(cov)).dot(mean) +
np.linalg.slogdet(cov)[1]) + np.log(prior)
x_valid.T is meaningful only if x_valid is 2d. If it is 1d, the transpose does noting.
np.linalg.inv(cov) only makes sense if cov is 2d.
mean.T.dot... assumes mean is 2d.
np.linalg.slogdet(cov)[1] assumes np.linalg.slogdet(cov) has 2 or more elements (or rows).
You need to show us that the function works with some real arrays before jumping into iteration or 'vectorize'.
I suggest just using a for loop:
def v_predict_prob(X_valid, c, m, p):
out = []
for row in X_valid:
out.append(predict_prob(row, c, m, p))
return np.array(out)
Under the hood np.vectorize is doing the same thing: http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.vectorize.html
I know this question is a bit outdated, but I thought I would provide an answer for 2020.
Since the release of numpy 1.12, there is a new optional argument, "signature", which should allow 2D array functionality in most cases. Additionally, you will want to "exclude" the constants since they will not be vectorized.
All you would need to change is:
v_predict_prob = np.vectorize(predict_prob, exclude=['cov', 'mean', 'prior'], signature='(n)->()')
This signifies that the function should expect an n-dim array and output a scalar, and cov, mean, and prior will not be vectorized.
I'm making the transition from MATLAB to Numpy and feeling some growing pains.
I have a 3D array, lets say it's 3x3x3 and I want the scalar sum of each plane.
In matlab, I would use:
sum_vec = sum(3dArray,3);
TIA
wbg
EDIT: I was wrong about my matlab code. Matlab only vectorizes in one dim, so a loop wold be required. So numpy turns out to be more elegant...cool.
MATLAB
for i = 1:3
sum_vec(i) = sum(sum(3dArray(:,:,i));
end
You can do
sum_vec = np.array([plane.sum() for plane in cube])
or simply
sum_vec = cube.sum(-1).sum(-1)
where cube is your 3d array. You can specify 0 or 1 instead of -1 (or 2) depending on the orientation of the planes. The latter version is also better because it doesn't use a Python loop, which usually helps to improve performance when using numpy.
You should use the axis keyword in np.sum. Like in many other numpy functions, axis lets you perform the operation along a specific axis. For example, if you want to sum along the last dimension of the array, you would do:
import numpy as np
sum_vec = np.sum(3dArray, axis=-1)
And you'll get a resulting 2D array which corresponds to the sum along the last dimension to all the array slices 3dArray[i, k, :].
UPDATE
I didn't understand exactly what you wanted. You want to sum over two dimensions (a plane). In this case you can do two sums. For example, summing over the first two dimensions:
sum_vec = np.sum(np.sum(3dArray, axis=0), axis=0)
Instead of applying the same sum function twice, you may perform the sum on the reshaped array:
a = np.random.rand(10, 10, 10) # 3D array
b = a.view()
b.shape = (a.shape[0], -1)
c = np.sum(b, axis=1)
The above should be faster because you only sum once.
sumvec= np.sum(3DArray, axis=2)
or this works as well
sumvec=3DArray.sum(2)
Remember Python starts with 0 so axis=2 represent the 3rd dimension.
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.sum.html
If you're trying to sum over a plane (and avoid loops, which is always a good idea) you can use np.sum and pass two axes as a tuple for your argument.
For example, if you have an (nx3x3) array then using
np.sum(a, (1,2))
Will give an (nx1x1), summing over a plane, not a single axis.