How to properly select an area inside a numpy ndarray - numpy

How to properly select a specific area inside a NumPy ndarray. For example, in the sample code below I want to select the 2x3 matrix that corresponds to the intersections of columns and rows of matrix M stored in a and b respectively,
M = np.random.rand(4,5)
print(M)
a = [0, 2]
b = [0, 2, 3]
Selection = M[a,b]
but I am getting:
IndexError: shape mismatch: indexing arrays could not be broadcast
together with shapes (2,) (3,)
when I want from matrix M:
[[0.36899449 0.02531732 0.04966994 0.66058884 0.26193009]
[0.92893864 0.10193024 0.74850916 0.72822403 0.09112129]
[0.28863096 0.45470087 0.01032583 0.30931807 0.42765045]
[0.59819051 0.94057773 0.95352287 0.81818564 0.24220261]]
To get:
[[0.36899449 0.04966994]
[0.28863096 0.01032583]
[0.59819051 0.95352287]]

Related

How can one utilize the indices provided by torch.topk()?

Suppose I have a pytorch tensor x of shape [N, N_g, 2]. It can be viewed as N * N_g 2d vectors. Specifically, x[i, j, :] is the 2d vector of the jth group in the ith batch.
Now I am trying to get the coordinates of vectors of top 5 length in each group. So I tried the following:
(i) First I used x_len = (x**2).sum(dim=2).sqrt() to compute their lengths, resulting in x_len.shape==[N, N_g].
(ii) Then I used tk = x_len.topk(5) to get the top 5 lengths in each group.
(iii) The desired output would be a tensor x_top5 of shape [N, 5, 2]. Naturally I thought of using tk.indices to index x so as to obtain x_top5. But I failed as it seems such indexing is not supported.
How can I do this?
A minimal example:
x = torch.randn(10,10,2) # N=10 is the batchsize, N_g=10 is the group size
x_len = (x**2).sum(dim=2).sqrt()
tk = x_len.topk(5)
x_top5 = x[tk.indices]
print(x_top5.shape)
# torch.Size([10, 5, 10, 2])
However, this gives x_top5 as a tensor of shape [10, 5, 10, 2], instead of [10, 5, 2] as desired.

Elementwise multiplication of NumPy arrays of different shapes

When I use numpy.multiply(a,b) to multiply numpy arrays with shapes (2, 1),(2,) I get a 2 by 2 matrix. But what I want is element-wise multiplication.
I'm not familiar with numpy's rules. Can anyone explain what's happening here?
When doing an element-wise operation between two arrays, which are not of the same dimensionality, NumPy will perform broadcasting. In your case Numpy will broadcast b along the rows of a:
import numpy as np
a = np.array([[1],
[2]])
b = [3, 4]
print(a * b)
Gives:
[[3 4]
[6 8]]
To prevent this, you need to make a and b of the same dimensionality. You can add dimensions to an array by using np.newaxis or None in your indexing, like this:
print(a * b[:, np.newaxis])
Gives:
[[3]
[8]]
Let's say you have two arrays, a and b, with shape (2,3) and (2,) respectively:
a = np.random.randint(10, size=(2,3))
b = np.random.randint(10, size=(2,))
The two arrays, for example, contain:
a = np.array([[8, 0, 3],
[2, 6, 7]])
b = np.array([7, 5])
Now for handling a product element to element a*b you have to specify what numpy has to do when reaching for the absent axis=1 of array b. You can do so by adding None:
result = a*b[:,None]
With result being:
array([[56, 0, 21],
[10, 30, 35]])
Here are the input arrays a and b of the same shape as you mentioned:
In [136]: a
Out[136]:
array([[0],
[1]])
In [137]: b
Out[137]: array([0, 1])
Now, when we do multiplication using either * or numpy.multiply(a, b), we get:
In [138]: a * b
Out[138]:
array([[0, 0],
[0, 1]])
The result is a (2,2) array because numpy uses broadcasting.
# b
#a | 0 1
------------
0 | 0*0 0*1
1 | 1*0 1*1
I just explained the broadcasting rules in broadcasting arrays in numpy
In your case
(2,1) + (2,) => (2,1) + (1,2) => (2,2)
It has to add a dimension to the 2nd argument, and can only add it at the beginning (to avoid ambiguity).
So you want a (2,1) result, you have to expand the 2nd argument yourself, with reshape or [:, np.newaxis].

The represetaion of vector in numpy: is the return of np.array() a row vector?

I am new to numpy, I guess the return of np.array() is a row vector, Because the dot product between two vectors is commutative, is my guess right? Any respone is grateful.
vx = np.array([1, 2])
vw = np.array([3, 5])
np.dot(vx, vw)
np.dot(vw, vx)
The arrays are 1d ('vectors', not row/column vectors).
First paragraph from dot docuentation:
For 2-D arrays it is equivalent to matrix multiplication, and for 1-D
arrays to inner product of vectors (without complex conjugation). For
N dimensions it is a sum product over the last axis of a and
the second-to-last of b
So you are getting the inner product, which is commutative.
In [118]: vx = np.array([1, 2])
In [119]: vx.shape
Out[119]: (2,)
dot returns a scalar:
In [120]: np.dot(vx,vx)
Out[120]: 5
For a 2d 'row vector', shape matters. dot is matrix multiplication, and last dim as to match 2nd to the last, e.g. 2 matches with 2.
In [121]: vx2 = np.array([[1,2]])
In [122]: vx2.shape
Out[122]: (1, 2)
In [123]: np.dot(vx2, vx2)
...
ValueError: shapes (1,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)
In [124]: np.dot(vx2, vx2.T)
Out[124]: array([[5]])
In this case the result is 2d (1,1).

how to map tensor to it's indices in tensorflow

Suppose I have a 2D tensor with shape (size, size), and I want to get 2 new tensors that containing the original tensors row index and column index.
So if size is 2, I want to get
[[0, 0], [1, 1]] and [[0, 1], [0, 1]]
What's tricky is that size is another tensor whose value can only be known when running the graph in a tensorflow Session.
How can I do this in tensorflow?
Seems like you are looking for tf.meshgrid.
Here's an example:
shape = tf.shape(matrix)
R, C = tf.meshgrid(tf.range(shape[0]), tf.range(shape[1]), indexing='ij')
matrix is your 2D tensor, R and C contain your row and column indices, respectively. Note that this can be slightly simplified if your matrix is square (only one tf.range).

Slice 3D ndarray with 2D ndarray in numpy?

My apologies if this has been answered many times, but I just can't find a solution.
Assume the following code:
import numpy as np
A,_,_ = np.meshgrid(np.arange(5),np.arange(7),np.arange(10))
B = (rand(7,10)*5).astype(int)
How can I slice A using B so that B represent the indexes in the first and last dimensions of A (I.e A[magic] = B)?
I have tried
A[:,B,:] which doesn't work due to peculiarities of advanced indexing.
A[:,B,np.arange(10)] generates 7 copies of the matrix I'm after
A[np.arange(7),B,np.arange(10)] gives the error:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Any other suggestions?
These both work:
A[0, B, 0]
A[B, B, B]
Really, only the B in axis 1 matters, the others can be any range that will broadcast to B.shape and are limited by A.shape[0] (for axis 1) and A.shape[2] (for axis 2), for a ridiculous example:
A[range(7) + range(3), B, range(9,-1, -1)]
But you don't want to use : because then you'll get, as you said, 7 or 10 (or both!) "copies" of the array you want.
A, _, _ = np.meshgrid(np.arange(5),np.arange(7),np.arange(10))
B = (rand(7,10)*A.shape[1]).astype(int)
np.allclose(B, A[0, B, 0])
#True
np.allclose(B, A[B, B, B])
#True
np.allclose(B, A[range(7) + range(3), B, range(9,-1, -1)])
#True