Numpy multidimensional advanced indexing - numpy

I have an array a with shape [3,x,y,z,n] (three 4d-images). And a second array b with shape [x,y,z] which contains the indices I want to choose from the first dimension of a (so the values of b are in the range 0 to 2).
The results I want to have would be of shape [x,y,z,n]. How can I do that in numpy?

Using advanced-indexing -
a[b,np.arange(x)[:,None,None],np.arange(y)[:,None],np.arange(z)]
A shorter way to express that would be -
a[tuple([b] + np.ogrid[:x,:y,:z])]
Using NumPy builtin np.take_along_axis to perform advanced-indexing by doing the dirty work under the hoods -
np.take_along_axis(a,b[None,...,None],axis=0)[0]

Related

Simple question about slicing a Numpy Tensor

I have a Numpy Tensor,
X = np.arange(64).reshape((4,4,4))
I wish to grab the 2,3,4 entries of the first dimension of this tensor, which you can do with,
Y = X[[1,2,3],:,:]
Is this a simpler way of writing this instead of explicitly writing out the indices [1,2,3]? I tried something like [1,:], which gave me an error.
Context: for my real application, the shape of the tensor is something like (30000,100,100). I would like to grab the last (10000, 100,100) to (30000,100,100) of this tensor.
The simplest way in your case is to use X[1:4]. This is the same as X[[1,2,3]], but notice that with X[1:4] you only need one pair of brackets because 1:4 already represent a range of values.
For an N dimensional array in NumPy if you specify indexes for less than N dimensions you get all elements of the remaining dimensions. That is, for N equal to 3, X[1:4] is the same as X[1:4, :, :] or X[1:4, :]. Only if you want to index some dimension while getting all elements in a dimension that comes before it is that you actually need to pass :. Such as X[:, 2:4], for instance.
If you wish to select from some row to the end of array, simply use python slicing notation as below:
X[10000:,:,:]
This will select all rows from 10000 to the end of array and all columns and depths for them.

K-mean across each axis of numpy array

I have a numpy array A with shape (N,M). I am looking to run K-mean (K=2) algorithm across each M axis and get the result C which is an array with shape (K,M).
M can be a really large number. what is the fastest and most scalable way I can compute C in python?

How to apply numpy matrix operations on first two dimensions of 3D array

I have a 3D numpy array which I am using to represent a tuple of (square) matrices, and I'd like to perform a matrix operation on each of those matrices, corresponding to the first two dimensions of the array. For instance, if my list of matrices is [A,B,C] I would like to compute [A'A,B'B,C'C] where ' denotes the conjugate transpose.
The following code kinda sorta does what I'm looking for:
foo=np.array([[[1,1],[0,1]],[[0,1],[0,0]],[[3,0],[0,-2]]])
[np.matrix(j).H*np.matrix(j) for j in foo]
But I'd like to do this using vectorized operations instead of list comprehension.

How to retain indices of a matrix while working on one of its submatrices?

I am trying to implement an algorithm that iteratively removes some rows and columns of a matrix and continues processing the remaining submatrix. However, I would like to know the index of a value in the original matrix rather than the remaining submatrix.
For example, assume that a matrix x is built using
x = np.arange(9).reshape(3, 3)
Now, I would like to find the index of the element that is equal to 8 in the submatrix defined below:
np.where(x[1:, 1:] == 8)
By default, numpy returns (array[1], array[1]) because it is finding the element in the sliced submatrix. What I like to be returned instead is (array[2], array[2]), which is the index of 8 in the original matrix.
What is an efficient solution to this problem?
P.S.
The submatrix may be built arbitrarily. For example, I may need to keep rows, 0 and 1, but columns 0 and 2.
Each submatrix may be sliced in next iterations to make a smaller submatrix. I still would like to have access to the index in the original matrix. In other words, I am looking for a solution that works on submatrices of submatrices as well.
I recently learned about indexing with arrays where submatrices of a matrix can be selected using another numpy array. I think what I can do to solve the problem is to map indices of the submatrix to elements of the indexing array.
For example, in the example above, the submatrix can be defined like this:
row_idx = np.array([1, 2])
col_idx = np.array([1, 2])
np.where(x[row_idx[:, None], col_idx] == 8)
This will still return the same (array[1], array[1]) output, but I can use these indices to lookup the elements of row_idx and col_idx in order to find the corresponding indices in the original matrix, i.e. row_idx[1] and col_idx[1].

Numpy sum over planes of 3d array, return a scalar

I'm making the transition from MATLAB to Numpy and feeling some growing pains.
I have a 3D array, lets say it's 3x3x3 and I want the scalar sum of each plane.
In matlab, I would use:
sum_vec = sum(3dArray,3);
TIA
wbg
EDIT: I was wrong about my matlab code. Matlab only vectorizes in one dim, so a loop wold be required. So numpy turns out to be more elegant...cool.
MATLAB
for i = 1:3
sum_vec(i) = sum(sum(3dArray(:,:,i));
end
You can do
sum_vec = np.array([plane.sum() for plane in cube])
or simply
sum_vec = cube.sum(-1).sum(-1)
where cube is your 3d array. You can specify 0 or 1 instead of -1 (or 2) depending on the orientation of the planes. The latter version is also better because it doesn't use a Python loop, which usually helps to improve performance when using numpy.
You should use the axis keyword in np.sum. Like in many other numpy functions, axis lets you perform the operation along a specific axis. For example, if you want to sum along the last dimension of the array, you would do:
import numpy as np
sum_vec = np.sum(3dArray, axis=-1)
And you'll get a resulting 2D array which corresponds to the sum along the last dimension to all the array slices 3dArray[i, k, :].
UPDATE
I didn't understand exactly what you wanted. You want to sum over two dimensions (a plane). In this case you can do two sums. For example, summing over the first two dimensions:
sum_vec = np.sum(np.sum(3dArray, axis=0), axis=0)
Instead of applying the same sum function twice, you may perform the sum on the reshaped array:
a = np.random.rand(10, 10, 10) # 3D array
b = a.view()
b.shape = (a.shape[0], -1)
c = np.sum(b, axis=1)
The above should be faster because you only sum once.
sumvec= np.sum(3DArray, axis=2)
or this works as well
sumvec=3DArray.sum(2)
Remember Python starts with 0 so axis=2 represent the 3rd dimension.
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.sum.html
If you're trying to sum over a plane (and avoid loops, which is always a good idea) you can use np.sum and pass two axes as a tuple for your argument.
For example, if you have an (nx3x3) array then using
np.sum(a, (1,2))
Will give an (nx1x1), summing over a plane, not a single axis.