Can I select arbitrary windows from the last dimension of a numpy array? - numpy

I'd like to write a numpy function that takes an MxN array A, a window length L, and an MxP array idxs of starting indices into the M rows of A that selects P arbitrary slices of length L from each of the M rows of A. Except, I would love for this to work on the last dimension of A, and not necessarily care how many dimensions A has, so all dims of A and idxs match except the last one. Examples:
If A is just 1D:
A = np.array([1, 2, 3, 4, 5, 6])
window_len = 3
idxs = np.array([1, 3])
result = magical_routine(A, idxs, window_len)
Where result is a 2x3 array since I selected 2 slices of len 3:
np.array([[ 2, 3, 4],
[ 4, 5, 6]])
If A is 2D:
A = np.array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9,10,11,12],
[13,14,15,16,17,18]])
window_len = 3
idxs = np.array([[1, 3],
[0, 1],
[2, 2]])
result = magical_routine(A, idxs, window_len)
Where result is a 3x2x3 array since there are 3 rows of A, and I selected 2 slices of len 3 from each row:
np.array([[[ 2, 3, 4], [ 4, 5, 6]],
[[ 7, 8, 9], [ 8, 9,10]],
[[15,16,17], [15,16,17]]])
And so on.
I have discovered an number of inefficient ways to do this, along with ways that work for a specific number of dimensions of A. For 2D, the following is pretty tidy:
col_idxs = np.add.outer(idxs, np.arange(window_len))
np.take_along_axis(A[:, np.newaxis], col_idxs, axis=-1)
I can't see a nice way to generalize this for 1D and other D's though...
Is anyone aware of an efficient way that generalizes to any number of dims?

For your 1d case
In [271]: A=np.arange(1,7)
In [272]: idxs = np.array([1,3])
Using the kind of iteration that this questions usually gets:
In [273]: np.vstack([A[i:i+3] for i in idxs])
Out[273]:
array([[2, 3, 4],
[4, 5, 6]])
Alternatively generate all indices, and one indexing. linspace is handy for this (though it's not the only option):
In [278]: j = np.linspace(idxs,idxs+3,3,endpoint=False)
In [279]: j
Out[279]:
array([[1., 3.],
[2., 4.],
[3., 5.]])
In [282]: A[j.T.astype(int)]
Out[282]:
array([[2, 3, 4],
[4, 5, 6]])
for the 2d
In [284]: B
Out[284]:
array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18]])
In [285]: idxs = np.array([[1, 3],
...: [0, 1],
...: [2, 2]])
In [286]: j = np.linspace(idxs,idxs+3,3,endpoint=False)
In [287]: j
Out[287]:
array([[[1., 3.],
[0., 1.],
[2., 2.]],
[[2., 4.],
[1., 2.],
[3., 3.]],
[[3., 5.],
[2., 3.],
[4., 4.]]])
With a bit of trial and error, pair up the indices to get:
In [292]: B[np.arange(3)[:,None,None],j.astype(int).transpose(1,2,0)]
Out[292]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
Or iterate as in the first case, but with an extra layer:
In [294]: np.array([[B[j,i:i+3] for i in idxs[j]] for j in range(3)])
Out[294]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
With sliding windows:
In [295]: aa = np.lib.stride_tricks.sliding_window_view(A,3)
In [296]: aa.shape
Out[296]: (4, 3)
In [297]: aa
Out[297]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])
In [298]: aa[[1,3]]
Out[298]:
array([[2, 3, 4],
[4, 5, 6]])
and
In [300]: bb = np.lib.stride_tricks.sliding_window_view(B,(1,3))
In [301]: bb.shape
Out[301]: (3, 4, 1, 3)
In [302]: bb[np.arange(3)[:,None],idxs,0,:]
Out[302]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])

I got it! I was almost there:
def magical_routine(A, idxs, window_len=2000):
col_idxs = np.add.outer(idxs, np.arange(window_len))
return np.take_along_axis(A[..., np.newaxis, :], col_idxs, axis=-1)
I just needed to always add the new axis to A's second to last dim, and then leave remaining axes alone.

Related

Transposing a Numpy Array on a slice

I have a 2d array and I am trying to create a 3d array in which each each row is a repeated element, in this case 9 times, of the original array. I think this involves transposing on some kind of a np slice.... my numpy skills are a bit rough...
Here is an example:
input:
an_array = np.array([1,2,3,4,5,6])
a = an_array.reshape(3,2)
a
array([[1, 2],
[3, 4],
[5, 6]])
My desired output is as follows:
array([[[1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2, 2]],
[[3, 3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4, 4]],
[[5, 5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6]]])
This was my idea, but it does not quite give the desired output. The rows are in the wrong order plus the shape is (2,3,9) instead of (3,2,9), which is an easy issue to resolve, but anyway, I was wondering if there might be quick way to do this?
new = np.transpose([a[:]]*9)
new
array([[[1, 1, 1, 1, 1, 1, 1, 1, 1],
[3, 3, 3, 3, 3, 3, 3, 3, 3],
[5, 5, 5, 5, 5, 5, 5, 5, 5]],
[[2, 2, 2, 2, 2, 2, 2, 2, 2],
[4, 4, 4, 4, 4, 4, 4, 4, 4],
[6, 6, 6, 6, 6, 6, 6, 6, 6]]])
In [102]: arr = np.arange(1,7).reshape(3,2)
In [103]: arr
Out[103]:
array([[1, 2],
[3, 4],
[5, 6]])
You didn't show the middle step, but I assume its:
In [104]: arr.repeat(4)
Out[104]:
array([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6,
6, 6])
In [105]: arr.repeat(4).reshape(3,2,4)
Out[105]:
array([[[1, 1, 1, 1],
[2, 2, 2, 2]],
[[3, 3, 3, 3],
[4, 4, 4, 4]],
[[5, 5, 5, 5],
[6, 6, 6, 6]]])
transpose of that is easy, and relatively cheap (in numpy)
In [106]: arr.repeat(4).reshape(3,2,4).transpose(1,0,2)
Out[106]:
array([[[1, 1, 1, 1],
[3, 3, 3, 3],
[5, 5, 5, 5]],
[[2, 2, 2, 2],
[4, 4, 4, 4],
[6, 6, 6, 6]]])
repeat lets you specify the axis. You want to expand on a new last axis, but you also want a (2,3) instead of `(3,2), so start with transpose
In [4]: arr.T[:,:,None]
Out[4]:
array([[[1],
[3],
[5]],
[[2],
[4],
[6]]])
In [5]: arr.T[:,:,None].repeat(4,axis=2)
Out[5]:
array([[[1, 1, 1, 1],
[3, 3, 3, 3],
[5, 5, 5, 5]],
[[2, 2, 2, 2],
[4, 4, 4, 4],
[6, 6, 6, 6]]])
or for the (3,2,n) directly:
In [9]: arr[:,:,None].repeat(4,axis=2)
Out[9]:
array([[[1, 1, 1, 1],
[2, 2, 2, 2]],
[[3, 3, 3, 3],
[4, 4, 4, 4]],
[[5, 5, 5, 5],
[6, 6, 6, 6]]])
All these manipulations are relatively cheap so they can be mixed and matched as needed.
reshape and using broadcasting
a_out = a.reshape(3,-1,1) * np.ones(9)
Out[40]:
array([[[1., 1., 1., 1., 1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2., 2., 2., 2., 2.]],
[[3., 3., 3., 3., 3., 3., 3., 3., 3.],
[4., 4., 4., 4., 4., 4., 4., 4., 4.]],
[[5., 5., 5., 5., 5., 5., 5., 5., 5.],
[6., 6., 6., 6., 6., 6., 6., 6., 6.]]])
Or
a_out = np.broadcast_to(np.reshape(a, (3,-1,1)), (3,2,9))
Out[43]:
array([[[1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2, 2]],
[[3, 3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4, 4]],
[[5, 5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6]]])
Try transposing the vstack and then reshape
a = np.transpose([np.vstack(an_array)]*9)
a.reshape(3,2,9)
Damn browliv great job answering that

NumPy: how to filter out the first axes of multidimensional array according to some condition on the elements

Consider the follow ndarray lm -
In [135]: lm
Out[135]:
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]],
[[16, 13],
[30, 1],
[14, 9]]])
In [136]: lm.shape
Out[136]: (3, 3, 2)
I want to filter out members of the first axes (lm[0], lm[1], ...) where at least one of the elements is greater than 20. Since lm[2, 1, 0] is the only element fulfills this condition, I would expect the following result -
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]]]
i.e lm[2] has at least one element > 20, so it is filtered out of the result set. How can I achieve this?
Two ways to do so with np.all and np.any with axis arg -
In [14]: lm[(lm<=20).all(axis=(1,2))]
Out[14]:
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]]])
In [15]: lm[~(lm>20).any(axis=(1,2))]
Out[15]:
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]]])
To make it generic for ndarrays to work along the last two axes, use axis=(-2,-1) instead.

How to iterate through slices at the last dimension

For example, you have array
a = np.array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]]])
We want to iterate through slices at the last dimension, i.e. [0,1,2], [3,4,5], [6,7,8], [9,10,11]. Any way to achieve this without the for loop? Thanks!
Tried this but it does not work, because numpy does not interpret the tuple in the way we wanted - a[(0, 0),:] is not the same as a[0, 0, :]
[a[i,:] for i in zip(*product(*(range(ii) for ii in a.shape[:-1])))]
More generally, any way for the last k dimensions? Something equivalent to looping through a[i,j,k, ...].
In [26]: a = np.array([[[ 0, 1, 2],
...: [ 3, 4, 5]],
...:
...: [[ 6, 7, 8],
...: [ 9, 10, 11]]])
In [27]: [a[i,j,:] for i in range(2) for j in range(2)]
Out[27]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11])]
or
In [31]: list(np.ndindex(2,2))
Out[31]: [(0, 0), (0, 1), (1, 0), (1, 1)]
In [32]: [a[i,j] for i,j in np.ndindex(2,2)]
another
list(a.reshape(-1,3))

how to rearrange elements in a tensor, like in MATLAB?

For example, I got a tensor [30,6,6,3]: 30 is the batch_size, 6X6 is height x width, 3 is channels).
How could I rearrange its elements from every 3X3 to 1X9, like pixels in MATLAB? As the picture described:
tf.reshape() seems unworkable.
You can do these kinds of transformations by using combination of transpose and reshape. Numpy and TensorFlow logic is the same, so here's a simpler example using numpy. Suppose you have 4x4 array and want to spit it into 4 sub-arrays by skipping rows/columns like in your example.
IE, starting with
a=array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
​
You want to obtain 4 sub-images like
[0, 2]
[8, 10]
and
[1, 3]
[9, 11]
etc
First you can generate subarrays by stepping over columns
b = a.reshape((4,2,2)).transpose([2,0,1])
This generates the following array
array([[[ 0, 2],
[ 4, 6],
[ 8, 10],
[12, 14]],
[[ 1, 3],
[ 5, 7],
[ 9, 11],
[13, 15]]])
Now you skip the rows
c = b.reshape([2,2,2,2]).transpose(2,0,1,3)
This generates following array
array([[[[ 0, 2],
[ 8, 10]],
[[ 1, 3],
[ 9, 11]]],
[[[ 4, 6],
[12, 14]],
[[ 5, 7],
[13, 15]]]])
Now notice that you have the desired subarrays, but the leftmost shape is 2x2, but you want to have 4, so you reshape
c.reshape([4,2,2])
which gives you
array([[[ 0, 2],
[ 8, 10]],
[[ 1, 3],
[ 9, 11]],
[[ 4, 6],
[12, 14]],
[[ 5, 7],
[13, 15]]])
Note that the general technique of combining n,m array into n*m single dimension is to do reshape(m*n, ...). Because of row-major order, the dimensions to flatten must be on the left for reshape to work as a flattening operation. So if in your example the channels are the last dimension, you will need to transpose them to the left, flatten (using reshape), and then transpose them back.

Matrix multiplication of two vectors

I'm trying to do a matrix multiplication of two vectors in numpy which would result in an array.
Example
In [108]: b = array([[1],[2],[3],[4]])
In [109]: a =array([1,2,3])
In [111]: b.shape
Out[111]: (4, 1)
In [112]: a.shape
Out[112]: (3,)
In [113]: b.dot(a)
ValueError: objects are not aligned
As can be seen from the shapes, the array a isn't actually a matrix. The catch is to define a like this.
In [114]: a =array([[1,2,3]])
In [115]: a.shape
Out[115]: (1, 3)
In [116]: b.dot(a)
Out[116]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
How to achieve the same result when acquiring the vectors as fields or columns of a matrix?
In [137]: mat = array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
In [138]: x = mat[:,0] #[1,2,3,4]
In [139]: y = mat[0,:] #[1,2,3]
In [140]: x.dot(y)
ValueError: objects are not aligned
You are computing the outer product of two vectors. You can use the function numpy.outer for this:
In [18]: a
Out[18]: array([1, 2, 3])
In [19]: b
Out[19]: array([10, 20, 30, 40])
In [20]: numpy.outer(b, a)
Out[20]:
array([[ 10, 20, 30],
[ 20, 40, 60],
[ 30, 60, 90],
[ 40, 80, 120]])
Use 2d arrays instead of 1d vectors and broadcasting with the * ...
In [8]: #your code from above
In [9]: y = mat[0:1,:]
In [10]: y
Out[10]: array([[1, 2, 3]])
In [11]: x = mat[:,0:1]
In [12]: x
Out[12]:
array([[1],
[2],
[3],
[4]])
In [13]: x*y
Out[13]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
It's the similar catch as in the basic example.
Both x and y aren't perceived as matrices but as single dimensional arrays.
In [143]: x.shape
Out[143]: (4,)
In [144]: y.shape
Out[144]: (3,)
We have to add the second dimension to them, which will be 1.
In [171]: x = array([x]).transpose()
In [172]: x.shape
Out[172]: (4, 1)
In [173]: y = array([y])
In [174]: y.shape
Out[174]: (1, 3)
In [175]: x.dot(y)
Out[175]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])