Difference between : and , in numpy - numpy

Some resources have mentioned that in numpy's array slicing, array[2,:,1] results in the same as array[2][:][1] , but I do not get the same ones in this case:
array3d = np.array([[[1, 2], [3, 4]],[[5, 6], [7, 8]], [[9, 10], [11, 12]]])
array3d[2,:,1]
out: array([10, 12])
and:
array3d[2][:][1]
out: array([11, 12])
What is the difference?

some resources is wrong!
In [1]: array3d = np.array([[[1, 2], [3, 4]],[[5, 6], [7, 8]], [[9, 10], [11, 12
...: ]]])
In [2]: array3d
Out[2]:
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]]])
When the indices are all scalar this kind of decomposition works:
In [3]: array3d[2,0,1]
Out[3]: 10
In [4]: array3d[2][0][1]
Out[4]: 10
One index reduces the dimension, picking one 'plane':
In [5]: array3d[2]
Out[5]:
array([[ 9, 10],
[11, 12]])
[:] on that does nothing - it is not a place holder by itself. Within the multidimensional index it is a slice - the whole thing in that dimension. We see the same behavior with lists. alist[2] returns an element, alist[:] returns a copy of the whole list.
In [6]: array3d[2][:]
Out[6]:
array([[ 9, 10],
[11, 12]])
Remember, numpy is a python package. Python syntax still applies at all levels. x[a][b][c] does 3 indexing operations in sequence, 'chaining' them. x[a,b,c] is one indexing operation, passing a tuple of to x. It's numpy code that interprets that tuple.
We have to use a multidimensional index on the remaining dimensions:
In [7]: array3d[2][:,1]
Out[7]: array([10, 12])
In [8]: array3d[2,:,1]
Out[8]: array([10, 12])
The interpreter actually does:
In [9]: array3d.__getitem__((2,slice(None),1))
Out[9]: array([10, 12])
In [11]: array3d.__getitem__(2).__getitem__((slice(None),1))
Out[11]: array([10, 12])

Related

Numpy slicing multiple dimensions by lists

Given a matrix, say
>>> a=np.arange(25).reshape(5,-1)
Can I achieve
>>> idx=[1,3]
>>> a[idx][:,idx]
array([[ 6, 8],
[16, 18]])
without having to slice a twice?
broadcast idx` against itself:
In [118]: a=np.arange(25).reshape(5,-1)
In [119]: i,j=np.ix_([1,3],[1,3])
In [120]: i,j
Out[120]:
(array([[1],
[3]]),
array([[1, 3]]))
In [121]: a[i,j]
Out[121]:
array([[ 6, 8],
[16, 18]])

Numpy: How to select row entries in a 2d array by column vector

How can I retrieve a column vector from a 2d array given an indicator column vector?
Suppose I have
X = np.array([[1, 4, 6],
[8, 2, 9],
[0, 3, 7],
[6, 5, 1]])
and
S = np.array([0, 2, 1, 2])
Is there an elegant way to get from X and S the result array([1, 9, 3, 1]), which is equivalent to
np.array([x[s] for x, s in zip(X, S)])
You can achieve this using np.take_along_axis:
>>> np.take_along_axis(X, S[..., None], axis=1)
array([[1],
[9],
[3],
[1]])
You need to make sure both array arguments are of the same shape (or broadcasting can be applied), hence the S[..., None] broadcasting.
Of course your can reshape the returned value with a [:, 0] slice.
>>> np.take_along_axis(X, S[..., None], axis=1)[:, 0]
array([1, 9, 3, 1])
Alternatively you can just use indexing with an arangement:
>>> X[np.arange(len(S)), S[np.arange(len(S))]]
array([1, 9, 3, 1])
I believe this is also equivalent to np.diag(X[:, S]) but with unnecessary copying...
For 2d arrays
# Mention row numbers as one list and S which is column number as other
X[[0, 1, 2, 3], S]
# more general
X[np.indices(S.shape), S]
indexing_basics

Algorithms of Joining arrays in numpy

I'm new in numpy, I understand the methods of "Joining arrays" in lower shape such as (n1, n2) beacause we can visualize, like a matrix.
But I don't undestand the logic in higher dimensions (n0, ...., n_{d-1}) of course I can't visualize that. To visualize I usually imagine a multidimensional array like a tree, so (n0, ...., n_{d-1}) means that at level (axis) i of tree every node has n_{i} children. So at level 0 (the root) we have n0 children and so on.
In substance what is the formal exact definiton of "Joining arrays" algorithms?
https://numpy.org/doc/stable/reference/routines.array-manipulation.html
Let's see I can illustrate some basic array operations.
First make a 2d array. Start with a 1d, [0,1,...5], and reshape it to (2,3):
In [1]: x = np.arange(6).reshape(2,3)
In [2]: x
Out[2]:
array([[0, 1, 2],
[3, 4, 5]])
I can join 2 copies of x along the 1st dimension (vstack, v for vertical also does this):
In [3]: np.concatenate([x,x], axis=0)
Out[3]:
array([[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5]])
Note that the result is (4,3); no new dimension.
Or join them 'horizontally':
In [4]: np.concatenate([x,x], axis=1)
Out[4]:
array([[0, 1, 2, 0, 1, 2], # (2,6) shape
[3, 4, 5, 3, 4, 5]])
But if I supply them to np.array I make a 3d array (2,2,3) shape:
In [5]: np.array([x,x])
Out[5]:
array([[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]]])
This action of np.array is really no different from making a 2d array from nested lists, np.array([[1,2],[3,4]]). We could just add a layer of nesting, just like Out[5} without the line breaks. I tend to think of this 3d array as having 2 blocks, each with 2 rows and 3 columns. But the names are just a convenience.
stack acts like np.array, making a 3d array. It actually changes the input arrays to (1,2,3) shape, and concatenates on the first axis.
In [6]: np.stack([x,x])
Out[6]:
array([[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]]])
stack lets us join the array in other ways
In [7]: np.stack([x,x], axis=1) # expand to (2,1,3) and concatante
Out[7]:
array([[[0, 1, 2],
[0, 1, 2]],
[[3, 4, 5],
[3, 4, 5]]])
In [8]: np.stack([x,x], axis=2) # expand to (2,3,1) and concatenate
Out[8]:
array([[[0, 0],
[1, 1],
[2, 2]],
[[3, 3],
[4, 4],
[5, 5]]])
concatenate and the other stack functions don't add anything new to basic numpy arrays. They just provide a way(s) of making a new array from existing ones. There aren't any special algorithms.
If it helps you could think of these join functions as creating a new "blank" array, and filling it with copies of the source arrays. For example that last stack can be done with:
In [9]: res = np.zeros((2,3,2), int)
In [10]: res
Out[10]:
array([[[0, 0],
[0, 0],
[0, 0]],
[[0, 0],
[0, 0],
[0, 0]]])
In [11]: res[:,:,0] = x
In [12]: res[:,:,1] = x
In [13]: res
Out[13]:
array([[[0, 0],
[1, 1],
[2, 2]],
[[3, 3],
[4, 4],
[5, 5]]])

Delete specified column index from every row of 2d numpy array

I have a numpy array A as follows:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
and another numpy array column_indices_to_be_deleted as follows:
array([1, 0, 2])
I want to delete the element from every row of A specified by the column indices in column_indices_to_be_deleted. So, column index 1 from row 0, column index 0 from row 1 and column index 2 from row 2 in this case, to get a new array that looks like this:
array([[1, 3],
[5, 6],
[7, 8]])
What would be the simplest way of doing that?
One way with masking created with broadcatsed-comparison -
In [43]: a # input array
Out[43]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [44]: remove_idx # indices to be removed from each row
Out[44]: array([1, 0, 2])
In [45]: n = a.shape[1]
In [46]: a[remove_idx[:,None]!=np.arange(n)].reshape(-1,n-1)
Out[46]:
array([[1, 3],
[5, 6],
[7, 8]])
Another mask based approach with the mask created with array-assignment -
In [47]: mask = np.ones(a.shape,dtype=bool)
In [48]: mask[np.arange(len(remove_idx)), remove_idx] = 0
In [49]: a[mask].reshape(-1,a.shape[1]-1)
Out[49]:
array([[1, 3],
[5, 6],
[7, 8]])
Another with np.delete -
In [64]: m,n = a.shape
In [66]: np.delete(a.flat,remove_idx+n*np.arange(m)).reshape(m,-1)
Out[66]:
array([[1, 3],
[5, 6],
[7, 8]])

Why aren't some dimensions shown in the output even when according to the indexing they should be?

b = np.array([[[0, 2, 3], [10, 12, 13]], [[20, 22, 23], [110, 112, 113]]])
print(b[..., -1])
>>>[[3, 13], [23, 113]]
Why does this output show the first axis but not the second axis (to show the second axis, it would have to show each number in its own list)? Is Numpy trying to minimize unnecessary display of dimensions when there is only one number per each second dimension list being shown? Why doesn’t numpy replicate the dimensions of the original array exactly?
Why does this output show the first axis but not the second axis (to show the second axis, it would have to show each number in its own list)?
It does show the first and the second axis. Note that you have a 2d array here, and the first and second axis are retained. Only the third axis has "collapsed".
Your indexing is, for a 3d array, equivalent to:
b[:, :, -1]
It thus means that you create a 2d array c where cij = bij-1. -1 means the last element, so cij=bij2.
b has as values:
>>> b
array([[[ 0, 2, 3],
[ 10, 12, 13]],
[[ 20, 22, 23],
[110, 112, 113]]])
So that means that our result c has as c00=b002 which is 3; for c01=b012 which is 13; for c10=b102 which is 23; and for c11=b112, which is 113.
So the end product is:
>>> b[:,:,-1]
array([[ 3, 13],
[ 23, 113]])
>>> b[...,-1]
array([[ 3, 13],
[ 23, 113]])
By specifying a value for a given dimension that dimension "collapses". Another sensical alternative would have been to have a dimension of size 1, but frequently such subscripting is done to retrieve arrays with a lower number of dimensions.
In [7]: b = np.array([[[0, 2, 3], [10, 12, 13]], [[20, 22, 23], [110, 112, 113]]])
In [8]: b # (2,2,3) shape array
Out[8]:
array([[[ 0, 2, 3],
[ 10, 12, 13]],
[[ 20, 22, 23],
[110, 112, 113]]])
In [9]: b[..., -1]
Out[9]:
array([[ 3, 13],
[ 23, 113]])
This slice of b is a (2,2) array. It's not just a matter of display. Axes 0 and 1 are present; it's axes 2 that's been dropped.
Indexing with a list, or a slice:
In [10]: b[..., [-1]] # (2,2,1)
Out[10]:
array([[[ 3],
[ 13]],
[[ 23],
[113]]])
In [11]: b[..., -1:]
Out[11]:
array([[[ 3],
[ 13]],
[[ 23],
[113]]])
https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
This indexing page is long, but it covers these cases (and more).