Numpy Advanced Indexing confusion - numpy

If a is numpy array of shape (5,3), b is of shape (2,2) and c is of shape (2,2), what is the shape of a[b,c]?
Can anyone explain this to me with an example. I've read the docs but still I am not able to understand how it works.

Just for the purpose of expounding the concept of advanced indexing, here is a contrived example:
# input arrays
In [22]: a
Out[22]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [23]: b
Out[23]:
array([[0, 1],
[2, 3]])
In [24]: c
Out[24]:
array([[0, 1],
[2, 2]])
# advanced indexing
In [25]: a[b, c]
Out[25]:
array([[ 0, 4],
[ 8, 11]])
By the expression a[b, c], we are using the arrays b and c to selectively pull out elements from the array a.
To interpret the output of a[b, c]:
# b # c # 2D indices
[[0, 1], [[0, 1] ---> (0,0) (1,1)
[2, 3]] [2, 2]] ---> (2,2) (3,2)
The 2D indices would simply be applied to the array a and the corresponding elements would be returned as array in the result of a[b, c]
a[(0,0)] --> 0
a[(1,1)] --> 4
a[(2,2)] --> 8
a[(3,2)] --> 11
The above elements are returned as a 2D array since the arrays b and c are 2D arrays themselves.
Also, please note that advanced indexing always returns a copy.
In [27]: (a[b, c]).flags.owndata
Out[27]: True
However, an assignment operation using advanced indexing will alter the original array (in-place). But, this behaviour is also dependent on two factors:
whether your indexing operation is pure (only advanced indexing) or mixed (a combination of advanced & simple indexing)
in case of mixed indexing, the order in which they are applied.
See: Views and copies confusion with NumPy arrays when combining index operations

Related

What is the difference between np.array([val1, val2]) and np.array([[val1, val2]])?

What is the difference between np.array([1, 2]) and np.array([[1, 2]])?
Which one of them is a matrix?
I also do not understand the output for shape of the above tensors. The former returns (2,) and the latter returns (1,2).
np.array([1, 2]) builds an array starting from a list, thus giving you a 1D array with the shape (2, ) since it only contains a single list of two elements.
When using the double [ you are actually passing a list of lists, thus this gets you a multidimensional array, or matrix, with the shape (1, 2).
With the latter you are able to build more complex matrices like:
np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
rendering a 3x3 matrix:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

special vectorial product numpy

ref question
Let's say I have vector s and I want to produce the matrix m (see image) with only numpy functions, how could I do that ? I imagined to transpose the vector s and to find a special product between s and s^t but I couldn't manage to find it. Do you have any idea ?
This looks like an outer product:
s = np.arange(3) # array([1, 2, 3])
np.multiply.outer(s,s)
output:
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])

Why is the output like this? I do not understand how the indexing is working

How is it indexing it? Why is the output [1,4,5]?
I am following the tutorial on http://cs231n.github.io/python-numpy-tutorial/#numpy
a = np.array([[1,2], [3, 4], [5, 6]])
# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]]) # Prints "[1 4 5]"
It's called fancy indexing in numpy.
You can image the first list and the second list as x-axis and y-axis. So a[[0,1,2],[0,1,0]] is like getting three elements which their coordinates are (0,0), (1,1), (2,0) from a.
a[0,0] # 1
a[1,1] # 4
a[2,0] # 5

Indexing a sub-array by lists [duplicate]

This question already has an answer here:
Assign values to numpy.array
(1 answer)
Closed 5 years ago.
I have some array A and 2 lists of indices ind1 and ind2, one for each axis. Now this gives me a slice of the array, to which I need to assign some new values. Problem is, my approach for this does not work.
Let me demonstrate with an example. First I create an array, and try to access some slice:
>>> A=numpy.arange(9).reshape(3,3)
>>> ind1, ind2 = [0,1], [1,2]
>>> A
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> A[ind1,ind2]
array([1, 5])
Now this just gives me 2 values, not the 2-by-2 matrix I was going for. So I tried this:
>>> A[ind1,:][:,ind2]
array([[1, 2],
[4, 5]])
Okay, better. Now let's say these value should be 0:
>>> A[ind1,:][:,ind2]=0
>>> A
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
If I try to assign like this, the array A does not get updated, because of the double indexing (I am only assigning to some copy of A, which gets discarded). Is there some way to index the sub array by just indexing once?
Note: Indexing by selecting some appropriate range like A[:2,1:3] would work for this example, but I need something that works with any arbitrary list of indices.
What about using meshgrid to create your 2d-indexes? As follows
>>> import numpy as np
>>> A = np.arange(9).reshape(3,3)
>>> ind1, ind2 = [0,1],[1,2]
>>> ind12 = np.meshgrid(ind1,ind2, indexing='ij')
>>> # = np.ix_(ind1,ind2) as pointed out by #Divakar
>>> A[ind12]
[[1 2]
[4 5]]
And finally
>>> A[ind12] = 0
>>> A
[[0 0 0]
[3 0 0]
[6 7 8]]
Which works with any arbitrary list of indices.
>>> ind1, ind2 = [0,2],[0,2]
>>> ind12 = np.meshgrid(ind1,ind2, indexing='ij')
>>> A[ind12] = 100
[[100 1 100]
[ 3 4 5]
[100 7 100]]
As pointed out by #hpaulj in comments, note that np.ix_(ind1,ind2) is actually equivalent to the following use of np.meshgrid,
>>> np.meshgrid(ind1,ind2, indexing='ij', sparse=True)
Which is a priori even more efficient. This is a major point in the np.ix_'s favor when the parameters indexing and sparse are constantly set to 'ij' and True respectively.

Numpy Indexing Behavior

I am having a lot of trouble understanding numpy indexing for multidimensional arrays. In this example that I am working with, let's say that I have a 2D array, A, which is 100x10. Then I have another array, B, which is a 100x1 1D array of values between 0-9 (indices for A). In MATLAB, I would use A(sub2ind(size(A), 1:size(A,1)', B) to return for each row of A, the value at the index stored in the corresponding row of B.
So, as a test case, let's say I have this:
A = np.random.rand(100,10)
B = np.int32(np.floor(np.random.rand(100)*10))
If I print their shapes, I get:
print A.shape returns (100L, 10L)
print B.shape returns (100L,)
When I try to index into A using B naively (incorrectly)
Test1 = A[:,B]
print Test1.shape returns (100L, 100L)
but if I do
Test2 = A[range(A.shape[0]),B]
print Test2.shape returns (100L,)
which is what I want. I'm having trouble understanding the distinction being made here. In my mind, A[:,5] and A[range(A.shape[0]),5] should return the same thing, but it isn't here. How is : different from using range(sizeArray) which just creates an array from [0:sizeArray] inclusive, to use an indices?
Let's look at a simple array:
In [654]: X=np.arange(12).reshape(3,4)
In [655]: X
Out[655]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
With the slice we can pick 3 columns of X, in any order (and even repeated). In other words, take all the rows, but selected columns.
In [656]: X[:,[3,2,1]]
Out[656]:
array([[ 3, 2, 1],
[ 7, 6, 5],
[11, 10, 9]])
If instead I use a list (or array) of 3 values, it pairs them up with the column values, effectively picking 3 values, X[0,3],X[1,2],X[2,1]:
In [657]: X[[0,1,2],[3,2,1]]
Out[657]: array([3, 6, 9])
If instead I gave it a column vector to index rows, I get the same thing as with the slice:
In [659]: X[[[0],[1],[2]],[3,2,1]]
Out[659]:
array([[ 3, 2, 1],
[ 7, 6, 5],
[11, 10, 9]])
This amounts to picking 9 individual values, as generated by broadcasting:
In [663]: np.broadcast_arrays(np.arange(3)[:,None],np.array([3,2,1]))
Out[663]:
[array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2]]),
array([[3, 2, 1],
[3, 2, 1],
[3, 2, 1]])]
numpy indexing can be confusing. But a good starting point is this page: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html