2 different ways to index 3D array in Numpy? - numpy

I have a 3-D array with dimension of (14,3,5), which correspond to (event, color, taste).
If I want to select all event's second color option and third taste option.
Could someone tell me which is the correct format?
[:,2,3] vs [:,2][:,3]
Are they the same, or different?
If they are different, how are they different?

Do a test:
In [256]: arr = np.arange(2*3*5).reshape(2,3,5)
In [257]: arr
Out[257]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
One way:
In [258]: arr[:,2,3]
Out[258]: array([13, 28])
The other is evaluated in 2 steps:
In [259]: arr[:,2]
Out[259]:
array([[10, 11, 12, 13, 14],
[25, 26, 27, 28, 29]])
In [260]: arr[:,2][:,3]
Out[260]: array([13, 28])
The [:,3] is applied to the result of the [:,2]. Each [] is translated by the interpreter into a __getitem__() call (or a __setitem__ if followed by a =). [:,2,3] is one just call, __getitem__((slice(None),2,3)).
With scalar indices like this, they are the same.
But what if one (or both) index is a list or array?
In [261]: arr[:,[1,2],3]
Out[261]:
array([[ 8, 13],
[23, 28]])
In [262]: arr[:,[1,2]]
Out[262]:
array([[[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
In [263]: arr[:,[1,2]][:,3]
Traceback (most recent call last):
Input In [263] in <cell line: 1>
arr[:,[1,2]][:,3]
IndexError: index 3 is out of bounds for axis 1 with size 2
In [264]: arr[:,[1,2]][:,:,3]
Out[264]:
array([[ 8, 13],
[23, 28]])
At least you are doing the common novice mistake of attempting:
In [265]: arr[:][2][3]
Traceback (most recent call last):
Input In [265] in <cell line: 1>
arr[:][2][3]
IndexError: index 2 is out of bounds for axis 0 with size 2
In the long run you need to read and understand (most of)
https://numpy.org/doc/stable/user/basics.indexing.html

Related

np.array for variable matrix

import numpy as np
data = np.array([[10, 20, 30, 40, 50, 60, 70, 80, 90],
[2, 7, 8, 9, 10, 11],
[3, 12, 13, 14, 15, 16],
[4, 3, 4, 5, 6, 7, 10, 12]],dtype=object)
target = data[:,0]
It has this error.
IndexError Traceback (most recent call last)
Input In \[82\], in \<cell line: 9\>()
data = np.array(\[\[10, 20, 30, 40, 50, 60, 70, 80, 90\],
\[2, 7, 8, 9, 10, 11\],
\[3, 12, 13, 14, 15, 16\],
\[4, 3, 4, 5, 6, 7, 10,12\]\],dtype=object)
# Define the target data ----\> 9 target = data\[:,0\]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
May I know how to fix it, please? I mean do not change the elements in the data. Many thanks. I made the matrix in the same size and the error message was gone. But I have the data with variable size.
You have a array of objects, so you can't use indexing on axis=1 as there is none (data.shape -> (4,)).
Use a list comprehension:
out = np.array([a[0] for a in data])
Output: array([10, 2, 3, 4])

Error trying to solve a matrix using numpy

Im trying to solve for x1 x2 x3 and x4 for this matrix but I keep getting errors.
Matrix A contains all the coefficients for x1 x2 x3 x4 respectively and Matrix B contains what it is equal to.
I wrote the following code which in theory should work but it keeps saying I provided 5 arguments or something like that
import numpy as np
a = np.matrix([2, 5, 6, 4], [5, 10, 9, 5], [7, 17.5, 21, 14], [0, 0, 2, 5])
b = np.matrix([23.5, 34, 82.25, -13])
x = np.linalg.solve(a,b)
print(x)
I shouldn't have to do this, since you should show the full traceback with the error:
In [396]: a = np.matrix([2, 5, 6, 4], [5, 10, 9, 5], [7, 17.5, 21, 14], [0, 0,
...: 2, 5])
...: b = np.matrix([23.5, 34, 82.25, -13])
...:
...: x = np.linalg.solve(a,b)
Traceback (most recent call last):
File "<ipython-input-396-710e1fc00100>", line 1, in <module>
a = np.matrix([2, 5, 6, 4], [5, 10, 9, 5], [7, 17.5, 21, 14], [0, 0, 2, 5])
TypeError: __new__() takes from 2 to 4 positional arguments but 5 were given
Look at that error message! See the np.matrix? Now go to np.matrix docs, and you'll see that the you need to provide ONE list of lists. And extra lists are interpreted as added arguments.
Thus you should use: (note the added [] - they are important.
In [397]: a = np.matrix([[2, 5, 6, 4], [5, 10, 9, 5], [7, 17.5, 21, 14], [0, 0,
...: 2, 5]])
...: b = np.matrix([23.5, 34, 82.25, -13])
...:
...: x = np.linalg.solve(a,b)
Traceback (most recent call last):
File "<ipython-input-397-b90e1785a311>", line 4, in <module>
x = np.linalg.solve(a,b)
File "<__array_function__ internals>", line 180, in solve
File "/usr/local/lib/python3.8/dist-packages/numpy/linalg/linalg.py", line 393, in solve
r = gufunc(a, b, signature=signature, extobj=extobj)
ValueError: solve: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (m,m),(m,n)->(m,n) (size 1 is different from 4)
In [398]: a.shape
Out[398]: (4, 4)
In [399]: b.shape
Out[399]: (1, 4)
Note the shape of b. solve doesn't like that mix of shapes. A (4,1) would probably work. But since we looked at np.matrix docs, lets follow its recommendations, and switch to np.array:
In [400]: a = np.array([[2, 5, 6, 4], [5, 10, 9, 5], [7, 17.5, 21, 14], [0, 0,
...: 2, 5]])
...: b = np.array([23.5, 34, 82.25, -13])
...:
...: x = np.linalg.solve(a,b)
Traceback (most recent call last):
File "<ipython-input-400-b1b6c06db25c>", line 4, in <module>
x = np.linalg.solve(a,b)
File "<__array_function__ internals>", line 180, in solve
File "/usr/local/lib/python3.8/dist-packages/numpy/linalg/linalg.py", line 393, in solve
r = gufunc(a, b, signature=signature, extobj=extobj)
File "/usr/local/lib/python3.8/dist-packages/numpy/linalg/linalg.py", line 88, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
LinAlgError: Singular matrix
In [401]: a
Out[401]:
array([[ 2. , 5. , 6. , 4. ],
[ 5. , 10. , 9. , 5. ],
[ 7. , 17.5, 21. , 14. ],
[ 0. , 0. , 2. , 5. ]])
In [402]: np.linalg.det(a)
Out[402]: 0.0
I assume you know enough linear algebra to understand that problem, and undertake your own fix.

Moving from nested loops to NumPy iterating

I have a system of equations that I am trying to simulate and using very basic looping structures seems to rapidly slow down my computing speed. I have a mock example below to illustrate how I am running the simulation now:
import numpy as np
Imax, Jmax, Tmax = 4, 4, 3
Iset, Jset, Tset = range(0,Imax), range(0,Jmax), range(0,Tmax)
X = np.arange(0,48).reshape(3,4,4)
X[1], X[2] = 4, 2
Y = 2*X
for t in Tset:
if t == 2:
break
else:
for i in Iset:
for j in Jset:
Y[t+1,i,j] = Y[t,i,j] + X[t,i,j]
X[t+1,i,j] = X[t,i,j] + 1
# Output for Y...
array([[[ 0, 2, 4, 6],
[ 8, 10, 12, 14],
[16, 18, 20, 22],
[24, 26, 28, 30]],
[[ 0, 3, 6, 9],
[12, 15, 18, 21],
[24, 27, 30, 33],
[36, 39, 42, 45]],
[[ 1, 5, 9, 13],
[17, 21, 25, 29],
[33, 37, 41, 45],
[49, 53, 57, 61]]])
Intuitively this structure makes sense to me because I am accessing the individual elements of the Y array and updating it, but because I have this looping over very large values and have more going on in the loop, I am experiencing a drastic reduction in computational speed.
I came across nditer and I am hoping that I can use this in place of the multiple nested loops that I have so that I can still get the same result, but faster. How can I go about converting this nested for-loop style into a more efficient iteration scheme?

Numpy array changes shape when accessing with indices

I have a small matrix A with dimensions MxNxO
I have a large matrix B with dimensions KxMxNxP, with P>O
I have a vector ind of indices of dimension Ox1
I want to do:
B[1,:,:,ind] = A
But, the lefthand of my equation
B[1,:,:,ind].shape
is of dimension Ox1xMxN and therefore I can not broadcast A (MxNxO) into it.
Why does accessing B in this way change the dimensions of the left side?
How can I easily achieve my goal?
Thanks
There's a feature, if not a bug, that when slices are mixed in the middle of advanced indexing, the sliced dimensions are put at the end.
Thus for example:
In [204]: B = np.zeros((2,3,4,5),int)
In [205]: ind=[0,1,2,3,4]
In [206]: B[1,:,:,ind].shape
Out[206]: (5, 3, 4)
The 3,4 dimensions have been placed after the ind, 5.
We can get around that by indexing first with 1, and then the rest:
In [207]: B[1][:,:,ind].shape
Out[207]: (3, 4, 5)
In [208]: B[1][:,:,ind] = np.arange(3*4*5).reshape(3,4,5)
In [209]: B[1]
Out[209]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
This only works when that first index is a scalar. If it too were a list (or array), we'd get an intermediate copy, and couldn't set the value like this.
https://docs.scipy.org/doc/numpy-1.15.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing
It's come up in other SO questions, though not recently.
weird result when using both slice indexing and boolean indexing on a 3d array

Split last dimension of arrays in lower dimensional arrays

Assume we have an array with NxMxD shape. I want to get a list with D NxM arrays.
The correct way of doing it would be:
np.dsplit(myarray, D)
However, this returns D NxMx1 arrays.
I can achieve the desired result by doing something like:
[myarray[..., i] for i in range(D)]
Or:
[np.squeeze(subarray) for subarray in np.dsplit(myarray, D)]
However, I feel like it is a bit redundant to need to perform an additional operation. Am I missing any numpy function that returns the desired result?
Try D.swapaxes(1,2).swapaxes(1,0)
>>>import numpy as np
>>>a = np.arange(24).reshape(2,3,4)
>>>a
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
>>>[a[:,:,i] for i in range(4)]
[array([[ 0, 4, 8],
[12, 16, 20]]),
array([[ 1, 5, 9],
[13, 17, 21]]),
array([[ 2, 6, 10],
[14, 18, 22]]),
array([[ 3, 7, 11],
[15, 19, 23]])]
>>>a.swapaxes(1,2).swapaxes(1,0)
array([[[ 0, 4, 8],
[12, 16, 20]],
[[ 1, 5, 9],
[13, 17, 21]],
[[ 2, 6, 10],
[14, 18, 22]],
[[ 3, 7, 11],
[15, 19, 23]]])
Edit: As pointed out by ajcr (thanks again), the transpose command is more convenient since the two swaps can be done in one step by using
D.transpose(2,0,1)
np.dsplit uses np.array_split, the core of which is:
sub_arys = []
sary = _nx.swapaxes(ary, axis, 0)
for i in range(Nsections):
st = div_points[i]; end = div_points[i+1]
sub_arys.append(_nx.swapaxes(sary[st:end], axis, 0))
with axis=-1, this is equivalent to:
[x[...,i:(i+1)] for i in np.arange(x.shape[-1])] # or
[x[...,[i]] for i in np.arange(x.shape[-1])]
which accounts for the singleton dimension.
So there's nothing wrong or inefficient about your
[x[...,i] for i in np.arange(x.shape[-1])]
Actually in quick time tests, any use of dsplit is slow. It's generality costs. So adding squeeze is relatively cheap.
But by accepting the other answer, it looks like you are really looking for an array of the correct shape, rather than a list of arrays. For many operations that makes sense. split is more useful when the subarrays have more than one 'row' along the split axis, or even an uneven number of 'rows'.