I have a function init_tensor() which broadcasts a 2d matrix of dimension (N,N) into 3d block matrix of dim (M,N,N) so that there are M matrices of dimension NXN:
def init_tensor(input_state, sample_size):
return np.broadcast_to(input_state, (sample_size,)+input_state.shape)
So for example if I want to create 3 (4x4) matrices then I could do:
init_tensor(np.eye(4, dtype=complex), 3)
Out[462]:
array([[[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j]],
[[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j]],
[[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j]]])
The problem I have is that I have some arrays with dim (1,M) which I'd like to fill into the 3D array as its elements. For a simple case if M was 3 and I have:
lambda1 = [l11,l12,l13]
lambda2 = [l21,l22,l23]
lambda3 = [l31,l32,l33]
tau1 = [t11,t12,t13]
tau2 = [t21,t22,t23]
tau3 = [t31,t32,t33]
I'd like a vectorised way where I can fill them into the tensor such that it becomes:
array([[[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
[ t11, l11, 0.+0.j, 0.+0.j],
[ t21, 0.+0.j, l21, 0.+0.j],
[ t31, 0.+0.j, 0.+0.j, l31]],
[[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
[ t12, l12, 0.+0.j, 0.+0.j],
[ t22, 0.+0.j, l22, 0.+0.j],
[ t32, 0.+0.j, 0.+0.j, l32]],
[[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
[ t13, l13, 0.+0.j, 0.+0.j],
[ t23, 0.+0.j, l23, 0.+0.j],
[ t33, 0.+0.j, 0.+0.j, l33]]])
The depth of the tensor matrix will always be the same as the length of the 1D arrays, and the value of M can vary between 1 to 100.
You can use numpys advanced indexing to get this result:
import numpy as np
m = 3
n = 4
arr = np.ones((m, n, n), dtype=int)
lmbda = 10 * np.arange(1, (n-1)**2+1).reshape(n-1, n-1)
# lmbda = [[l11,l12,l13]
# [l21,l22,l23]
# [l31,l32,l33]]
# array([[10, 20, 30],
# [40, 50, 60],
# [70, 80, 90]])
tau = -10 * np.arange(1, (n-1)**2+1).reshape(n-1, n-1)
# array([[-10, -20, -30],
# [-40, -50, -60],
# [-70, -80, -90]])
arr[:, range(1, n), range(1, n)] = lmbda.T
arr[:, range(1, n), 0] = tau.T # arr[:, 1:, 0] = tau.T # alternative
# array([[[ 1, 1, 1, 1],
# [-10, 10, 1, 1],
# [-40, 1, 40, 1],
# [-70, 1, 1, 70]],
# [[ 1, 1, 1, 1],
# [-20, 20, 1, 1],
# [-50, 1, 50, 1],
# [-80, 1, 1, 80]],
# [[ 1, 1, 1, 1],
# [-30, 30, 1, 1],
# [-60, 1, 60, 1],
# [-90, 1, 1, 90]]])
Related
Consider the following:
A = np.zeros((100,100)) # TODO: populate A
filt = median_filter(A, size=5) # doesn't impact A.shape
view = filt[30:40, 30:40]
subvew = view[0:5, 0:5]
Is it possible to extract from subview the corresponding rectangle within A?
I'd like to do something like:
coords = get_rect(subview)
rect_A = A[coords]
But if I'm constantly having to pass bounding-rects thru the system the code uglifies fast.
numpy must store this information internally, but is it possible to access it?
PS I'm not doing anything fancy like view = A[::2]
PPS From reviewing the excellent answer, it looks like it should be possible to subclass numpy.ndarray, adding a .parent property and a .get_global_rect() method. But it looks like a HARD task.
In [40]: x = np.arange(24).reshape(4,6)
__array_interface__ is a way of viewing everything about a numpy array.
In [41]: x.__array_interface__
Out[41]:
{'data': (43385712, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (4, 6),
'version': 3}
In [42]: x.strides
Out[42]: (48, 8)
For a view:
In [43]: y = x[:3,1:4]
In [44]: y
Out[44]:
array([[ 1, 2, 3],
[ 7, 8, 9],
[13, 14, 15]])
In [45]: y.__array_interface__
Out[45]:
{'data': (43385720, False),
'strides': (48, 8),
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (3, 3),
'version': 3}
In [46]: y.base
Out[46]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23])
x.base is the same, the original np.arange(24).
The key difference in y is the shape, and data value, which "points" 8 bytes further along.
So while one could, in theory, deduce the indexing used to create y, numpy does not have a function or method to do that for us. Keeping track of your own "coordinates" is the best option.
Another way to put it, y is a numpy.ndarray, just like x. It does not carry any extra information about how it was created. The same applies to z, a view of y.
As for the 1d base
In [48]: x.base.strides
Out[48]: (8,)
In [49]: x.base.shape
Out[49]: (24,)
In [50]: x.base.__array_interface__
Out[50]:
{'data': (43385712, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (24,),
'version': 3}
Question
Why are the numpy tuple indexing behaviors inconsistent? Please explain the rational or design decision behind these behaviors. In my understanding, Z[(0,2)] and Z[(0, 2), (0)] are both tuple indexing and expected the consistent behavior for copy/view. If this is incorrect, please explain,
import numpy as np
Z = np.arange(36).reshape(3, 3, 4)
print("Z is \n{}\n".format(Z))
b = Z[
(0,2) # Select Z[0][2]
]
print("Tuple indexing Z[(0,2)] is \n{}\nIs view? {}\n".format(
b,
b.base is not None
))
c = Z[ # Select Z[0][0][1] & Z[0][2][1]
(0,2),
(0)
]
print("Tuple indexing Z[(0, 2), (0)] is \n{}\nIs view? {}\n".format(
c,
c.base is not None
))
Z is
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
[[24 25 26 27]
[28 29 30 31]
[32 33 34 35]]]
Tuple indexing Z[(0,2)] is
[ 8 9 10 11]
Is view? True
Tuple indexing Z[(0, 2), (0)] is
[[ 0 1 2 3]
[24 25 26 27]]
Is view? False
Numpy indexing is confusing and wonder how people built the understanding. If there is a good way to understand or cheat-sheets, please advise.
It's the comma that creates a tuple. The () just set boundaries where needed.
Thus
Z[(0,2)]
Z[0,2]
are the same, select on the first 2 dimension. Whether that returns an element, or an array depends on how many dimensions Z has.
The same interpretation applies to the other case.
Z[(0, 2), (0)]
Z[( np.array([0,2]), 0)]
Z[ np.array([0,2]), 0]
are the same - the first dimensions is indexed with a list/array, and thus is advanced indexing. It's a copy.
[ 8 9 10 11]
is a row of the 3d array; its a contiguous block of Z
[[ 0 1 2 3]
[24 25 26 27]]
is 2 rows from Z. They aren't contiguous, so there's no way of identifying them with just shape and strides (and offset in the databuffer).
details
__array_interface__ gives details about the underlying data of an array
In [146]: Z = np.arange(36).reshape(3,3,4)
In [147]: Z.__array_interface__
Out[147]:
{'data': (38255712, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (3, 3, 4),
'version': 3}
In [148]: Z.strides
Out[148]: (96, 32, 8)
For the view:
In [149]: Z1 = Z[0,2]
In [150]: Z1
Out[150]: array([ 8, 9, 10, 11])
In [151]: Z1.__array_interface__
Out[151]:
{'data': (38255776, False), # 38255712+8*8
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (4,),
'version': 3}
The data buffer pointer is 8 elements further along in Z buffer. Shape is much reduced.
In [152]: Z2 = Z[[0,2],0]
In [153]: Z2
Out[153]:
array([[ 0, 1, 2, 3],
[24, 25, 26, 27]])
In [154]: Z2.__array_interface__
Out[154]:
{'data': (31443104, False), # an entirely different location
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (2, 4),
'version': 3}
Z2 is the same as two selections:
In [158]: Z[0,0]
Out[158]: array([0, 1, 2, 3])
In [159]: Z[2,0]
Out[159]: array([24, 25, 26, 27])
It is not
Z[0][0][1] & Z[0][2][1]
Z[0,0,1] & Z[0,2,1]
Compare that with a 2 row slice:
In [156]: Z3 = Z[0:2,0]
In [157]: Z3.__array_interface__
Out[157]:
{'data': (38255712, False), # same as Z's
'strides': (96, 8),
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (2, 4),
'version': 3}
A view is returned if the new array can be described with shape, strides and all or part of the original data buffer.
I have a matrix that looks like that:
>> X
>>
[[5.1 1.4]
[4.9 1.4]
[4.7 1.3]
[4.6 1.5]
[5. 1.4]]
I want to get its first column as an array of [5.1, 4.9, 4.7, 4.6, 5.]
However when I try to get it by X[:,0] i get
>> [[5.1]
[4.9]
[4.7]
[4.6]
[5. ]]
which is something different. How to get it as an array ?
You can use list comprehensions for this kind of thing..
import numpy as np
X = np.array([[5.1, 1.4], [4.9, 1.4], [4.7, 1.3], [4.6, 1.5], [5.0, 1.4]])
X_0 = [i for i in X[:,0]]
print(X_0)
Output..
[5.1, 4.9, 4.7, 4.6, 5.0]
Almost there! Just reshape your result:
X[:,0].reshape(1,-1)
Outputs:
[[5.1 4.9 4.7 4.6 5. ]]
Full code:
import numpy as np
X=np.array([[5.1 ,1.4],[4.9 ,1.4], [4.7 ,1.3], [4.6 ,1.5], [5. , 1.4]])
print(X)
print(X[:,0].reshape(1,-1))
With regular numpy array:
In [3]: x = np.arange(15).reshape(5,3)
In [4]: x
Out[4]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [5]: x[:,0]
Out[5]: array([ 0, 3, 6, 9, 12])
With np.matrix (use discouraged if not actually deprecated)
In [6]: X = np.matrix(x)
In [7]: X
Out[7]:
matrix([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [8]: print(X)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
In [9]: X[:,0]
Out[9]:
matrix([[ 0],
[ 3],
[ 6],
[ 9],
[12]])
In [10]: X[:,0].T
Out[10]: matrix([[ 0, 3, 6, 9, 12]])
To get 1d array, convert to array and ravel, or in one step:
In [11]: X[:,0].A1
Out[11]: array([ 0, 3, 6, 9, 12])
If I create a numpy array, and another to serve as a selective index into it:
>>> x
array([[ 2, 3, 4],
[ 5, 6, 7],
[ 6, 7, 8],
[11, 12, 13]])
>>> nz
array([ True, True, False, True], dtype=bool)
then direct use of nz returns a view of the original array:
>>> x[nz,:]
array([[ 2, 3, 4],
[ 5, 6, 7],
[11, 12, 13]])
>>> x[nz,:] += 2
>>> x
array([[ 4, 5, 6],
[ 7, 8, 9],
[ 6, 7, 8],
[13, 14, 15]])
however, naturally, an assignment makes a copy:
>>> v = x[nz,:]
Any operation on v is on the copy, and has no effect on the original array.
Is there any way to create a named view, from x[nz,:], simply to abbreviate code, or which I can pass around, so operations on the named view will affect only the selected elements of x?
Numpy has masked_array, which might be what you are looking for:
import numpy as np
x = np.asarray([[ 2, 3, 4],[ 5, 6, 7],[ 6, 7, 8],[11, 12, 13]])
nz = np.asarray([ True, True, False, True], dtype=bool)
mx = np.ma.masked_array(x, ~nz.repeat(3)) # True means masked, so "~" is needed
mx += 2
# x changed as well because it is the base of mx
print(x)
print(x is mx.base)
Code:
shape = np.array([6, 6])
grid = np.array([x.ravel() for x in np.meshgrid(*[np.arange(x) for i, x in enumerate(shape)], indexing='ij')]).T
slices = [tuple(slice(box[i], box[i] + 2) for i in range(len(box))) for box in grid]
score = np.zeros((7,7,3))
column = np.random.randn(36, 12) #just for example
column
>> array([[ 0, 1, 2, 3, ... 425, 426, 427, 428, 429, 430, 431]])
column = column.reshape((16, 3, 3, 3))
for i, window in enumerate(slices):
score[window] += column[i]
score
>> array([[[0.000e+00, 1.000e+00, 2.000e+00],
[3.000e+01, 3.200e+01, 3.400e+01],
[9.000e+01, 9.300e+01, 9.600e+01], ...
[8.280e+02, 8.300e+02, 8.320e+02],
[4.290e+02, 4.300e+02, 4.310e+02]]])
It works but last 2 lines take really much time as they will be in loop. The problem is that 'grid' variable contains an array of windows. And I don't now how to speed up the process.
Let's simplify the problem at bit - reduce the dimensions, and drop the final size 3 dimension:
In [265]: shape = np.array([4,4])
In [266]: grid = np.array([x.ravel() for x in np.meshgrid(*[np.arange(x) for i
...: , x in enumerate(shape)], indexing='ij')]).T
...: grid = [tuple(slice(box[i], box[i] + 3) for i in range(len(box))) fo
...: r box in grid]
...:
...:
In [267]: len(grid)
Out[267]: 16
In [268]: score = np.arange(36).reshape(6,6)
In [269]: X = np.array([score[x] for x in grid]).reshape(4,4,3,3)
In [270]: X
Out[270]:
array([[[[ 0, 1, 2],
[ 6, 7, 8],
[12, 13, 14]],
[[ 1, 2, 3],
[ 7, 8, 9],
[13, 14, 15]],
[[ 2, 3, 4],
[ 8, 9, 10],
[14, 15, 16]],
....
[[21, 22, 23],
[27, 28, 29],
[33, 34, 35]]]])
This is a moving window - one (3,3) array, shift over 1,..., shift down 1, etc
With as_strided is is possible to construct a view of the array, that consists of all these windows, but without actually copying values. Having worked with as_strided before I was able construct the equivalent strides as:
In [271]: score.shape
Out[271]: (6, 6)
In [272]: score.strides
Out[272]: (48, 8)
In [273]: ast = np.lib.stride_tricks.as_strided
In [274]: x=ast(score, shape=(4,4,3,3), strides=(48,8,48,8))
In [275]: np.allclose(X,x)
Out[275]: True
This could be extended to your (28,28,3) dimensions, and turned into the summation.
Generating such moving windows has been covered in previous SO questions. And it's also implemented in one of the image processing packages.
Adaptation for a 3 channel image,
In [45]: arr.shape
Out[45]: (6, 6, 3)
In [46]: arr.strides
Out[46]: (144, 24, 8)
In [47]: arr[:3,:3,0]
Out[47]:
array([[ 0., 1., 2.],
[ 6., 7., 8.],
[12., 13., 14.]])
In [48]: x = ast(arr, shape=(4,4,3,3,3), strides=(144,24,144,24,8))
In [49]: x[0,0,:,:,0]
Out[49]:
array([[ 0., 1., 2.],
[ 6., 7., 8.],
[12., 13., 14.]])
Since we are moving the window by one element at a time, the strides for x are easily derived form the source strides.
For 4x4 windows, just change the shape
x = ast(arr, shape=(3,3,4,4,3), strides=(144,24,144,24,8))
In Efficiently Using Multiple Numpy Slices for Random Image Cropping
#Divikar suggests using skimage
With the default step=1, the result is compatible:
In [55]: from skimage.util.shape import view_as_windows
In [63]: y = view_as_windows(arr,(4,4,3))
In [64]: y.shape
Out[64]: (3, 3, 1, 4, 4, 3)
In [69]: np.allclose(x,y[:,:,0])
Out[69]: True