Related
I'd like to write a numpy function that takes an MxN array A, a window length L, and an MxP array idxs of starting indices into the M rows of A that selects P arbitrary slices of length L from each of the M rows of A. Except, I would love for this to work on the last dimension of A, and not necessarily care how many dimensions A has, so all dims of A and idxs match except the last one. Examples:
If A is just 1D:
A = np.array([1, 2, 3, 4, 5, 6])
window_len = 3
idxs = np.array([1, 3])
result = magical_routine(A, idxs, window_len)
Where result is a 2x3 array since I selected 2 slices of len 3:
np.array([[ 2, 3, 4],
[ 4, 5, 6]])
If A is 2D:
A = np.array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9,10,11,12],
[13,14,15,16,17,18]])
window_len = 3
idxs = np.array([[1, 3],
[0, 1],
[2, 2]])
result = magical_routine(A, idxs, window_len)
Where result is a 3x2x3 array since there are 3 rows of A, and I selected 2 slices of len 3 from each row:
np.array([[[ 2, 3, 4], [ 4, 5, 6]],
[[ 7, 8, 9], [ 8, 9,10]],
[[15,16,17], [15,16,17]]])
And so on.
I have discovered an number of inefficient ways to do this, along with ways that work for a specific number of dimensions of A. For 2D, the following is pretty tidy:
col_idxs = np.add.outer(idxs, np.arange(window_len))
np.take_along_axis(A[:, np.newaxis], col_idxs, axis=-1)
I can't see a nice way to generalize this for 1D and other D's though...
Is anyone aware of an efficient way that generalizes to any number of dims?
For your 1d case
In [271]: A=np.arange(1,7)
In [272]: idxs = np.array([1,3])
Using the kind of iteration that this questions usually gets:
In [273]: np.vstack([A[i:i+3] for i in idxs])
Out[273]:
array([[2, 3, 4],
[4, 5, 6]])
Alternatively generate all indices, and one indexing. linspace is handy for this (though it's not the only option):
In [278]: j = np.linspace(idxs,idxs+3,3,endpoint=False)
In [279]: j
Out[279]:
array([[1., 3.],
[2., 4.],
[3., 5.]])
In [282]: A[j.T.astype(int)]
Out[282]:
array([[2, 3, 4],
[4, 5, 6]])
for the 2d
In [284]: B
Out[284]:
array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18]])
In [285]: idxs = np.array([[1, 3],
...: [0, 1],
...: [2, 2]])
In [286]: j = np.linspace(idxs,idxs+3,3,endpoint=False)
In [287]: j
Out[287]:
array([[[1., 3.],
[0., 1.],
[2., 2.]],
[[2., 4.],
[1., 2.],
[3., 3.]],
[[3., 5.],
[2., 3.],
[4., 4.]]])
With a bit of trial and error, pair up the indices to get:
In [292]: B[np.arange(3)[:,None,None],j.astype(int).transpose(1,2,0)]
Out[292]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
Or iterate as in the first case, but with an extra layer:
In [294]: np.array([[B[j,i:i+3] for i in idxs[j]] for j in range(3)])
Out[294]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
With sliding windows:
In [295]: aa = np.lib.stride_tricks.sliding_window_view(A,3)
In [296]: aa.shape
Out[296]: (4, 3)
In [297]: aa
Out[297]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])
In [298]: aa[[1,3]]
Out[298]:
array([[2, 3, 4],
[4, 5, 6]])
and
In [300]: bb = np.lib.stride_tricks.sliding_window_view(B,(1,3))
In [301]: bb.shape
Out[301]: (3, 4, 1, 3)
In [302]: bb[np.arange(3)[:,None],idxs,0,:]
Out[302]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
I got it! I was almost there:
def magical_routine(A, idxs, window_len=2000):
col_idxs = np.add.outer(idxs, np.arange(window_len))
return np.take_along_axis(A[..., np.newaxis, :], col_idxs, axis=-1)
I just needed to always add the new axis to A's second to last dim, and then leave remaining axes alone.
I wonder if this is possible, so I have two 2D arrays:
X[7][9] = 10
Y[7][9] = 5
From above info I want to create following two 2D arrays:
X'[5][10] = 9
Y'[5][10] = 7
Is it possible to accomplish this? Values of X and Y are bounded and won't exceed shape of X and Y. Also X and Y has the same shape.
thanks in advance.
You should be able to use np.nditer to keep track of the multi-index and the corresponding values of the arrays.
rng = np.random.RandomState(0)
X = rng.randint(low=0, high=10, size=(10, 10))
Y = rng.randint(low=0, high=10, size=(10, 10))
X_prime = X.copy()
Y_prime = Y.copy()
it = np.nditer([X, Y], flags=['multi_index'])
for x, y in it:
i, j = it.multi_index
X_prime[y, x] = j
Y_prime[y, x] = i
I believe this is the result you were expecting:
>>> X[7, 9], Y[7, 9]
(3, 9)
>>> X_prime[9, 3], Y_prime[9, 3]
(9, 7)
>>> X[1, 2], Y[1, 2]
(8, 2)
>>> X_prime[2, 8], Y_prime[2, 8]
(2, 1)
In [147]: X = np.random.randint(0,5,(5,5))
In [148]: Y = np.random.randint(0,5,(5,5))
Similar to Matt's answer, but using ndindex to generate the indices. There are various ways of generating all such values. Internally I believe ndindex uses nditer:
In [149]: X_,Y_ = np.zeros_like(X)-1,np.zeros_like(Y)-1
In [150]: for i,j in np.ndindex(*X.shape):
...: k,l = X[i,j], Y[i,j]
...: X_[k,l] = i
...: Y_[k,l] = j
...:
In [151]: X
Out[151]:
array([[2, 4, 3, 4, 2],
[0, 3, 0, 2, 3],
[1, 1, 4, 4, 4],
[2, 1, 2, 2, 0],
[0, 1, 0, 1, 4]])
In [152]: Y
Out[152]:
array([[1, 2, 1, 3, 0],
[4, 2, 4, 0, 4],
[4, 3, 3, 2, 1],
[0, 3, 0, 2, 2],
[1, 4, 2, 0, 0]])
In [153]: X_
Out[153]:
array([[-1, 4, 4, -1, 1],
[ 4, -1, -1, 3, 4],
[ 3, 0, 3, -1, -1],
[-1, 0, 1, -1, 1],
[ 4, 2, 2, 2, -1]])
In [154]: Y_
Out[154]:
array([[-1, 0, 2, -1, 2],
[ 3, -1, -1, 1, 1],
[ 2, 0, 3, -1, -1],
[-1, 2, 1, -1, 4],
[ 4, 4, 3, 2, -1]])
Notice that with randomly generated arrays, the mapping is not full (the -1 values). And if there are duplicates, the last replaces previous values.
Handling duplicates - note the change in X_:
In [156]: for i,j in np.ndindex(*X.shape):
...: k,l = X[i,j], Y[i,j]
...: if X_[k,l]==-1:
...: X_[k,l] = i
...: Y_[k,l] = j
...: else:
...: X_[k,l] += i
...: Y_[k,l] += j
...:
...:
In [157]: X_
Out[157]:
array([[-1, 4, 7, -1, 2],
[ 4, -1, -1, 5, 6],
[ 7, 0, 3, -1, -1],
[-1, 0, 1, -1, 1],
[ 4, 2, 2, 2, -1]])
If the mapping is complete and one to one, it might be possible to do this mapping in a whole-array non-iterative fashion, which would be faster than this.
trainX.size == 43120000
trainX = trainX.reshape([-1, 28, 28, 1])
(1)Does reshape accept a list as an argment instead of a tuple?
(2)Are the following two statements equivalent?
trainX = trainX.reshape([-1, 28, 28, 1])
trainX = trainX.reshape((55000, 28, 28, 1))
Try the variations:
In [1]: np.arange(12).reshape(3,4)
Out[1]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [2]: np.arange(12).reshape([3,4])
Out[2]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [3]: np.arange(12).reshape((3,4))
Out[3]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
With the reshape method, the shape can be arguments, a tuple or a list. In the reshape function is has to be in a list or tuple, to separate them from the first array argument
In [4]: np.reshape(np.arange(12), (3,4))
Out[4]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
and yes, one -1 can be used. The total size of the reshape is fixed, so one value can be deduced from the others.
In [5]: np.arange(12).reshape(-1,4)
Out[5]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
The method documentation has this note:
Unlike the free function numpy.reshape, this method on ndarray allows
the elements of the shape parameter to be passed in as separate arguments.
For example, a.reshape(10, 11) is equivalent to
a.reshape((10, 11)).
It's a builtin function, but the signature looks like x.reshape(*shape), and it tries to be flexible as long as the values make sense.
From the numpy documentation:
newshape : int or tuple of ints
The new shape should be compatible with the original shape. If an
integer, then the result will be a 1-D array of that length. One shape
dimension can be -1. In this case, the value is inferred from the
length of the array and remaining dimensions.
So yes, -1 for one dimension is fine and your two statements are equivalent. About the tuple requirement,
>>> import numpy as np
>>> a = np.arange(9)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8])
>>> a.reshape([3,3])
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>>
So apparently a list is good as well.
Consider 3D tensor of T(w x h x d).
The goal is to create a tensor of R(w x h x K) where K = d x k by tiling along 3rd dimension in a unique way.
The tensor should repeat each slice in 3rd dimension k times, meaning :
T[:,:,0]=R[:,:,0:k] and T[:,:,1]=R[:,:,k:2*k]
There's a subtle difference with standard tiling which gives T[:,:,0]=R[:,:,::k], repeats at every kth in 3rd dimension.
Use np.repeat along that axis -
np.repeat(T,k,axis=2)
Sample run -
In [688]: # Setup
...: w,h,d = 2,3,4
...: k = 2
...: T = np.random.randint(0,9,(w,h,d))
...:
...: # Original approach
...: R = np.zeros((w,h,d*k),dtype=T.dtype)
...: for i in range(4):
...: R[:,:,i*k:(i+1)*k] = T[:,:,i][...,None]
...:
In [692]: T
Out[692]:
array([[[4, 5, 6, 4],
[5, 4, 4, 3],
[8, 0, 0, 8]],
[[7, 3, 8, 0],
[8, 7, 0, 8],
[3, 6, 8, 5]]])
In [690]: R
Out[690]:
array([[[4, 4, 5, 5, 6, 6, 4, 4],
[5, 5, 4, 4, 4, 4, 3, 3],
[8, 8, 0, 0, 0, 0, 8, 8]],
[[7, 7, 3, 3, 8, 8, 0, 0],
[8, 8, 7, 7, 0, 0, 8, 8],
[3, 3, 6, 6, 8, 8, 5, 5]]])
In [691]: np.allclose(R, np.repeat(T,k,axis=2))
Out[691]: True
Alternatively with np.tile and reshape -
np.tile(T[...,None],k).reshape(w,h,-1)
I'm trying to do a matrix multiplication of two vectors in numpy which would result in an array.
Example
In [108]: b = array([[1],[2],[3],[4]])
In [109]: a =array([1,2,3])
In [111]: b.shape
Out[111]: (4, 1)
In [112]: a.shape
Out[112]: (3,)
In [113]: b.dot(a)
ValueError: objects are not aligned
As can be seen from the shapes, the array a isn't actually a matrix. The catch is to define a like this.
In [114]: a =array([[1,2,3]])
In [115]: a.shape
Out[115]: (1, 3)
In [116]: b.dot(a)
Out[116]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
How to achieve the same result when acquiring the vectors as fields or columns of a matrix?
In [137]: mat = array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
In [138]: x = mat[:,0] #[1,2,3,4]
In [139]: y = mat[0,:] #[1,2,3]
In [140]: x.dot(y)
ValueError: objects are not aligned
You are computing the outer product of two vectors. You can use the function numpy.outer for this:
In [18]: a
Out[18]: array([1, 2, 3])
In [19]: b
Out[19]: array([10, 20, 30, 40])
In [20]: numpy.outer(b, a)
Out[20]:
array([[ 10, 20, 30],
[ 20, 40, 60],
[ 30, 60, 90],
[ 40, 80, 120]])
Use 2d arrays instead of 1d vectors and broadcasting with the * ...
In [8]: #your code from above
In [9]: y = mat[0:1,:]
In [10]: y
Out[10]: array([[1, 2, 3]])
In [11]: x = mat[:,0:1]
In [12]: x
Out[12]:
array([[1],
[2],
[3],
[4]])
In [13]: x*y
Out[13]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
It's the similar catch as in the basic example.
Both x and y aren't perceived as matrices but as single dimensional arrays.
In [143]: x.shape
Out[143]: (4,)
In [144]: y.shape
Out[144]: (3,)
We have to add the second dimension to them, which will be 1.
In [171]: x = array([x]).transpose()
In [172]: x.shape
Out[172]: (4, 1)
In [173]: y = array([y])
In [174]: y.shape
Out[174]: (1, 3)
In [175]: x.dot(y)
Out[175]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])