Problem
This is related to a question I asked here, about reversing a 2D numpy array column-wise, with random indices per row. For example.
np.random.seed(0)
arr = np.repeat(np.arange(6)[np.newaxis], 100000, axis=0)
m, n = arr.shape
indices = np.sort(np.random.rand(m, n).argsort(1)[:,:2], axis=1).astype("int32").
# reverse
for idx, (i, j) in enumerate(reverse):
arr[idx, i:j+1] = arr[idx, i:j+1][::-1]
---------------------------------------------------------------------------------------------------------
# `arr` # `indices` # `output`
array([[0, 1, 2, 3, 4, 5], array([[3, 4], array([[0, 1, 2, 4, 3, 5],
[0, 1, 2, 3, 4, 5], [0, 3], [3, 2, 1, 0, 4, 5],
[0, 1, 2, 3, 4, 5], [2, 4], [0, 1, 4, 3, 2, 5],
..., ..., --> ...,
[0, 1, 2, 3, 4, 5], [3, 5], [0, 1, 2, 5, 4, 3],
[0, 1, 2, 3, 4, 5], [1, 4], [0, 4, 3, 2, 1, 5],
[0, 1, 2, 3, 4, 5]]) [0, 2]]) [2, 1, 0, 3, 4, 5]])
I experimented with a Cython version of this function, given below.
cpdef int[:,:] reverse_indices(int[:,:] arr, int[:,:] indices):
cdef:
Py_ssize_t idx, N = arr.shape[0]
int i, j
for idx in range(N):
i, j = indices[idx, 0], indices[idx, 1]
arr[idx, i:j + 1] = arr[idx, i:j + 1][::-1]
return arr
But I found that this was "only" ~8 times quicker than a pure python version (47ms vs. 375ms). I found that line arr[idx, i:j + 1] = arr[idx, i:j + 1][::-1] was especially slow. I removed this and replaced with the following to perform the reverse operation.
for idx in range(N):
i, j = indices[idx, 0], indices[idx, 1]
while i < j + 1:
arr[idx,i], arr[idx,j] = arr[idx,j], arr[idx,i]
i += 1
j -= 1
return arr
This solution was ~157 times quicker than the python version (2.4ms vs. 375ms).
Question
I found that even, without any reversing ([::-1]), i.e. arr[idx, i:j + 1] = arr[idx, i:j + 1], has a similar effect on performance. Why does slicing a section of an array have such a significant effect, vs. indexing a single element?
Is there a better way to reverse a section of an array/list in Cython?
Related
I wonder if this is possible, so I have two 2D arrays:
X[7][9] = 10
Y[7][9] = 5
From above info I want to create following two 2D arrays:
X'[5][10] = 9
Y'[5][10] = 7
Is it possible to accomplish this? Values of X and Y are bounded and won't exceed shape of X and Y. Also X and Y has the same shape.
thanks in advance.
You should be able to use np.nditer to keep track of the multi-index and the corresponding values of the arrays.
rng = np.random.RandomState(0)
X = rng.randint(low=0, high=10, size=(10, 10))
Y = rng.randint(low=0, high=10, size=(10, 10))
X_prime = X.copy()
Y_prime = Y.copy()
it = np.nditer([X, Y], flags=['multi_index'])
for x, y in it:
i, j = it.multi_index
X_prime[y, x] = j
Y_prime[y, x] = i
I believe this is the result you were expecting:
>>> X[7, 9], Y[7, 9]
(3, 9)
>>> X_prime[9, 3], Y_prime[9, 3]
(9, 7)
>>> X[1, 2], Y[1, 2]
(8, 2)
>>> X_prime[2, 8], Y_prime[2, 8]
(2, 1)
In [147]: X = np.random.randint(0,5,(5,5))
In [148]: Y = np.random.randint(0,5,(5,5))
Similar to Matt's answer, but using ndindex to generate the indices. There are various ways of generating all such values. Internally I believe ndindex uses nditer:
In [149]: X_,Y_ = np.zeros_like(X)-1,np.zeros_like(Y)-1
In [150]: for i,j in np.ndindex(*X.shape):
...: k,l = X[i,j], Y[i,j]
...: X_[k,l] = i
...: Y_[k,l] = j
...:
In [151]: X
Out[151]:
array([[2, 4, 3, 4, 2],
[0, 3, 0, 2, 3],
[1, 1, 4, 4, 4],
[2, 1, 2, 2, 0],
[0, 1, 0, 1, 4]])
In [152]: Y
Out[152]:
array([[1, 2, 1, 3, 0],
[4, 2, 4, 0, 4],
[4, 3, 3, 2, 1],
[0, 3, 0, 2, 2],
[1, 4, 2, 0, 0]])
In [153]: X_
Out[153]:
array([[-1, 4, 4, -1, 1],
[ 4, -1, -1, 3, 4],
[ 3, 0, 3, -1, -1],
[-1, 0, 1, -1, 1],
[ 4, 2, 2, 2, -1]])
In [154]: Y_
Out[154]:
array([[-1, 0, 2, -1, 2],
[ 3, -1, -1, 1, 1],
[ 2, 0, 3, -1, -1],
[-1, 2, 1, -1, 4],
[ 4, 4, 3, 2, -1]])
Notice that with randomly generated arrays, the mapping is not full (the -1 values). And if there are duplicates, the last replaces previous values.
Handling duplicates - note the change in X_:
In [156]: for i,j in np.ndindex(*X.shape):
...: k,l = X[i,j], Y[i,j]
...: if X_[k,l]==-1:
...: X_[k,l] = i
...: Y_[k,l] = j
...: else:
...: X_[k,l] += i
...: Y_[k,l] += j
...:
...:
In [157]: X_
Out[157]:
array([[-1, 4, 7, -1, 2],
[ 4, -1, -1, 5, 6],
[ 7, 0, 3, -1, -1],
[-1, 0, 1, -1, 1],
[ 4, 2, 2, 2, -1]])
If the mapping is complete and one to one, it might be possible to do this mapping in a whole-array non-iterative fashion, which would be faster than this.
I'm trying to concatenate 2 arrays element wise. I have the concatenation working to produce the correct shape but it has not been applied element wise.
So i have this array
[0, 1]
[2, 3]
[4, 5]
I want to append each element in the array with each element. the target result would be
[0, 1, 0, 1]
[0, 1, 2, 3]
[0, 1, 4, 5]
[2, 3, 0, 1]
[2, 3, 2, 3]
[2, 3, 4, 5]
[4, 5, 0, 1]
[4, 5, 2, 3]
[4, 5, 4, 5]
i think i may need to change an axis but then i can't get the broadcasting to work.
any help would be greatly appreciated. lots to learn in numpy !
a = np.arange(6).reshape(3, 2))
b = np.concatenate((a, a), axis=1)
One way would be stacking replicated versions created with np.repeat and np.tile -
In [52]: n = len(a)
In [53]: np.hstack((np.repeat(a,n,axis=0),np.tile(a,(n,1))))
Out[53]:
array([[0, 1, 0, 1],
[0, 1, 2, 3],
[0, 1, 4, 5],
[2, 3, 0, 1],
[2, 3, 2, 3],
[2, 3, 4, 5],
[4, 5, 0, 1],
[4, 5, 2, 3],
[4, 5, 4, 5]])
Another would be with broadcasted-assignment, since you mentioned broadcasting -
def create_mesh(a):
m,n = a.shape
out = np.empty((m,m,2*n),dtype=a.dtype)
out[...,:n] = a[:,None]
out[...,n:] = a
return out.reshape(-1,2*n)
One solution is to build on senderle's cartesian_product to extend this to 2D arrays. Here's how I usually do this:
# Your input array.
arr
# array([[0, 1],
# [2, 3],
# [4, 5]])
idxs = cartesian_product(*[np.arange(len(arr))] * 2)
arr[idxs].reshape(idxs.shape[0], -1)
# array([[0, 1, 0, 1],
# [0, 1, 2, 3],
# [0, 1, 4, 5],
# [2, 3, 0, 1],
# [2, 3, 2, 3],
# [2, 3, 4, 5],
# [4, 5, 0, 1],
# [4, 5, 2, 3],
# [4, 5, 4, 5]])
Suppose I have a matrix A with some arbitrary values:
array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
And a matrix B which contains indices of elements in A:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
How do I select values from A pointed by B, i.e.:
A[B] = [[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]]
EDIT: np.take_along_axis is a builtin function for this use case implemented since numpy 1.15. See #hpaulj 's answer below for how to use it.
You can use NumPy's advanced indexing -
A[np.arange(A.shape[0])[:,None],B]
One can also use linear indexing -
m,n = A.shape
out = np.take(A,B + n*np.arange(m)[:,None])
Sample run -
In [40]: A
Out[40]:
array([[2, 4, 5, 3],
[1, 6, 8, 9],
[8, 7, 0, 2]])
In [41]: B
Out[41]:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
In [42]: A[np.arange(A.shape[0])[:,None],B]
Out[42]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
In [43]: m,n = A.shape
In [44]: np.take(A,B + n*np.arange(m)[:,None])
Out[44]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
More recent versions have added a take_along_axis function that does the job:
A = np.array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
B = np.array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
np.take_along_axis(A, B, 1)
Out[]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
There's also a put_along_axis.
I know this is an old question, but another way of doing it using indices is:
A[np.indices(B.shape)[0], B]
output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]
Following is the solution using for loop:
outlist = []
for i in range(len(B)):
lst = []
for j in range(len(B[i])):
lst.append(A[i][B[i][j]])
outlist.append(lst)
outarray = np.asarray(outlist)
print(outarray)
Above can also be written in more succinct list comprehension form:
outlist = [ [A[i][B[i][j]] for j in range(len(B[i]))]
for i in range(len(B)) ]
outarray = np.asarray(outlist)
print(outarray)
Output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]
In NumPy, suppose I have a matrix X:
X = array([[3, 1, 4, 5], [5, 1, 2, 1], [4, 4, 0, 1], [0, 3, 0, 3], [1, 2, 3, 4])
How can I construct a new matrix using the first (row 0), last second and last (row 3, 4) of X?
The resulting matrix is:
Y = array([[3, 1, 4, 5], [0, 3, 0, 3], [1, 2, 3, 4])
I cannot list all the rows I want to include for the new matrix because for the data I have, it will be like choosing the (20, 60), (90, 120) row of the original matrix to construct a new matrix.
Use np.r_ to get those concatenated row indices and simply index into the rows of the input array, like so -
X[np.r_[0, 3:5]] # for sample case
X[np.r_[20:60, 90:120]] # for actual case
Sample run -
In [146]: X
Out[146]:
array([[3, 1, 4, 5],
[5, 1, 2, 1],
[4, 4, 0, 1],
[0, 3, 0, 3],
[1, 2, 3, 4]])
In [147]: X[np.r_[0, 3:5]]
Out[147]:
array([[3, 1, 4, 5],
[0, 3, 0, 3],
[1, 2, 3, 4]])
Sample run for shape test on a bigger random array -
In [150]: X = np.random.rand(200,10)
In [151]: X[np.r_[20:60, 90:120]].shape
Out[151]: (70, 10) # 70 rows selected
I have a 2-d numpy array as follows:
a = np.array([[1,5,9,13],
[2,6,10,14],
[3,7,11,15],
[4,8,12,16]]
I want to extract it into patches of 2 by 2 sizes with out repeating the elements.
The answer should exactly be the same. This can be 3-d array or list with the same order of elements as below:
[[[1,5],
[2,6]],
[[3,7],
[4,8]],
[[9,13],
[10,14]],
[[11,15],
[12,16]]]
How can do it easily?
In my real problem the size of a is (36, 72). I can not do it one by one. I want programmatic way of doing it.
Using scikit-image:
import numpy as np
from skimage.util import view_as_blocks
a = np.array([[1,5,9,13],
[2,6,10,14],
[3,7,11,15],
[4,8,12,16]])
print(view_as_blocks(a, (2, 2)))
You can achieve it with a combination of np.reshape and np.swapaxes like so -
def extract_blocks(a, blocksize, keep_as_view=False):
M,N = a.shape
b0, b1 = blocksize
if keep_as_view==0:
return a.reshape(M//b0,b0,N//b1,b1).swapaxes(1,2).reshape(-1,b0,b1)
else:
return a.reshape(M//b0,b0,N//b1,b1).swapaxes(1,2)
As can be seen there are two ways to use it - With keep_as_view flag turned off (default one) or on. With keep_as_view = False, we are reshaping the swapped-axes to a final output of 3D, while with keep_as_view = True, we will keep it 4D and that will be a view into the input array and hence, virtually free on runtime. We will verify it with a sample case run later on.
Sample cases
Let's use a sample input array, like so -
In [94]: a
Out[94]:
array([[2, 2, 6, 1, 3, 6],
[1, 0, 1, 0, 0, 3],
[4, 0, 0, 4, 1, 7],
[3, 2, 4, 7, 2, 4],
[8, 0, 7, 3, 4, 6],
[1, 5, 6, 2, 1, 8]])
Now, let's use some block-sizes for testing. Let's use a blocksize of (2,3) with the view-flag turned off and on -
In [95]: extract_blocks(a, (2,3)) # Blocksize : (2,3)
Out[95]:
array([[[2, 2, 6],
[1, 0, 1]],
[[1, 3, 6],
[0, 0, 3]],
[[4, 0, 0],
[3, 2, 4]],
[[4, 1, 7],
[7, 2, 4]],
[[8, 0, 7],
[1, 5, 6]],
[[3, 4, 6],
[2, 1, 8]]])
In [48]: extract_blocks(a, (2,3), keep_as_view=True)
Out[48]:
array([[[[2, 2, 6],
[1, 0, 1]],
[[1, 3, 6],
[0, 0, 3]]],
[[[4, 0, 0],
[3, 2, 4]],
[[4, 1, 7],
[7, 2, 4]]],
[[[8, 0, 7],
[1, 5, 6]],
[[3, 4, 6],
[2, 1, 8]]]])
Verify view with keep_as_view=True
In [20]: np.shares_memory(a, extract_blocks(a, (2,3), keep_as_view=True))
Out[20]: True
Let's check out performance on a large array and verify the virtually free runtime claim as discussed earlier -
In [42]: a = np.random.rand(2000,3000)
In [43]: %timeit extract_blocks(a, (2,3), keep_as_view=True)
1000000 loops, best of 3: 801 ns per loop
In [44]: %timeit extract_blocks(a, (2,3), keep_as_view=False)
10 loops, best of 3: 29.1 ms per loop
Here's a rather cryptic numpy one-liner to generate your 3-d array, called result1 here:
In [60]: x
Out[60]:
array([[2, 1, 2, 2, 0, 2, 2, 1, 3, 2],
[3, 1, 2, 1, 0, 1, 2, 3, 1, 0],
[2, 0, 3, 1, 3, 2, 1, 0, 0, 0],
[0, 1, 3, 3, 2, 0, 3, 2, 0, 3],
[0, 1, 0, 3, 1, 3, 0, 0, 0, 2],
[1, 1, 2, 2, 3, 2, 1, 0, 0, 3],
[2, 1, 0, 3, 2, 2, 2, 2, 1, 2],
[0, 3, 3, 3, 1, 0, 2, 0, 2, 1]])
In [61]: result1 = x.reshape(x.shape[0]//2, 2, x.shape[1]//2, 2).swapaxes(1, 2).reshape(-1, 2, 2)
result1 is like a 1-d array of 2-d arrays:
In [68]: result1.shape
Out[68]: (20, 2, 2)
In [69]: result1[0]
Out[69]:
array([[2, 1],
[3, 1]])
In [70]: result1[1]
Out[70]:
array([[2, 2],
[2, 1]])
In [71]: result1[5]
Out[71]:
array([[2, 0],
[0, 1]])
In [72]: result1[-1]
Out[72]:
array([[1, 2],
[2, 1]])
(Sorry, I don't have time at the moment to give a detailed breakdown of how it works. Maybe later...)
Here's a less cryptic version that uses a nested list comprehension. In this case, result2 is a python list of 2-d numpy arrays:
In [73]: result2 = [x[2*j:2*j+2, 2*k:2*k+2] for j in range(x.shape[0]//2) for k in range(x.shape[1]//2)]
In [74]: result2[5]
Out[74]:
array([[2, 0],
[0, 1]])
In [75]: result2[-1]
Out[75]:
array([[1, 2],
[2, 1]])