I have a Numpy object with random N*M elements, and I also have two numbers A and B.
Now I want to access every element in this N*M array and make a change, i.e., if the element > 0, replace this element to A (i.e., element <- A), and if this element < 0, replace this element to B (i.e., element <- B).
I know there is a naive way to implement this method, that is accessing every single element using for loop, but it is very slow.
Can we use more fancy code to implement this ?
Boolean masked assignment will change values in place:
In [493]: arr = np.random.randint(-10,10,(5,7))
In [494]: arr
Out[494]:
array([[ -5, -6, -7, -1, -8, -8, -10],
[ -9, 1, -3, -9, 3, 8, -1],
[ 6, -7, 4, 0, -4, 4, -2],
[ -3, -10, -2, 7, -4, 2, 2],
[ -5, 5, -1, -7, 7, 5, -7]])
In [495]: arr[arr>0] = 100
In [496]: arr[arr<0] = -50
In [497]: arr
Out[497]:
array([[-50, -50, -50, -50, -50, -50, -50],
[-50, 100, -50, -50, 100, 100, -50],
[100, -50, 100, 0, -50, 100, -50],
[-50, -50, -50, 100, -50, 100, 100],
[-50, 100, -50, -50, 100, 100, -50]])
I just gave a similar answer in
python numpy: iterate for different conditions without using a loop
IIUC:
narr = np.random.randint(-100,100,(10,5))
array([[ 70, -20, 96, 73, -94],
[ 42, 35, -55, 56, 54],
[ 97, -16, 24, 32, 78],
[ 49, 49, -11, -82, 82],
[-10, 59, -42, -68, -70],
[ 95, 23, 22, 58, -38],
[ -2, -64, 27, -33, -95],
[ 98, 42, 8, -83, 85],
[ 23, 51, -99, -82, -7],
[-28, -11, -44, 95, 93]])
A = 1000
B = -999
Use np.where:
np.where(narr > 0, A, np.where(narr < 0, B , narr))
Output:
array([[1000, -999, 1000, 1000, -999],
[1000, 1000, -999, 1000, 1000],
[1000, -999, 1000, 1000, 1000],
[1000, 1000, -999, -999, 1000],
[-999, 1000, -999, -999, -999],
[1000, 1000, 1000, 1000, -999],
[-999, -999, 1000, -999, -999],
[1000, 1000, 1000, -999, 1000],
[1000, 1000, -999, -999, -999],
[-999, -999, -999, 1000, 1000]])
Because you mentioned that you're interested in the speed of the computation, I made a speed comparision of several different approaches for your problem.
test.py:
import numpy as np
A = 100
B = 50
def createArray():
array = np.random.randint(-100,100,(500,500))
return array
def replace(x):
return A if x > 0 else B
def replace_ForLoop():
"""Simple for-loop."""
array = createArray()
for i in range(array.shape[0]):
for j in range(array.shape[1]):
array[i][j] = replace(array[i][j])
def replace_nditer():
"""Use numpy.nditer to iterate over values."""
array = createArray()
for elem in np.nditer(array, op_flags=['readwrite']):
elem[...] = replace(elem)
def replace_masks():
"""Use boolean masks."""
array = createArray()
array[array>0] = A
array[array<0] = B
def replace_vectorize():
"""Use numpy.vectorize"""
array = createArray()
vectorfunc = np.vectorize(replace)
array = vectorfunc(array)
def replace_where():
"""Use numpy.where"""
array = createArray()
array = np.where(array > 0, A, np.where(array < 0, B , array))
Note: The variants using nested for-loops, np.nditer and boolean masks work inplace, the last two do not.
Timing comparision:
> python -mtimeit -s'import test' 'test.replace_ForLoop()'
10 loops, best of 3: 185 msec per loop
> python -mtimeit -s'import test' 'test.replace_nditer()'
10 loops, best of 3: 294 msec per loop
> python -mtimeit -s'import test' 'test.replace_masks()'
100 loops, best of 3: 5.8 msec per loop
> python -mtimeit -s'import test' 'test.replace_vectorize()'
10 loops, best of 3: 55.3 msec per loop
> python -mtimeit -s'import test' 'test.replace_where()'
100 loops, best of 3: 5.42 msec per loop
Using loops is indeed quite slow. numpy.nditer is even slower, which comes as a surprise to me, because the doc calls it an efficient multi-dimensional iterator object to iterate over arrays. numpy.vectorize is essentially a for-loop, but still manages to be thrice as fast as the naive implementation.
The np.where variant proposed by Scott Boston is slightly faster than using boolean masks as per hpaulj's answer. However, it does need more memory because it does not modify inplace.
Related
Basically what the title entails.
The two matrices are mostly zeros. And the first is 1 x 9999999999999 and the second is 9999999999999 x 1
When I try to do a dot product I get this.
Unable to allocate 72.8 TiB for an array with shape (10000000000000,) and data type int64
Full traceback </br>
MemoryError: Unable to allocate 72.8 TiB for an array with shape (10000000000000,) and data type int64
In [31]: imputed.dot(s)
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-31-670cfc69d4cf> in <module>
----> 1 imputed.dot(s)
~/.local/lib/python3.8/site-packages/scipy/sparse/base.py in dot(self, other)
357
358 """
--> 359 return self * other
360
361 def power(self, n, dtype=None):
~/.local/lib/python3.8/site-packages/scipy/sparse/base.py in __mul__(self, other)
478 if self.shape[1] != other.shape[0]:
479 raise ValueError('dimension mismatch')
--> 480 return self._mul_sparse_matrix(other)
481
482 # If it's a list or whatever, treat it like a matrix
~/.local/lib/python3.8/site-packages/scipy/sparse/compressed.py in _mul_sparse_matrix(self, other)
499
500 major_axis = self._swap((M, N))[0]
--> 501 other = self.__class__(other) # convert to this format
502
503 idx_dtype = get_index_dtype((self.indptr, self.indices,
~/.local/lib/python3.8/site-packages/scipy/sparse/compressed.py in __init__(self, arg1, shape, dtype, copy)
32 arg1 = arg1.copy()
33 else:
---> 34 arg1 = arg1.asformat(self.format)
35 self._set_self(arg1)
36
~/.local/lib/python3.8/site-packages/scipy/sparse/base.py in asformat(self, format, copy)
320 # Forward the copy kwarg, if it's accepted.
321 try:
--> 322 return convert_method(copy=copy)
323 except TypeError:
324 return convert_method()
~/.local/lib/python3.8/site-packages/scipy/sparse/csc.py in tocsr(self, copy)
135 idx_dtype = get_index_dtype((self.indptr, self.indices),
136 maxval=max(self.nnz, N))
--> 137 indptr = np.empty(M + 1, dtype=idx_dtype)
138 indices = np.empty(self.nnz, dtype=idx_dtype)
139 data = np.empty(self.nnz, dtype=upcast(self.dtype))
MemoryError: Unable to allocate 72.8 TiB for an array with shape (10000000000000,) and data type int64
It seems the scipy is trying to create a temp array.
I am using the .dot method that scipy provides.
I am also open to non-scipy solutions.
Thanks!
In [105]: from scipy import sparse
If I make a (100,1) csr matrix:
In [106]: A = sparse.random(100,1,format='csr')
In [107]: A
Out[107]:
<100x1 sparse matrix of type '<class 'numpy.float64'>'
with 1 stored elements in Compressed Sparse Row format>
The data and indices are:
In [109]: A.data
Out[109]: array([0.19060481])
In [110]: A.indices
Out[110]: array([0], dtype=int32)
In [112]: A.indptr
Out[112]:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)
So even with only 1 nonzero term, one array is large (101).
On the other hand the csc format for the same array has a much smaller storage. But csc with (1,100) shape will look like the csr.
In [113]: Ac = A.tocsc()
In [114]: Ac.indptr
Out[114]: array([0, 1], dtype=int32)
In [115]: Ac.indices
Out[115]: array([88], dtype=int32)
Math, especially matrix products is done with csr/csc formats. So it may be hard to avoid this 80 TB memory use.
Looking at the traceback I see that it's trying to convert other to the format that matches self.
So with A.dot(B), and A is (1,N) csr, the small shape. B is (N,1) csc, also the small shape. But B.tocsr() requires the large (N+1,) shaped indptr.
Let's try an alternative to dot
First 2 matrices:
In [122]: A = sparse.random(1,100, .2,format='csr')
In [123]: B = sparse.random(100,1, .2,format='csc')
In [124]: A
Out[124]:
<1x100 sparse matrix of type '<class 'numpy.float64'>'
with 20 stored elements in Compressed Sparse Row format>
In [125]: B
Out[125]:
<100x1 sparse matrix of type '<class 'numpy.float64'>'
with 20 stored elements in Compressed Sparse Column format>
In [126]: A#B
Out[126]:
<1x1 sparse matrix of type '<class 'numpy.float64'>'
with 1 stored elements in Compressed Sparse Row format>
In [127]: _.A
Out[127]: array([[1.33661021]])
Their nonzero element indices. Only the ones that match matter.
In [128]: A.indices, B.indices
Out[128]:
(array([16, 20, 23, 28, 30, 37, 39, 40, 43, 49, 54, 59, 61, 63, 67, 70, 74,
91, 94, 99], dtype=int32),
array([ 5, 8, 15, 25, 34, 35, 40, 46, 47, 51, 53, 60, 68, 70, 75, 81, 87,
90, 91, 94], dtype=int32))
equality matrix:
In [129]: mask = A.indices[:,None]==B.indices
In [132]: np.nonzero(mask.any(axis=0))
Out[132]: (array([ 6, 13, 18, 19]),)
In [133]: np.nonzero(mask.any(axis=1))
Out[133]: (array([ 7, 15, 17, 18]),)
The matching indices:
In [139]: A.indices[Out[133]]
Out[139]: array([40, 70, 91, 94], dtype=int32)
In [140]: B.indices[Out[132]]
Out[140]: array([40, 70, 91, 94], dtype=int32)
sum of the corresponding data values matches [127]
In [141]: (A.data[Out[133]]*B.data[Out[132]]).sum()
Out[141]: 1.3366102138511582
I am a newbie in numpy. I have an array A of size [x,] of values and an array B of size [y,] (y > x). I want as result an array C of size [x,] filled with indices of B.
Here is an example of inputs and outputs:
>>> A = [10, 20, 30, 10, 40, 50, 10, 50, 20]
>>> B = [10, 20, 30, 40, 50]
>>> C = #Some operations
>>> C
[0, 1, 2, 0, 3, 4, 0, 4, 1]
I didn't find the way how to do this. Please advice me. Thank you.
I think you are looking for searchsorted, assuming that B is sorted increasingly:
C = np.searchsorted(B,A)
Output:
array([0, 1, 2, 0, 3, 4, 0, 4, 1])
Update for general situation where B is not sorted. We can do an argsort:
# let's swap 40 and 50 in B
# expect the output to have 3 and 4 swapped
B = [10, 20, 30, 50, 40]
BB = np.sort(B)
C = np.argsort(B)[np.searchsorted(BB,A)]
Output:
array([0, 1, 2, 0, 4, 3, 0, 3, 1], dtype=int64)
You can double check:
(np.array(B)[C] == A).all()
# True
For general python lists
A = [10, 20, 30, 10, 40, 50, 10, 50, 20]
B = [10, 20, 30, 40, 50]
C = [A.index(e) for e in A if e in B]
print(C)
You can try this code
A = np.array([10, 20, 30, 10, 40, 50, 10, 50, 20])
B = np.array([10, 20, 30, 40, 50])
np.argmax(B==A[:,None],axis=1)
I have a system of equations that I am trying to simulate and using very basic looping structures seems to rapidly slow down my computing speed. I have a mock example below to illustrate how I am running the simulation now:
import numpy as np
Imax, Jmax, Tmax = 4, 4, 3
Iset, Jset, Tset = range(0,Imax), range(0,Jmax), range(0,Tmax)
X = np.arange(0,48).reshape(3,4,4)
X[1], X[2] = 4, 2
Y = 2*X
for t in Tset:
if t == 2:
break
else:
for i in Iset:
for j in Jset:
Y[t+1,i,j] = Y[t,i,j] + X[t,i,j]
X[t+1,i,j] = X[t,i,j] + 1
# Output for Y...
array([[[ 0, 2, 4, 6],
[ 8, 10, 12, 14],
[16, 18, 20, 22],
[24, 26, 28, 30]],
[[ 0, 3, 6, 9],
[12, 15, 18, 21],
[24, 27, 30, 33],
[36, 39, 42, 45]],
[[ 1, 5, 9, 13],
[17, 21, 25, 29],
[33, 37, 41, 45],
[49, 53, 57, 61]]])
Intuitively this structure makes sense to me because I am accessing the individual elements of the Y array and updating it, but because I have this looping over very large values and have more going on in the loop, I am experiencing a drastic reduction in computational speed.
I came across nditer and I am hoping that I can use this in place of the multiple nested loops that I have so that I can still get the same result, but faster. How can I go about converting this nested for-loop style into a more efficient iteration scheme?
I need to multiply an array (NIR) with a scalar (f) but leaving some values that meet a certain condition intact.
I tried the following:
NIR_f = np.multiply(NIR,f,where=NIR!=-28672.0)
To check I made:
i,j=1119,753
NIR[i][j],NIR_f[i][j]
and I got this:
(-28672.0, 10058.0)
It is assumed that both results should be the same! In that position the condition is not met, therefore the value should remain intact.
Am I using the "where" option wrongly?
Without your array, or a smaller substitute, I can't exactly replicate your problem. But there are potentially 2 issues
float testing is not exact, so it might be matching one -28672.0, and not another.
the remain intact assumption is tricky. leave the value in the output alone, but what was it originally, 0's or NIR values.
Using an integer array to avoid the float issue:
In [20]: arr = np.arange(12).reshape(3,4)
In [21]: arr
Out[21]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [22]: np.multiply(arr, 10, where=arr!=10)
Out[22]:
array([[ 0, 10, 20, 30],
[ 40, 50, 60, 70],
[ 80, 90, 481036337249, 110]])
In [24]: np.multiply(arr, 10, where=arr!=10)
Out[24]:
array([[ 0, 10, 20, 30],
[ 40, 50, 60, 70],
[ 80, 90, 0, 110]])
arr[2,2] is random. In effect it started with a np.empty array of the right shape and dtype, and filled all values but that one with the multiplication. To use where correctly we need to specify an out parameter as well.
In [25]: out = np.full(arr.shape,-1)
In [26]: out
Out[26]:
array([[-1, -1, -1, -1],
[-1, -1, -1, -1],
[-1, -1, -1, -1]])
In [27]: np.multiply(arr, 10, where=arr!=10, out=out)
Out[27]:
array([[ 0, 10, 20, 30],
[ 40, 50, 60, 70],
[ 80, 90, -1, 110]])
The issue of inexact floats comes up often enough that I won't try to illustrate that.
I have 3D numpy array, for example, like this:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]]])
Is there a way to index it in such a way that I select, for example, top right corner of 2x2 elements in the first plane, and a center 2x2 elements subarray from the second plane? So that I could then zero out the elements 2,3,6,7,21,22,25,26:
array([[[ 0, 1, 0, 0],
[ 4, 5, 0, 0],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 0, 0, 23],
[24, 0, 0, 27],
[28, 29, 30, 31]]])
I have a batch of images, and I need to zero out a small window of fixed size, but at different (random) locations for each image in the batch. The first dimension is number of images.
Something like this:
a[:, x: x+2, y: y+2] = 0
where x and y are vectors which have different values for each first dimension of a.
Approach #1 : Here'e one approach that's mostly based on linear-indexing -
def random_block_fill_lidx(a, N, fillval=0):
# a is input array
# N is blocksize
# Store shape info
m,n,r = a.shape
# Get all possible starting linear indices for each 2D slice
possible_start_lidx = (np.arange(n-N+1)[:,None]*r + range(r-N+1)).ravel()
# Get random start indices from all possible ones for all 2D slices
start_lidx = np.random.choice(possible_start_lidx, m)
# Get linear indices for the block of (N,N)
offset_arr = (a.shape[-1]*np.arange(N)[:,None] + range(N)).ravel()
# Add in those random start indices with the offset array
idx = start_lidx[:,None] + offset_arr
# On a 2D view of the input array, use advance-indexing to set fillval.
a.reshape(m,-1)[np.arange(m)[:,None], idx] = fillval
return a
Approach #2 : Here's another and possibly more efficient one (for large 2D slices) using advanced-indexing -
def random_block_fill_adv(a, N, fillval=0):
# a is input array
# N is blocksize
# Store shape info
m,n,r = a.shape
# Generate random start indices for second and third axes keeping proper
# distance from the boundaries for the block to be accomodated within.
idx0 = np.random.randint(0,n-N+1,m)
idx1 = np.random.randint(0,r-N+1,m)
# Setup indices for advanced-indexing.
# First axis indices would be simply the range array to select one per elem.
# We need to extend this to 3D so that the latter dim indices could be aligned.
dim0 = np.arange(m)[:,None,None]
# Second axis indices would idx0 with broadcasted additon of blocksized
# range array to cover all block indices along this axis. Repeat for third.
dim1 = idx0[:,None,None] + np.arange(N)[:,None]
dim2 = idx1[:,None,None] + range(N)
a[dim0, dim1, dim2] = fillval
return a
Approach #3 : With the old-trusty loop -
def random_block_fill_loopy(a, N, fillval=0):
# a is input array
# N is blocksize
# Store shape info
m,n,r = a.shape
# Generate random start indices for second and third axes keeping proper
# distance from the boundaries for the block to be accomodated within.
idx0 = np.random.randint(0,n-N+1,m)
idx1 = np.random.randint(0,r-N+1,m)
# Iterate through first and use slicing to assign fillval.
for i in range(m):
a[i, idx0[i]:idx0[i]+N, idx1[i]:idx1[i]+N] = fillval
return a
Sample run -
In [357]: a = np.arange(2*4*7).reshape(2,4,7)
In [358]: a
Out[358]:
array([[[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27]],
[[28, 29, 30, 31, 32, 33, 34],
[35, 36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47, 48],
[49, 50, 51, 52, 53, 54, 55]]])
In [359]: random_block_fill_adv(a, N=3, fillval=0)
Out[359]:
array([[[ 0, 0, 0, 0, 4, 5, 6],
[ 7, 0, 0, 0, 11, 12, 13],
[14, 0, 0, 0, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27]],
[[28, 29, 30, 31, 32, 33, 34],
[35, 36, 37, 38, 0, 0, 0],
[42, 43, 44, 45, 0, 0, 0],
[49, 50, 51, 52, 0, 0, 0]]])
Fun stuff : Being in-place filling, if we keep running random_block_fill_adv(a, N=3, fillval=0), we will eventually end up with all zeros a. Thus, also verifying the code.
Runtime test
In [579]: a = np.random.randint(0,9,(10000,4,4))
In [580]: %timeit random_block_fill_lidx(a, N=2, fillval=0)
...: %timeit random_block_fill_adv(a, N=2, fillval=0)
...: %timeit random_block_fill_loopy(a, N=2, fillval=0)
...:
1000 loops, best of 3: 545 µs per loop
1000 loops, best of 3: 891 µs per loop
100 loops, best of 3: 10.6 ms per loop
In [581]: a = np.random.randint(0,9,(1000,40,40))
In [582]: %timeit random_block_fill_lidx(a, N=10, fillval=0)
...: %timeit random_block_fill_adv(a, N=10, fillval=0)
...: %timeit random_block_fill_loopy(a, N=10, fillval=0)
...:
1000 loops, best of 3: 739 µs per loop
1000 loops, best of 3: 671 µs per loop
1000 loops, best of 3: 1.27 ms per loop
So, which one to choose depends on the first axis length and blocksize.