Can I create a view from a boolean selection of a numpy array? - numpy

If I create a numpy array, and another to serve as a selective index into it:
>>> x
array([[ 2, 3, 4],
[ 5, 6, 7],
[ 6, 7, 8],
[11, 12, 13]])
>>> nz
array([ True, True, False, True], dtype=bool)
then direct use of nz returns a view of the original array:
>>> x[nz,:]
array([[ 2, 3, 4],
[ 5, 6, 7],
[11, 12, 13]])
>>> x[nz,:] += 2
>>> x
array([[ 4, 5, 6],
[ 7, 8, 9],
[ 6, 7, 8],
[13, 14, 15]])
however, naturally, an assignment makes a copy:
>>> v = x[nz,:]
Any operation on v is on the copy, and has no effect on the original array.
Is there any way to create a named view, from x[nz,:], simply to abbreviate code, or which I can pass around, so operations on the named view will affect only the selected elements of x?

Numpy has masked_array, which might be what you are looking for:
import numpy as np
x = np.asarray([[ 2, 3, 4],[ 5, 6, 7],[ 6, 7, 8],[11, 12, 13]])
nz = np.asarray([ True, True, False, True], dtype=bool)
mx = np.ma.masked_array(x, ~nz.repeat(3)) # True means masked, so "~" is needed
mx += 2
# x changed as well because it is the base of mx
print(x)
print(x is mx.base)

Related

numpy find unique rows (only appeared once)

for example I got many sub-arrays by splitting one array A based on list B:
A = np.array([[1,1,1],
[2,2,2],
[2,3,4],
[5,8,10],
[5,9,9],
[7,9,6],
[1,1,1],
[2,2,2],
[9,2,4],
[9,3,6],
[10,3,3],
[11,2,2]])
B = np.array([5,7])
C = np.split(A,B.cumsum()[:-1])
>>>print(C)
>>>array([[1,1,1],
[1,2,2],
[2,3,4],
[5,8,10],
[5,9,9]]),
array([[7,9,6],
[1,1,1],
[2,2,2],
[9,2,4],
[9,3,6],
[10,3,3],
[11,2,2]])
How can I find get the rows only appeared once in all the sub-arrays (delete those who appeared twice)? so that I can get the result like: (because [1,1,1] and [2,2,2] appeared twice in C )
>>>array([[2,3,4],
[5,8,10],
[5,9,9]]),
array([[7,9,6],
[9,2,4],
[9,3,6],
[10,3,3],
[11,2,2]])
You can use np.unique to identify the duplicates:
_, i, c = np.unique(A, axis=0, return_index=True, return_counts=True)
idx = np.isin(np.arange(len(A)), i[c==1])
out = [a[i] for a,i in zip(np.split(A, B.cumsum()[:-1]),
np.split(idx, B.cumsum()[:-1]))]
output:
[array([[ 2, 3, 4],
[ 5, 8, 10],
[ 5, 9, 9]]),
array([[ 7, 9, 6],
[ 9, 2, 4],
[ 9, 3, 6],
[10, 3, 3],
[11, 2, 2]])]

Numpy get column of two dimensional matrix as array

I have a matrix that looks like that:
>> X
>>
[[5.1 1.4]
[4.9 1.4]
[4.7 1.3]
[4.6 1.5]
[5. 1.4]]
I want to get its first column as an array of [5.1, 4.9, 4.7, 4.6, 5.]
However when I try to get it by X[:,0] i get
>> [[5.1]
[4.9]
[4.7]
[4.6]
[5. ]]
which is something different. How to get it as an array ?
You can use list comprehensions for this kind of thing..
import numpy as np
X = np.array([[5.1, 1.4], [4.9, 1.4], [4.7, 1.3], [4.6, 1.5], [5.0, 1.4]])
X_0 = [i for i in X[:,0]]
print(X_0)
Output..
[5.1, 4.9, 4.7, 4.6, 5.0]
Almost there! Just reshape your result:
X[:,0].reshape(1,-1)
Outputs:
[[5.1 4.9 4.7 4.6 5. ]]
Full code:
import numpy as np
X=np.array([[5.1 ,1.4],[4.9 ,1.4], [4.7 ,1.3], [4.6 ,1.5], [5. , 1.4]])
print(X)
print(X[:,0].reshape(1,-1))
With regular numpy array:
In [3]: x = np.arange(15).reshape(5,3)
In [4]: x
Out[4]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [5]: x[:,0]
Out[5]: array([ 0, 3, 6, 9, 12])
With np.matrix (use discouraged if not actually deprecated)
In [6]: X = np.matrix(x)
In [7]: X
Out[7]:
matrix([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [8]: print(X)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
In [9]: X[:,0]
Out[9]:
matrix([[ 0],
[ 3],
[ 6],
[ 9],
[12]])
In [10]: X[:,0].T
Out[10]: matrix([[ 0, 3, 6, 9, 12]])
To get 1d array, convert to array and ravel, or in one step:
In [11]: X[:,0].A1
Out[11]: array([ 0, 3, 6, 9, 12])

Efficiently construct numpy matrix from offset ranges of 1D array [duplicate]

Lets say I have a Python Numpy array a.
a = numpy.array([1,2,3,4,5,6,7,8,9,10,11])
I want to create a matrix of sub sequences from this array of length 5 with stride 3. The results matrix hence will look as follows:
numpy.array([[1,2,3,4,5],[4,5,6,7,8],[7,8,9,10,11]])
One possible way of implementing this would be using a for-loop.
result_matrix = np.zeros((3, 5))
for i in range(0, len(a), 3):
result_matrix[i] = a[i:i+5]
Is there a cleaner way to implement this in Numpy?
Approach #1 : Using broadcasting -
def broadcasting_app(a, L, S ): # Window len = L, Stride len/stepsize = S
nrows = ((a.size-L)//S)+1
return a[S*np.arange(nrows)[:,None] + np.arange(L)]
Approach #2 : Using more efficient NumPy strides -
def strided_app(a, L, S ): # Window len = L, Stride len/stepsize = S
nrows = ((a.size-L)//S)+1
n = a.strides[0]
return np.lib.stride_tricks.as_strided(a, shape=(nrows,L), strides=(S*n,n))
Sample run -
In [143]: a
Out[143]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
In [144]: broadcasting_app(a, L = 5, S = 3)
Out[144]:
array([[ 1, 2, 3, 4, 5],
[ 4, 5, 6, 7, 8],
[ 7, 8, 9, 10, 11]])
In [145]: strided_app(a, L = 5, S = 3)
Out[145]:
array([[ 1, 2, 3, 4, 5],
[ 4, 5, 6, 7, 8],
[ 7, 8, 9, 10, 11]])
Starting in Numpy 1.20, we can make use of the new sliding_window_view to slide/roll over windows of elements.
And coupled with a stepping [::3], it simply becomes:
from numpy.lib.stride_tricks import sliding_window_view
# values = np.array([1,2,3,4,5,6,7,8,9,10,11])
sliding_window_view(values, window_shape = 5)[::3]
# array([[ 1, 2, 3, 4, 5],
# [ 4, 5, 6, 7, 8],
# [ 7, 8, 9, 10, 11]])
where the intermediate result of the sliding is:
sliding_window_view(values, window_shape = 5)
# array([[ 1, 2, 3, 4, 5],
# [ 2, 3, 4, 5, 6],
# [ 3, 4, 5, 6, 7],
# [ 4, 5, 6, 7, 8],
# [ 5, 6, 7, 8, 9],
# [ 6, 7, 8, 9, 10],
# [ 7, 8, 9, 10, 11]])
Modified version of #Divakar's code with checking to ensure that memory is contiguous and that the returned array cannot be modified. (Variable names changed for my DSP application).
def frame(a, framelen, frameadv):
"""frame - Frame a 1D array
a - 1D array
framelen - Samples per frame
frameadv - Samples between starts of consecutive frames
Set to framelen for non-overlaping consecutive frames
Modified from Divakar's 10/17/16 11:20 solution:
https://stackoverflow.com/questions/40084931/taking-subarrays-from-numpy-array-with-given-stride-stepsize
CAVEATS:
Assumes array is contiguous
Output is not writable as there are multiple views on the same memory
"""
if not isinstance(a, np.ndarray) or \
not (a.flags['C_CONTIGUOUS'] or a.flags['F_CONTIGUOUS']):
raise ValueError("Input array a must be a contiguous numpy array")
# Output
nrows = ((a.size-framelen)//frameadv)+1
oshape = (nrows, framelen)
# Size of each element in a
n = a.strides[0]
# Indexing in the new object will advance by frameadv * element size
ostrides = (frameadv*n, n)
return np.lib.stride_tricks.as_strided(a, shape=oshape,
strides=ostrides, writeable=False)

Use an ufunc analogous to numpy.where

For example, if I want to add conditionally, I can use:
y = numpy.where(condition, a+b, b)
Is there a way to directly combine an ufunc and where? Something like:
y = numpy.add.where(condition, a, b)
Something along that line is add.at.
In [21]: b = np.arange(10)
In [22]: cond = b%3==0
Your where:
In [24]: np.where(cond, 10+b, b)
Out[24]: array([10, 1, 2, 13, 4, 5, 16, 7, 8, 19])
Use the other where (or np.nonzeros) to turn the boolean mask into index tuple
In [25]: cond
Out[25]: array([ True, False, False, True, False, False, True, False, False, True], dtype=bool)
In [26]: idx = np.where(cond)
In [27]: idx
Out[27]: (array([0, 3, 6, 9], dtype=int32),)
add.at does inplace, unbuffered addition:
In [28]: np.add.at(b,idx[0],10)
In [29]: b
Out[29]: array([10, 1, 2, 13, 4, 5, 16, 7, 8, 19])
add.at is intended as a way of getting around buffering problems with the more direct index +=:
In [30]: b = np.arange(10)
In [31]: b[idx[0]] += 10
In [32]: b
Out[32]: array([10, 1, 2, 13, 4, 5, 16, 7, 8, 19])
Here the action is the same (add.at is slower). But if there were duplicates in idx the results will be different.
+= also works with the boolean mask:
In [33]: b[cond] -= 10
In [34]: b
Out[34]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
There's got to be a ufunc equivalent to the += operator, but I don't use ufunc enough to know off hand.

numpy custom array element retrieval

I have a question regarding how to extract certain values from a 2D numpy array
Foo =
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
Bar =
array([[0, 0, 1],
[1, 2, 3]])
I want to extract elements from Foo using the values of Bar as indices, such that I end up with an 2D matrix/array Baz of the same shape as Bar. The ith column in Baz correspond is Foo[(np.array(each j in Bar[:,i]),np.array(i,i,i,i ...))]
Baz =
array([[ 1, 2, 6],
[ 4, 8, 12]])
I could do a couple nested for-loops but I was wondering if there is a more elegant, numpy-ish way to do this.
Sorry if this is a bit convoluted. Let me know if I need to explain further.
Thanks!
You can use Bar as the row index and an array [0, 1, 2] as the column index:
# for easy copy-pasting
import numpy as np
Foo = np.array([[ 1, 2, 3], [ 4, 5, 6], [ 7, 8, 9], [10, 11, 12]])
Bar = np.array([[0, 0, 1], [1, 2, 3]])
# now use Bar as the `i` coordinate and 0, 1, 2 as the `j` coordinate:
Foo[Bar, [0, 1, 2]]
# array([[ 1, 2, 6],
# [ 4, 8, 12]])
# OR, to automatically generate the [0, 1, 2]
Foo[Bar, xrange(Bar.shape[1])]