I have a ndarray and I want to filter out a particular value of it. My array is:
arr = np.array([
[1., 6., 1.],
[1., 7., 0.],
[1., 8., 0.],
[3., 5., 1.],
[5., 1., 1.],
[5., 2., 2.],
[6., 1., 1.],
[6., 2., 2.],
[6., 7., 3.],
[6., 8., 0.]
])
I want to filter out [6., 1., 1.]. So I have tried:
arr[arr != [6., 1., 1.]]
and I got:
array([1., 6., 1., 7., 0., 1., 8., 0., 3., 5., 5., 5., 2., 2., 2., 2., 7.,
3., 8., 0.])
which is not what I want (and also destroyed the previous structure of the array). I have also tried:
arr[arr[:] != [6., 1., 1.]]
but I got the same output as before.
P.S.: I know I can delete an element by its index, but I don't want to do that. I want to check for the particular element.
P.P.S.: For 1-d arrays my method works.
You're very close. The boolean array you get tells you how many elements match in each row. You need to make sure that all the elements in a row match to delete it, or that any of the elements don't match to keep it:
arr[(arr != [6, 1, 1]).any(axis=1)]
You can also write it as
arr[~(arr == [6, 1, 1]).all(axis=1)]
I have a numpy array:
a = array([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
which i replicate with np.repeat like that:
np.repeat(a, 3, axis=0)
with the result:
array([[0., 1., 2.],
[0., 1., 2.],
[0., 1., 2.],
[3., 4., 5.],
[3., 4., 5.],
[3., 4., 5.],
[6., 7., 8.],
[6., 7., 8.],
[6., 7., 8.]])
Can i achieve the same with np.lib.stride_tricks.as_strided to avoid copying data? I need something like that also for multidimensional arrays, but i always repeat along 0-th axis...
I don't think this is possible. You can get close:
n=3
out = np.lib.stride_tricks.as_strided(a,
shape = (n,) + a.shape,
strides = (0,) + a.strides
)
np.shares_memory(a, out)
Out[]: True
out
Out[]:
array([[[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]],
[[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]],
[[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]]])
But that's not repeating in dimension 0, it's repeating everything in a new dimension 0. And reshaping creates a copy:
out.reshape(-1, 3)
Out[]:
array([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.],
[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.],
[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
np.shares_memory(a, out.reshape(-1, 3))
Out[]: False
You'll generally be better off using broadcasting instead, going from something like:
op(a_repeated, b)
to:
op(a[None, ...], b.reshape((-1, a.shape[0]) + b.shape[1:])) )
But that depends a lot on what op is (and whether it is vectorized and/or vectorizable).
I need to create a (w,N)-matrix that looks like this:
w//2............N-1,N-1
. \ N-1
. \ N-1
. \ N-1
1...............N-1,N-1
0...................N-1
00..................N-2
. \ N-3
. \ .
. \ .
000000..............N-w//2
Which is an (w,N) matrix, with an odd w. The middle row is the range from 0 to N. For each row index above the middle row, the row is shifted to the left like with scipy.ndimage.shift(mode='nearest') and for each row below the middle row it is shifted to the right with the same method.
N is usually around 10^4 and w is usually between 10 and 10^2.
I've come up with 2 ways to do this:
from scipy.ndimage import shift
middle = np.arange(0, N)
final = np.vstack(
[shift(middle, i, mode='nearest') for i in range(-w//2, 0)] +
[middle] +
[shift(middle, i, mode='nearest') for i in range(1, w//2)] )
Which takes 0.035 seconds to run.
np.vstack([
np.maximum(
0,
np.minimum(
N-1,
np.arange(-step, N-step)
)
)
for step in range(-w//2, w//2)
])
Which takes 0.021 seconds to run.
These numbers were with N=10^3 and w=21.
I'd really like to get these numbers down as low as possible, ideally down to around 1ms.
I tried multiprocessing, but that doesn't really help, the overhead to too big to gain something from the concurrency. Also I know I could store this result somewhere, but that'd require a significant change by the caller of this function, so that'll be done later.
Is there any mathematical relation that can represent a tilt/shift operation like this? I couldn't think of one, but if there is, numpy can probably take advantage of that to beat my results.
So yeah, any ideas to make my code faster?
initialise an array with appropriate shape and horizontal values from 0 to N (inclusive)
w, N = 11, 10
arr = np.empty(shape= [w, N], dtype= int)
arr[:] = np.arange(N)
arr
>>> [[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]]
subtract from each row an appropriate value
arr += np.arange(w).reshape([-1, 1])[::-1] - (1+w//2)
arr
>>> [[ 5., 6., 7., 8., 9., 10., 11., 12., 13., 14.],
[ 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.],
[ 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.],
[ 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.],
[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[-1., 0., 1., 2., 3., 4., 5., 6., 7., 8.],
[-2., -1., 0., 1., 2., 3., 4., 5., 6., 7.],
[-3., -2., -1., 0., 1., 2., 3., 4., 5., 6.],
[-4., -3., -2., -1., 0., 1., 2., 3., 4., 5.],
[-5., -4., -3., -2., -1., 0., 1., 2., 3., 4.]]
where values cross limiting values reassign them the limit values
arr[arr<0] = 0
arr[arr>N-1] = N-1
arr
>>> [[5., 6., 7., 8., 9., 9., 9., 9., 9., 9.],
[4., 5., 6., 7., 8., 9., 9., 9., 9., 9.],
[3., 4., 5., 6., 7., 8., 9., 9., 9., 9.],
[2., 3., 4., 5., 6., 7., 8., 9., 9., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9., 9.],
[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[0., 0., 1., 2., 3., 4., 5., 6., 7., 8.],
[0., 0., 0., 1., 2., 3., 4., 5., 6., 7.],
[0., 0., 0., 0., 1., 2., 3., 4., 5., 6.],
[0., 0., 0., 0., 0., 1., 2., 3., 4., 5.],
[0., 0., 0., 0., 0., 0., 1., 2., 3., 4.]]
Edit
tried timing the script
import timeit
script = '''
w, N = 21, 10**3
arr = np.empty(shape= [w, N], dtype= int)
arr[:] = np.arange(N)
arr += np.arange(w).reshape([-1, 1])[::-1] - (1+w//2)
arr[arr<0] = 0
arr[arr>N-1] = N-1
'''
time = timeit.timeit(script, number= 100000, setup= 'import numpy as np') / 100000
time
>>> 0.00019059010320999733 # 0.19 ms
Consider the following ndarray a -
In [117]: a
Out[117]:
array([[[nan, nan],
[nan, nan],
[nan, nan]],
[[ 3., 11.],
[ 7., 13.],
[12., 16.]],
[[ 0., 4.],
[ 6., 1.],
[ 5., 8.]],
[[17., 10.],
[15., 9.],
[ 2., 14.]]])
The minimum computed on the first axis is -
In [118]: np.nanmin(a, 0)
Out[118]:
array([[0., 4.],
[6., 1.],
[2., 8.]])
which is a[2] from visual inspection. What is the most efficient way to calculate this index 2
as suggested by #Divakar you can use np.nanargmin
import numpy as np
a = np.array([[[np.nan, np.nan],
[np.nan, np.nan],
[np.nan, np.nan]],
[[ 3., 11.],
[ 7., 13.],
[12., 16.]],
[[ 0., 4.],
[ 6., 1.],
[ 5., 8.]],
[[17., 10.],
[15., 9.],
[ 2., 14.]]])
minIdx = np.nanargmin(np.sum(a,(1,2)))
minIdx
2
a[minIdx]
array([[0., 4.],
[6., 1.],
[5., 8.]])
This question already has answers here:
Tensorflow - matmul of input matrix with batch data
(5 answers)
Closed 5 years ago.
Say I have a shape (3, 5, 3) tensor like so:
x = [[[ 4., 6., 6.],
[ 0., 0., 3.],
[ 6., 6., 5.],
[ 4., 1., 8.],
[ 3., 6., 7.]],
[[ 4., 0., 5.],
[ 4., 7., 2.],
[ 4., 5., 3.],
[ 4., 2., 1.],
[ 3., 4., 4.]],
[[ 0., 3., 4.],
[ 6., 7., 5.],
[ 1., 2., 2.],
[ 3., 8., 3.],
[ 8., 5., 7.]]]
And a shape (3, 3, 4) tensor like so:
y = [[[ 3., 2., 5., 4.],
[ 8., 7., 1., 8.],
[ 4., 0., 5., 3.]],
[[ 8., 7., 7., 3.],
[ 5., 4., 0., 1.],
[ 6., 5., 4., 4.]],
[[ 7., 0., 1., 2.],
[ 7., 5., 0., 6.],
[ 7., 5., 4., 1.]]]
How would do a matrix multiplication so that the resulting matrix is of shape (3, 5, 4)
Whereby the first element of the matrix is given by the matrix multiplication of
[[ 4., 6., 6.],
[ 0., 0., 3.],
[ 6., 6., 5.],
[ 4., 1., 8.],
[ 3., 6., 7.]]
and
[[ 3., 2., 5., 4.]
[ 8., 7., 1., 8.]
[ 4., 0., 5., 3.]]
I've tried using tf.tensordot like:
z = tf.tensorflow(x, y, axes = [[2],[1]])
which I believe is multiply the 3rd axis of x with the 2nd axis of y but it gives me a tensor of shape (3, 5, 3, 4). Any ideas?
Silly me after reading tf.matmul docs it seems like since the inner dimensions match I can just do tf.matmul(x,y) and it gives me the answer