Convert numpy array with many dimensions into 2D array with nested numpy arrays - numpy

I would like to convert an array with many dimensions (more than 2) into a 2D array where other dimensions would be converted to nested stand-alone arrays.
So if I have an array like numpy.arange(3 * 4 * 5 * 5 * 5).reshape((3, 4, 5, 5, 5)), I would like to convert it to an array of shape (3, 4), where each element would be an array of shape (5, 5, 5). The dtype of the outer array would be object.
For example, for np.arange(8).reshape((1, 1, 2, 2, 2)), the output would be equivalent to:
a = np.ndarray(shape=(1,1), dtype=object)
a[0, 0] = np.arange(8).reshape((1, 1, 2, 2, 2))[0, 0, :, :, :]
How can I do this efficiently?

We can reshape and assign elements from the regular array into the output object dtype array in a single loop that seems to be a tad faster than with two loops, like so -
def reshape_approach(a):
m,n = a.shape[:2]
a.shape = (m*n,) + a.shape[2:]
out = np.empty((m*n),dtype=object)
for i in range(m*n):
out[i] = a[i]
out.shape = (m,n)
a.shape = (m,n) + a.shape[1:]
return out
Runtime test
Other approach(es) -
# #Scotty1-'s soln
def simply_assign(a):
m,n = a.shape[:2]
out = np.empty((m,n),dtype=object)
for i in range(m):
for j in range(n):
out[i,j] = a[i,j]
return out
Timings -
In [154]: m,n = 300,400
...: a = np.arange(m * n * 5 * 5 * 5).reshape((m,n, 5, 5, 5))
In [155]: %timeit simply_assign(a)
10 loops, best of 3: 39.4 ms per loop
In [156]: %timeit reshape_approach(a)
10 loops, best of 3: 32.9 ms per loop
With 7D data -
In [160]: m,n,p,q = 30,40,30,40
...: a = np.arange(m * n *p * q * 5 * 5 * 5).reshape((m,n,p,q, 5, 5, 5))
In [161]: %timeit simply_assign(a)
1000 loops, best of 3: 421 µs per loop
In [162]: %timeit reshape_approach(a)
1000 loops, best of 3: 316 µs per loop

Thanks for your hint Mitar. This is how it should look like using dtype=np.object arrays:
outer_array = np.empty((x.shape[0], x.shape[1]), dtype=np.object)
for i in range(x.shape[0]):
for j in range(x.shape[1]):
outer_array[i, j] = x[i, j]
Looping may not be the most efficient way to do it, but there is afaik no vectorized operation for this task.
(Using some more reshaping, this should be even faster than Divakar's solution: ;)) ---> No, Divakar is faster.... Nice solution Divakar!
def advanced_reshape_solution(x):
m, n = x.shape[:2]
sub_arr_size = np.prod(x.shape[2:])
out_array = np.empty((m * n), dtype=object)
x_flat_view = x.reshape(-1)
for i in range(m*n):
out_array[i] = x_flat_view[i * sub_arr_size:(i + 1) * sub_arr_size].reshape(x.shape[2:])
return out_array.reshape((m, n))

Related

Python 3 vectorizing nested for loop where inner loop depends on parameter

In the geosciensces while porting code from Fortran to python I see variations of these nested for loops(sometimes double nested and sometimes triple nested) that I would like to vectorize(shown here as an minimum reproducible example)
import numpy as np
import sys
import math
def main():
t = np.arange(0,300)
n1=7
tc = test(n1,t)
def test(n1,t):
n2 = int(2*t.size/(n1+1))
print(n2)
tChunked = np.zeros(shape = (n1,n2))
for i in range(0,n1):
istart = int(i*n2/2)
for j in range(0,n2):
tChunked[i,j] = t[istart+j]
return tChunked
main()
What have I tried ?
I have gotten as far as elminating the istart and getting j and using outer addition to get istart+j. But how do I use the index k to get a 2d tChunked array in a single line is where I am stuck.
istart = np.linspace(0,math.ceil(n1*n2/2),num=n1,endpoint=False,dtype=np.int32)
jstart = np.linspace(0,n2,num=n2,endpoint=False,dtype=np.int32)
k = jstart[:,np.newaxis]+istart
numpy will output a 2D array if the index is 2D. So you simply do this.
def test2(n1, t):
n2 = int(2 * t.size / (n1 + 1))
istart = np.linspace(0, math.ceil(n1 * n2 / 2), num=n1, endpoint=False, dtype=np.int32)
jstart = np.linspace(0, n2, num=n2, endpoint=False, dtype=np.int32)
k = istart[:, np.newaxis] + jstart # Note: I switched i and j.
tChunked = t[k] # This creates an array of the same shape as k.
return tChunked
If you have to deal with a lot of nested loops, maybe the solution is to use numba as it can result in better performance than native numpy. Specially for non-python functions like the one you showed.
As easy as:
from numba import njit
#njit
def test(n1,t):
n2 = int(2*t.size/(n1+1))
print(n2)
tChunked = np.zeros(shape = (n1,n2))
for i in range(0,n1):
istart = int(i*n2/2)
for j in range(0,n2):
tChunked[i,j] = t[istart+j]
What made me fall head-over-heels in love with Python was actually NumPy and specifically its amazing indexing and indexing routines!
In test_extra_crispy() we can use zip() to get our ducks (initial conditions) in a row, then indexing using offsets to do the "transplanting" of blocks of values:
i_values = np.arange(7)
istarts = (i_values * n2 / 2).astype(int)
for i, istart in zip(i_values, istarts):
tChunked[i, :n2] = t[istart:istart+n2]
See also
How to np.roll() faster? (and all the linked questions on that page
Best way to insert values of 3D array inside of another larger array (for higher dimensions)
We can see that for
t = np.arange(10000000)
n1 = 7
"extra crispy" is a lot faster than the original (91 vs 4246 ms), but only a little faster than test2() from ken's answer which is not significant considering that it does more careful checking than my brute force treatment.
However "extra crispy" avoids adding an additional axis and expanding the memory usage. For modest arrays this doesn't matter at all. But if you have a huge array then disk space and access time becomes a real problem.
"Just use Jit"?
Zaero Divide's answer points out that #jit goes a long way towards converting impractical python nested loops to fast compiled do loops.
But always check out what methods are already available in NumPy which are already running efficiently with compiled code. One can easily get a gigaflop on a laptop for problems amenable to NumPy's many handy methods.
In my case test_jit() ran four times slower than "extra crispy" (350 vs 91 ms)!
You shouldn't write script that depends on #jit to run fast if you don't need to; not everyone wants to "just use #jit".
Strangely shaped indexing
If you need to address a more random-shaped volume within an array, you can use indexing like this:
array = np.array([[0, 0, 1, 0, 0], [0, 1, 0, 1, 0], [1, 0, 0, 0, 1], [0, 1, 0, 1, 0], [0, 0, 1, 0, 0]])
print(array)
gives
[[0 0 1 0 0]
[0 1 0 1 0]
[1 0 0 0 1]
[0 1 0 1 0]
[0 0 1 0 0]]
and we can get indices for the 1's like this:
i, j = np.where(array == 1)
print(i)
print(j)
which gives
If we want to start with a zeroed array and insert those 1's via numpy indexing, just do this
array = np.zeros((5, 5), dtype=int)
array[i, j] = 1
which recreates the original array.
import numpy as np
import matplotlib.pyplot as plt
import time
def test_original(n1, t):
n2 = int(2*t.size / (n1 + 1))
tChunked = np.zeros(shape = (n1, n2))
for i in range(n1):
istart = int(i * n2 / 2)
for j in range(0, n2):
tChunked[i, j] = t[istart + j]
return tChunked
t = np.arange(10000000)
n1 = 7
t_start = time.process_time()
tc_original = test_original(n1, t)
print('original process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_original.shape: ', tc_original.shape)
fig, ax = plt.subplots(1, 1)
for thing in tc_original:
ax.plot(thing)
plt.show()
def test_extra_crispy(n1, t):
n2 = int(2*t.size / (n1 + 1))
tChunked = np.zeros(shape = (n1, n2))
i_values = np.arange(7)
istarts = (i_values * n2 / 2).astype(int)
for i, istart in zip(i_values, istarts):
tChunked[i, :n2] = t[istart:istart+n2]
return tChunked
t_start = time.process_time()
tc_extra_crispy = test_extra_crispy(n1, t)
print('extra crispy process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_extra_crispy.shape: ', tc_extra_crispy.shape)
print('np.all(tc_extra_crispy == tc_original): ', np.all(tc_extra_crispy == tc_original))
import math
def test2(n1, t): # https://stackoverflow.com/a/72492815/3904031
n2 = int(2 * t.size / (n1 + 1))
istart = np.linspace(0, math.ceil(n1 * n2 / 2), num=n1, endpoint=False, dtype=np.int32)
jstart = np.linspace(0, n2, num=n2, endpoint=False, dtype=np.int32)
k = istart[:, np.newaxis] + jstart # Note: I switched i and j.
tChunked = t[k] # This creates an array of the same shape as k.
return tChunked
t_start = time.process_time()
tc_test2 = test2(n1, t)
print('test2 process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_test2.shape: ', tc_test2.shape)
print('np.all(tc_test2 == tc_original): ', np.all(tc_test2 == tc_original))
# from https://stackoverflow.com/a/72494777/3904031
from numba import njit
#njit
def test_jit(n1,t):
n2 = int(2*t.size/(n1+1))
print(n2)
tChunked = np.zeros(shape = (n1,n2))
for i in range(0,n1):
istart = int(i*n2/2)
for j in range(0,n2):
tChunked[i,j] = t[istart+j]
return tChunked
t_start = time.process_time()
tc_jit = test_jit(n1, t)
print('jit process time (ms)', round(1000*(time.process_time() - t_start), 3))
# print('tc_jit.shape: ', tc_jit.shape)
print('np.all(tc_jit == tc_original): ', np.all(tc_jit == tc_original))

Vectorized running bin index calculation with Tensorflow or numpy

I have an integer array like this:
in=[1, 2, 6, 1, 3, 2, 1]
I would like to calculate a running index for the equal values in the array. For the matrix above the output would be:
out=[0, 0, 0, 1, 0, 1, 2]
So the naive implementation would be to have a counter for all the values. I would like to have a vectorized solution to run it with tensorflow, perhaps with numpy.
I already thought of creating a 2D tensor of shape=(in.shape[0], tf.max(in), ) and writing 1 to the tensor[i, in[i]] cell, and then call a cumsum column-wise, then writing back row-wise. But my input array is quite big (with several 100k entries) with the maximum value of ~500k, thus this sparse matrix wouldn't even fit into the memory.
Do you have better suggestions? Thank you!
Here's a pandas solution:
s = pd.Series([1, 2, 6, 1, 3, 2, 1])
s.groupby(s).cumcount().values
Output:
array([0, 0, 0, 1, 0, 1, 2], dtype=int64)
Test on similar sized data:
s = pd.Series(np.random.randint(0,500000, 100000))
%timeit -n 100 s.groupby(s).cumcount().values
# 23.9 ms ± 562 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
You can use an actual sparse matrix, i.e. use sparse storage. With that an input like a = np.random.randint(0,5*10**5,10**6) is no problem:
import numpy as np
from scipy import sparse
def running(a):
n,m = a.size,a.max()+1
aux = sparse.csr_matrix((np.ones_like(a),a,np.arange(n+1)),(n,m)).tocsc()
msk = aux.indptr[1:] != aux.indptr[:-1]
indptr = aux.indptr[:-1][msk]
aux.data[0] = 0
aux.data[indptr[1:]] -= np.diff(indptr)
out = np.empty_like(a)
out[aux.indices] = aux.data.cumsum()
return out
# alternative method for validation
def use_argsort(a):
indices = a.argsort(kind="stable")
ao = a[indices]
indptr = np.concatenate([[0],(ao[1:] != ao[:-1]).nonzero()[0]+1])
data = np.ones_like(a)
data[0] = 0
data[indptr[1:]] -= np.diff(indptr)
out = np.empty_like(a)
out[indices] = data.cumsum()
return out
in_ = np.array([1, 2, 6, 1, 3, 2, 1])
print("OP example",in_,"->",running(in_))
print("second opinion","->",use_argsort(in_))
from timeit import timeit
A = np.random.randint(0,500_000,1_000_000)
print("large example (500k labels, 1M entries) takes",
timeit(lambda:running(A),number=10)*100,"ms")
print("using other method takes",
timeit(lambda:use_argsort(A),number=10)*100,"ms")
print("same result:",(use_argsort(A) == running(A)).all())
Sample run:
OP example [1 2 6 1 3 2 1] -> [0 0 0 1 0 1 2]
second opinion -> [0 0 0 1 0 1 2]
large example (500k labels, 1M entries) takes 84.1427305014804 ms
using other method takes 262.38483290653676 ms
same result: True

Finding those elements in an array which are "close"

I have an 1 dimensional sorted array and would like to find all pairs of elements whose difference is no larger than 5.
A naive approach would to be to make N^2 comparisons doing something like
diffs = np.tile(x, (x.size,1) ) - x[:, np.newaxis]
D = np.logical_and(diffs>0, diffs<5)
indicies = np.argwhere(D)
Note here that the output of my example are indices of x. If I wanted the values of x which satisfy the criteria, I could do x[indicies].
This works for smaller arrays, but not arrays of the size with which I work.
An idea I had was to find where there are gaps larger than 5 between consecutive elements. I would split the array into two pieces, and compare all the elements in each piece.
Is this a more efficient way of finding elements which satisfy my criteria? How could I go about writing this?
Here is a small example:
x = np.array([ 9, 12,
21,
36, 39, 44, 46, 47,
58,
64, 65,])
the result should look like
array([[ 0, 1],
[ 3, 4],
[ 5, 6],
[ 5, 7],
[ 6, 7],
[ 9, 10]], dtype=int64)
Here is a solution that iterates over offsets while shrinking the set of candidates until there are none left:
import numpy as np
def f_pp(A, maxgap):
d0 = np.diff(A)
d = d0.copy()
IDX = []
k = 1
idx, = np.where(d <= maxgap)
vidx = idx[d[idx] > 0]
while vidx.size:
IDX.append(vidx[:, None] + (0, k))
if idx[-1] + k + 1 == A.size:
idx = idx[:-1]
d[idx] = d[idx] + d0[idx+k]
k += 1
idx = idx[d[idx] <= maxgap]
vidx = idx[d[idx] > 0]
return np.concatenate(IDX, axis=0)
data = np.cumsum(np.random.exponential(size=10000)).repeat(np.random.randint(1, 20, (10000,)))
pairs = f_pp(data, 1)
#pairs = set(map(tuple, pairs))
from timeit import timeit
kwds = dict(globals=globals(), number=100)
print(data.size, 'points', pairs.shape[0], 'close pairs')
print('pp', timeit("f_pp(data, 1)", **kwds)*10, 'ms')
Sample run:
99963 points 1020651 close pairs
pp 43.00256529124454 ms
Your idea of slicing the array is a very efficient approach. Since your data are sorted you can just calculate the difference and split it:
d=np.diff(x)
ind=np.where(d>5)[0]
pieces=np.split(x,ind)
Here pieces is a list, where you can then use in a loop with your own code on every element.
The best algorithm is highly dependent on the nature of your data which I'm unaware. For example another possibility is to write a nested loop:
pairs=[]
for i in range(x.size):
j=i+1
while x[j]-x[i]<=5 and j<x.size:
pairs.append([i,j])
j+=1
If you want it to be more clever, you can edit the outer loop in a way to jump when j hits a gap.

Conditional nd argmin: How can I find the coordinates of the min of a subset of a multidimensional array?

I know I can use argmin and unravel_index to find the index of the smallest value in an ndarray, but what if I want to find the smallest nonzero element, or the smallest element which is not NaN?
Here's an approach using flattened indices -
def flatnonzero_based(a,condition): # condition = a!= or ~np.isnan(a)
idx = np.flatnonzero(condition)
return np.unravel_index(idx[np.take(a, idx).argmin()], a.shape)
Benchmarking
Approaches -
def flatnonzero_based(a,condition): # Proposed soln
idx = np.flatnonzero(condition)
return np.unravel_index(idx[np.take(a, idx).argmin()], a.shape)
def where_based(a, condition): # #Paul Panzer's soln
nz = np.where(condition)
return np.array(nz)[:, np.argmin(a[nz])]
Timings and verification -
In [233]: a = np.random.rand(40,50,30)
In [234]: nan_idx = np.random.choice(range(a.size), size = a.size//100, replace=0)
In [235]: a.ravel()[nan_idx] = np.nan
In [236]: condition = ~np.isnan(a)
In [237]: where_based(a, condition)
Out[237]: array([16, 10, 8])
In [238]: flatnonzero_based(a, condition)
Out[238]: (16, 10, 8)
In [239]: %timeit where_based(a, condition)
1000 loops, best of 3: 877 µs per loop
In [240]: %timeit flatnonzero_based(a, condition)
10000 loops, best of 3: 143 µs per loop
With 4D data -
In [255]: a = np.random.rand(40,50,30,30)
In [256]: nan_idx = np.random.choice(range(a.size), size = a.size//100, replace=0)
In [257]: a.ravel()[nan_idx] = np.nan
In [258]: condition = ~np.isnan(a)
In [259]: where_based(a, condition)
Out[259]: array([34, 14, 5, 10])
In [260]: flatnonzero_based(a, condition)
Out[260]: (34, 14, 5, 10)
In [261]: %timeit where_based(a, condition)
10 loops, best of 3: 64.9 ms per loop
In [262]: %timeit flatnonzero_based(a, condition)
100 loops, best of 3: 5.32 ms per loop
Incorporating #user7138814's suggestion -
In [267]: np.unravel_index(np.nanargmin(a), a.shape)
Out[267]: (34, 14, 5, 10)
In [268]: %timeit np.unravel_index(np.nanargmin(a), a.shape)
100 loops, best of 3: 4.54 ms per loop
This should work (condition is data != 0 or ~np.isnan(data))
nz = np.where(condition)
cond_arg_min = np.array(nz)[:, np.argmin(data[nz])]

Creating image from point list with Numpy, how to speed up?

I've following code which seems to be performance bottleneck:
for x, y, intensity in myarr:
target_map[x, y] = target_map[x,y] + intensity
There are multiple coordinates for same coordinate with variable intensity.
Datatypes:
> print myarr.shape, myarr.dtype
(219929, 3) uint32
> print target_map.shape, target_map.dtype
(150, 200) uint32
Is there any way to optimize this loop, other than writing it in C?
This seems to be related question, how ever I couldn't get the accepted answer working for me: How to convert python list of points to numpy image array?
I get following error message:
Traceback (most recent call last):
File "<pyshell#38>", line 1, in <module>
image[coordinates] = 1
IndexError: too many indices for array
If you convert your 2D coordinates into target_map into flat indices into it using np.ravel_multi_index, you can use np.unique and np.bincount to speed things up quite a bit:
def vec_intensity(my_arr, target_map) :
flat_coords = np.ravel_multi_index((my_arr[:, 0], my_arr[:, 1]),
dims=target_map.shape)
unique_, idx = np.unique(flat_coords, return_inverse=True)
sum_ = np.bincount(idx, weights=my_arr[:, 2])
target_map.ravel()[unique_] += sum_
return target_map
def intensity(my_arr, target_map) :
for x, y, intensity in myarr:
target_map[x, y] += intensity
return target_map
#sample data set
rows, cols = 150, 200
items = 219929
myarr = np.empty((items, 3), dtype=np.uint32)
myarr[:, 0] = np.random.randint(rows, size=(items,))
myarr[:, 1] = np.random.randint(cols, size=(items,))
myarr[:, 2] = np.random.randint(100, size=(items,))
And now:
In [6]: %timeit target_map_1 = np.zeros((rows, cols), dtype=np.uint32); target_map_1 = vec_intensity(myarr, target_map_1)
10 loops, best of 3: 53.1 ms per loop
In [7]: %timeit target_map_2 = np.zeros((rows, cols), dtype=np.uint32); target_map_2 = intensity(myarr, target_map_2)
1 loops, best of 3: 934 ms per loop
In [8]: np.all(target_map_1 == target_map_2)
Out[8]: True
That's almost a 20x speed increase.