Access multiple dimension of an array ("mixed indexing") in a single call? - numpy

Given the following array: samples * rows * columns
arr_3d = np.array([
[
[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]
],
[
[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[
[19, 20, 21],
[22, 23, 24],
[25, 26, 27]
]
])
This works, but can I access rows and columns in the same [_,_,_] call?
>>> arr_3d[[1,2],:,:][:,:,[0,1]]
array([
[
[10, 11],
[13, 14],
[16, 17]
],
[
[19, 20],
[22, 23],
[25, 26]
]
])
This does not behave as expected:
>>> arr_3d[[1,2],:,[0,1]]
array([
[10, 13, 16],
[20, 23, 26]
])
UPDATE: looks like this is a known challenge of mixed indexing
https://numpy.org/neps/nep-0021-advanced-indexing.html#mixed-indexing

You can use:
arr_3d[np.ix_([1,2], np.arange(arr_3d.shape[1]), [0,1])]
output:
array([[[10, 11],
[13, 14],
[16, 17]],
[[19, 20],
[22, 23],
[25, 26]]])

Related

NetworkX: Layout to evenly space connected nodes in an image without large empty spaces

I generate graphs from a big set of JSON files that I don't have a priory info about node positions in a graph image. As a result, when I draw these graphs, I get images with nodes and edges unevenly arranged in the image with lots of unused empty space.
The following is an example of a program that generates a connected graph of 38 nodes.
With default NetworkX image size connected nodes overlap each other. And with increased image size the large empty spaces appear.
How to create layout that will arrange nodes and edges evenly taking into account image size without large empty spaces?
import networkx as nx
import matplotlib.pyplot as plt
import random
import string
def generate_label(i):
label = str(i)+':'+random.choice(['q','a'])+':' \
+''.join(random.sample(string.ascii_letters, 3))
return label
edges = [[0, 16], [1, 13], [2, 20], [17, 2], [3, 28], [17, 3], [4, 27],
[17, 4], [7, 26], [17, 7], [21, 9], [29, 10], [31, 11], [32, 12],
[1, 13], [0, 16], [17, 18], [17, 2], [17, 21], [17, 22], [17, 3],
[17, 4], [17, 29], [17, 24], [17, 7], [18, 19], [17, 18], [18, 19],
[2, 20], [21, 9], [17, 21], [22, 23], [17, 22], [22, 23], [24, 25],
[17, 24], [24, 25], [7, 26], [4, 27], [3, 28], [29, 10], [17, 29],
[30, 31], [30, 32], [30, 33], [30, 31], [31, 11], [30, 32], [32, 12],
[30, 33], [34, 35], [34, 35]]
G = nx.Graph()
for i in range(38):
G.add_node(i, label = generate_label(i))
for e in edges:
G.add_edge(e[0], e[1])
labels = nx.get_node_attributes(G, 'label')
plt.figure(figsize=(14,20))
nx.draw_networkx(nx.relabel_nodes(G, labels), with_labels=True,
node_color = 'orange', node_size=200, font_size=12)
plt.show()

Combination of slicing and array index in numpy

Looking at the answers to this question: How to understand numpy's combined slicing and indexing example
I'm still unable to understand the result of indexing with a combination of a slice and two 1d arrays, like this:
>>> m = np.arange(36).reshape(3,3,4)
>>> m
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]],
[[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35]]])
>>> m[1:3, [2,1],[2,1]]
array([[22, 17],
[34, 29]])
Why is the result equivalent to this?
np.array([
[m[1,2,2],m[1,1,1]],
[m[2,2,2],m[2,1,1]]
])

NumPy: how to filter out the first axes of multidimensional array according to some condition on the elements

Consider the follow ndarray lm -
In [135]: lm
Out[135]:
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]],
[[16, 13],
[30, 1],
[14, 9]]])
In [136]: lm.shape
Out[136]: (3, 3, 2)
I want to filter out members of the first axes (lm[0], lm[1], ...) where at least one of the elements is greater than 20. Since lm[2, 1, 0] is the only element fulfills this condition, I would expect the following result -
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]]]
i.e lm[2] has at least one element > 20, so it is filtered out of the result set. How can I achieve this?
Two ways to do so with np.all and np.any with axis arg -
In [14]: lm[(lm<=20).all(axis=(1,2))]
Out[14]:
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]]])
In [15]: lm[~(lm>20).any(axis=(1,2))]
Out[15]:
array([[[15, 7],
[ 2, 3],
[ 0, 4]],
[[ 8, 12],
[ 6, 5],
[17, 10]]])
To make it generic for ndarrays to work along the last two axes, use axis=(-2,-1) instead.

Most efficient way to reshape tensor into sequences

I am working with audio in TensorFlow, and would like to obtain a series of sequences which could be obtained from sliding a window over my data, so to speak. Examples to illustrate my situation:
Current Data Format:
Shape = [batch_size, num_features]
example = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
]
What I want:
Shape = [batch_size - window_length + 1, window_length, num_features]
example = [
[
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
],
[
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]
],
[
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
],
]
My current solution is to do something like this:
list_of_windows_of_data = []
for x in range(batch_size - window_length + 1):
list_of_windows_of_data.append(tf.slice(data, [x, 0], [window_length,
num_features]))
windowed_data = tf.squeeze(tf.stack(list_of_windows_of_data, axis=0))
And this does the transform. However, it also creates 20,000 operations which slows TensorFlow down a lot when creating a graph. If anyone else has a fun and more efficient way to do this, please do share.
You can do that using tf.map_fn as follows:
example = tf.constant([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
]
)
res = tf.map_fn(lambda i: example[i:i+3], tf.range(example.shape[0]-2), dtype=tf.int32)
sess=tf.InteractiveSession()
res.eval()
This prints
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]],
[[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15]]])
You could use the built-in tf.extract_image_patches:
example = tf.constant([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
]
)
res = tf.reshape(tf.extract_image_patches(example[None,...,None],
[1,3,3,1], [1,1,1,1], [1,1,1,1], 'VALID'), [-1,3,3])

numpy: Efficient way to use a 1D array as an index into 2D array

X.shape == (10,4)
y.shape == (10)
I'd like to produce M, where each entry in M is defined as M[r,c] == X[r, y[r]]; that is, use y to index into the appropriate column of X.
How can I do this efficiently (without loops)?
M could have a single column, though eventually I need to broadcast it so that it has the same shape as X. c starts from the first col of X (0) and goes to the last (9).
Just do :
X=np.arange(40).reshape(10,4)
Y=np.random.randint(0,4,10)
M=X[range(10),Y]
for
In [8]: X
Out[8]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31],
[32, 33, 34, 35],
[36, 37, 38, 39]])
In [9]: Y
Out[9]: array([1, 1, 3, 3, 1, 2, 2, 3, 2, 1])
In [10]: M
Out[10]: array([ 1, 5, 11, 15, 17, 22, 26, 31, 34, 37])