Efficiently compute product of all other elements in Numpy - numpy

Let A be a 2D matrix. How can I compute a matrix B, such that each element of B is the product of all other entries in the same row of A?
Example:
A = np.array([[5, 0, 6], # the input
[3, 1, 9],
[2, 0, 0]])
B = np.array([[0, 30, 0], # the result
[9, 27, 3],
[0, 0, 0]])
The naïve strategy (B = np.prod(A, axis=-1, keepdims=True) / A) runs into division-by-zero errors, and unfortunately these zeros are important elsewhere in the program and cannot trivially be replaced with tiny epsilons.
I've tried using np.where to address the three cases (rows without zeros, rows with one zero, rows with multiple zeros), but although that prevents NaNs in the output, it still requires computing everything up front before letting np.where pick and choose element-wise, which seems like a lot of code and unnecessary computational effort (and still produces div-by-zero warnings in the process).
What is the smartest, fastest way of solving this problem?

I found this answer and, inspired by it, came up with the following efficient-ish solution:
def products_of_others(a, axes=None):
if axes is None:
axes = tuple(range(a.ndim))
if isinstance(axes, int):
axes = (axes,)
# flatten the desired axes into one last dimension
original_shape = a.shape
other_axes = tuple([ax for ax in range(a.ndim) if ax not in axes])
new_ax_order = other_axes + axes
old_ax_order = np.argsort(new_ax_order)
a = np.transpose(a, new_ax_order)
a = np.reshape(a, [original_shape[ax] for ax in other_axes] + [np.prod([original_shape[ax] for ax in axes])])
after = np.concatenate([a[..., 1:], np.ones_like(a[..., 0:1])], axis=-1)
before = np.concatenate([np.ones_like(a[..., 0:1]), a[..., :-1]], axis=-1)
after_prod = np.cumprod(after[..., ::-1], axis=-1)[..., ::-1]
before_prod = np.cumprod(before, axis=-1)
# undo the flattening
out = np.reshape(after_prod * before_prod, [original_shape[ax] for ax in other_axes] + [original_shape[ax] for ax in axes])
out = np.transpose(out, old_ax_order)
return out

Related

Python: create (sparse) stacked diagonal block matrix

I need to create a matrix with the form
M=[
[a1, 0, 0],
[0, b1, 0],
[0, 0, c1],
[a2, 0, 0],
[0, b2, 0],
[0, 0, c2],
[a3, 0, 0],
[0, b3, 0],
[0, 0, c3],
...]
where a(i), b(i) and c(i) are [1xp] blocks. The resulting matrix M has the form [3m x 3p]. I am given the input data in the form of 3 matrices [m x p]:
A = [[a1.T, a2.T, a3.T, ...]].T
B = [[b1.T, b2.T, b3.T, ...]].T
C = [[c1.T, c2.T, c3.T, ...]].T
How can I create the matrix M? Ideally it would be sparse using the scipy.sparse library but I am even struggling creating it as a dense matrix using numpy. Is there no way around a loop or at least list comprehension in this case?
No need to make it complicated. For your scale, the following executes in less than a second.
import numpy as np
import scipy.sparse
from numpy.random import default_rng
rand = default_rng(seed=0)
m = 70_000
p = 20
abc = rand.random((3, m, p))
M_dense = np.zeros((m, 3, 3*p))
for i in range(3):
M_dense[:, i, i*p:(i+1)*p] = abc[i, ...]
M_sparse = scipy.sparse.csr_matrix(M_dense.reshape((-1, 3*p)))
print(M_sparse.shape)
(210000, 60)
Far better, though, is to construct the sparse matrix directly. Note the permuted shape of abc.
abc = rand.random((m, 3, p))
data = abc.ravel()
indices = np.tile(np.arange(3*p), m)
indptr = np.arange(0, data.size+1, p)
M_sparse = scipy.sparse.csr_matrix((data, indices, indptr))

How to gather one element per row

Say I have the following tensor:
t = tf.convert_to_tensor([
[1,2,3,4],
[5,6,7,8]
])
and I have another index tensor:
i = tf.convert_to_tensor([[0],[2]])
how can i gather those elements saying that the [0] refers to the first array and [2] to the second one? thus getting as result [[1],[7]]?
I was thinking concatenating the indexes with a incremental value, to get[[0,0],[1,2]], like this:
i = tf.concat((tf.range(i.shape[0])[...,None] , i), axis=-1)
tf.gather_nd(t, i)
but I feel there is a better solution
You can use TensorFlow variant of NumPy's take_along_axis,
tf.experimental.numpy.take_along_axis(t, i, axis=1)
You can simple stack i with tf.range(...) as follows
import tensorflow as tf
t = tf.convert_to_tensor([
[1,2,3,4],
[5,6,7,8]
])
i = tf.convert_to_tensor([0, 2])
length = tf.shape(i)[0]
indices = tf.stack([tf.range(length), i], axis=1)
# [0, 0], [1, 2]]
tf.gather_nd(t, indices)
# [1, 7]
I'm not sure there is an essentially better solution.

How to transpose each element of a Numpy Matrix [duplicate]

I'm starting off with a numpy array of an image.
In[1]:img = cv2.imread('test.jpg')
The shape is what you might expect for a 640x480 RGB image.
In[2]:img.shape
Out[2]: (480, 640, 3)
However, this image that I have is a frame of a video, which is 100 frames long. Ideally, I would like to have a single array that contains all the data from this video such that img.shape returns (480, 640, 3, 100).
What is the best way to add the next frame -- that is, the next set of image data, another 480 x 640 x 3 array -- to my initial array?
A dimension can be added to a numpy array as follows:
image = image[..., np.newaxis]
Alternatively to
image = image[..., np.newaxis]
in #dbliss' answer, you can also use numpy.expand_dims like
image = np.expand_dims(image, <your desired dimension>)
For example (taken from the link above):
x = np.array([1, 2])
print(x.shape) # prints (2,)
Then
y = np.expand_dims(x, axis=0)
yields
array([[1, 2]])
and
y.shape
gives
(1, 2)
You could just create an array of the correct size up-front and fill it:
frames = np.empty((480, 640, 3, 100))
for k in xrange(nframes):
frames[:,:,:,k] = cv2.imread('frame_{}.jpg'.format(k))
if the frames were individual jpg file that were named in some particular way (in the example, frame_0.jpg, frame_1.jpg, etc).
Just a note, you might consider using a (nframes, 480,640,3) shaped array, instead.
Pythonic
X = X[:, :, None]
which is equivalent to
X = X[:, :, numpy.newaxis] and
X = numpy.expand_dims(X, axis=-1)
But as you are explicitly asking about stacking images,
I would recommend going for stacking the list of images np.stack([X1, X2, X3]) that you may have collected in a loop.
If you do not like the order of the dimensions you can rearrange with np.transpose()
You can use np.concatenate() use the axis parameter to specify the dimension that should be concatenated. If the arrays being concatenated do not have this dimension, you can use np.newaxis to indicate where the new dimension should be added:
import numpy as np
movie = np.concatenate((img1[:,np.newaxis], img2[:,np.newaxis]), axis=3)
If you are reading from many files:
import glob
movie = np.concatenate([cv2.imread(p)[:,np.newaxis] for p in glob.glob('*.jpg')], axis=3)
Consider Approach 1 with reshape method and Approach 2 with np.newaxis method that produce the same outcome:
#Lets suppose, we have:
x = [1,2,3,4,5,6,7,8,9]
print('I. x',x)
xNpArr = np.array(x)
print('II. xNpArr',xNpArr)
print('III. xNpArr', xNpArr.shape)
xNpArr_3x3 = xNpArr.reshape((3,3))
print('IV. xNpArr_3x3.shape', xNpArr_3x3.shape)
print('V. xNpArr_3x3', xNpArr_3x3)
#Approach 1 with reshape method
xNpArrRs_1x3x3x1 = xNpArr_3x3.reshape((1,3,3,1))
print('VI. xNpArrRs_1x3x3x1.shape', xNpArrRs_1x3x3x1.shape)
print('VII. xNpArrRs_1x3x3x1', xNpArrRs_1x3x3x1)
#Approach 2 with np.newaxis method
xNpArrNa_1x3x3x1 = xNpArr_3x3[np.newaxis, ..., np.newaxis]
print('VIII. xNpArrNa_1x3x3x1.shape', xNpArrNa_1x3x3x1.shape)
print('IX. xNpArrNa_1x3x3x1', xNpArrNa_1x3x3x1)
We have as outcome:
I. x [1, 2, 3, 4, 5, 6, 7, 8, 9]
II. xNpArr [1 2 3 4 5 6 7 8 9]
III. xNpArr (9,)
IV. xNpArr_3x3.shape (3, 3)
V. xNpArr_3x3 [[1 2 3]
[4 5 6]
[7 8 9]]
VI. xNpArrRs_1x3x3x1.shape (1, 3, 3, 1)
VII. xNpArrRs_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
VIII. xNpArrNa_1x3x3x1.shape (1, 3, 3, 1)
IX. xNpArrNa_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
a = np.expand_dims(a, axis=-1)
or
a = a[:, np.newaxis]
or
a = a.reshape(a.shape + (1,))
There is no structure in numpy that allows you to append more data later.
Instead, numpy puts all of your data into a contiguous chunk of numbers (basically; a C array), and any resize requires allocating a new chunk of memory to hold it. Numpy's speed comes from being able to keep all the data in a numpy array in the same chunk of memory; e.g. mathematical operations can be parallelized for speed and you get less cache misses.
So you will have two kinds of solutions:
Pre-allocate the memory for the numpy array and fill in the values, like in JoshAdel's answer, or
Keep your data in a normal python list until it's actually needed to put them all together (see below)
images = []
for i in range(100):
new_image = # pull image from somewhere
images.append(new_image)
images = np.stack(images, axis=3)
Note that there is no need to expand the dimensions of the individual image arrays first, nor do you need to know how many images you expect ahead of time.
You can use stack with the axis parameter:
img.shape # h,w,3
imgs = np.stack([img1,img2,img3,img4], axis=-1) # -1 = new axis is last
imgs.shape # h,w,3,nimages
For example: to convert grayscale to color:
>>> d = np.zeros((5,4), dtype=int) # 5x4
>>> d[2,3] = 1
>>> d3.shape
Out[30]: (5, 4, 3)
>>> d3 = np.stack([d,d,d], axis=-2) # 5x4x3 -1=as last axis
>>> d3[2,3]
Out[32]: array([1, 1, 1])
I followed this approach:
import numpy as np
import cv2
ls = []
for image in image_paths:
ls.append(cv2.imread('test.jpg'))
img_np = np.array(ls) # shape (100, 480, 640, 3)
img_np = np.rollaxis(img_np, 0, 4) # shape (480, 640, 3, 100).
This worked for me:
image = image[..., None]
This will help you add axis anywhere you want
import numpy as np
signal = np.array([[0.3394572666491664, 0.3089068053925853, 0.3516359279582483], [0.33932706934615525, 0.3094755563319447, 0.3511973743219001], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256]])
print(signal.shape)
#(4,3)
print(signal[...,np.newaxis].shape) or signal[...:none]
#(4, 3, 1)
print(signal[:, np.newaxis, :].shape) or signal[:,none, :]
#(4, 1, 3)
there is three-way for adding new dimensions to ndarray .
first: using "np.newaxis" (something like #dbliss answer)
np.newaxis is just given an alias to None for making it easier to
understand. If you replace np.newaxis with None, it works the same
way. but it's better to use np.newaxis for being more explicit.
import numpy as np
my_arr = np.array([2, 3])
new_arr = my_arr[..., np.newaxis]
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
second: using "np.expand_dims()"
Specify the original ndarray in the first argument and the position
to add the dimension in the second argument axis.
my_arr = np.array([2, 3])
new_arr = np.expand_dims(my_arr, -1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
third: using "reshape()"
my_arr = np.array([2, 3])
new_arr = my_arr.reshape(*my_arr.shape, 1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)

how to use tf.scatter_nd "without" accumulation?

Let's say I have a (None, 2)-shape tensor indices, and (None,)-shape tensor values. These actual row # and values will be determined at runtime.
I would like to set a 4x5 tensor t that each element of indices has values of values. I found that I can use tf.scatter_nd like this:
t = tf.scatter_np(indices, values, [4, 5])
# E.g., indices = [[1,2],[2,3]], values = [100, 200]
# t[1,2] <-- 100; t[2,3] <-- 200
My problem is that: when indices has duplicates, the values will be accumulated.
# E.g., indices = [[1,2],[1,2]], values = [100, 200]
# t[1,2] <-- 300
I would like to assign only one, i,e, either ignorance (so, the first value) or overwriting (so, the last value).
I feel like I need to check duplicates in indices, or I need to use tensorflow loop. Could anyone please advise? (hopefully a minimal example code?)
You can use tf.unique: the only issue is that this op requires a 1D tensor.
Thus, to overcome this I decided to use the Cantor pairing function.
In short, it exists a bijective function that maps a tuple (in this case a pair of values, but it works for any N-dimensional tuple) to a single value.
Once the coordinates have been reduced to a 1-D tensor of scalar, then tf.unique can be used to find the indices of the unique numbers.
The Cantor pairing function is invertible, thus now we know not only the indices of the non-repeated values within the 1-D tensor, but we can also go back to the 2-D space of the coordinates and use scatter_nd to perform the update without the problem of the accumulator.
TL;DR:
import tensorflow as tf
import numpy as np
# Dummy values
indices = np.array([[1, 2], [2, 3]])
values = np.array([100, 200])
# Placeholders
indices_ = tf.placeholder(tf.int32, shape=(2, 2))
values_ = tf.placeholder(tf.float32, shape=(2))
# Use the Cantor tuple to create a one-to-one correspondence between the coordinates
# and a single value
x = tf.cast(indices_[:, 0], tf.float32)
y = tf.cast(indices_[:, 1], tf.float32)
z = (x + y) * (x + y + 1) / 2 + y # shape = (2)
# Collect unique indices, treated as single values
# Drop the indices position into z because are useless
unique_cantor, _ = tf.unique(z)
# Go back from cantor numbers to pairs of values
w = tf.floor((tf.sqrt(8 * unique_cantor + 1) - 1) / 2)
t = (tf.pow(w, 2) + w) / 2
y = z - t
x = w - y
# Recreate a batch of coordinates that are uniques
unique_indices = tf.cast(tf.stack([x, y], axis=1), tf.int32)
# Update without accumulator
go = tf.scatter_nd(unique_indices, values_, [4, 5])
with tf.Session() as sess:
print(sess.run(go, feed_dict={indices_: indices, values_: values}))
Apply tf.scatter_nd also to a matrix of ones. That will give you the number of elements that are accumulated and you can just divide your result by that to get an average. (But watch out for zeros, for those you should divide by one).
counter = tf.ones(tf.shape(values))
t = tf.scatter_nd(indices,values,shape)
t_counter = tf.scatter_nd(indices,counter,shape)
Then divide t by t_counter (but only where t_counter is not zero).
This may not be the best solution, I used tf.unsorted_segment_max to avoid accumulation
with tf.Session() as sess:
# #########
# Examples:
# ##########
width, height, depth = [3, 3, 2]
indices = tf.cast([[0, 1, 0], [0, 1, 0], [1, 1, 1]], tf.int32)
values = tf.cast([1, 2, 3], tf.int32)
# ########################
# Filter duplicated indices
# #########################
flatten = tf.matmul(indices, [[height * depth], [depth], [1]])
filtered, idx = tf.unique(tf.squeeze(flatten))
# #####################
# Obtain updated result
# #####################
def reverse(index):
"""Map from 1-D to 3-D """
x = index / (height * depth)
y = (index - x * height * depth) / depth
z = index - x * height * depth - y * depth
return tf.stack([x, y, z], -1)
# This will pick the maximum value instead of accumulating the result
updated_values = tf.unsorted_segment_max(values, idx, tf.shape(filtered_idx)[0])
updated_indices = tf.map_fn(fn=lambda i: reverse(i), elems=filtered)
# Now you can scatter_nd without accumulation
result = tf.scatter_nd(updated_indices,
updated_values,
tf.TensorShape([3, 3, 2]))

TensorFlow: numpy.repeat() alternative

I want to compare the predicted values yp from my neural network in a pairwise fashion, and so I was using (back in my old numpy implementation):
idx = np.repeat(np.arange(len(yp)), len(yp))
jdx = np.tile(np.arange(len(yp)), len(yp))
s = yp[[idx]] - yp[[jdx]]
This basically create a indexing mesh which I then use. idx=[0,0,0,1,1,1,...] while jdx=[0,1,2,0,1,2...]. I do not know if there is a simpler manner of doing it...
Anyhow, TensorFlow has a tf.tile(), but it seems to be lacking a tf.repeat().
idx = np.repeat(np.arange(n), n)
v2 = v[idx]
And I get the error:
TypeError: Bad slice index [ 0 0 0 ..., 215 215 215] of type <type 'numpy.ndarray'>
It also does not work to use a TensorFlow constant for the indexing:
idx = tf.constant(np.repeat(np.arange(n), n))
v2 = v[idx]
-
TypeError: Bad slice index Tensor("Const:0", shape=TensorShape([Dimension(46656)]), dtype=int64) of type <class 'tensorflow.python.framework.ops.Tensor'>
The idea is to convert my RankNet implementation to TensorFlow.
You can achieve the effect of np.repeat() using a combination of tf.tile() and tf.reshape():
idx = tf.range(len(yp))
idx = tf.reshape(idx, [-1, 1]) # Convert to a len(yp) x 1 matrix.
idx = tf.tile(idx, [1, len(yp)]) # Create multiple columns.
idx = tf.reshape(idx, [-1]) # Convert back to a vector.
You can simply compute jdx using tf.tile():
jdx = tf.range(len(yp))
jdx = tf.tile(jdx, [len(yp)])
For the indexing, you could try using tf.gather() to extract non-contiguous slices from the yp tensor:
s = tf.gather(yp, idx) - tf.gather(yp, jdx)
According to tf api document, tf.keras.backend.repeat_elements() does the same work with np.repeat() . For example,
x = tf.constant([1, 3, 3, 1], dtype=tf.float32)
rep_x = tf.keras.backend.repeat_elements(x, 5, axis=0)
# result: [1. 1. 1. 1. 1. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 1. 1. 1. 1. 1.]
Just for 1-d tensors, I've made this function
def tf_repeat(y,repeat_num):
return tf.reshape(tf.tile(tf.expand_dims(y,axis=-1),[1,repeat_num]),[-1])
It looks like your question is so popular that people refer it on TF tracker. Sadly the same function is not still implemented in TF.
You can implement it by combining tf.tile, tf.reshape, tf.squeeze. Here is a way to convert examples from np.repeat:
import numpy as np
import tensorflow as tf
x = [[1,2],[3,4]]
print np.repeat(3, 4)
print np.repeat(x, 2)
print np.repeat(x, 3, axis=1)
x = tf.constant([[1,2],[3,4]])
with tf.Session() as sess:
print sess.run(tf.tile([3], [4]))
print sess.run(tf.squeeze(tf.reshape(tf.tile(tf.reshape(x, (-1, 1)), (1, 2)), (1, -1))))
print sess.run(tf.reshape(tf.tile(tf.reshape(x, (-1, 1)), (1, 3)), (2, -1)))
In the last case where repeats are different for each element you most probably will need loops.
Just in case anybody is interested for a 2D method to copy the matrices. I think this could work:
TF_obj = tf.zeros([128, 128])
tf.tile(tf.expand_dims(TF_obj, 2), [1, 1, 2])
import numpy as np
import tensorflow as tf
import itertools
x = np.arange(6).reshape(3,2)
x = tf.convert_to_tensor(x)
N = 3 # number of repetition
K = x.shape[0] # for here 3
order = list(range(0, N*K, K))
order = [[x+i for x in order] for i in range(K)]
order = list(itertools.chain.from_iterable(order))
x_rep = tf.gather(tf.tile(x, [N, 1]), order)
Results from:
[0, 1],
[2, 3],
[4, 5]]
To:
[[0, 1],
[0, 1],
[0, 1],
[2, 3],
[2, 3],
[2, 3],
[4, 5],
[4, 5],
[4, 5]]
If you want:
[[0, 1],
[2, 3],
[4, 5],
[0, 1],
[2, 3],
[4, 5],
[0, 1],
[2, 3],
[4, 5]]
Simply use tf.tile(x, [N, 1])
So I have found that tensorflow has one such method to repeat the elements of an array. The method tf.keras.backend.repeat_elements is what you are looking for. Anyone who comes at a later point of time can save lot of their efforts. This link offers an explanation to the method and specifically says
Repeats the elements of a tensor along an axis, like np.repeat
I have included a very short example which proves that the elements are copied in the exact way as np.repeat would do.
import numpy as np
import tensorflow as tf
x = np.random.rand(2,2)
# print(x) # uncomment this line to see the array's elements
y = tf.convert_to_tensor(x)
y = tf.keras.backend.repeat_elements(x, rep=3, axis=0)
# print(y) # uncomment this line to see the results
You can simulate missing tf.repeat by tf.stacking the value with itself:
value = np.arange(len(yp)) # what to repeat
repeat_count = len(yp) # how many times
repeated = tf.stack ([value for i in range(repeat_count)], axis=1)
I advice using this only on small repeat counts.
Though many clean and working solutions have been given, they seem to all be based on producing the set of indices from scratch each iteration.
While the cost to produce these node's isn't typically significant during training, it may be significant if using your model for inference.
Repeating tf.range (like your example) has come up a few times so I built the following function creator. Given the maximum number of times something will be repeated and the maximum number of things that will need repeating, it returns a function which produces the same values as np.repeat(np.arange(len(multiples)), multiples).
import tensorflow as tf
import numpy as np
def numpy_style_repeat_1d_creator(max_multiple=100, max_to_repeat=10000):
board_num_lookup_ary = np.repeat(
np.arange(max_to_repeat),
np.full([max_to_repeat], max_multiple))
board_num_lookup_ary = board_num_lookup_ary.reshape(max_to_repeat, max_multiple)
def fn_to_return(multiples):
board_num_lookup_tensor = tf.constant(board_num_lookup_ary, dtype=tf.int32)
casted_multiples = tf.cast(multiples, dtype=tf.int32)
padded_multiples = tf.pad(
casted_multiples,
[[0, max_to_repeat - tf.shape(multiples)[0]]])
return tf.boolean_mask(
board_num_lookup_tensor,
tf.sequence_mask(padded_multiples, maxlen=max_multiple))
return fn_to_return
#Here's an example of how it can be used
with tf.Session() as sess:
repeater = numpy_style_repeat_1d_creator(5,4)
multiples = tf.constant([4,1,3])
repeated_values = repeater(multiples)
print(sess.run(repeated_values))
The general idea is to store a repeated tensor and then mask it, but it may help to see it visually (this is for the example given above):
In the example above the following Tensor is produced:
[[0,0,0,0,0],
[1,1,1,1,1],
[2,2,2,2,2],
[3,3,3,3,3]]
For multiples [4,1,3] it will collect the non-X values:
[[0,0,0,0,X],
[1,X,X,X,X],
[2,2,2,X,X],
[X,X,X,X,X]]
resulting in:
[0,0,0,0,1,2,2,2]
tl;dr: To avoid producing the indices each time (can be costly), pre-repeat everything and then mask that tensor each time
A relatively fast implementation was recently added with RaggedTensor utilities from 1.13, but it's not a part of the officially exported API. You can still use it, but there's a chance it might disappear.
from tensorflow.python.ops.ragged.ragged_util import repeat
From the source code:
# This op is intended to exactly match the semantics of numpy.repeat, with
# one exception: numpy.repeat has special (and somewhat non-intuitive) behavior
# when axis is not specified. Rather than implement that special behavior, we
# simply make `axis` be a required argument.
Tensorflow 2.10 has implemented np.repeat feature.
tf.repeat([1, 2, 3], repeats=[3, 1, 2], axis=0)
<tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 1, 1, 2, 3, 3], dtype=int32)>