How to gather one element per row - tensorflow

Say I have the following tensor:
t = tf.convert_to_tensor([
[1,2,3,4],
[5,6,7,8]
])
and I have another index tensor:
i = tf.convert_to_tensor([[0],[2]])
how can i gather those elements saying that the [0] refers to the first array and [2] to the second one? thus getting as result [[1],[7]]?
I was thinking concatenating the indexes with a incremental value, to get[[0,0],[1,2]], like this:
i = tf.concat((tf.range(i.shape[0])[...,None] , i), axis=-1)
tf.gather_nd(t, i)
but I feel there is a better solution

You can use TensorFlow variant of NumPy's take_along_axis,
tf.experimental.numpy.take_along_axis(t, i, axis=1)

You can simple stack i with tf.range(...) as follows
import tensorflow as tf
t = tf.convert_to_tensor([
[1,2,3,4],
[5,6,7,8]
])
i = tf.convert_to_tensor([0, 2])
length = tf.shape(i)[0]
indices = tf.stack([tf.range(length), i], axis=1)
# [0, 0], [1, 2]]
tf.gather_nd(t, indices)
# [1, 7]
I'm not sure there is an essentially better solution.

Related

Speed up applying a transformation to each index value of a given array

I need to apply a function to the result of a transformation of all index values of a given numpy array. The following code does this:
import numpy as np
from matplotlib.transforms import IdentityTransform
# some 2D array
a = np.empty((2,3))
# some affine transformation, identity is just an example here
trans = IdentityTransform()
# some function taking a 2D index and returning some value depending
# on that index, again just an example
def f(idx):
return (idx[0]+idx[1])/2
# apply f to the result of transforming each index of a
b=np.empty_like(a)
for idx in np.ndindex(a.shape):
b[idx] = f(trans.transform(idx))
print(b)
This prints the following correct result:
[[0. 0.5 1. ]
[0.5 1. 1.5]]
The problem now is, the code is too slow when the shape of a gets larger, say 2000x3000. Is there a way to speed this up?
My idea is to create an array of indices of a idx = [[0,0], [0,1], ..., [1,2]], then transform this array in one go using something like tmp = trans.transform(idx), and lastly apply f to every element with np.vectorize(f)(tmp).
Is this a reasonable approach? If yes, how would this actually look like? If no, are there any alternatives?
Edit: I managed to get at tmp via the following code:
tmp=trans.transform(np.asarray([idx for idx in np.ndindex(a.shape)]))
So now I have an array containing the results of the affine transformation for every index value of a. But this seems to use an awful lot of memory.
I'll post an answer myself with what I figured out now. Maybe it is of use for someone.
To answer the first part of my question, I found a fast and efficient way to create the result of transforming the index values, using the result of np.indices() and then massaging the result of that until it fits to what t.transform() expects.
Given some array a = np.empty((2,3)), the indices of that array can be obtained via np.indices(a.shape). This returns two 2D arrays (one for each dimension of a, actually). What I failed to understand was how to turn these results into something transform() understands.
The key here is to apply np.ravel() to the result of each of those arrays, np.indices() returns:
>>> a=np.empty((2,3))
>>> list(map(np.ravel, np.indices(a.shape)))
[array([0, 0, 0, 1, 1, 1]), array([0, 1, 2, 0, 1, 2])]
Now I have a list of arrays containing all the x and y indices, which just needs to be put together with np.vstack() and then transposed to get an array of all (x, y) indices, and this is the form transform() will accept.
>>> l=list(map(np.ravel, np.indices(a.shape)))
>>> np.vstack(l).transpose()
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2]])
And finally, for some arbitrary affine transformation:
>>> from matplotlib.transforms import Affine2D
>>> t = Affine2D().translate(10, 20).scale(0.5)
>>> t.transform(np.vstack(l).transpose())
array([[ 5. , 10. ],
[ 5. , 10.5],
[ 5. , 11. ],
[ 5.5, 10. ],
[ 5.5, 10.5],
[ 5.5, 11. ]])
This is quite fast, even for larger array sizes. If the shape gets big enough (something like 20000x30000), I run out of memory, but for shapes 10000x10000 it still is amazingly fast.
>>> timeit.timeit("t.transform(np.vstack(list(map(np.ravel, np.indices(a.shape, dtype=np.uint16)))).transpose())",
... "import numpy as np ; from matplotlib.transforms import Affine2D ; a = np.empty((20, 10)) ; t = Affine2D().translate(10, 20).scale(0.5)", number=10)
0.0003051299718208611
>>> timeit.timeit("t.transform(np.vstack(list(map(np.ravel, np.indices(a.shape, dtype=np.uint16)))).transpose())",
... "import numpy as np ; from matplotlib.transforms import Affine2D ; a = np.empty((200, 100)) ; t = Affine2D().translate(10, 20).scale(0.5)", number=10)
0.0026413939776830375
>>> timeit.timeit("t.transform(np.vstack(list(map(np.ravel, np.indices(a.shape, dtype=np.uint16)))).transpose())",
... "import numpy as np ; from matplotlib.transforms import Affine2D ; a = np.empty((2000, 1000)) ; t = Affine2D().translate(10, 20).scale(0.5)", number=10)
0.35055489401565865
>>> timeit.timeit("t.transform(np.vstack(list(map(np.ravel, np.indices(a.shape, dtype=np.uint16)))).transpose())",
... "import numpy as np ; from matplotlib.transforms import Affine2D ; a = np.empty((20000, 10000)) ; t = Affine2D().translate(10, 20).scale(0.5)", number=10)
43.62860555597581
Now for the second part, for applying the function to each of the transformed index values I use the following code for now, which is fast enough in my case.
xxyy = t.transform(np.vstack(...).transpose())
np.fromiter((f(*xy) for xy in xxyy), dtype=np.short, count=len(xxyy))

Efficiently compute product of all other elements in Numpy

Let A be a 2D matrix. How can I compute a matrix B, such that each element of B is the product of all other entries in the same row of A?
Example:
A = np.array([[5, 0, 6], # the input
[3, 1, 9],
[2, 0, 0]])
B = np.array([[0, 30, 0], # the result
[9, 27, 3],
[0, 0, 0]])
The naïve strategy (B = np.prod(A, axis=-1, keepdims=True) / A) runs into division-by-zero errors, and unfortunately these zeros are important elsewhere in the program and cannot trivially be replaced with tiny epsilons.
I've tried using np.where to address the three cases (rows without zeros, rows with one zero, rows with multiple zeros), but although that prevents NaNs in the output, it still requires computing everything up front before letting np.where pick and choose element-wise, which seems like a lot of code and unnecessary computational effort (and still produces div-by-zero warnings in the process).
What is the smartest, fastest way of solving this problem?
I found this answer and, inspired by it, came up with the following efficient-ish solution:
def products_of_others(a, axes=None):
if axes is None:
axes = tuple(range(a.ndim))
if isinstance(axes, int):
axes = (axes,)
# flatten the desired axes into one last dimension
original_shape = a.shape
other_axes = tuple([ax for ax in range(a.ndim) if ax not in axes])
new_ax_order = other_axes + axes
old_ax_order = np.argsort(new_ax_order)
a = np.transpose(a, new_ax_order)
a = np.reshape(a, [original_shape[ax] for ax in other_axes] + [np.prod([original_shape[ax] for ax in axes])])
after = np.concatenate([a[..., 1:], np.ones_like(a[..., 0:1])], axis=-1)
before = np.concatenate([np.ones_like(a[..., 0:1]), a[..., :-1]], axis=-1)
after_prod = np.cumprod(after[..., ::-1], axis=-1)[..., ::-1]
before_prod = np.cumprod(before, axis=-1)
# undo the flattening
out = np.reshape(after_prod * before_prod, [original_shape[ax] for ax in other_axes] + [original_shape[ax] for ax in axes])
out = np.transpose(out, old_ax_order)
return out

Eigenvector normalization in numpy

I'm using the linalg in numpy to compute eigenvalues and eigenvectors of matrices of signed reals.
I've read this previous question but still don't grasp the normalization of eigenvectors.
Here is an example straight off Wikipedia:
import numpy as np
from numpy import linalg as la
a = np.matrix([[2, 1], [1, 2]], dtype=np.float)
eigh_vals, eigh_vects = np.linalg.eig(a)
print 'eigen_values='
print eigh_vals
print 'eigen_vectors='
print eigh_vects
The eigenvalues are 1 and 3.
For eigenvectors we expect scalar multiples of [1, -1] and [1, 1], which I get:
eig_vals=
[ 3. 1.]
eig_vets=
[[ 0.70710678 -0.70710678]
[ 0.70710678 0.70710678]]
I understand the 1/sqrt(2) factor is to have the norm=1 but why?
Can normalization be 'switched off'?
Thanks!
The key message for the first eigenvector in the Wikipedia article is
Any non-zero vector with v1 = −v2 solves this equation.
So the actual solution is V1 = [x, -x]. Picking the vector V1 = [1, -1] may be pleasing to the human eye, but it is just as aritrary as picking a vector V1 = [104051, -104051] or any other real value.
Actually, picking V1 = [1, -1] / sqrt(2) is the least arbitrary. Of all the possible vectors for V1, it's the only one that is of unit length.
However if instead of unit length you prefer the first value to be 1, you can do
eigh_vects /= eigh_vects[:, 0]
import numpy as np
import sympy as sp
v = sp.Matrix([[2, 1], [1, 2]])
v_vec = v.eigenvects()
v_vec is a list contains 2 tuples:
[(1, 1, [Matrix([
[-1],
[ 1]])]), (3, 1, [Matrix([
[1],
[1]])])]
1 and 3 is the two eigenvalues. The '1' behind 1 & 3 is the number of the eigenvalues. In each tuple, the third element is the eigenvector of each eigenvalue. It is a Matrix object in sp. You can convert a Matrix object to the np array.
v_vec1 = np.array(v_vec[0][2], dtype=float)
v_vec2 = np.array(v_vec[1][2], dtype=float)
print('v_vec1 =', v_vec1)
print('v_vec2 =', v_vec2)
Here is the normalized eigenvectors you would get:
v_vec1 = [[-1. 1.]]
v_vec2 = [[1. 1.]]
If sympy is an option for you, it appears to normalize less aggressively:
import sympy
a = sympy.Matrix([[2, 1], [1, 2]])
a.eigenvects()
# [(1, 1, [Matrix([
# [-1],
# [ 1]])]), (3, 1, [Matrix([
# [1],
# [1]])])]

How to transpose each element of a Numpy Matrix [duplicate]

I'm starting off with a numpy array of an image.
In[1]:img = cv2.imread('test.jpg')
The shape is what you might expect for a 640x480 RGB image.
In[2]:img.shape
Out[2]: (480, 640, 3)
However, this image that I have is a frame of a video, which is 100 frames long. Ideally, I would like to have a single array that contains all the data from this video such that img.shape returns (480, 640, 3, 100).
What is the best way to add the next frame -- that is, the next set of image data, another 480 x 640 x 3 array -- to my initial array?
A dimension can be added to a numpy array as follows:
image = image[..., np.newaxis]
Alternatively to
image = image[..., np.newaxis]
in #dbliss' answer, you can also use numpy.expand_dims like
image = np.expand_dims(image, <your desired dimension>)
For example (taken from the link above):
x = np.array([1, 2])
print(x.shape) # prints (2,)
Then
y = np.expand_dims(x, axis=0)
yields
array([[1, 2]])
and
y.shape
gives
(1, 2)
You could just create an array of the correct size up-front and fill it:
frames = np.empty((480, 640, 3, 100))
for k in xrange(nframes):
frames[:,:,:,k] = cv2.imread('frame_{}.jpg'.format(k))
if the frames were individual jpg file that were named in some particular way (in the example, frame_0.jpg, frame_1.jpg, etc).
Just a note, you might consider using a (nframes, 480,640,3) shaped array, instead.
Pythonic
X = X[:, :, None]
which is equivalent to
X = X[:, :, numpy.newaxis] and
X = numpy.expand_dims(X, axis=-1)
But as you are explicitly asking about stacking images,
I would recommend going for stacking the list of images np.stack([X1, X2, X3]) that you may have collected in a loop.
If you do not like the order of the dimensions you can rearrange with np.transpose()
You can use np.concatenate() use the axis parameter to specify the dimension that should be concatenated. If the arrays being concatenated do not have this dimension, you can use np.newaxis to indicate where the new dimension should be added:
import numpy as np
movie = np.concatenate((img1[:,np.newaxis], img2[:,np.newaxis]), axis=3)
If you are reading from many files:
import glob
movie = np.concatenate([cv2.imread(p)[:,np.newaxis] for p in glob.glob('*.jpg')], axis=3)
Consider Approach 1 with reshape method and Approach 2 with np.newaxis method that produce the same outcome:
#Lets suppose, we have:
x = [1,2,3,4,5,6,7,8,9]
print('I. x',x)
xNpArr = np.array(x)
print('II. xNpArr',xNpArr)
print('III. xNpArr', xNpArr.shape)
xNpArr_3x3 = xNpArr.reshape((3,3))
print('IV. xNpArr_3x3.shape', xNpArr_3x3.shape)
print('V. xNpArr_3x3', xNpArr_3x3)
#Approach 1 with reshape method
xNpArrRs_1x3x3x1 = xNpArr_3x3.reshape((1,3,3,1))
print('VI. xNpArrRs_1x3x3x1.shape', xNpArrRs_1x3x3x1.shape)
print('VII. xNpArrRs_1x3x3x1', xNpArrRs_1x3x3x1)
#Approach 2 with np.newaxis method
xNpArrNa_1x3x3x1 = xNpArr_3x3[np.newaxis, ..., np.newaxis]
print('VIII. xNpArrNa_1x3x3x1.shape', xNpArrNa_1x3x3x1.shape)
print('IX. xNpArrNa_1x3x3x1', xNpArrNa_1x3x3x1)
We have as outcome:
I. x [1, 2, 3, 4, 5, 6, 7, 8, 9]
II. xNpArr [1 2 3 4 5 6 7 8 9]
III. xNpArr (9,)
IV. xNpArr_3x3.shape (3, 3)
V. xNpArr_3x3 [[1 2 3]
[4 5 6]
[7 8 9]]
VI. xNpArrRs_1x3x3x1.shape (1, 3, 3, 1)
VII. xNpArrRs_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
VIII. xNpArrNa_1x3x3x1.shape (1, 3, 3, 1)
IX. xNpArrNa_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
a = np.expand_dims(a, axis=-1)
or
a = a[:, np.newaxis]
or
a = a.reshape(a.shape + (1,))
There is no structure in numpy that allows you to append more data later.
Instead, numpy puts all of your data into a contiguous chunk of numbers (basically; a C array), and any resize requires allocating a new chunk of memory to hold it. Numpy's speed comes from being able to keep all the data in a numpy array in the same chunk of memory; e.g. mathematical operations can be parallelized for speed and you get less cache misses.
So you will have two kinds of solutions:
Pre-allocate the memory for the numpy array and fill in the values, like in JoshAdel's answer, or
Keep your data in a normal python list until it's actually needed to put them all together (see below)
images = []
for i in range(100):
new_image = # pull image from somewhere
images.append(new_image)
images = np.stack(images, axis=3)
Note that there is no need to expand the dimensions of the individual image arrays first, nor do you need to know how many images you expect ahead of time.
You can use stack with the axis parameter:
img.shape # h,w,3
imgs = np.stack([img1,img2,img3,img4], axis=-1) # -1 = new axis is last
imgs.shape # h,w,3,nimages
For example: to convert grayscale to color:
>>> d = np.zeros((5,4), dtype=int) # 5x4
>>> d[2,3] = 1
>>> d3.shape
Out[30]: (5, 4, 3)
>>> d3 = np.stack([d,d,d], axis=-2) # 5x4x3 -1=as last axis
>>> d3[2,3]
Out[32]: array([1, 1, 1])
I followed this approach:
import numpy as np
import cv2
ls = []
for image in image_paths:
ls.append(cv2.imread('test.jpg'))
img_np = np.array(ls) # shape (100, 480, 640, 3)
img_np = np.rollaxis(img_np, 0, 4) # shape (480, 640, 3, 100).
This worked for me:
image = image[..., None]
This will help you add axis anywhere you want
import numpy as np
signal = np.array([[0.3394572666491664, 0.3089068053925853, 0.3516359279582483], [0.33932706934615525, 0.3094755563319447, 0.3511973743219001], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256]])
print(signal.shape)
#(4,3)
print(signal[...,np.newaxis].shape) or signal[...:none]
#(4, 3, 1)
print(signal[:, np.newaxis, :].shape) or signal[:,none, :]
#(4, 1, 3)
there is three-way for adding new dimensions to ndarray .
first: using "np.newaxis" (something like #dbliss answer)
np.newaxis is just given an alias to None for making it easier to
understand. If you replace np.newaxis with None, it works the same
way. but it's better to use np.newaxis for being more explicit.
import numpy as np
my_arr = np.array([2, 3])
new_arr = my_arr[..., np.newaxis]
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
second: using "np.expand_dims()"
Specify the original ndarray in the first argument and the position
to add the dimension in the second argument axis.
my_arr = np.array([2, 3])
new_arr = np.expand_dims(my_arr, -1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
third: using "reshape()"
my_arr = np.array([2, 3])
new_arr = my_arr.reshape(*my_arr.shape, 1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)

TensorFlow: numpy.repeat() alternative

I want to compare the predicted values yp from my neural network in a pairwise fashion, and so I was using (back in my old numpy implementation):
idx = np.repeat(np.arange(len(yp)), len(yp))
jdx = np.tile(np.arange(len(yp)), len(yp))
s = yp[[idx]] - yp[[jdx]]
This basically create a indexing mesh which I then use. idx=[0,0,0,1,1,1,...] while jdx=[0,1,2,0,1,2...]. I do not know if there is a simpler manner of doing it...
Anyhow, TensorFlow has a tf.tile(), but it seems to be lacking a tf.repeat().
idx = np.repeat(np.arange(n), n)
v2 = v[idx]
And I get the error:
TypeError: Bad slice index [ 0 0 0 ..., 215 215 215] of type <type 'numpy.ndarray'>
It also does not work to use a TensorFlow constant for the indexing:
idx = tf.constant(np.repeat(np.arange(n), n))
v2 = v[idx]
-
TypeError: Bad slice index Tensor("Const:0", shape=TensorShape([Dimension(46656)]), dtype=int64) of type <class 'tensorflow.python.framework.ops.Tensor'>
The idea is to convert my RankNet implementation to TensorFlow.
You can achieve the effect of np.repeat() using a combination of tf.tile() and tf.reshape():
idx = tf.range(len(yp))
idx = tf.reshape(idx, [-1, 1]) # Convert to a len(yp) x 1 matrix.
idx = tf.tile(idx, [1, len(yp)]) # Create multiple columns.
idx = tf.reshape(idx, [-1]) # Convert back to a vector.
You can simply compute jdx using tf.tile():
jdx = tf.range(len(yp))
jdx = tf.tile(jdx, [len(yp)])
For the indexing, you could try using tf.gather() to extract non-contiguous slices from the yp tensor:
s = tf.gather(yp, idx) - tf.gather(yp, jdx)
According to tf api document, tf.keras.backend.repeat_elements() does the same work with np.repeat() . For example,
x = tf.constant([1, 3, 3, 1], dtype=tf.float32)
rep_x = tf.keras.backend.repeat_elements(x, 5, axis=0)
# result: [1. 1. 1. 1. 1. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 1. 1. 1. 1. 1.]
Just for 1-d tensors, I've made this function
def tf_repeat(y,repeat_num):
return tf.reshape(tf.tile(tf.expand_dims(y,axis=-1),[1,repeat_num]),[-1])
It looks like your question is so popular that people refer it on TF tracker. Sadly the same function is not still implemented in TF.
You can implement it by combining tf.tile, tf.reshape, tf.squeeze. Here is a way to convert examples from np.repeat:
import numpy as np
import tensorflow as tf
x = [[1,2],[3,4]]
print np.repeat(3, 4)
print np.repeat(x, 2)
print np.repeat(x, 3, axis=1)
x = tf.constant([[1,2],[3,4]])
with tf.Session() as sess:
print sess.run(tf.tile([3], [4]))
print sess.run(tf.squeeze(tf.reshape(tf.tile(tf.reshape(x, (-1, 1)), (1, 2)), (1, -1))))
print sess.run(tf.reshape(tf.tile(tf.reshape(x, (-1, 1)), (1, 3)), (2, -1)))
In the last case where repeats are different for each element you most probably will need loops.
Just in case anybody is interested for a 2D method to copy the matrices. I think this could work:
TF_obj = tf.zeros([128, 128])
tf.tile(tf.expand_dims(TF_obj, 2), [1, 1, 2])
import numpy as np
import tensorflow as tf
import itertools
x = np.arange(6).reshape(3,2)
x = tf.convert_to_tensor(x)
N = 3 # number of repetition
K = x.shape[0] # for here 3
order = list(range(0, N*K, K))
order = [[x+i for x in order] for i in range(K)]
order = list(itertools.chain.from_iterable(order))
x_rep = tf.gather(tf.tile(x, [N, 1]), order)
Results from:
[0, 1],
[2, 3],
[4, 5]]
To:
[[0, 1],
[0, 1],
[0, 1],
[2, 3],
[2, 3],
[2, 3],
[4, 5],
[4, 5],
[4, 5]]
If you want:
[[0, 1],
[2, 3],
[4, 5],
[0, 1],
[2, 3],
[4, 5],
[0, 1],
[2, 3],
[4, 5]]
Simply use tf.tile(x, [N, 1])
So I have found that tensorflow has one such method to repeat the elements of an array. The method tf.keras.backend.repeat_elements is what you are looking for. Anyone who comes at a later point of time can save lot of their efforts. This link offers an explanation to the method and specifically says
Repeats the elements of a tensor along an axis, like np.repeat
I have included a very short example which proves that the elements are copied in the exact way as np.repeat would do.
import numpy as np
import tensorflow as tf
x = np.random.rand(2,2)
# print(x) # uncomment this line to see the array's elements
y = tf.convert_to_tensor(x)
y = tf.keras.backend.repeat_elements(x, rep=3, axis=0)
# print(y) # uncomment this line to see the results
You can simulate missing tf.repeat by tf.stacking the value with itself:
value = np.arange(len(yp)) # what to repeat
repeat_count = len(yp) # how many times
repeated = tf.stack ([value for i in range(repeat_count)], axis=1)
I advice using this only on small repeat counts.
Though many clean and working solutions have been given, they seem to all be based on producing the set of indices from scratch each iteration.
While the cost to produce these node's isn't typically significant during training, it may be significant if using your model for inference.
Repeating tf.range (like your example) has come up a few times so I built the following function creator. Given the maximum number of times something will be repeated and the maximum number of things that will need repeating, it returns a function which produces the same values as np.repeat(np.arange(len(multiples)), multiples).
import tensorflow as tf
import numpy as np
def numpy_style_repeat_1d_creator(max_multiple=100, max_to_repeat=10000):
board_num_lookup_ary = np.repeat(
np.arange(max_to_repeat),
np.full([max_to_repeat], max_multiple))
board_num_lookup_ary = board_num_lookup_ary.reshape(max_to_repeat, max_multiple)
def fn_to_return(multiples):
board_num_lookup_tensor = tf.constant(board_num_lookup_ary, dtype=tf.int32)
casted_multiples = tf.cast(multiples, dtype=tf.int32)
padded_multiples = tf.pad(
casted_multiples,
[[0, max_to_repeat - tf.shape(multiples)[0]]])
return tf.boolean_mask(
board_num_lookup_tensor,
tf.sequence_mask(padded_multiples, maxlen=max_multiple))
return fn_to_return
#Here's an example of how it can be used
with tf.Session() as sess:
repeater = numpy_style_repeat_1d_creator(5,4)
multiples = tf.constant([4,1,3])
repeated_values = repeater(multiples)
print(sess.run(repeated_values))
The general idea is to store a repeated tensor and then mask it, but it may help to see it visually (this is for the example given above):
In the example above the following Tensor is produced:
[[0,0,0,0,0],
[1,1,1,1,1],
[2,2,2,2,2],
[3,3,3,3,3]]
For multiples [4,1,3] it will collect the non-X values:
[[0,0,0,0,X],
[1,X,X,X,X],
[2,2,2,X,X],
[X,X,X,X,X]]
resulting in:
[0,0,0,0,1,2,2,2]
tl;dr: To avoid producing the indices each time (can be costly), pre-repeat everything and then mask that tensor each time
A relatively fast implementation was recently added with RaggedTensor utilities from 1.13, but it's not a part of the officially exported API. You can still use it, but there's a chance it might disappear.
from tensorflow.python.ops.ragged.ragged_util import repeat
From the source code:
# This op is intended to exactly match the semantics of numpy.repeat, with
# one exception: numpy.repeat has special (and somewhat non-intuitive) behavior
# when axis is not specified. Rather than implement that special behavior, we
# simply make `axis` be a required argument.
Tensorflow 2.10 has implemented np.repeat feature.
tf.repeat([1, 2, 3], repeats=[3, 1, 2], axis=0)
<tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 1, 1, 2, 3, 3], dtype=int32)>