Related
Imagine you have an n-dimensional tensor where one of those dimensions corresponds to time.
What I'd like to do is: given some integer window_size, I'd like to replace my time dimension with two new dimensions, [..., n_groups, window_size]. Where n_groups is representative of all posible groupings of size window_size across the time dimension. So if we started with a time dimension of size n_periods, then n_groups should end up being n_periods - window_size.
All of this is very easy to accomplish using traditional "pythonic" looping and slicing, such as:
stacked = tf.stack([inputs[i:i+window_size] for i in range(len(inputs) - window_size + 1)], axis=0)
However, if the time dimension is very long, this produces a staggering number of graph operations. I am wondering if there isn't a built-in TensorFlow function that might help me accomplish this relatively simple task more efficiently...
So common is the idea of "rolling-window grouping" that the Pandas project has a very sophisticated and sizeable API to handle this particular case. I would have thought that TensorFlow would also include such a utility.
Considering the tf documentation about map_fn:
"map_fn will apply the operations used by fn to each element of elems, resulting in O(elems.shape[0]) total operations. This is somewhat mitigated by the fact that map_fn can process elements in parallel. However, a transform expressed using map_fn is still typically less efficient than an equivalent transform expressed using vectorized operations."
You can try the following approach, given an input tensor:
input_tensor = tf.range([10])
# <tf.Tensor: shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)>
convert into a square matrix:
res = tf.repeat(tf.expand_dims(input_tensor, 0), input_tensor.shape[0], axis = 0)
# array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], dtype=int32)>
Then apply map_fn over this tensor including in the input a range vector with negative values:
elements = tf.range(10, dtype=tf.int32) * -1
w,_ = tf.map_fn(lambda x: (tf.roll(x[0], x[1], axis=0), x[1]), (res, elements), dtype=(tf.int32, tf.int32))
This will row(left) the elements as:
#<tf.Tensor: shape=(10, 10), dtype=int32, numpy=
#array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 0],
# [2, 3, 4, 5, 6, 7, 8, 9, 0, 1],
# [3, 4, 5, 6, 7, 8, 9, 0, 1, 2],
# [4, 5, 6, 7, 8, 9, 0, 1, 2, 3],
# [5, 6, 7, 8, 9, 0, 1, 2, 3, 4],
# [6, 7, 8, 9, 0, 1, 2, 3, 4, 5],
# [7, 8, 9, 0, 1, 2, 3, 4, 5, 6],
# [8, 9, 0, 1, 2, 3, 4, 5, 6, 7],
# [9, 0, 1, 2, 3, 4, 5, 6, 7, 8]], dtype=int32)>
Finally, take as much element as you need using tensor slicing like:
window = 8
tf.slice(w, [0, 0], [(w.shape[0] - window) + 1, window])
gives:
#<tf.Tensor: shape=(3, 8), dtype=int32, numpy=
#array([[0, 1, 2, 3, 4, 5, 6, 7],
# [1, 2, 3, 4, 5, 6, 7, 8],
# [2, 3, 4, 5, 6, 7, 8, 9]], dtype=int32)>
For a window = 4
window = 4
tf.slice(w, [0, 0], [(w.shape[0] - window) + 1, window])
gives:
#array([[0, 1, 2, 3],
# [1, 2, 3, 4],
# [2, 3, 4, 5],
# [3, 4, 5, 6],
# [4, 5, 6, 7],
# [5, 6, 7, 8],
# [6, 7, 8, 9]], dtype=int32)>
Try to. convert this into a tf graph to see if it has better performance than the normal python loop.
I have a batch of data with shape [?, dim],
x=[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]]
and a tensor indicates repetition number for each row with shape [?,1], say:
rep_nums=[[1],[2],[1],[3],[1]]
and expecting result to be :
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[15, 16, 17, 18, 19],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]]
I tried dynamic_partition as this mentioned, but only works in TF2.x, which not compatible my pre-exist project.
I think tf.repeat will help.
import tensorflow as tf
c1 = tf.constant([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
times = tf.constant([1, 2, 1, 3, 1])
res = tf.repeat(c1, times, axis=0)
with tf.Session() as sess:
print (sess.run(res))
so i have this Multi-dimensional array with the shape (2,3,4,5)
Here is how it looks like.
rand_5 =
array([[[[0, 2, 8, 9, 6],
[4, 9, 7, 3, 3],
[8, 3, 0, 1, 0],
[0, 6, 7, 7, 9]],
[[3, 0, 7, 7, 7],
[0, 5, 4, 3, 1],
[3, 1, 3, 4, 3],
[1, 9, 5, 9, 1]],
[[2, 3, 2, 2, 5],
[7, 3, 0, 9, 9],
[3, 4, 5, 3, 0],
[4, 8, 6, 7, 2]]],
[[[7, 3, 8, 6, 6],
[5, 6, 5, 7, 1],
[5, 4, 4, 9, 9],
[0, 6, 2, 6, 8]],
[[2, 4, 1, 6, 1],
[5, 1, 6, 9, 8],
[6, 5, 9, 7, 5],
[4, 9, 6, 8, 1]],
[[5, 5, 8, 3, 7],
[7, 9, 4, 7, 5],
[9, 6, 2, 0, 5],
[3, 0, 5, 7, 1]]]])
the third metric in the second index metric(1) is shown below
is rand_5[1,2] =
array([[5, 5, 8, 3, 7],
[7, 9, 4, 7, 5],
[9, 6, 2, 0, 5],
[3, 0, 5, 7, 1]])
QUESTION?
My Question is how can i select from the 2nd,3rd row & 1st,2nd Column from the metric above, such that i have the result shown in the metric below.?
[9,6]
[3,0]
With array slicing:
rand_5[1, 2, 2:4, 0:2]
outputs:
array([[9, 6],
[3, 0]])
I have a square matrix and like to break it into some smaller matrices. For example, assume we have a matrix with the shape of [4,4] and would like to convert it into 4 smaller matrices with size [2,2].
input:
[9, 9, 9, 9,
8, 8, 8, 8,
7, 7, 7, 7,
6, 6, 6, 6]
output:
[[9, 9 | [9, 9,
8, 8] | 8, 8],
---------------
[7, 7 | [7, 7,
6, 6] | 6, 6]]
You can use repeated calls to torch.split for this.
>>> x
tensor([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
>>> [z for y in x.split(2) for z in y.split(2, dim=1)]
[tensor([[1, 2],
[5, 6]]), tensor([[3, 4],
[7, 8]]), tensor([[ 9, 10],
[13, 14]]), tensor([[11, 12],
[15, 16]])]
Given a tensor with the shape of 4*4 or 1*16 the easiest way to do this is by view function or reshape:
a = torch.tensor([9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 7, 6, 6, 6, 6])
# a = a.view(4,4)
a = a.view(2, 2, 2, 2)
# output:
tensor([[[[9, 9],
[9, 9]],
[[8, 8],
[8, 8]]],
[[[7, 7],
[7, 7]],
[[6, 6],
[6, 6]]]])
I am working with audio in TensorFlow, and would like to obtain a series of sequences which could be obtained from sliding a window over my data, so to speak. Examples to illustrate my situation:
Current Data Format:
Shape = [batch_size, num_features]
example = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
]
What I want:
Shape = [batch_size - window_length + 1, window_length, num_features]
example = [
[
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
],
[
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]
],
[
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
],
]
My current solution is to do something like this:
list_of_windows_of_data = []
for x in range(batch_size - window_length + 1):
list_of_windows_of_data.append(tf.slice(data, [x, 0], [window_length,
num_features]))
windowed_data = tf.squeeze(tf.stack(list_of_windows_of_data, axis=0))
And this does the transform. However, it also creates 20,000 operations which slows TensorFlow down a lot when creating a graph. If anyone else has a fun and more efficient way to do this, please do share.
You can do that using tf.map_fn as follows:
example = tf.constant([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
]
)
res = tf.map_fn(lambda i: example[i:i+3], tf.range(example.shape[0]-2), dtype=tf.int32)
sess=tf.InteractiveSession()
res.eval()
This prints
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]],
[[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15]]])
You could use the built-in tf.extract_image_patches:
example = tf.constant([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]
]
)
res = tf.reshape(tf.extract_image_patches(example[None,...,None],
[1,3,3,1], [1,1,1,1], [1,1,1,1], 'VALID'), [-1,3,3])