Select indices in tensorflow that fulfils a certain condition - tensorflow

I wish to select elements of a matrix where the coordinates of the elements in the matrix fulfil a certain condition. For example, a condition could be : (y_coordinate-x_coordinate) == -4
So, those elements whose coordinates fulfil this condition will be selected. How can I do this efficiently without looping through every element?

Perhaps you need tf.gather_nd:
iterSession = tf.InteractiveSession()
vals = tf.constant([[1,2,3], [4,5,6], [7,8,9]])
arr = tf.constant([[x, y] for x in range(3) for y in range(3) if -1 <= x - y <= 1])
arr.eval()
# >> array([[0, 0],
# >> [0, 1],
# >> [1, 0],
# >> [1, 1],
# >> [1, 2],
# >> [2, 1],
# >> [2, 2]], dtype=int32)
tf.gather_nd(vals, arr).eval()
# >> array([1, 2, 4, 5, 6, 8, 9], dtype=int32)
Or tf.boolean_mask:
iterSession = tf.InteractiveSession()
vals = tf.constant([[1,2,3], [4,5,6], [7,8,9]])
arr = tf.constant([[-1 <= x - y <= 1 for x in range(3)] for y in range(3)])
arr.eval()
# array([[ True, True, False],
# [ True, True, True],
# [False, True, True]], dtype=bool)
tf.boolean_mask(vals, arr).eval()
# array([ 1., 2., 4., 5., 6., 8., 9.], dtype=int32)

Related

NumPy: How to calulate piecewise linear interpolant on multiple axes

Given the following ndarray t -
In [26]: t.shape
Out[26]: (3, 3, 2)
In [27]: t
Out[27]:
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]]])
this piecewise linear interpolant for the points t[:, 0, 0] can evaluated for [0 , 0.66666667, 1.33333333, 2.] as follows using numpy.interp -
In [38]: x = np.linspace(0, t.shape[0]-1, 4)
In [39]: x
Out[39]: array([0. , 0.66666667, 1.33333333, 2. ])
In [30]: xp = np.arange(t.shape[0])
In [31]: xp
Out[31]: array([0, 1, 2])
In [32]: fp = t[:,0,0]
In [33]: fp
Out[33]: array([ 0, 6, 12])
In [40]: np.interp(x, xp, fp)
Out[40]: array([ 0., 4., 8., 12.])
How can all the interpolants be efficiently calculated and returned together for all values of fp -
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 4, 5],
[ 6, 7],
[ 8, 9]],
[[ 8, 9],
[10, 11],
[12, 13]],
[[12, 13],
[14, 15],
[16, 17]]])
As the interpolation is 1d with changing y values it must be run for each 1d slice of t. It's probably faster to loop explicitly but neater to loop using np.apply_along_axis
import numpy as np
t = np.arange( 18 ).reshape(3,3,2)
x = np.linspace( 0, t.shape[0]-1, 4)
xp = np.arange(t.shape[0])
def interfunc( arr ):
""" Function interpolates a 1d array. """
return np.interp( x, xp, arr )
np.apply_along_axis( interfunc, 0, t ) # apply function along axis 0
""" Result
array([[[ 0., 1.],
[ 2., 3.],
[ 4., 5.]],
[[ 4., 5.],
[ 6., 7.],
[ 8., 9.]],
[[ 8., 9.],
[10., 11.],
[12., 13.]],
[[12., 13.],
[14., 15.],
[16., 17.]]]) """
With explicit loops
result = np.zeros((4,3,2))
for c in range(t.shape[1]):
for p in range(t.shape[2]):
result[:,c,p] = np.interp( x, xp, t[:,c,p])
On my machine the second option runs in half the time.
Edit to use np.nditer
As the result and the parameter have different shapes I seem to have to create two np.nditer objects one for the parameter and one for the result. This is my first attempt to use nditer for anything so it could be over complicated.
def test( t ):
ts = t.shape
result = np.zeros((ts[0]+1,ts[1],ts[2]))
param = np.nditer( [t], ['external_loop'], ['readonly'], order = 'F')
with np.nditer( [result], ['external_loop'], ['writeonly'], order = 'F') as res:
for p, r in zip( param, res ):
r[:] = interfunc(p)
return result
It's slightly slower than the explicit loops and less easy to follow than either of the other solutions.
As requested by #Tis Chris, here is a solution using np.nditer with the multi_index flag but I prefer the explicit nested for loops method above because it is 10% faster
In [29]: t = np.arange( 18 ).reshape(3,3,2)
In [30]: ax0old = np.arange(t.shape[0])
In [31]: ax0new = np.linspace(0, t.shape[0]-1, 4)
In [32]: tnew = np.zeros((len(ax0new), t.shape[1], t.shape[2]))
In [33]: it = np.nditer(t[0], flags=['multi_index'])
In [34]: for _ in it:
...: tnew[:, it.multi_index[0], it.multi_index[1]] = np.interp(ax0new, ax0old, t[:, it.multi_
...: index[0], it.multi_index[1]])
...:
In [35]: tnew
Out[35]:
array([[[ 0., 1.],
[ 2., 3.],
[ 4., 5.]],
[[ 4., 5.],
[ 6., 7.],
[ 8., 9.]],
[[ 8., 9.],
[10., 11.],
[12., 13.]],
[[12., 13.],
[14., 15.],
[16., 17.]]])
You could try scipy.interpolate.interp1d:
from scipy.interpolate import interp1d
import numpy as np
t = np.array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]]])
# for the first slice
f = interp1d(np.arange(t.shape[0]), t[..., 0], axis=0)
# returns a function which you call with values within range np.arange(t.shape[0])
# data used for interpolation
t[..., 0]
>>> array([[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
f(1)
>>> array([ 6., 8., 10.])
f(1.5)
>>> array([ 9., 11., 13.])

Sum rows of a 2D array with a specific stepsize - NumPy

This is a quick one. I am wondering if there is a better way to express the following lines (besides using a short loop):
energy = np.zeros((4, signal.shape[1]))
energy[0::4, 0:] = np.sum(signal[0::4, :], axis=0)
energy[1::4, 0:] = np.sum(signal[1::4, :], axis=0)
energy[2::4, 0:] = np.sum(signal[2::4, :], axis=0)
energy[3::4, 0:] = np.sum(signal[3::4, :], axis=0)
Reshape to split the first axis into two and then sum along the first of those two, like so -
energy = signal.reshape(-1,4,signal.shape[1]).sum(0)
Sample run -
In [327]: np.random.seed(0)
In [328]: signal = np.random.randint(0,9,(8,5))
In [329]: energy = np.zeros((4, signal.shape[1]))
...: energy[0::4, 0:] = np.sum(signal[0::4, :], axis=0)
...: energy[1::4, 0:] = np.sum(signal[1::4, :], axis=0)
...: energy[2::4, 0:] = np.sum(signal[2::4, :], axis=0)
...: energy[3::4, 0:] = np.sum(signal[3::4, :], axis=0)
In [330]: energy
Out[330]:
array([[ 13., 4., 6., 3., 10.],
[ 8., 5., 4., 7., 15.],
[ 7., 11., 11., 4., 13.],
[ 7., 8., 8., 5., 12.]])
In [331]: signal.reshape(-1,4,signal.shape[1]).sum(0)
Out[331]:
array([[13, 4, 6, 3, 10],
[ 8, 5, 4, 7, 15],
[ 7, 11, 11, 4, 13],
[ 7, 8, 8, 5, 12]])
For arrays with number of rows not necessarily a multiple of 4, here's the generic version -
m = signal.shape[0]
n = m//4
energy = signal[:n*4].reshape(n,4,-1).sum(0)
energy[:m%4] += signal[n*4:]

How to enlarge a tensor(duplicate value) in tensorflow?

I am new in TensorFlow. I am trying to implement the global_context extraction in this paper https://arxiv.org/abs/1506.04579, which is actually an average pooling over the whole feature map, then duplicate the 1x1 feature map back to the original size. The illustration is as below
Specifically, the expected operation is following.
input: [N, 1, 1, C] tensor, where N is the batch size and C is the number of channel
output: [N, H, W, C] tensor, where H, W is the hight and width of original feature map, and all the H * W values of output are the same as the 1x1 input.
For example,
[[1, 1, 1]
1 -> [1, 1, 1]
[1, 1, 1]]
I have no idea how to do this using TensorFlow. tf.image.resize_images requires 3 channels, and tf.pad cannot pad constant value other than zero.
tf.tile may help you
x = tf.constant([[1, 2, 3]]) # shape (1, 3)
y = tf.tile(x, [3, 1]) # shape (3, 3)
y_ = tf.tile(x, [3, 2]) # shape (3, 6)
with tf.Session() as sess:
a, b, c = sess.run([x, y, y_])
>>>a
array([[1, 2, 3]], dtype=int32)
>>>b
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]], dtype=int32)
>>>c
array([[1, 2, 3, 1, 2, 3],
[1, 2, 3, 1, 2, 3],
[1, 2, 3, 1, 2, 3]], dtype=int32)
tf.tile(input, multiples, name=None)
multiples means how many times you want to repeat in this axis
in y repeat axis0 3 times
in y_ repeat axis0 3 times, and axis1 2 times
you may need to use tf.expand_dim first
yes it accept dynamic shape
x = tf.placeholder(dtype=tf.float32, shape=[None, 4])
x_shape = tf.shape(x)
y = tf.tile(x, [3 * x_shape[0], 1])
with tf.Session() as sess:
x_ = np.array([[1, 2, 3, 4]])
a = sess.run(y, feed_dict={x:x_})
>>>a
array([[ 1., 2., 3., 4.],
[ 1., 2., 3., 4.],
[ 1., 2., 3., 4.]], dtype=float32)

Set k-largest elements of a tensor to zero in TensorFlow

I want to find k largest elements of each row of h and set zero value to those maximum elements.
I could be able to select the indexes of top most value of each row by using top_k function like:
top_k = tf.nn.top_k(h, 1)
But I could not use the indexes returned by top_k to update tensor.
How can I do that? Thanks in advance...
This is a bit tricky, maybe there is a better solution. tf.scatter_update() doesn't work here because it can only modify parts of tensor along the first dimension (not an element in first row and second column for instance).
You have to get the values and indices from tf.nn.top_k() to create a sparse Tensor and subtract it to the initial Tensor x:
x = tf.constant([[6., 2., 0.], [0., 4., 5.]]) # of type tf.float32
k = 2
values, indices = tf.nn.top_k(x, k, sorted=False) # indices will be [[0, 1], [1, 2]], values will be [[6., 2.], [4., 5.]]
# We need to create full indices like [[0, 0], [0, 1], [1, 2], [1, 1]]
my_range = tf.expand_dims(tf.range(0, indices.get_shape()[0]), 1) # will be [[0], [1]]
my_range_repeated = tf.tile(my_range, [1, k]) # will be [[0, 0], [1, 1]]
# change shapes to [N, k, 1] and [N, k, 1], to concatenate into [N, k, 2]
full_indices = tf.concat([tf.expand_dims(my_range_repeated, 2), tf.expand_dims(indices, 2)], axis=2)
full_indices = tf.reshape(full_indices, [-1, 2])
to_substract = tf.sparse_to_dense(full_indices, x.get_shape(), tf.reshape(values, [-1]), default_value=0.)
res = x - to_substract # res should be all 0.
I was facing the opposite problem and wanted a operation which supported gradients. top_k does not support gradient propagation and hence a good way will be to implement the function in c++.
top_k c++ code is found here.
Your operation's kernel will look look like this:
template <typename T>
class MakeSparseOp : public OpKernel {
public:
explicit MakeSparseOp(OpKernelConstruction *context) : OpKernel(context) {}
void Compute(OpKernelContext *context) override {
// Grab the input tensors
const auto &k_in = context->input(1);
OP_REQUIRES(context, TensorShapeUtils::IsScalar(k_in.shape()),
errors::InvalidArgument("k must be scalar, got shape ",
k_in.shape().DebugString()));
int k = k_in.scalar<int32>()();
OP_REQUIRES(context, k >= 0,
errors::InvalidArgument("Need k >= 0, got ", k));
const Tensor &x_in = context->input(0);
OP_REQUIRES(context, x_in.dims() >= 1,
errors::InvalidArgument("input must be >= 1-D, got shape ",
x_in.shape().DebugString()));
OP_REQUIRES(
context, x_in.dim_size(x_in.dims() - 1) >= k,
errors::InvalidArgument("input must have at least k columns"));
// Flattening the input tensor
const auto &x = x_in.flat_inner_dims<T>();
const auto num_rows = x.dimension(0);
const auto num_cols = x.dimension(1);
TensorShape output_shape = x_in.shape();
// Create an output tensor
Tensor *x_out = nullptr;
OP_REQUIRES_OK(context,
context->allocate_output(0, output_shape, &x_out));
/*
* Get the top k values along the first dimension for input
*/
auto x_sparse = x_out->flat_inner_dims<T>();
if (k == 0) return; // Nothing to do
// Using TopN to get the k max element
gtl::TopN<std::pair<T, int32>> filter(k);
x_sparse = x; // Copy all elements
for (int r = 0; r < num_rows; r++) {
// Processing a row at a time
for (int32 c = 0; c < num_cols; c++) {
// The second element is the negated index, so that lower-index
// elements
// are considered larger than higher-index elements in case of
// ties.
filter.push(std::make_pair(x(r, c), -c));
}
for (auto top_k_it = filter.unsorted_begin();
top_k_it != filter.unsorted_end(); ++top_k_it) {
x_sparse(r, -top_k_it->second) = 0; // Set max k to zero
}
filter.Reset();
}
}
};
My implementation for a related problem is here.
With recent availability of scatter_nd_update function in tensorflow, here is a modified version of the answer from Oliver.
k = 2
val_to_replace_with = -333
x = tf.Variable([[6., 2., 0.], [0., 4., 5.]]) # of type tf.float32
values, indices = tf.nn.top_k(x, k, sorted=False) # indices will be [[0, 1], [1, 2]], values will be [[6., 2.], [4., 5.]]
# We need to create full indices like [[0, 0], [0, 1], [1, 2], [1, 1]]
my_range = tf.expand_dims(tf.range(0, tf.shape(indices)[0]), 1) # will be [[0], [1]]
my_range_repeated = tf.tile(my_range, [1, k]) # will be [[0, 0], [1, 1]]
# change shapes to [N, k, 1] and [N, k, 1], to concatenate into [N, k, 2]
full_indices = tf.concat([tf.expand_dims(my_range_repeated, -1), tf.expand_dims(indices, -1)], axis=2)
full_indices = tf.reshape(full_indices, [-1, 2])
# only significant modification -----------------------------------------------------------------
updates = val_to_replace_with + tf.zeros([tf.size(indices)], dtype=tf.float32)
c = tf.scatter_nd_update(x, full_indices, updates)
# only significant modification -----------------------------------------------------------------
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(c))
follow the Olivier Moindrot's idea, but implemented by tf's API.
x = tf.constant([[6., 2., 0.], [0., 4., 5.]]) # of type tf.float32
k = 2
values, indices = tf.nn.top_k(x, k, sorted=False) # indices will be [[0, 1], [1, 2]], values will be [[6., 2.], [4., 5.]]
# We need to create full indices like [[0, 0], [0, 1], [1, 2], [1, 1]]
ii, _ = tf.meshgrid(tf.range(2), tf.range(k), indexing='ij')
full_indices = tf.reshape(tf.stack([ii, indices], axis=-1), [-1, len(x.shape)])
tf.tensor_scatter_nd_sub(x, full_indices, tf.reshape(values, -1))
"""
In [249]: tf.tensor_scatter_nd_sub(x, full_indices, tf.reshape(values, -1))
Out[249]:
<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[0., 0., 0.],
[0., 0., 0.]], dtype=float32)>
"""

Fill a tensor with a value that is not a scalar

I am trying to move a list of points to the origin using tensorflow the best way to do it mathematically is to find the centroid of the list of points then subtract the list of points by that centroid.
The problems: The number of rows contained in the point list is unknown until runtime.
Code so far:
import tensorflow as tf
example_point_list = tf.constant([[3., 3.], [.2, .2], [.1, .1]]) // but with any number of points
centroid = tf.reduce_mean(example_point_list, 0)
// subtract???
origin_point_list = tf.sub(example_point_list, centroid)
The problem is that subtract works on an element by element basis so I have to create a centroid tensor with the same number of rows as the point list but there are no methods that do that.
(to put it in math terms)
A = [[1, 1],
[2, 2]
[3, 3]]
B = avg(A) // [2, 2]
// step I need to do but do not know how to do it
B -> B1 // [[2, 2], [2, 2], [2, 2]]
Result = A - B1
Any help is appreciated!
Because of broadcasting, you don't need to tile the rows. In fact, it's more efficient to not tile them and subtract vector from matrix directly. In your case it would look like this
tf.reset_default_graph()
example_points = np.array([[1, 1], [2, 2], [3, 3]], dtype=np.float32)
example_point_list = tf.placeholder(tf.float32)
centroid = tf.reduce_mean(example_point_list, 0)
result = example_point_list - centroid
sess = tf.InteractiveSession()
sess.run(result, feed_dict={example_point_list: example_points})
result
array([[-1., -1.],
[ 0., 0.],
[ 1., 1.]], dtype=float32)
If you really want to tile the centroid vector explicitly, you could do it using shape operator which can get shape during runtime
tf.reset_default_graph()
example_point_list0 = np.array([[1, 1], [2, 2], [3, 3]], dtype=np.float32)
example_point_list = tf.placeholder(tf.float32)
# get number of examples from the array: [3]
num_examples = tf.slice(tf.shape(example_points), [0], [1])
# reshape [3] into 3
num_examples_flat = tf.reshape(num_examples, ())
centroid = tf.reduce_mean(example_point_list, 0)
# reshape centroid vector [2, 2] into matrix [[2, 2]]
centroid_matrix = tf.reshape(centroid, [1, -1])
# assemble 3 into vector of dimensions to tile: [3, 1]
tile_shape = tf.pack([num_examples_flat, 1])
# tile [[2, 2]] into [[2, 2], [2, 2], [2, 2]]
centroid_tiled = tf.tile(centroid_matrix, tile_shape)
sess = tf.InteractiveSession()
sess.run(centroid_tiled, feed_dict={example_point_list: example_point_list0})
result
array([[ 2., 2.],
[ 2., 2.],
[ 2., 2.]], dtype=float32)