Custom median pooling in tensorflow - tensorflow

I am trying to implement a median pooling layer in tensorflow.
However there is neither tf.nn.median_pool and neither tf.reduce_median.
Is there a way to implement such pooling layer with the python api ?

You could use something like:
patches = tf.extract_image_patches(tensor, [1, k, k, 1], ...)
m_idx = int(k*k/2+1)
top = tf.top_k(patches, m_idx, sorted=True)
median = tf.slice(top, [0, 0, 0, m_idx-1], [-1, -1, -1, 1])
To accommodate even sized median kernels and multiple channels, you will need to extend this, but this should get you most of the way.

As of March 2017, an easier answer (that under the hood works similarly to how Alex suggested) is to do this:
patches = tf.extract_image_patches(x, [1, k, k, 1], [1, k, k, 1], 4*[1], 'VALID')
medians = tf.contrib.distributions.percentile(patches, 50, axis=3)

For me, Alex's answer is not working for tf 1.4.1.
tf.top_k should be tf.nn.top_k
and should get values of tf.nn.top_k
Also, if the input is [1, H, W, C], either answer could not only work on height and width and neglect the channel.

Channel-wise median-pooling can be done by some addition reshapes on top of other answers:
# assuming NHWC layout
strides = rates = [1, 1, 1, 1]
patches = tf.extract_image_patches(x, [1, k, k, 1], strides, rates, 'VALID')
batch_size = tf.shape(x)[0]
n_channels = tf.shape(x)[-1]
n_patches_h = (tf.shape(x)[1] - k) // strides[1] + 1
n_patches_w = (tf.shape(x)[2] - k) // strides[2] + 1
n_patches = tf.shape(patches)[-1] // n_channels
patches = tf.reshape(patches, [batch_size, k, k, n_patches_h * n_patches_w, n_channels])
medians = tf.contrib.distributions.percentile(patches, 50, axis=[1,2])
medians = tf.reshape(medians, (batch_size, n_patches_h, n_patches_w, n_channels))
Not very efficient though.

I was looking for a median filter for tensorflowjs but can't seem to find one. tfa has a median filter now I think but for tf.js you can use this. Not sure if it would work on nodegpu.
function medianFilter(x, filter, strides, pad) {
//make Kernal
//todo allow for filter as array or number
let filterSize = filter ** 2;
let locs = tf.range(0, filterSize, filterSize );
//makes a bunc of arrays each one reprensentin one of the valuesin the median window ie 2x2 filter i in chanle and 4 out chanles
let f = tf.oneHot(tf.range(0,filterSize,1, 'int32'), filterSize).reshape([filter, filter, 1, filterSize]);
let y = tf.conv2d(x,f,strides,pad);
let m_idx = Math.floor(filterSize/2)+1;
let top = tf.topk(y, m_idx, true);
//note that thse are 3d tensors and if you use 4d ones add a 0 and -1 infron like in above ansowers
let median = tf.slice(top.values, [0,0,m_idx-1], [-1,-1,1] );
return median;
}

Related

Tensorflow multiply 3D batch tensor with a 2D weight

I've got two tensors with the shape shown below,
batch.shape = [?, 5, 4]
weight.shape = [3, 5]
by multiplying the weight with every element in the batch, I want to get
result.shape = [?, 3, 4]
what is the most efficient way to achieve this?
Try this:
newbatch = tf.transpose(batch,[1,0,2])
newbatch = tf.reshape(newbatch,[5,-1])
result = tf.matmul(weight,newbatch)
result = tf.reshape(result,[3,-1,4])
result = tf.transpose(result, [1,0,2])
Or more compactly:
newbatch = tf.reshape(tf.transpose(batch,[1,0,2]),[5,-1])
result = tf.transpose(tf.reshape(tf.matmul(weight,newbatch),[3,-1,4]), [1,0,2])
Try this:
tf.einsum("ijk,aj-> iak",batch,weight)
A generalized contraction between tensors of arbitrary dimension Refer this for more information

Pairwise distance between a set of Matrices in Keras/Tensorflow

I want to calculate pairwise distance between a set of Tensor (e.g 4 Tensor). Each matrix is 2D Tensor. I don't know how to do this in vectorize format. I wrote following sudo-code to determine what I need:
E.shape => [4,30,30]
sum = 0
for i in range(4):
for j in range(4):
res = calculate_distance(E[i],E[j]) # E[i] is one the 30*30 Tensor
sum = sum + reduce_sum(res)
Here is my last try:
x_ = tf.expand_dims(E, 0)
y_ = tf.expand_dims(E, 1)
s = x_ - y_
P = tf.reduce_sum(tf.norm(s, axis=[-2, -1]))
This code works But I don't know how do this in a Batch. For instance when E.shape is [BATCH_SIZE * 4 * 30 * 30] my code doesn't work and Out Of Memory will happen. How can I do this efficiently?
Edit: After a day, I find a solution. it's not perfect but works:
res = tf.map_fn(lambda x: tf.map_fn(lambda y: tf.map_fn(lambda z: tf.norm(z - x), x), x), E)
res = tf.reduce_mean(tf.square(res))
Your solution with expand_dims should be okay if your batch size is not too large. However, given that your original pseudo code loops over range(4), you should probably expand axes 1 and 2, instead of 0 and 1.
You can check the shape of the tensors to ensure that you're specifying the correct axes. For example,
batch_size = 8
E_np = np.random.rand(batch_size, 4, 30, 30)
E = K.variable(E_np) # shape=(8, 4, 30, 30)
x_ = K.expand_dims(E, 1)
y_ = K.expand_dims(E, 2)
s = x_ - y_ # shape=(8, 4, 4, 30, 30)
distances = tf.norm(s, axis=[-2, -1]) # shape=(8, 4, 4)
P = K.sum(distances, axis=[-2, -1]) # shape=(8,)
Now P will be the sum of pairwise distances between the 4 matrices for each of the 8 samples.
You can also verify that the values in P is the same as what would be computed in your pseudo code:
answer = []
for batch_idx in range(batch_size):
s = 0
for i in range(4):
for j in range(4):
a = E_np[batch_idx, i]
b = E_np[batch_idx, j]
s += np.sqrt(np.trace(np.dot(a - b, (a - b).T)))
answer.append(s)
print(answer)
[149.45960605637578, 147.2815068236368, 144.97487402393705, 146.04866735065312, 144.25537059201062, 148.9300986019226, 146.61229889228133, 149.34259789169045]
print(K.eval(P).tolist())
[149.4595947265625, 147.281494140625, 144.97488403320312, 146.04867553710938, 144.25537109375, 148.9300994873047, 146.6123046875, 149.34259033203125]
Tensorflow allows to compute the Frobenius norm via tf.norm function. In case of 2D matrices, it's equivalent to 1-norm.
The following solution isn't vectorized and assumes that the first dimension in E is known statically:
E = tf.random_normal(shape=[5, 3, 3], dtype=tf.float32)
F = tf.split(E, E.shape[0])
total = tf.reduce_sum([tf.norm(tensor=(lhs-rhs), ord=1, axis=(-2, -1)) for lhs in F for rhs in F])
Update:
An optimized vectorized version of the same code:
E = tf.random_normal(shape=[1024, 4, 30, 30], dtype=tf.float32)
lhs = tf.expand_dims(E, axis=1)
rhs = tf.expand_dims(E, axis=2)
total = tf.reduce_sum(tf.norm(tensor=(lhs - rhs), ord=1, axis=(-2, -1)))
Memory concerns: upon evaluating this code,
tf.contrib.memory_stats.MaxBytesInUse() reports that the peak memory consumption is 73729792 = 74Mb, which indicates relatively moderate overhead (the raw lhs-rhs tensor is 59Mb). Your OOM is most likely caused by the duplication of BATCH_SIZE dimension when you compute s = x_ - y_, because your batch size is much larger than the number of matrices (1024 vs 4).

Tensorflow symmetric matrix

I want to create a symmetric matrix of n*n and train this matrix in TensorFlow. Effectively I should only train (n+1)*n/2 parameters. How should I do this?
I saw some previous threads which suggest do the following:
X = tf.Variable(tf.random_uniform([d,d], minval=-.1, maxval=.1, dtype=tf.float64))
X_symm = 0.5 * (X + tf.transpose(X))
However, this means I have to train n*n variables, not n*(n+1)/2 variables.
Even there is no function to achieve this, a patch of self-written code would help!
Thanks!
You can use tf.matrix_band_part(input, 0, -1) to create an upper triangular matrix from a square one, so this code would allow you to train on n(n+1)/2 variables although it has you create n*n:
X = tf.Variable(tf.random_uniform([d,d], minval=-.1, maxval=.1, dtype=tf.float64))
X_upper = tf.matrix_band_part(X, 0, -1)
X_symm = 0.5 * (X_upper + tf.transpose(X_upper))
Referring to answer of gdelab: in Tensorflow 2.x, you have to use following code.
X_upper = tf.linalg.band_part(X, 0, -1)
gdelab's answer is correct and will work, since a neural network can adjust the 0.5 factor by itself. I aimed for a solution, where the neural network actually only has (n+1)*n/2 output neurons. The following function transforms these into a symmetric matrix:
def create_symmetric_matrix(x,n):
x_rev = tf.reverse(x[:, n:], [1])
xc = tf.concat([x, x_rev], axis=1)
x_res = tf.reshape(xc, [-1, n, n])
x_upper_triangular = tf.linalg.band_part(x_res, 0, -1)
x_lower_triangular = tf.linalg.set_diag( tf.transpose(x_upper_triangular, perm=[0, 2, 1]), tf.zeros([tf.shape(x)[0], n], dtype=tf.float32))
return x_upper_triangular + x_lower_triangular
with x as a vector of rank [batch,n*(n+1)/2] and n as the rank of the output matrix.
The code is inspired by tfp.math.fill_triangular.

Row-wise Histogram

Given a 2-dimensional tensor t, what's the fastest way to compute a tensor h where
h[i, :] = tf.histogram_fixed_width(t[i, :], vals, nbins)
I.e. where tf.histogram_fixed_width is called per row of the input tensor t?
It seems that tf.histogram_fixed_width is missing an axis parameter that works like, e.g., tf.reduce_sum's axis parameter.
tf.histogram_fixed_width works on the entire tensor indeed. You have to loop through the rows explicitly to compute the per-row histograms. Here is a complete working example using TensorFlow's tf.while_loop construct :
import tensorflow as tf
t = tf.random_uniform([2, 2])
i = 0
hist = tf.constant(0, shape=[0, 5], dtype=tf.int32)
def loop_body(i, hist):
h = tf.histogram_fixed_width(t[i, :], [0.0, 1.0], nbins=5)
return i+1, tf.concat_v2([hist, tf.expand_dims(h, 0)], axis=0)
i, hist = tf.while_loop(
lambda i, _: i < 2, loop_body, [i, hist],
shape_invariants=[tf.TensorShape([]), tf.TensorShape([None, 5])])
sess = tf.InteractiveSession()
print(hist.eval())
Inspired by keveman's answer and because the number of rows of t is fixed and rather small, I chose to use a combination of tf.gather to split rows and tf.pack to join rows. It looks simple and works, will see if it is efficient...
t_histo_rows = [
tf.histogram_fixed_width(
tf.gather(t, [row]),
vals, nbins)
for row in range(t_num_rows)]
t_histo = tf.pack(t_histo_rows, axis=0)
I would like to propose another implementation.
This implementation can also handle multi axes and unknown dimensions (batching).
def histogram(tensor, nbins=10, axis=None):
value_range = [tf.reduce_min(tensor), tf.reduce_max(tensor)]
if axis is None:
return tf.histogram_fixed_width(tensor, value_range, nbins=nbins)
else:
if not hasattr(axis, "__len__"):
axis = [axis]
other_axis = [x for x in range(0, len(tensor.shape)) if x not in axis]
swap = tf.transpose(tensor, [*other_axis, *axis])
flat = tf.reshape(swap, [-1, *np.take(tensor.shape.as_list(), axis)])
count = tf.map_fn(lambda x: tf.histogram_fixed_width(x, value_range, nbins=nbins), flat, dtype=(tf.int32))
return tf.reshape(count, [*np.take([-1 if a is None else a for a in tensor.shape.as_list()], other_axis), nbins])
The only slow part here is tf.map_fn but it is still faster than the other solutions mentioned.
If someone knows a even faster implementation please comment since this operation is still very expensive.
answers above is still slow running in GPU. Here i give an another option, which is faster(at least in my running envirment), but it is limited to 0~1 (you can normalize the value first). the train_equal_mask_nbin can be defined once in advance
def histogram_v3_nomask(tensor, nbins, row_num, col_num):
#init mask
equal_mask_list = []
for i in range(nbins):
equal_mask_list.append(tf.ones([row_num, col_num], dtype=tf.int32) * i)
#[nbins, row, col]
#[0, row, col] is tensor of shape [row, col] with all value 0
#[1, row, col] is tensor of shape [row, col] with all value 1
#....
train_equal_mask_nbin = tf.stack(equal_mask_list, axis=0)
#[inst, doc_len] float to int(equaly seg float in bins)
int_input = tf.cast(tensor * (nbins), dtype=tf.int32)
#input [row,col] -> copy N times, [nbins, row_num, col_num]
int_input_nbin_copy = tf.reshape(tf.tile(int_input, [nbins, 1]), [nbins, row_num, col_num])
#calculate histogram
histogram = tf.transpose(tf.count_nonzero(tf.equal(train_equal_mask_nbin, int_input_nbin_copy), axis=2))
return histogram
With the advent of tf.math.bincount, I believe the problem has become much simpler.
Something like this should work:
def hist_fixed_width(x,st,en,nbins):
x=(x-st)/(en-st)
x=tf.cast(x*nbins,dtype=tf.int32)
x=tf.clip_by_value(x,0,nbins-1)
return tf.math.bincount(x,minlength=nbins,axis=-1)

How to find an index of the first matching element in TensorFlow

I am looking for a TensorFlow way of implementing something similar to Python's list.index() function.
Given a matrix and a value to find, I want to know the first occurrence of the value in each row of the matrix.
For example,
m is a <batch_size, 100> matrix of integers
val = 23
result = [0] * batch_size
for i, row_elems in enumerate(m):
result[i] = row_elems.index(val)
I cannot assume that 'val' appears only once in each row, otherwise I would have implemented it using tf.argmax(m == val). In my case, it is important to get the index of the first occurrence of 'val' and not any.
It seems that tf.argmax works like np.argmax (according to the test), which will return the first index when there are multiple occurrences of the max value.
You can use tf.argmax(tf.cast(tf.equal(m, val), tf.int32), axis=1) to get what you want. However, currently the behavior of tf.argmax is undefined in case of multiple occurrences of the max value.
If you are worried about undefined behavior, you can apply tf.argmin on the return value of tf.where as #Igor Tsvetkov suggested.
For example,
# test with tensorflow r1.0
import tensorflow as tf
val = 3
m = tf.placeholder(tf.int32)
m_feed = [[0 , 0, val, 0, val],
[val, 0, val, val, 0],
[0 , val, 0, 0, 0]]
tmp_indices = tf.where(tf.equal(m, val))
result = tf.segment_min(tmp_indices[:, 1], tmp_indices[:, 0])
with tf.Session() as sess:
print(sess.run(result, feed_dict={m: m_feed})) # [2, 0, 1]
Note that tf.segment_min will raise InvalidArgumentError when there is some row containing no val. In your code row_elems.index(val) will raise exception too when row_elems don't contain val.
Looks a little ugly but works (assuming m and val are both tensors):
idx = list()
for t in tf.unpack(m, axis=0):
idx.append(tf.reduce_min(tf.where(tf.equal(t, val))))
idx = tf.pack(idx, axis=0)
EDIT:
As Yaroslav Bulatov mentioned, you could achieve the same result with tf.map_fn:
def index1d(t):
return tf.reduce_min(tf.where(tf.equal(t, val)))
idx = tf.map_fn(index1d, m, dtype=tf.int64)
Here is another solution to the problem, assuming there is a hit on every row.
import tensorflow as tf
val = 3
m = tf.constant([
[0 , 0, val, 0, val],
[val, 0, val, val, 0],
[0 , val, 0, 0, 0]])
# replace all entries in the matrix either with its column index, or out-of-index-number
match_indices = tf.where( # [[5, 5, 2, 5, 4],
tf.equal(val, m), # [0, 5, 2, 3, 5],
x=tf.range(tf.shape(m)[1]) * tf.ones_like(m), # [5, 1, 5, 5, 5]]
y=(tf.shape(m)[1])*tf.ones_like(m))
result = tf.reduce_min(match_indices, axis=1)
with tf.Session() as sess:
print(sess.run(result)) # [2, 0, 1]
Here is a solution which also considers the case the element is not included by the matrix (solution from github repository of DeepMind)
def get_first_occurrence_indices(sequence, eos_idx):
'''
args:
sequence: [batch, length]
eos_idx: scalar
'''
batch_size, maxlen = sequence.get_shape().as_list()
eos_idx = tf.convert_to_tensor(eos_idx)
tensor = tf.concat(
[sequence, tf.tile(eos_idx[None, None], [batch_size, 1])], axis = -1)
index_all_occurrences = tf.where(tf.equal(tensor, eos_idx))
index_all_occurrences = tf.cast(index_all_occurrences, tf.int32)
index_first_occurrences = tf.segment_min(index_all_occurrences[:, 1],
index_all_occurrences[:, 0])
index_first_occurrences.set_shape([batch_size])
index_first_occurrences = tf.minimum(index_first_occurrences + 1, maxlen)
return index_first_occurrences
And:
import tensorflow as tf
mat = tf.Variable([[1,2,3,4,5], [2,3,4,5,6], [3,4,5,6,7], [0,0,0,0,0]], dtype = tf.int32)
idx = 3
first_occurrences = get_first_occurrence_indices(mat, idx)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
sess.run(first_occurrence) # [3, 2, 1, 5]