How do I get a tensor representing the "on" positions in the original tensor? - tensorflow

I have a Tensorflow label that may be exemplified by any of the following: [1, 2], [3], []. The first has two classes, the second has one class, and the third has none. I'd like to then turn that these another tensor that looks like the following:
[1, 2] --> [0, 1, 1, 0].
[3] --> [0, 0, 0, 1].
[] --> [0].
The number of classes is defined beforehand (here it's 3). In some sense, this is the inverse of this question - Tensorflow Extract Indices Not Equal to Zero.

The following works:
sparse_categories = tf.convert_to_tensor([[1 if k == i else 0 for k in range(num_categories+1)] for i in range(num_categories+1)])
values = tf.cond(tf.size(values) > 0, lambda: values, lambda: [0])
values = tf.reduce_sum(tf.gather(sparse_categories, values))

Related

How to use tf.gather with index vector that may contain out-of-range indices?

I have an index vector that may contain negative entries. How can I use this in tf.gather? My approach
params = tf.constant(range(5))
idx = tf.constant([-1, 1, 2])
tf.where(
condition = idx >= 0,
x = tf.gather(params, idx),
y = -1
)
throws
InvalidArgumentError: indices[0] = -1 is not in [0, 5) [Op:GatherV2]
because the x branch is evaluated for all elements. I do not want to remove the invalid indices because I need to retain the positional information, i.e. the desired output is [-1, 1, 2] (rather than [1, 2], which I would get by discarding the invalid indices).
You can do it as follows
tf.where(idx >= 0, tf.gather(params, tf.where(idx >= 0, idx, 0)), -1)
Output
<tf.Tensor: shape=(3,), dtype=int32, numpy=array([-1, 1, 2])>

numpy: Cleanly retrieve coordinates (indices) for highest k values - along a specific axis - in ndarray

I would like to be able to:
select k highest values along (or across?) the first dimension
find indices for those k values
assign those values to a new ndarray of equal shape at their respective positions.
I'm wondering if there is a quicker way to achieve the result exemplified below. In particular, I would like to avoid making the batch indices "manually".
Here's my solution:
# Create unordered array (instrumental to the example)
arr = np.arange(24).reshape(2, 3, 4)
arr_1 = arr[0,::2].copy()
arr_2 = arr[1,1::].copy()
arr[0,::2] = arr_2[:,::-1]
arr[1,1:] = arr_1[:,::-1]
# reshape array to: (batch_size, H*W)
arr_batched = arr.reshape(arr.shape[0], -1)
# find indices for k greatest values along all but the 1st dimension.
gr_ind = np.argpartition(arr_batched, -k)[:, -k]
# flatten and unravel indices.
maxk_ind_flat = gr_ind.flatten()
maxk_ind_shape = np.unravel_index(maxk_ind_flat, arr.shape)
# maxk_ind_shape prints: (array([0, 0, 0, 0]), array([2, 2, 0, 0]), array([1, 0, 2, 3]))
# note: unraveling indices obtained by partitioning an array of shape (2, n) will not keep into account the first dimension (here [0,0,0,0])
# Craft batch indices...
batch_indices = np.repeat(np.arange(arr.shape[0], k)
# ...and join
maxk_indices = tuple([batch_indices]+[ind for ind in maxk_ind_shape[1:]])
# The result is used to re-assign k-highest values for each batch element to a destination matrix:
arr2 = np.zeros_like(arr)
arr2[maxk_indices] = arr[maxk_indices]
# arr2 prints:
# array([[[ 0, 0, 0, 0],
# [ 0, 0, 0, 0],
# [23,22, 0, 0]],
#
# [[ 0, 0, 14, 15],
# [ 0, 0, 0, 0],
# [ 0, 0, 0, 0]]])
Any help would be appreciated.
One way would be to use np.[put/take]_along_axis:
gr_ind = np.argpartition(arr_batched,-k,axis=-1)[:,-k:]
arr_2 = np.zeros_like(arr)
np.put_along_axis(arr_2.reshape(arr_batched.shape),gr_ind,np.take_along_axis(arr_batched,gr_ind,-1),-1)

how to implement the variable array with one and zero in tensorflow

I'm totally new on tensorflow, and I just want to implement a kind of selection function by using matrices multiplication.
example below:
#input:
I = [[9.6, 4.1, 3.2]]
#selection:(single "1" value , and the other are "0s")
s = tf.transpose(tf.Variable([[a, b, c]]))
e.g. s could be [[0, 1, 0]] or [[0, 0, 1]] or [[1, 0, 0]]
#result:(multiplication)
o = tf.matul(I, s)
sorry for the poor expression,
I intend to find the 'solution' in distribution functions with different means and sigmas. (value range from 0 to 1).
so now, i have three variable i, j, index.
value1 = np.exp(-((index - m1[i]) ** 2.) / s1[i]** 2.)
value2 = np.exp(-((index - m2[j]) ** 2.) / s2[j]** 2.)
m1 = [1, 3, 5] s = [0.2, 0.4, 0.5]. #first graph
m2 = [3, 5, 7]. s = [0.5, 0.5, 1.0]. #second graph
I want to get the max or optimization of total value
e.g. value1 + value2 = 1+1 = 2 and one of the solutions: i = 2, j=1, index=5
or I could do this in the other module?

Tensorflow: When using slim.dataset.Dataset, is there a way to map label ID values to other values?

dataset = slim.dataset.Dataset(...)
provider = slim.dataset_data_provider.DatasetDataProvider(dataset, ..._
image, labels = provider.get(['image', 'label')
Let's say, for an example in a dataset A, labels could be [1, 2, 1, 3]. However, for some reason (e.g, due to dataset B), I would like to map the label IDs to other values. The mapping could be like below.
# {old_label: target_label}
mapping = {0: 0, 1: 2, 2: 2, 3: 2, 4: 2, 5: 3, 6: 1}
For now, I am guessing two ways:
-- tf.data.Dataset seems to have a map(map_func) function that every examples should pass, which could be the solution. However, I am more familiar to slim.dataset.Dataset. Is there a similar trick for slim.dataset.Dataset?
-- I was wondering if I can simply apply some mapping function to a tensor label such as:
new_labels = tf.map_fn(lambda x: x+1, labels, dtype=tf.int32)
# labels = [1 2 1 3] --> new_labels = [2 3 2 4]. This works.
new_labels = tf.map_fn(lambda x: mapping[x], labels, dtype=tf.int32)
# I wished but this does not work!
However, the below didn't work, which is what I need. Could anyone please advise?
I think you can try tf.contrib.lookup:
keys = list(mapping.keys())
values = [mapping[k] for k in keys]
table = tf.contrib.lookup.HashTable(
tf.contrib.lookup.KeyValueTensorInitializer(keys, values, key_dtype=tf.int64, value_dtype=tf.int64), -1
)
new_labels = table.lookup(labels)
sess=tf.Session()
sess.run(table.init)
print(sess.run(new_labels))

How to find an index of the first matching element in TensorFlow

I am looking for a TensorFlow way of implementing something similar to Python's list.index() function.
Given a matrix and a value to find, I want to know the first occurrence of the value in each row of the matrix.
For example,
m is a <batch_size, 100> matrix of integers
val = 23
result = [0] * batch_size
for i, row_elems in enumerate(m):
result[i] = row_elems.index(val)
I cannot assume that 'val' appears only once in each row, otherwise I would have implemented it using tf.argmax(m == val). In my case, it is important to get the index of the first occurrence of 'val' and not any.
It seems that tf.argmax works like np.argmax (according to the test), which will return the first index when there are multiple occurrences of the max value.
You can use tf.argmax(tf.cast(tf.equal(m, val), tf.int32), axis=1) to get what you want. However, currently the behavior of tf.argmax is undefined in case of multiple occurrences of the max value.
If you are worried about undefined behavior, you can apply tf.argmin on the return value of tf.where as #Igor Tsvetkov suggested.
For example,
# test with tensorflow r1.0
import tensorflow as tf
val = 3
m = tf.placeholder(tf.int32)
m_feed = [[0 , 0, val, 0, val],
[val, 0, val, val, 0],
[0 , val, 0, 0, 0]]
tmp_indices = tf.where(tf.equal(m, val))
result = tf.segment_min(tmp_indices[:, 1], tmp_indices[:, 0])
with tf.Session() as sess:
print(sess.run(result, feed_dict={m: m_feed})) # [2, 0, 1]
Note that tf.segment_min will raise InvalidArgumentError when there is some row containing no val. In your code row_elems.index(val) will raise exception too when row_elems don't contain val.
Looks a little ugly but works (assuming m and val are both tensors):
idx = list()
for t in tf.unpack(m, axis=0):
idx.append(tf.reduce_min(tf.where(tf.equal(t, val))))
idx = tf.pack(idx, axis=0)
EDIT:
As Yaroslav Bulatov mentioned, you could achieve the same result with tf.map_fn:
def index1d(t):
return tf.reduce_min(tf.where(tf.equal(t, val)))
idx = tf.map_fn(index1d, m, dtype=tf.int64)
Here is another solution to the problem, assuming there is a hit on every row.
import tensorflow as tf
val = 3
m = tf.constant([
[0 , 0, val, 0, val],
[val, 0, val, val, 0],
[0 , val, 0, 0, 0]])
# replace all entries in the matrix either with its column index, or out-of-index-number
match_indices = tf.where( # [[5, 5, 2, 5, 4],
tf.equal(val, m), # [0, 5, 2, 3, 5],
x=tf.range(tf.shape(m)[1]) * tf.ones_like(m), # [5, 1, 5, 5, 5]]
y=(tf.shape(m)[1])*tf.ones_like(m))
result = tf.reduce_min(match_indices, axis=1)
with tf.Session() as sess:
print(sess.run(result)) # [2, 0, 1]
Here is a solution which also considers the case the element is not included by the matrix (solution from github repository of DeepMind)
def get_first_occurrence_indices(sequence, eos_idx):
'''
args:
sequence: [batch, length]
eos_idx: scalar
'''
batch_size, maxlen = sequence.get_shape().as_list()
eos_idx = tf.convert_to_tensor(eos_idx)
tensor = tf.concat(
[sequence, tf.tile(eos_idx[None, None], [batch_size, 1])], axis = -1)
index_all_occurrences = tf.where(tf.equal(tensor, eos_idx))
index_all_occurrences = tf.cast(index_all_occurrences, tf.int32)
index_first_occurrences = tf.segment_min(index_all_occurrences[:, 1],
index_all_occurrences[:, 0])
index_first_occurrences.set_shape([batch_size])
index_first_occurrences = tf.minimum(index_first_occurrences + 1, maxlen)
return index_first_occurrences
And:
import tensorflow as tf
mat = tf.Variable([[1,2,3,4,5], [2,3,4,5,6], [3,4,5,6,7], [0,0,0,0,0]], dtype = tf.int32)
idx = 3
first_occurrences = get_first_occurrence_indices(mat, idx)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
sess.run(first_occurrence) # [3, 2, 1, 5]