Indexing per row in TensorFlow - tensorflow

I have a matrix:
Params =
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
For each row I want to select some elements using column indices:
col_indices =
[[0 1]
[1 2]
[2 3]]
In Numpy, I can create row indices:
row_indices =
[[0 0]
[1 1]
[2 2]]
and do params[row_indices, col_indices]
In TenforFlow, I did this:
tf_params = tf.constant(params)
tf_col_indices = tf.constant(col_indices, dtype=tf.int32)
tf_row_indices = tf.constant(row_indices, dtype=tf.int32)
tf_params[row_indices, col_indices]
But there raised an error:
ValueError: Shape must be rank 1 but is rank 3
What does it mean? How should I do this kind of indexing properly?
Thanks!

Tensor rank (sometimes referred to as order or degree or n-dimension) is the number of dimensions of the tensor. For example, the following tensor (defined as a Python list) has a rank of 2:
t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
A rank two tensor is what we typically think of as a matrix, a rank one tensor is a vector. For a rank two tensor you can access any element with the syntax t[i, j]. For a rank three tensor you would need to address an element with t[i, j, k]. See this for more details.
ValueError: Shape must be rank 1 but is rank 3 means you are trying to create a 3-tensor (cube of numbers) instead of a vector.
To see how you can declare tensor constants of different shape, you can see this.

Related

Averaging untill an index value that corresponds with the value of another array in Numpy

I have one array in which the values should be averaged until the day that is given as a value in another array. The first array has 365 days as the first axis, and the second array corresponds to specific julian dates, ranging from 0 to 365, from which the value from the first array should be averaged.
array1.shape = (365, 375, 700)
array2.shape = (375, 700)
The resultant array naturally will have the same shape as the second array that is used for averaging the first array. Is there an easy way to do this? Maybe with some for loops or with vectorization/broadcasting?
Thanks in advance!
You can use numpy.cumsum to calculate the cumulative sum along axis=0 then taking some index and dividing by this index give the average till this index.
import numpy as np
def averages(a, b):
return a.cumsum(axis=0)[
b.ravel(),
np.repeat(np.arange(b.shape[0]), b.shape[1]),
np.tile(np.arange(b.shape[1]), b.shape[0]),
].reshape(b.shape) / (b + 1)
a = np.arange(12).reshape(3, 2, 2)
b = np.array([[0, 1], [1, 2]])
print(a)
# [[[ 0 1]
# [ 2 3]]
# [[ 4 5]
# [ 6 7]]
# [[ 8 9]
# [10 11]]]
print(b)
# [[0 1]
# [1 2]]
print(averages(a, b))
# [[0. 3.]
# [4. 7.]]

How to get those rows having the equal value and their subscript if there is a [10,1] tensor?

I am new in TensorFlow. If there is a [10,1] tensor, I want to find out all rows with the same value and their subscript.
For example, there is a tensor like [[1],[2],[3],[4],[5],[1],[2],[3],[4],[6]].
By comparing each element in the matrix, it is easy to get a dictionary structure like
{‘1’: [0,5], ‘2’: [1,6], ‘3’: [2, 7], ‘4’: [3, 8], ‘5’: [4], ‘6’: [9]} in python, which can record how many times each element occurs in the matrix.
I expect to achieve this result in TensorFlow. Could someone please give me a hand? Thanks a lot.
I think this is a longer method. Still the elements and indices are not associated in a data structure.
Other shorter methods must be there.
t = tf.constant([[1],[2],[3],[4],[5],[1],[2],[3],[4],[6]])
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
y, idx, count =tf.unique_with_counts(tf.squeeze(t))
y1, idx1, count1 = sess.run([y,idx,count])
for i in range(len(y1)) :
print( sess.run( tf.where(tf.equal(t,y1[i]))[:2,-2]))
Output is
[0 5]
[1 6]
[2 7]
[3 8]
[4]
[9]

Intersection between two tensors of different lengths

I have a tensorflow situation. I want to find the intersection of two 2-D tensors which have different shapes.
Example:
object_ids_ [[0 0]
[0 1]
[1 1]]
object_ids_more_07_ [[0 0]
[0 1]
[0 2]
[1 0]
[1 2]]
The output I am looking for is:
[[0,0],
[0,1]]
I came across "tf.sets.set_intersection", tensorflow page: https://www.tensorflow.org/api_docs/python/tf/sets/set_intersection
But couldn't perform it for tensors with different shapes. Another implementation I found is at:
Find the intersection of two tensors. Return the sorted, unique values that are in both of the input tensors
but had a hard time replicating it for 2D tensors.
Any help would be appreciated , thanks
One way to do is to subtract->abs->sum of all the combinations and then get indices where it matches zero. Can be achieved using broadcasting.
a = tf.constant([[0,0],[0,1],[1,1]])
b = tf.constant([[0, 0],[0, 1],[0,2],[1, 0],[1, 2]])
find_match = tf.reduce_sum(tf.abs(tf.expand_dims(b,0) - tf.expand_dims(a,1)),2)
indices = tf.transpose(tf.where(tf.equal(find_match, tf.zeros_like(find_match))))[0]
out = tf.gather(a, indices)
with tf.Session() as sess:
print(sess.run(out))
#Output
#[[0 0]
#[0 1]]

How to get a dense representation of one-hot vectors

Suppose a Tensor containing :
[[0 0 1]
[0 1 0]
[1 0 0]]
How to get the dense representation in a native way (without using numpy or iterations) ?
[2,1,0]
There is tf.one_hot() to do the inverse, there is also tf.sparse_to_dense() that seems to do it but I was not able to figure out how to use it.
tf.argmax(x, axis=1) should do the job.
vec = tf.constant([[0, 0, 1], [0, 1, 0], [1, 0, 0]])
locations = tf.where(tf.equal(vec, 1))
# This gives array of locations of "1" indices below
# => [[0, 2], [1, 1], [2, 0]])
# strip first column
indices = locations[:,1]
sess = tf.Session()
print(sess.run(indices))
# => [2 1 0]
TensorFlow does not have a native dense to sparse conversion function/helper. Given that the input array is a dense tensor, such as the one you provided, you can define a function to convert a dense tensor to a sparse tensor.
def dense_to_sparse(dense_tensor):
where_dense_non_zero = tf.where(tf.not_equal(dense_tensor, 0))
indices = where_dense_non_zero
values = tf.gather_nd(dense_tensor, where_dense_non_zero)
shape = dense_tensor.get_shape()
return tf.SparseTensor(
indices=indices,
values=values,
shape=shape
)
This helper function finds the indices and values where the Tensor is non-zero and outputs a Sparse tensor with those indices and values. Additionally, the shape is effectively copied over.
You do not want to use tf.sparse_to_dense as that gives you the opposite representation. If you want your output to be [2, 1, 0] instead, you'll need to index the indices. First, you'll need the indices where the array isn't 0:
indices = tf.where(tf.not_equal(dense_tensor, 0))
Then, you'll need to access the tensor using slicing/indicing:
output = indices[:, 1]
You might notice that 1 in the slice above is equivalent to the dimension of the tensor - 1. Therefore, to make these value generic, you could do something like:
output = indices[:, len(dense_tensor.get_shape()) - 1]
Although I'm not exactly sure what you'd do with these values (the value of the column where the value is). Hope this helped!
EDIT: Yaroslav's answer is better if you're looking for the indices/locations of where the input tensor if 1; it won't be extensible for tensors with non-1/0 values if that is required.

tensorflow transform a (structured) dense matrix to sparse, when number of rows unknow

My task is to transform a special formed dense matrix tensor into a sparse one. e.g. input matrix M as followed (dense positive integer sequence followed by 0 as padding in each row)
[[3 5 7 0]
[2 2 0 0]
[1 3 9 0]]
Additionally, given the non-padding length for each row, e.g. given by tensor L =
[3, 2, 3].
The desired output would be sparse tensor S.
SparseTensorValue(indices=array([[0, 0],[0, 1],[0, 2],[1, 0],[1, 1],[2, 0],[2, 1], [2, 2]]), values=array([3, 5, 7, 2, 2, 1, 3, 9], dtype=int32), shape=array([3, 4]))
This is useful in models where objects are described by variable-sized descriptors (S are then used in embedding_lookup_sparse to connect embeddings of descriptors.)
I am able to do it when number of M's row is known (by python loop and ops like slice and concat). However, M's row number here is determined by mini-batch size and could change (say in testing phase). Is there a good way to implement that? I am trying some control_flow_ops but haven't succeeded.
Thanks!!