What does tf.gather_nd intuitively do? - tensorflow

Can you intuitively explain or give more examples about tf.gather_nd for indexing and slicing into high-dimensional tensors in Tensorflow?
I read the API, but it is kept quite concise that I find myself hard to follow the function's concept.

Ok, so think about it like this:
You are providing a list of index values to index the provided tensor to get those slices. The first dimension of the indices you provide is for each index you will perform. Let's pretend that tensor is just a list of lists.
[[0]] means you want to get one specific slice(list) at index 0 in the provided tensor. Just like this:
[tensor[0]]
[[0], [1]] means you want get two specific slices at indices 0 and 1 like this:
[tensor[0], tensor[1]]
Now what if tensor is more than one dimensions? We do the same thing:
[[0, 0]] means you want to get one slice at index [0,0] of the 0-th list. Like this:
[tensor[0][0]]
[[0, 1], [2, 3]] means you want return two slices at the indices and dimensions provided. Like this:
[tensor[0][1], tensor[2][3]]
I hope that makes sense. I tried using Python indexing to help explain how it would look in Python to do this to a list of lists.

You provide a tensor and indices representing locations in that tensor. It returns the elements of the tensor corresponding to the indices you provide.
EDIT: An example
import tensorflow as tf
sess = tf.Session()
x = [[1,2,3],[4,5,6]]
y = tf.gather_nd(x, [[1,1],[1,2]])
print(sess.run(y))
[5, 6]

Related

using gather on argmax is different than taking max

I'm trying to learn to train a double-DQN algorithm on tensorflow and it doesn't work. to make sure everything is fine I wanted to test something. I wanted to make sure that using tf.gather on the argmax is exactly the same as taking the max: let's say I have a network called target_network:
first let's take the max:
next_qvalues_target1 = target_network.get_symbolic_qvalues(next_obs_ph) #returns tensor of qvalues
next_state_values_target1 = tf.reduce_max(next_qvalues_target1, axis=1)
let's try it in a different way- using argmax and gather:
next_qvalues_target2 = target_network.get_symbolic_qvalues(next_obs_ph) #returns same tensor of qvalues
chosen_action = tf.argmax(next_qvalues_target2, axis=1)
next_state_values_target2 = tf.gather(next_qvalues_target2, chosen_action)
diff = tf.reduce_sum(next_state_values_target1) - tf.reduce_sum(next_state_values_target2)
next_state_values_target2 and next_state_values_target1 are supposed to be completely identical. so running the session should output diff = . but it does not.
What am I missing?
Thanks.
Found out what went wrong. chosen action is of shape (n, 1) so I thought that using gather on a variable that's (n, 4) I'll get a result of shape (n, 1). turns out this isn't true. I needed to turn chosen_action to be a variable of shape (n, 2)- instead of [action1, action2, action3...] I needed it to be [[1, action1], [2, action2], [3, action3]....] and use gather_nd to be able to take specific elements from next_qvalues_target2 and not gather, because gather takes complete rows.

Logical AND/OR in Keras Backend

Tensorflow has tf.logical_and() and tf.logical_or() for comparison of two boolean tensors, i.e. tf.logical_and(x,y)==TRUE if x==TRUE and y==TRUE (doc). I can't find anything like this in the Keras backend though. They have keras.backend.any() and .all(), but this is for aggregation within a tensor, not between. I've been having to use workarounds with nested K.switch() functions, but it is painfully inelegant.
Let x and y be boolean keras tensors of the same shape.
To take elementwise or, do the following:
keras.backend.any(keras.backend.stack([x, y], axis=0), axis=0)
To take elementwise and, do the following:
keras.backend.all(keras.backend.stack([x, y], axis=0), axis=0)
Here keras.backend.stack([x, y], axis=0) stacks x and y into a new tensor with an additional dimension at number 0. After that keras.backend.any takes a logical or along the new dimension, and keras.backend.any takes the logical and.
My solution (perhaps not the best, because I haven't found others either), is:
A = K.cast(someBooleanTensor, K.floatx())
B = K.cast(anotherBooleanTensor, K.floatx())
A_and_B = A * B #this is also something I use a lot for gathering elements
A_or_B = 1 -((1-A)*(1-B))
But thinking about it now... I never tested python operators... perhaps they work?

Efficient axis-wise cartesian product of multiple 2D matrices with Numpy or TensorFlow

So first off, I think what I'm trying to achieve is some sort of Cartesian product but elementwise, across the columns only.
What I'm trying to do is, if you have multiple 2D arrays of size [ (N,D1), (N,D2), (N,D3)...(N,Dn) ]
The result is thus to be a combinatorial product across axis=1 such that the final result will then be of shape (N, D) where D=D1*D2*D3*...Dn
e.g.
A = np.array([[1,2],
[3,4]])
B = np.array([[10,20,30],
[5,6,7]])
cartesian_product( [A,B], axis=1 )
>> np.array([[ 1*10, 1*20, 1*30, 2*10, 2*20, 2*30 ]
[ 3*5, 3*6, 3*7, 4*5, 4*6, 4*7 ]])
and extendable to cartesian_product([A,B,C,D...], axis=1)
e.g.
A = np.array([[1,2],
[3,4]])
B = np.array([[10,20],
[5,6]])
C = np.array([[50, 0],
[60, 8]])
cartesian_product( [A,B,C], axis=1 )
>> np.array([[ 1*10*50, 1*10*0, 1*20*50, 1*20*0, 2*10*50, 2*10*0, 2*20*50, 2*20*0]
[ 3*5*60, 3*5*8, 3*6*60, 3*6*8, 4*5*60, 4*5*8, 4*6*60, 4*6*8]])
I have a working solution that essentially creates an empty (N,D) matrix and then broadcasting a vector columnwise product for each column within nested for loops for each matrix in the provided list. Clearly is horrible once the arrays get larger!
Is there an existing solution within numpy or tensorflow for this? Potentially one that is efficiently paralleizable (A tensorflow solution would be wonderful but a numpy is ok and as long as the vector logic is clear then it shouldn't be hard to make a tf equivalent)
I'm not sure if I need to use einsum, tensordot, meshgrid or some combination thereof to achieve this. I have a solution but only for single-dimension vectors from https://stackoverflow.com/a/11146645/2123721 even though that solution says to work for arbitrary dimensions array (which appears to mean vectors). With that one i can do a .prod(axis=1), but again this is only valid for vectors.
thanks!
Here's one approach to do this iteratively in an accumulating manner making use of broadcasting after extending dimensions for each pair from the list of arrays for elmentwise multiplications -
L = [A,B,C] # list of arrays
n = L[0].shape[0]
out = (L[1][:,None]*L[0][:,:,None]).reshape(n,-1)
for i in L[2:]:
out = (i[:,None]*out[:,:,None]).reshape(n,-1)

repmat with interlace or Kronecker product in Tensorflow

Suppose I have a tensor:
A=[[1,2,3],[4,5,6]]
Which is a matrix with 2 rows and 3 columns.
I would like to replicate it, suppose twice, to get the following tensor:
A2 = [[1,2,3],
[1,2,3],
[4,5,6],
[4,5,6]]
Using tf.repmat will clearly replicate it differently, so I tried the following code (which works):
A_tiled = tf.reshape(tf.tile(A, [1, 2]), [4, 3])
Unfortunately, it seems to be working very slow when the number of columns become large. Executing it in Matlab using Kronecker product with a vector of ones (Matlab's "kron") seems to be much faster.
Can anyone help?

How to expand a Tensorflow Variable

Is there any way to make a Tensorflow Variable larger? Like, let's say I wanted to add a neuron to a layer of a neural network in the middle of training. How would I go about doing that? An answer in This question told me how to change the shape of the variable, to expand it to fit another row of weights, but I don't know how to initialize those new weights.
I figure another way of going about this might involve combining variables, as in initializing the weights first in a second variable and then adding that in as a new row or column of the first variable, but I can't find anything that lets me do that either.
There are various ways you could accomplish this.
1) The second answer in that post (https://stackoverflow.com/a/33662680/5548115) explains how you can change the shape of a variable by calling 'assign' with validate_shape=False. For example, you could do something like
# Assume var is [m, n]
# Add the new 'data' of shape [1, n] with new values
new_neuron = tf.constant(...)
# If concatenating to add a row, concat on the first dimension.
# If new_neuron was [m, 1], you would concat on the second dimension.
new_variable_data = tf.concat(0, [var, new_neuron]) # [m+1, n]
resize_var = tf.assign(var, new_variable_data, validate_shape=False)
Then when you run resize_var, the data pointed to by 'var' will now have the updated data.
2) You could also create a large initial variable, and call tf.slice on different regions of the variable as training progresses, since you can dynamically change the 'begin' and 'size' attributes of slice.
Simply using tf.concat for expand a Tensorflow Variable,you can see the api_docs
for detail.
v1 = tf.Variable(tf.zeros([5,3]),dtype=tf.float32)
v2 = tf.Variable(tf.zeros([1,3]),dtype=tf.float32)
v3 = tf.concat(0,[v1, v2])
Figured it out. It's kind of a roundabout process, but it's the only one I can tell that actually functions. You need to first unpack the variables, then append the new variable to the end, then pack them back together.
If you're expanding along the first dimension, it's rather short: only 7 lines of actual code.
#the first variable is 5x3
v1 = tf.Variable(tf.zeros([5, 3], dtype=tf.float32), "1")
#the second variable is 1x3
v2 = tf.Variable(tf.zeros([1, 3], dtype=tf.float32), "2")
#unpack the first variable into a list of size 3 tensors
#there should be 5 tensors in the list
change_shape = tf.unpack(v1)
#unpack the second variable into a list of size 3 tensors
#there should be 1 tensor in this list
change_shape_2 = tf.unpack(v2)
#for each tensor in the second list, append it to the first list
for i in range(len(change_shape_2)):
change_shape.append(change_shape_2[i])
#repack the list of tensors into a single tensor
#the shape of this resultant tensor should be [6, 3]
final = tf.pack(change_shape)
If you want to expand along the second dimension, it gets somewhat longer.
#First variable, 5x3
v3 = tf.Variable(tf.zeros([5, 3], dtype=tf.float32))
#second variable, 5x1
v4 = tf.Variable(tf.zeros([5, 1], dtype=tf.float32))
#unpack tensors into lists of size 3 tensors and size 1 tensors, respectively
#both lists will hold 5 tensors
change = tf.unpack(v3)
change2 = tf.unpack(v4)
#for each tensor in the first list, unpack it into its own list
#this should make a 2d array of size 1 tensors, array will be 5x3
changestep2 = []
for i in range(len(change)):
changestep2.append(tf.unpack(change[i]))
#do the same thing for the second tensor
#2d array of size 1 tensors, array will be 5x1
change2step2 = []
for i in range(len(change2)):
change2step2.append(tf.unpack(change2[i]))
#for each tensor in the array, append it onto the corresponding array in the first list
for j in range(len(change2step2[i])):
changestep2[i].append(change2step2[i][j])
#pack the lists in the array back into tensors
changestep2[i] = tf.pack(changestep2[i])
#pack the list of tensors into a single tensor
#the shape of this resultant tensor should be [5, 4]
final2 = tf.pack(changestep2)
I don't know if there's a more efficient way of doing this, but this works, as far as it goes. Changing further dimensions would require more layers of lists, as necessary.