How to perform tf.nn.softmax in two selected dimension in tensorflow? - tensorflow

I want to implement the tf.nn.softmax() for the selected two dimension of a tensor with shape (batch_size=?, height, width, channel).
But it seems not possible for tf.nn.softmax() to receive 2 axis in the same time. Using tf.softmax(tensor, axis=[1, 2]) will raise axis error in tensorflow.
How can I implement this elegantly and in vectorized mode? thx :D

Instead of passing two dimensions at a time, I would first reshape the input accordingly, e.g.:
array = tf.constant([[1., 2.], [3., 4.]])
tf.nn.softmax(array, axis=0) # Calculate for axis 0
tf.nn.softmax(array, axis=1) # Calculate for axis 1
tf.nn.softmax(tf.reshape(array, [-1])) # Calculate for both axes

You can do
array = np.random.rand(1, 2, 2, 1)
s1 = tf.nn.softmax(array, axis=1)
s2 = tf.nn.softmax(array, axis=2)
rs = tf.reduce_sum([s1, s2], 0)
This will return tensor of same shape as initial array

It can be done with keras activation functions:
# logits has shape [BS, H, W, CH]
prob = tf.keras.activations.softmax(logits, axis=[1, 2])
# prob has shape [BS, H, W, CH]
check = tf.reduce_sum(prob, axis=[1, 2])
# check is tensor of ones with shape [BS, CH]

Related

Backpropagating gradients through nested tf.map_fn

I would like to map a TensorFlow function on each vector corresponding to the depth channel of every pixel in a matrix with dimension [batch_size, H, W, n_channels].
In other words, for every image of size H x W that I have in the batch:
I extract some features maps F_k (whose number is n_channels) with the same size H x W (hence, the features maps all together are a tensor of shape [H, W, n_channels];
then, I wish to apply a custom function to the vector v_ij that is associated with the i-th row and j-th column of each feature map F_k, but explores the depth channel in its entirety (e.g. v has dimension [1 x 1 x n_channels]). Ideally, all of this would happen in parallel.
A picture to explain the process can be found below. The only difference with the picture is that both input and output "receptive fields" have size 1x1 (apply the function to each pixel independently).
This would be similar to applying a 1x1 convolution to the matrix; however, I need to apply a more general function over the depth channel, rather than a simple sum operation.
I think tf.map_fn() could be an option and I tried the following solution, where I recursively use tf.map_fn() to access the features associated with each pixel. However, this kind of seems sub-optimal, and most importantly it raises an error when trying to backpropagate the gradients.
Do you have any idea of the reason why this happens and how I should structure my code to avoid the error?
This is my current implementation of the function:
import tensorflow as tf
from tensorflow import layers
def apply_function_on_pixel_features(incoming):
# at first the input is [None, W, H, n_channels]
if len(incoming.get_shape()) > 1:
return tf.map_fn(lambda x: apply_function_on_pixel_features(x), incoming)
else:
# here the input is [n_channels]
# apply some function that applies a transfomration and returns a vetor of the same size
output = my_custom_fun(incoming) # my_custom_fun() doesn't change the shape
return output
and the body of my code:
H = 128
W = 132
n_channels = 8
x1 = tf.placeholder(tf.float32, [None, H, W, 1])
x2 = layers.conv2d(x1, filters=n_channels, kernel_size=3, padding='same')
# now apply a function to the features vector associated to each pixel
x3 = apply_function_on_pixel_features(x2)
x4 = tf.nn.softmax(x3)
loss = cross_entropy(x4, labels)
optimizer = tf.train.AdamOptimizer(lr)
train_op = optimizer.minimize(loss) # <--- ERROR HERE!
Particularly, the error is the following:
File "/home/venvs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2481, in AddOp
self._AddOpInternal(op)
File "/home/venvs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2509, in _AddOpInternal
self._MaybeAddControlDependency(op)
File "/home/venvs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2547, in _MaybeAddControlDependency
op._add_control_input(self.GetControlPivot().op)
AttributeError: 'NoneType' object has no attribute 'op'
The whole error stack and the code can be found here.
Thanks for the help,
G.
Update:
Following #thushv89 suggestion, I added a possible solution to the problem. I still don't know why my previous code didn't work. Any insight on this would still be very appreciated.
#gabriele regarding having to depend on batch_size, have you tried doing it the following way? This function does not depend on batch_size. You can replace the map_fn with anything you like.
def apply_function_on_pixel_features(incoming):
# get input shape:
_, W, H, C = incoming.get_shape().as_list()
incoming_flat = tf.reshape(incoming, shape=[-1, C])
# apply function on every vector of shape [1, C]
out_matrix = tf.map_fn(lambda x: x+1, incoming_flat) # dimension remains unchanged
# go back to the input shape shape [None, W, H, C]
out_matrix = tf.reshape(out_matrix, shape=[-1, W, H, C])
return out_matrix
The full code of what I tested is as below.
import numpy as np
import tensorflow as tf
from tensorflow.keras.losses import categorical_crossentropy
def apply_function_on_pixel_features(incoming):
# get input shape:
_, W, H, C = incoming.get_shape().as_list()
incoming_flat = tf.reshape(incoming, shape=[-1])
# apply function on every vector of shape [1, C]
out_matrix = tf.map_fn(lambda x: x+1, incoming_flat) # dimension remains unchanged
# go back to the input shape shape [None, W, H, C]
out_matrix = tf.reshape(out_matrix, shape=[-1, W, H, C])
return out_matrix
H = 32
W = 32
x1 = tf.placeholder(tf.float32, [None, H, W, 1])
labels = tf.placeholder(tf.float32, [None, 10])
x2 = tf.layers.conv2d(x1, filters=1, kernel_size=3, padding='same')
# now apply a function to the features vector associated to each pixel
x3 = apply_function_on_pixel_features(x2)
x4 = tf.layers.flatten(x3)
x4 = tf.layers.dense(x4, units=10, activation='softmax')
loss = categorical_crossentropy(labels, x4)
optimizer = tf.train.AdamOptimizer(0.001)
train_op = optimizer.minimize(loss)
x = np.zeros(shape=(10, H, W, 1))
y = np.random.choice([0,1], size=(10, 10))
with tf.Session() as sess:
tf.global_variables_initializer().run()
sess.run(train_op, feed_dict={x1: x, labels:y})
Following #thushv89 suggestion, I reshaped the array, applied the function and then reshaped it back (so to avoid the tf.map_fn recursion). I still don't know exactly why the previous code didn't work, but the current implementation allowed to propagate the gradients back to the previous layers. I'll leave it below, for whom might be interested:
def apply_function_on_pixel_features(incoming, batch_size):
# get input shape:
_, W, H, C = incoming.get_shape().as_list()
incoming_flat = tf.reshape(incoming, shape=[batch_size * W * H, C])
# apply function on every vector of shape [1, C]
out_matrix = my_custom_fun(incoming_flat) # dimension remains unchanged
# go back to the input shape shape [None, W, H, C]
out_shape = tf.convert_to_tensor([batch_size, W, H, C])
out_matrix = tf.reshape(out_matrix, shape=out_shape)
return out_matrix
Notice that now I needed to give the batch size to correctly reshape the tensor because TensorFlow would complain if I gave None or -1 as a dimension.
Any comments and insight on the above code would still be very appreciated.

How do I store an intermediate convolutional layer's result in tensorflow for later processing?

The image below describes the output before the application of a max-pooling layer of a single intermediate filter layer of a CNN.
I want to store the co-ordinates of the pixel with intensity 4(on the bottom right of the matrix on the LHS of the arrow) as it is in the matrix on the LHS of the arrow. That is the pixel at co-ordinate (4,4)(1 based indexing)in the right matrix is the one which is getting stored in the bottom right cell of the matrix on the RHS of the arrow, right. Now what I want to do is to store this co-ordinate value (4,4) along with the co-ordinates for the other pixels {(2,2) for pixel with intensity 6, (2, 4) for pixel with intensity 8 and (3, 1) for pixel with intensity 3} as a list for later processing. How do I do it in Tensorflow.
Max pooling done with a filter of size 2 x 2 and stride of 2
You can use tf.nn.max_pool_with_argmax (link).
Note:
The indices in argmax are flattened, so that a maximum value at
position [b, y, x, c] becomes flattened index ((b * height + y) *
width + x) * channels + c.
We need to do some processing to make it fit your coordinates.
An example:
import tensorflow as tf
import numpy as np
def max_pool_with_argmax(net,filter_h,filter_w,stride):
output, mask = tf.nn.max_pool_with_argmax( net,ksize=[1, filter_h, filter_w, 1],
strides=[1, stride, stride, 1],padding='SAME')
# If your ksize looks like [1, stride, stride, 1]
loc_x = mask // net.shape[2]
loc_y = mask % net.shape[2]
loc = tf.concat([loc_x+1,loc_y+1],axis=-1) #count from 0 so add 1
# If your ksize is all changing, use the following
# c = tf.mod(mask,net.shape[3])
# remain = tf.cast(tf.divide(tf.subtract(mask,c),net.shape[3]),tf.int64)
# x = tf.mod(remain,net.shape[2])
# remain = tf.cast(tf.divide(tf.subtract(remain,x),net.shape[2]),tf.int64)
# y = tf.mod(remain,net.shape[1])
# remain = tf.cast(tf.divide(tf.subtract(remain, y), net.shape[1]),tf.int64)
# b = tf.mod(remain, net.shape[0])
# loc = tf.concat([y+1,x+1], axis=-1)
return output,loc
input = tf.Variable(np.random.rand(1, 6, 4, 1), dtype=np.float32)
output, mask = max_pool_with_argmax(input,2,2,2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
input_value,output_value,mask_value = sess.run([input,output,mask])
print(input_value[0,:,:,0])
print(output_value[0,:,:,0])
print(mask_value[0,:,:,:])
#print
[[0.20101677 0.09207255 0.32177696 0.34424785]
[0.4116488 0.5965447 0.20575707 0.63288754]
[0.3145412 0.16090539 0.59698933 0.709239 ]
[0.00252096 0.18027237 0.11163216 0.40613824]
[0.4027637 0.1995668 0.7462126 0.68812144]
[0.8993007 0.55828506 0.5263306 0.09376772]]
[[0.5965447 0.63288754]
[0.3145412 0.709239 ]
[0.8993007 0.7462126 ]]
[[[2 2]
[2 4]]
[[3 1]
[3 4]]
[[6 1]
[5 3]]]
You can see (2,2) for pixel with intensity 0.5965447, (2, 4) for pixel with intensity 0.63288754 and so on.
Let's say you have the following max-pooling layer:
pool_layer= tf.nn.max_pool(conv_output,
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],
padding='VALID')
you can use:
max_pos = tf.gradients([pool_layer], [conv_output])[0]

How to vectorize the following python code

I'm trying to obtain a matrix, where each element is calculated as follows:
X = torch.ones(batch_size, dim)
X_ = torch.ones(batch_size, dim)
Y = torch.ones(batch_size, dim)
M = torch.zeros(batch_size, batch_size)
for i in range(batch_size):
for j in range(batch_size):
M[i, j] = ((X[i] - X_[i] * Y[j])**2).sum()
It's very slow to calculate M element-wise, is there any suggestion about how to use matrix multiplication to replace the for loops?
Thanks.
If you want to sum() over dim, you can "lift" your 2D problem to 3D and sum there:
M = ((X[:, None, :] - X_[:, None, :] * Y[None, ...])**2).sum(dim=2)
How it works:
X[:, None, :] and X_[:, None, :] are 3D of size (batch_size, 1, dim), and Y[None, ...] is of size (1, batch_size, dim).
When multiplying X_[:, None, :] * Y[None, ...] pytorch broadcasts the dimensions of size 1 to the appropriate dimension to get a result of size (batch_size, batch_size, dim).
Finally, you sum() only over the last dimension (dim=2) to get an output M of size (batch_size, batch_size).
The trick here is done by taking advantage of broadcasting.

Select weight of action from a tensorflow model

I have a small model used in a reinforcement learning context.
I can input a 2d tensor of states, and I get a 2d tensor of action weigths.
Let say I input two states and I get the following action weights out:
[[0.1, 0.2],
[0.3, 0.4]]
Now I have another 2d tensor which have the action number from which I want to get the weights:
[[1],
[0]]
How can I use this tensor to get the weight of actions?
In this example I'd like to get:
[[0.2],
[0.3]]
Similar to Tensorflow tf.gather with axis parameter, the indices are handled little different here:
a = tf.constant( [[0.1, 0.2], [0.3, 0.4]])
indices = tf.constant([[1],[0]])
# convert to full indices
full_indices = tf.stack([tf.range(indices.shape[0])[...,tf.newaxis], indices], axis=2)
# gather
result = tf.gather_nd(a,full_indices)
with tf.Session() as sess:
print(sess.run(result))
#[[0.2]
#[0.3]]
A simple way to do this is squeeze the dimensions of indices, element-wise multiply with corresponding one-hot vector and then expand the dimensions later.
import tensorflow as tf
weights = tf.constant([[0.1, 0.2], [0.3, 0.4]])
indices = tf.constant([[1], [0]])
# Reduce from 2d (2, 1) to 1d (2,)
indices1d = tf.squeeze(indices)
# One-hot vector corresponding to the indices. shape (2, 2)
action_one_hot = tf.one_hot(indices=indices1d, depth=weights.shape[1])
# Element-wise multiplication and sum across axis 1 to pick the weight. Shape (2,)
action_taken_weight = tf.reduce_sum(action_one_hot * weights, axis=1)
# Expand the dimension back to have a 2d. Shape (2, 1)
action_taken_weight2d = tf.expand_dims(action_taken_weight, axis=1)
sess = tf.InteractiveSession()
print("weights\n", sess.run(weights))
print("indices\n", sess.run(indices))
print("indices1d\n", sess.run(indices1d))
print("action_one_hot\n", sess.run(action_one_hot))
print("action_taken_weight\n", sess.run(action_taken_weight))
print("action_taken_weight2d\n", sess.run(action_taken_weight2d))
Should give you the following output:
weights
[[0.1 0.2]
[0.3 0.4]]
indices
[[1]
[0]]
indices1d
[1 0]
action_one_hot
[[0. 1.]
[1. 0.]]
action_taken_weight
[0.2 0.3]
action_taken_weight2d
[[0.2]
[0.3]]
Note: You can also do action_taken_weight = tf.reshape(action_taken_weight, tf.shape(indices)) instead of expand_dims.

I want to print all variables and placeholders in Tensorflow

I run the sample code recurrent_network.py.
I wish print all the x, it's a placehoder. In the function: RNN(x, weights, biases):
What can I do?
Key point:
x = tf.transpose(x, [1, 0, 2])
# Reshaping to (n_steps*batch_size, n_input)
x = tf.reshape(x, [-1, n_input])
# Split to get a list of 'n_steps' tensors of shape (batch_size, n_input)
x = tf.split(0, n_steps, x)
Please see this post for details. I usually use tf.Print() myself.