Tensors with same name in the python code in tensorflow - variables

According to the code in: https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10.py, it happens that the same names are used for the tensors variables such as:
conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME') # Under conv1, line: 208
and,
conv = tf.nn.conv2d(norm1, kernel, [1, 1, 1, 1], padding='SAME') # Under conv2, line 227
Therefore, why this is allowed in tensorflow? If for some reason, If I tried to say:
sess.run([conv], feed_dict{x: some_data})
Then which conv tensor we will be evaluated?
Second, if the conv tensor under CONV1 layer was referring to the tf.nn.conv2d operation. How could another conv tensor under CONV2 refer to the second tf.nn.conv2d operation? In other words, how they are treated separately?
Any help is much appreciated!!

for your question: latest "conv" is evaluated
For example:
import tensorflow as tf
a = tf.constant(5)
b = tf.constant(6)
c = tf.multiply(a,b)
print c
c = tf.multiply(c,b)
print c
sess = tf.Session()
c_val = sess.run(c)
print c_val
Output :
Tensor("Mul:0", shape=(), dtype=int32)
Tensor("Mul_1:0", shape=(), dtype=int32)
180
You can see TF names them differently. Whenever you call an TF operator, it creates a node independent of python variable name. But python variable name corresponds to latest tensor you used.
I hope this helps.

Related

Average pooling tensorflow layer with differently shaped input tensors

I have extracted the embeddings for a particular entity X from every sentence in my dataset. Where X is mentioned more than once within the same sentence, this yields an embedding for each mention: I'd like to put these through an average pooling layer to arrive at a single embedding for X in each sentence.
Simplified working example:
import tensorflow as tf
embeddings = tf.constant([[1, 1, 1],
[2, 2, 2],
[4, 4, 4],
[5, 5, 5]])
# Let's imagine rows [1, 1, 1] & [4, 4, 4]
# correspond to embeddings for X from the same sentence
# We can indicate sentence belonging through an sent_idxs variable:
sent_idxs = tf.constant([0, 1, 0, 2])
With the help of related stackoverflow questions (Torch - How to calculate average of tensors with the same indexes, Summing over specific indices PyTorch (similar to scatter_add)), I could average embeddings corresponding to the same sentence like this:
unique_idxs, _, counts = tf.unique_with_counts(sent_idxs) # counts = ([2, 1, 1])
result_holder = tf.zeros([unique_idxs.shape[0], embeddings.shape[1]], dtype= embeddings.dtype)
embeddings = tf.tensor_scatter_nd_add(result_holder, tf.expand_dims(sent_idxs, axis=1), embeddings)
embeddings /= counts[:, None]
However, I would prefer to re-shape my original embeddings to instead perform the averaging with AveragePooling2D or AveragePooling1D, and I'm really struggling with imagining the appropriate shape to enable this.

How to decode the output of seq2seq?

The code here of the Tensorflow translate.py example confused me. The copied code is:
# This is a greedy decoder - outputs are just argmaxes of output_logits.
outputs = [int(np.argmax(logit, axis=1)) for logit in output_logits]
Why does the argmax work?
The output_logits's shape is [bucket_length,batch_size,embedding_size]
For each logit (or: activation for each word) they take the index where the activation has the highest value of everything.
For the argmax: take a look at the numpy examples on this page: https://docs.scipy.org/doc/numpy/reference/generated/numpy.argmax.html
a = array([[0, 1, 2],
[3, 4, 5]])
>>> np.argmax(a)
5
>>> np.argmax(a, axis=0)
array([1, 1, 1])
>>> np.argmax(a, axis=1)
array([2, 2])
So what output does is:
For each word (the length of bucket_length)
get the max activation of the embedding_size
You should look at the shape of the resulting outputs array. You will see that because batch_size is 1 it all works out!
Let me know if this helps you!

input dimension reshape in Tensorflow conolutional network

In the expert mnist tutorial in tensorflow website, it have something like this :
x_image = tf.reshape(x, [-1,28,28,1])
I know that the reshape is like
tf.reshape(input,[batch_size,width,height,channel])
Q1 : why is the batch_size equals -1? What does the -1 means?
And when I go down the code there's one more thing I can not understand
W_fc1 = weight_variable([7 * 7 * 64, 1024])
Q2:What does the image_size * 64 means?
Q1 : why is the batch_size equals -1? What does the -1 means?
-1 means "figure this part out for me". For example, if I run:
reshape([1, 2, 3, 4, 5, 6, 7, 8], [-1, 2])
It creates two columns, and whatever number of rows it needs to get everything to fit:
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
Q2:What does the image_size * 64 means?
It is the number of filters in that particular filter activation. Shapes of filters in conv layers follow the format [height, width, # of input channels (number of filters in the previous layer), # of filters].
When you pass -1 as a dimension in tf.reshape, it preserves the existing dimension. From the docs:
If one component of shape is the special value -1, the size of that
dimension is computed so that the total size remains constant. In
particular, a shape of [-1] flattens into 1-D. At most one component
of shape can be -1.
The reference to 7 x 7 x 64 is because the convolutional layer being applied prior to this example has reduced the image to a shape of [7, 7, 64], and the input to the next fully connected layer needs to be a single dimension, so in the next line of the example, the tensor is reshaped from [7,7,64] to [7*7*64] so it can connect to the FC layer.
For more info on how convolutions and max pooling works, the wikipedia page has some helpful graphics:
e.g. network architecture:
and pooling:

tf.rank function in Tensorflow

I ma trying to understand tf.rank function in tensorflow. From the documentation here, I understood that rank should return the number of distinct elements in the tensor.
Here x and weights are 2 distinct 2*2 tensors with 4 distinct elemnts in each of them. However, rank() function outputs are:
Tensor("Rank:0", shape=(), dtype=int32) Tensor("Rank_1:0", shape=(),
dtype=int32)
Also, for the tensor x, I used tf.constant() with dtype = float to convert ndarray into float32 tensor but the rank() still outputs as int32.
g = tf.Graph()
with g.as_default():
weights = tf.Variable(tf.truncated_normal([2,2]))
x = np.asarray([[1 , 2], [3 , 4]])
x = tf.constant(x, dtype = tf.float32)
y = tf.matmul(weights, x)
print (tf.rank(x), tf.rank(weights))
with tf.Session(graph = g) as s:
tf.initialize_all_variables().run()
print (s.run(weights), s.run(x))
print (s.run(y))
How should I interpret the output.
Firstly, tf.rank returns the dimension of a tensor, not the number of elements. For instance, the output from tf.rank called for the 2x2 matrix would be 2.
To print the rank of a tensor, create an appropriate node, e.g. rank = tf.rank(x) and then evaluate this node using a Session.run(), as you've done for weights and x. Execution of print (tf.rank(x), tf.rank(weights)) expectedly prints out description of tensors, as tf.rank(x), tf.rank(weights) are nodes of the graph, not the variables with defined values.

Tensorflow: access shape of placeholder after NN layer in code

So, here is what I want to do:
Right now, I have padding = 'SAME' for all of my neural net layers. I would like to make my code more generic, so I can build my nets with arbitrary paddings, and I don't want to have to calculate how big the output tensors of the layers of my net are. I would like to just access the dimension at initialization/run time, the way the tf.nn functions apparently do internally, so I can initialize my weight and bias tensors in the correct dimension...
So,
How do I access the "shape" function/object of the output placeholder of a convolution?
There are two kinds of shapes -- tensor.get_shape() which gives static shape computed by Python wrappers during Graph construction (whenever possible), and tf.shape(tensor) which is an op that can be executed during runtime to get shape of the tensor (always possible). Both of these work for convolutions.
a = tf.Variable(tf.ones((1, 3, 3, 1)))
b = tf.Variable(tf.ones((3, 3, 1, 1)))
c = tf.nn_ops.conv2d(a, b, [1, 1, 1, 1], padding="VALID")
sess = create_session()
sess.run(tf.initialize_all_variables())
print c.get_shape()
print sess.run(tf.shape(c))
This gives
(1, 1, 1, 1)
[1 1 1 1]