I test the following code script
import tensorflow as tf
a, b, c = 2, 3, 4
x = tf.Variable(tf.random_normal([a, b, c], mean=0.0, stddev=1.0, dtype=tf.float32))
s = tf.shape(x)
print(s)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
print(sess.run(s))
Running the code gets the following result
Tensor("Shape:0", shape=(3,), dtype=int32)
[2 3 4]
Looks like only the second print gives the readable format. What does the first print really do or how to understand the first output?
The call to s = tf.shape(x) defines a symbolic (but very simple) TensorFlow computation that only executes when you call sess.run(s).
When you execute print(s) Python will print everything that TensorFlow knows about the tensor s without actually evaluating it. Since it is the output of a tf.shape() op, TensorFlow knows that it has type tf.int32, and TensorFlow can also infer that it is a vector of length 3 (because x is statically known to be a 3-D tensor from the variable definition).
Note that in many cases you can obtain more shape information without calling a tensor by printing the static shape of a particular tensor, using the Variable.get_shape() method (similar to its cousin Tensor.get_shape()):
# Print the statically known shape of `x`.
print(x.get_shape())
# ==> "(2, 3, 4)"
Related
I have a simple linear regression question as below:
My codes are as below:
import tensorflow as tf
import numpy as np
batch_xs=np.array([[0,0,1],[1,1,1],[1,0,1],[0,1,1]])
batch_ys=np.array([[0],[1],[1],[0]])
x = tf.placeholder(tf.float32, [None, 3])
W = tf.Variable(tf.zeros([3, 1]))
b = tf.Variable(tf.zeros([1]))
y = tf.nn.sigmoid(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 1])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
learning_rate = 0.05
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)
sess = tf.Session()
tf.global_variables_initializer().run(session=sess)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
Prediction:
x0=np.array([[1.,0.,0.]])
x0=np.float32(x0)
y0=tf.nn.softmax(tf.matmul(x0,W) + b)
print(y0)
However, print(y0) shows Tensor("Softmax_2:0", shape=(1, 1), dtype=float32) instead of a figure. I expect y0 would be around 0.99.
I tried y0.eval(), but I got ValueError: Cannot evaluate tensor using 'eval()': No default session is registered..
How can I make a change to obtain the result? Thanks!
There are a couple of ways to get things to print out while writing TensorFlow code. Of course, there’s the classic Python built-in, print (Or the function print(), if we’re being Python 3 about it). And then there’s TensorFlow’s print function, tf.Print (notice the capital P).
When working with TensorFlow, it’s important to remember that everything is ultimately a graph computation. This means that if you print a TensorFlow operation using Python’s print, it will simply show a description of what that operation is, since no values have been passed through it yet. It will also often show the dimensions that are expected to be in that node, if they’re known.
If you want to print the values that are ‘flowing’ through a particular part of the graph as it’s being executed, then we need to turn to using tf.Print.
I am learning tensorflow using the tensorflow machine learning cookbook (https://github.com/nfmcclure/tensorflow_cookbook). I am currently on the NLP chapter (07). I am very confused about how one decides on the dimensions of the tensorflow variables. For example, in the bag of words example they use:
# Create variables for logistic regression
A = tf.Variable(tf.random_normal(shape=[embedding_size,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
# Initialize placeholders
x_data = tf.placeholder(shape=[sentence_size], dtype=tf.int32)
y_target = tf.placeholder(shape=[1, 1], dtype=tf.float32)
and in the tf-idf example they use:
# Create variables for logistic regression
A = tf.Variable(tf.random_normal(shape=[max_features,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
x_data = tf.placeholder(shape=[None, max_features], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
How does one decide on when to use None vs. 1 in the placeholder shapes? Thank you!
Using None as part of a shape, means it will be determined when you run the session.
This is useful for training in what is called batch training where you feed each iteration of the training process a fixed size subset of the data.
So if you kept it at None you can switch between batch sizes without a problem. (Although you won't be doing so in the same session, but every session you can try a different batch size)
When you state a specific shape, that is what it will be and that is the only shape that can be fed to it during the session (using the feed_dict param)
In your specific example, the first part of code, the shape of y_target will always be [1, 1] where in the second part of code, y_target could be [10, 1] / [200, 1] / [<whatever>, 1]
'None' should be used when count of elements in placeholder is unknown in advance. But for example in x_data placeholder if count of data elements is 1 i.e. it is known in advance, then you can replace 'None' with 1.
I am trying to create a simple neural net in TensorFlow. The only tricky part is I have a custom operation that I have implemented with py_func. When I pass the output from py_func to a Dense layer, TensorFlow complains that the rank should be known. The specific error is:
ValueError: Inputs to `Dense` should have known rank.
I don't know how to preserve the shape of my data when I pass it through py_func. My question is how do I get the correct shape? I have a simple example below to illustrate the problem.
def my_func(x):
return np.sinh(x).astype('float32')
inp = tf.convert_to_tensor(np.arange(5))
y = tf.py_func(my_func, [inp], tf.float32, False)
with tf.Session() as sess:
with sess.as_default():
print(inp.shape)
print(inp.eval())
print(y.shape)
print(y.eval())
The output from this snippet is:
(5,)
[0 1 2 3 4]
<unknown>
[ 0.
1.17520118 3.62686038 10.01787472 27.28991699]
Why is y.shape <unknown>? I want the shape to be (5,) the same as inp. Thanks!
Since py_func can execute arbitrary Python code and output anything, TensorFlow can't figure out the shape (it would require analyzing Python code of function body) You can instead give the shape manually
y.set_shape(inp.get_shape())
I have used the model described here on the 0.6.0 branch. The code can be found here. I have done some minor changes to the linked code.
In my code I create two models, one for training and one for validation, very similar as it is done in the Tensorflow Tutorial.
with tf.variable_scope("model", reuse=None, initializer=initializer):
m = PTBModel_User(is_training=True, config=config, name='Training model')
with tf.variable_scope("model", reuse=True, initializer=initializer):
mtest = PTBModel_User(is_training=False, config=config_valid, name='Validation model')
The first model, the one for training, seems to be created just fine, but the second, used for validation, does not. The output gets a None dimension! The row I'm refering to is on row 134 in the linked code:
output = tf.reshape(tf.concat(1, outputs), [-1, size])
I've added these lines right after the reshape of the output:
output_shape = output.get_shape()
print("Model num_steps:", num_steps)
print("Model batch_size:", batch_size)
print("Output dims", output_shape[0], output_shape[1])
and that gives me this:
Model num_steps: 400
Model batch_size: 1
Output dims Dimension(None) Dimension(650)
This problem only happens with the 'validation model', not with the 'training model'. For the 'training model' I get expected output:
Model num_steps: 400
Model batch_size: 2
Output dims Dimension(800) Dimension(650)
(Note that with the 'validation model' I use a batch_size=1 instead of batch_size=2 that I use for the training model)
From what I understand, using -1 as input to the reshape function, will figure the output shape out automagically! But then why do I get None? Nothing in my config fed to the model has a None value.
Thank you for all the help and tips!
TL;DR: A dimension being None simply means that shape inference could not determine an exact shape for the output tensor, at graph-building time. When you run the graph, the tensor will have the appropriate run-time shape.
If you're not interested in how shape inference works, you can stop reading now.
Shape inference applies local rules, based on a "shape function" that takes the shapes of the inputs to an operation and computes (possibly incomplete) shapes for the outputs of an operation. To figure out why tf.reshape() gives an incomplete shape, we have to look at its inputs, and work backwards:
The shape argument to tf.reshape() includes a [-1], which means "figure the output shape automagically" based on the shape of the tensor input.
The tensor input is the output of tf.concat() on the same line.
The inputs to tf.concat() are computed by a tf.mul() in BasicLSTMCell.__call__(). The tf.mul() op multiplies the result of a tf.tanh() and a tf.sigmoid() op.
The tf.tanh() op produces an output of size [?, hidden_size], and the tf.sigmoid() op produces an output of size [batch_size, hidden_size].
The tf.mul() op performs NumPy-style broadcasting. A dimension will only be broadcast if it has size 1. Consider three cases where we compute tf.mul(x, y):
If x has shape [1, 10], and y has shape [5, 10], then broadcasting will happen, and the output shape will be [5, 10].
If x has shape [1, 10], and y has shape [1, 10], then there will be no broadcasting, and the output shape will be [1, 10].
However, if x has shape [1, 10], and y has shape [?, 10], there is insufficient static information to tell whether broadcasting will happen (even though we happen to know that case 2 applies at runtime).
Therefore, when batch_size is 1, the tf.mul() op produces an output with the shape [?, hidden_size]; but when batch_size is greater than 1, the output shape is [batch_size, hidden_size].
Where shape inference breaks down, it can be appropriate to use the Tensor.set_shape() method to add information. This would potentially be useful in the BasicLSTMCell implementation, where we know more than it is possible to infer about the shapes of the outputs.
How can I choose to execute a portion of the graph based on a condition?
I have a part of my network which is to be executed only if a placeholder value is provided in feed_dict. An alternate path is taken if the value is not provided. How do I go about implementing this using tensorflow?
Here are the relevant portions of my code:
sess.run(accuracy, feed_dict={inputs: mnist.test.images, outputs: mnist.test.labels})
N = tf.shape(outputs)
cost = 0
if N > 0:
y_N = tf.slice(h_c, [0, 0], N)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(y_N, outputs, name='xentropy')
cost = tf.reduce_mean(cross_entropy, name='xentropy_mean')
In the above code, I'm looking for something to use in the place of if N > 0:
Hrm. It's possible that what you want is tf.control_flow_ops.cond()
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/control_flow_ops.py#L597
But that's not exported into the tf namespace, and I'm answering without checking how guaranteed-stable this interface is, but it's used in released models, so go for it. :)
However: Because you actually know in advance what path you want when you construct the feed_dict, you could also take a different approach of invoking a separate path through your model. The standard way to do this is to, e.g., set up code like:
def model(input, n_greater_than):
... cleverness ...
if n_greater_than:
... other cleverness...
return tf.reduce_mean(input)
out1 = model(input, True)
out2 = model(input, False)
And then pull the out1 or out2 nodes depending upon what you know when you're about to run your computation and set the feed_dict. Remember that by default, if the model references the same variables (create them outside the model() func), then you'll basically have two separate paths through.
You can see an example of this in the convolutional mnist example: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/image/mnist/convolutional.py#L165
I'm a fan of doing it this way without introducing control flow dependencies if you can.
Here is a simple example, that can get you started. It executes different parts of the graph based on the shape of the tensor:
import tensorflow as tf
a = tf.Variable([[3.0, 3.0], [3.0, 3.0]])
b = tf.Variable([[1.0, 1.0], [2.0, 2.0]])
l = tf.shape(a)
add_op, sub_op = tf.add(a, b), tf.sub(a, b)
sess = tf.Session()
init = tf.initialize_all_variables()
sess.run(init)
t = sess.run(l)
print sess.run(sub_op if t[0] == 3 else add_op)
sess.close()
Change 3 to 2 to see how tensor will be subtracted. As you see I initiated the nodes for add and sub and shape, then in the graph I check for the shape and go run the specific part.