I am learning the TensorFlow, building a multilayer_perceptron model. I am looking into some examples like the one at: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/multilayer_perceptron.ipynb
I then have some questions in the code below:
def multilayer_perceptron(x, weights, biases):
:
:
pred = multilayer_perceptron(x, weights, biases)
:
:
with tf.Session() as sess:
sess.run(init)
:
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Accuracy:", accuracy.eval({x: X_test, y: y_test_onehot}))
I am wondering what do tf.argmax(prod,1) and tf.argmax(y,1) mean and return (type and value) exactly? And is correct_prediction a variable instead of real values?
Finally, how do we get the y_test_prediction array (the prediction result when the input data is X_test) from the tf session? Thanks a lot!
tf.argmax(input, axis=None, name=None, dimension=None)
Returns the index with the largest value across axis of a tensor.
input is a Tensor and axis describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
For your specific case let's use two arrays and demonstrate this
pred = np.array([[31, 23, 4, 24, 27, 34],
[18, 3, 25, 0, 6, 35],
[28, 14, 33, 22, 20, 8],
[13, 30, 21, 19, 7, 9],
[16, 1, 26, 32, 2, 29],
[17, 12, 5, 11, 10, 15]])
y = np.array([[31, 23, 4, 24, 27, 34],
[18, 3, 25, 0, 6, 35],
[28, 14, 33, 22, 20, 8],
[13, 30, 21, 19, 7, 9],
[16, 1, 26, 32, 2, 29],
[17, 12, 5, 11, 10, 15]])
Evaluating tf.argmax(pred, 1) gives a tensor whose evaluation will give array([5, 5, 2, 1, 3, 0])
Evaluating tf.argmax(y, 1) gives a tensor whose evaluation will give array([5, 5, 2, 1, 3, 0])
tf.equal(x, y, name=None) takes two tensors(x and y) as inputs and returns the truth value of (x == y) element-wise.
Following our example, tf.equal(tf.argmax(pred, 1),tf.argmax(y, 1)) returns a tensor whose evaluation will givearray(1,1,1,1,1,1).
correct_prediction is a tensor whose evaluation will give a 1-D array of 0's and 1's
y_test_prediction can be obtained by executing pred = tf.argmax(logits, 1)
The documentation for tf.argmax and tf.equal can be accessed by following the links below.
tf.argmax() https://www.tensorflow.org/api_docs/python/math_ops/sequence_comparison_and_indexing#argmax
tf.equal() https://www.tensorflow.org/versions/master/api_docs/python/control_flow_ops/comparison_operators#equal
Reading the documentation:
tf.argmax
Returns the index with the largest value across axes of a tensor.
tf.equal
Returns the truth value of (x == y) element-wise.
tf.cast
Casts a tensor to a new type.
tf.reduce_mean
Computes the mean of elements across dimensions of a tensor.
Now you can easily explain what it does. Your y is one-hot encoded, so it has one 1 and all other are zero. Your pred represents probabilities of classes. So argmax finds the positions of best prediction and correct value. After that you check whether they are the same.
So now your correct_prediction is a vector of True/False values with the size equal to the number of instances you want to predict. You convert it to floats and take the average.
Actually this part is nicely explained in TF tutorial in the Evaluate the Model part
tf.argmax(input, axis=None, name=None, dimension=None)
Returns the index with the largest value across axis of a tensor.
For the case in specific, it receives pred as argument for it's input and 1 as axis. The axis describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
Example: Given the list [2.11,1.0021,3.99,4.32] argmax will return 3 which is the index of the highest value.
correct_prediction is a tensor that will be evaluated later. It is not a regular python variable. It contains the necessary information to compute the value later.
For this specific case, it will be part of another tensor accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) and will be evaluated by eval on accuracy.eval({x: X_test, y: y_test_onehot}).
y_test_prediction should be your correct_prediction tensor.
For those who do not have much time to understand tf.argmax:
x = np.array([[1, 9, 3],[4, 5, 6]])
tf.argmax(x, axis = 0)
output:
[array([1, 0, 1], dtype=int64)]
tf.argmax(x, axis = 1)
Output:
[array([1, 2], dtype=int64)]
source
Related
I have tensor (None, 196) and after reshaping, it becomes (None, 14, 14).
And now, I want to copy it to channel axis, so that the shape should be (None, 14, 14, 512). Lastly, I want to copy to timestep axis, so it becomes (None, 10, 14, 14, 512). I accomplish those steps using this snippet code:
def replicate(tensor, input_target):
batch_size = K.shape(tensor)[0]
nf, h, w, c = input_target
x = K.reshape(tensor, [batch_size, 1, h, w, 1])
# Replicate to channel dimension
x = K.tile(x, [batch_size, 1, 1, 1, c])
# Replicate to timesteps dimension
x = K.tile(x, [batch_size, nf, 1, 1, 1])
return x
x = ...
x = Lambda(replicate, arguments={'input_target':input_shape})(x)
another_x = Input(shape=input_shape) # shape (10, 14, 14, 512)
x = layers.multiply([x, another_x])
x = ...
I plot the model and the output shape is just like I want it to be. But, the problem arises in model training. I set the batch size to 2. This the the error message:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [8,10,14,14,512] vs. [2,10,14,14,512]
[[{{node multiply_1/mul}} = Mul[T=DT_FLOAT, _class=["loc:#training/Adam/gradients/multiply_1/mul_grad/Sum"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](Lambda_2/Tile_1, _arg_another_x_0_0/_189)]]
[[{{node metrics/top_k_categorical_accuracy/Mean_1/_265}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6346_metrics/top_k_categorical_accuracy/Mean_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Looks like, K.tile() increases the batch size from 2 to 8. When I set the batch size to 10, it becomes 1000.
So, my question is how to achieve the result as I want? Is it good way to use tile()? Or, should I use repeat_elements()? Thanks!
I am using Tensorflow 1.12.0 and Keras 2.2.4.
As a rule of thumb, try to avoid bringing batch size to the transformations happening in the Lambda layer.
When you use tile operation, you only set only the dimension that needs to change (for example you had batch_size value in your tile operation which is wrong). Also I am using tf.tile instead of K.tile (TF 1.12 doesn't have tile in the Keras backend it seems).
def replicate(tensor, input_target):
_, nf, h, w, c = input_target
x = K.reshape(tensor, [-1, 1, h, w, 1])
# Replicate to channel dimension
# You can combine below lines to tf.tile(x, [1, nf, 1, 1, c]) as well
x = tf.tile(x, [1, 1, 1, 1, c])
# Replicate to timesteps dimension
x = tf.tile(x, [1, nf, 1, 1, 1])
return x
Simple example
input_shape= [None, 10, 14, 14, 512]
x = Input(shape=(196,))
x = Lambda(replicate, arguments={'input_target':input_shape})(x)
print(x.shape)
Which gives
>>> (?, 10, 14, 14, 512)
I'm trying to build a neural network where the labels and the number of labels change on input. For example, I could have a final layer of 10 units that represent the logit of their class, but sometimes I will only need units [1,3,4] to calculate cross entropy, some of the units [3,4,5,7] etc.
I tried using different combinations of map_fn, gather, py_fn and while_loop but no one seems to be in my case. Another way might be to list all types of label combinations (I call them network heads) and find some conditional constructs that allow me to choose one based on the value of a placeholder. But I'm not sure how to implement it.
For example:
x = tf.placeholder(dtype=tf.float32, shape=[None,3])
y = tf.placeholder(dtype=tf.int32, shape=[None, 3])
... to_do ...
with tf.Session() as sess:
sess.run(to_do, feed_dict={x: [[1, 3, 4], [3, 7, 8]], y: [[1, 0, 0], [0, 1, 1]]})
Here I need something that return [[1],[7,8]].
Oh no matter. There was a very easy way to get the probabilites I needed for cross-entropy.
x = tf.placeholder(dtype=tf.float32, shape=[None,3])
y = tf.placeholder(dtype=tf.int32, shape=[None, 3])
probabilities = tf.where(tf.equal(y,1), tf.exp(x), tf.zeros_like(x))
normalizing_sum = tf.reduce_sum(probabilities, 1, keep_dims=True)
probabilities/=normalizing_sum
with tf.Session() as sess:
res = sess.run(probabilities, feed_dict={x: [[1, 3, 4], [3, 7, 8]], y: [[1, 0, 0], [0, 1, 1]]})
I want to take use of tensorflow to implement fully convolutional network. There is a function
tf.nn.conv2d_transpose(value, filter, output_shape, strides, padding, name),
which could be used to take bilinear upsampling. However, I am confused as to how to use it? The input is an image with a single channel, and the output is also an image with a single channel, whose size is two times the one of the input.
I tried to use the function as follows, but got an IndexError: list index out of range:
with tf.name_scope('deconv') as scope:
deconv = tf.nn.conv2d_transpose(conv6, [3, 3, 1, 1],
[1, 26, 20, 1], 2, padding='SAME', name=None)
Got it! (assuming input_size = [1, 13, 10,1])
with tf.name_scope('deconv') as scope:
deconv = tf.nn.conv2d_transpose(input_layer, [3, 3, 1, 1],
[1, 26, 20, 1], [1, 2, 2, 1], padding='SAME', name=None)
I am trying to use the tf.nn.deconv2d() op on a variable-sized batch of data. However, it appears that I need to set the output_shape argument as follows:
tf.nn.deconv2d(x, filter, output_shape=[12, 24, 24, 5], strides=[1, 2, 2, 1],
padding="SAME")
Why does tf.nn.deconv2d() take a fixed output_shape? Is there any way to specify a variable batch dimension? What happens if the input batch size varies?
N.B. tf.nn.deconv2d() will be called tf.nn.conv2d_transpose() in the next release of TensorFlow (0.7.0).
The output_shape argument to tf.nn.deconv2d() accepts a computed Tensor as its value, which enables you specify a dynamic shape. For example, let's say your input is defined as follows:
# N.B. Other dimensions are chosen arbitrarily.
input = tf.placeholder(tf.float32, [None, 24, 24, 5])
...then the batch size for a particular step can be computed at runtime:
batch_size = tf.shape(input)[0]
With this value, you can then build the output_shape argument to tf.nn.deconv2d() using tf.pack():
output_shape = tf.pack([batch_size, 24, 24, 5])
result = tf.nn.deconv2d(..., filter, output_shape=output_shape,
strides=[1, 2, 2, 1], padding='SAME')
I'm getting this error message when using conv2d_transpose:
W tensorflow/core/common_runtime/executor.cc:1102] 0x7fc81f0d6250 Compute status: Invalid argument: Conv2DBackpropInput: Number of rows of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: generator/g_h1/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](generator/g_h1/conv2d_transpose/output_shape, generator/g_h1/w/read, _recv_l_0)]]
However, it occurs after the graph is built while compiling the loss function (Adam). Any ideas on what would cause this? I suspect it's related to the input dimensions but I'm not sure exactly why.
Full error: https://gist.github.com/jimfleming/75d88e888044615dd6e3
Relevant code:
# l shape: [batch_size, 32, 32, 4]
output_shape = [self.batch_size, 8, 8, 128]
filter_shape = [7, 7, 128, l.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h1"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
h1 = tf.nn.relu(h1)
output_shape = [self.batch_size, 16, 16, 128]
filter_shape = [7, 7, 128, h1.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h2"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h2 = tf.nn.conv2d_transpose(h1, w,output_shape=output_shape, strides=strides, padding='SAME')
h2 = tf.nn.relu(h2)
output_shape = [self.batch_size, 32, 32, 3]
filter_shape = [5, 5, 3, h2.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h3"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h3 = tf.nn.conv2d_transpose(h2, w,output_shape=output_shape, strides=strides, padding='SAME')
h3 = tf.nn.tanh(h3)
Thanks for the question! You're exactly right---the problem is that the input and output dimensions being passed to tf.nn.conv2d_transpose don't agree. (The error may be detected when computing gradients, but the gradient computation isn't the problem.)
Let's look at just the first part of your code, and simplify it a little bit:
sess = tf.Session()
batch_size = 3
output_shape = [batch_size, 8, 8, 128]
strides = [1, 2, 2, 1]
l = tf.constant(0.1, shape=[batch_size, 32, 32, 4])
w = tf.constant(0.1, shape=[7, 7, 128, 4])
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
print sess.run(h1)
I replaced the variables with constants --- it's easier to see what's going on.
If you try to run this code, you get a similar error:
InvalidArgumentError: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: conv2d_transpose_6 = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_transpose_6/output_shape, Const_25, Const_24)]]
Now, the error is a little misleading --- it talks about the 'out_backprop' argument to 'Conv2DCustomBackpropInput'. The key is that tf.nn.conv2d_transpose is actually just the gradient of tf.nn.conv2d, so Tensorflow uses the same code internally (Conv2DCustomBackpropInput) to compute the gradient of tf.nn.conv2d and to compute tf.nn.conv2d_transpose.
The error means that the 'output_shape' you requested is not possible, given the shapes of 'l' and 'w'.
Since tf.nn.conv2d_transpose is the backward (gradient) counterpart of tf.nn.conv2d, one way to see what the correct shapes should be is to use the corresponding forward operation:
output = tf.constant(0.1, shape=output_shape)
expected_l = tf.nn.conv2d(output, w, strides=strides, padding='SAME')
print expected_l.get_shape()
# Prints (3, 4, 4, 4)
That is, in the forward direction, if you provided a tensor of shape 'output_shape', you would get out a tensor of shape (3, 4, 4, 4).
So one way to fix the problem is to change the shape of 'l' to (3, 4, 4, 4); if you change the code above to:
l = tf.constant(0.1, shape=[batch_size, 4, 4, 4])
everything works fine.
In general, try using tf.nn.conv2d to get a feel for what the relationship between the tensor shapes is. Since tf.nn.conv2d_transpose is its backward counterpart, it has the same relationship between input, output and filter shapes (but with the roles of the input and output reversed.)
Hope that helps!
Using padding='SAME' in tf.nn.conv2d_transpose() function may works too