I want to initialize the Weights variable by including the BatchSize dimension, which will be different between the Training and Prediction stages. Tried using the placeholder for that, but doesn't seem to work:
batchsize = tf.placeholder(tf.int32, name='batchsize', shape=[])
...
output, state = tf.nn.dynamic_rnn(multicell, X, dtype=tf.float32, initial_state=inState)
weights = tf.Variable(tf.truncated_normal([batchsize, CELL_SIZE, 1], 0.0, 1.0), name='weights')
bias = tf.Variable(tf.zeros(1), name='bias')
preds = tf.add(tf.matmul(output, weights), bias, name='preds')
loss = tf.reduce_mean(tf.squared_difference(preds, Y_))
train_step = tf.train.AdamOptimizer(LR).minimize(loss)
I can get it to work by specifying batchsize as a constant for the weights variable dimension, as opposed to a placeholder, but this way I get an error when I try to recover the session for the Prediction stage, because there the batchsize is 1. If I specify the placeholder, I get the error:
ValueError: initial_value must have a shape specified: Tensor("truncated_normal:0", shape=(?, 32, 1), dtype=float32)
Even though I do pass the value for the batchsize placeholder into the feed_dict when running this part of the graph.
If I specify the option validate_shape=False while creating the weights variable, that stage of the graph works, but later I get this error in AdamOptimizer:
ValueError: as_list() is not defined on an unknown TensorShape.
How can I get this to work? My ultimate goal is to reduce the Cell-Size dimension of the dynamic_rnn output down to 1 to predict the output at each time-step of the RNN.
Make the whole size of variable
get the specific shape of variable corresponding to the batch size (using tf.gather)
self.model_X = tf.placeholder(dtype=tf.float32, shape=[None, 100], name='X')
real_batch_size = tf.cast(tf.shape(self.model_X)[0],tf.int32)
self.y_dk = tf.get_variable(name="y_dk",initializer=tf.truncated_normal(shape=[self.num_doc, self.num_topic], mean=0, stddev=tf.truediv(1.0,self.lambda_y)), dtype=tf.float32)
batch_y_dk = tf.reshape(tf.gather(self.y_dk, self.model_batch_data_idx), [real_batch_size, self.num_topic])
Related
Is it possible to reassign weights with different values than the initialized and still train it successfully?
For ex:
Weights= tf.Variable(shape, zeros(), name="weights")
update_weights = weights + steps * bytes
Weights = Weights.assign(update_weights)
But I get the following error when I train it using AdamOptimizer:
Trying to optimize unsupported type <tf.Tensor 'conv_1/Assign:0' shape=(5, 5, 1, 30) dtype=float32_ref>
To convert tensor to variable suitable for minimize() of Adam optimizer: used this:
q_weights = tf.Variable(q_weights.assign(weights))
But got the following error!
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node conv_2/Variable/conv_2/Assign_conv_2/Variable_0 was passed float from conv_2/Variable/cond/Merge:0 incompatible with expected float_ref.
Full flow of code attached:
Weights= tf.Variable(shape, zeros(), name="weights")
update_weights = weights + steps * bytes
Weights = Weights.assign(update_weights)
conv = tf.nn.conv2d(input, Weights, ...)
act = tf.nn.relu(...)
tf.add_to_collection('train_params, Weights)
do dot ....
tf.add_to_collection('train_params, Weights)
do all remaining layers.....
tf.add_to_collection('train_params, Weights)
do logits
tf.add_to_collection('train_params, Weights)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels, logits)
back_prop = tf.train.AdamOptimizer(LR).minimize(loss, var_list = tf.get_collection('train_params')
repeat for every iteration
Thanks for the help.
Just put some dependency on your Graph, since you want to tell "Before starting your computation update the weights like this" which gets translated into:
Weights= tf.Variable(shape, zeros(), name="weights")
update_weights = tf.assign(Weights, Weights + steps * bytes)
with tf.control_dependencies([update_weights]):
conv = tf.nn.conv2d(input, Weights, ...)
etc. etc.
I am trying to create a variable and then trying to assign it with the value of my convolution layer.
However it is refusing because it is saying shapes are not equal even though I have passed validate_shape=False while creating the variable.
The convolution shape is [32,20,20,3]. How do I pass this into the variable?
the bottom code:
conv = tf.layers.conv2d_transpose(conv, filters=3, kernel_size=3, strides=(2,2), padding='same',activation=tf.nn.relu) # TO ASSIGN LATER
g=tf.Variable(([32,20,20]),dtype=tf.float32,validate_shape=False)#THE VARIABLE
loss = tf.reduce_mean(tf.square(conv))
opt = tf.train.AdamOptimizer().minimize(loss)
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(1000):
_, xx,inp,output,target = sess.run([opt, loss,x,conv,y])#
print(xx)
print("subtraction result:",output[0]-target[0])
g=g.assign(conv)
print(g.eval())
I am getting this error:
InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3] rhs shape= [32,20,20,3]
[[Node: Assign_7 = Assign[T=DT_FLOAT, use_locking=false, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable_9, conv2d_transpose_98/Relu)]]
Can someone please help fix this?
I think you want:
import numpy as np
import tensorflow as tf
g = tf.Variable(initial_value=np.zeros((32,20,20,3)), expected_shape=(32,20,20,3), dtype=tf.float32)
If you print g you get the correct shape now:
<tf.Variable 'Variable_3:0' shape=(32, 20, 20, 3) dtype=float32_ref>
What you did was this:
g = tf.Variable(initial_value=(32,20,20), dtype=tf.float32, valid_shape=False)
By not stating expected_shape you defaulted to positional arguments, the first argument of tf.Variable is initial_value as per the documentation below:
__init__(
initial_value=None,
trainable=True,
collections=None,
validate_shape=True,
caching_device=None,
name=None,
variable_def=None,
dtype=None,
expected_shape=None,
import_scope=None,
constraint=None
)
That shape of the initial_value you declared would have been a vector of length [3] which is exactly the shape that the assign operation is complaining about.
Moral of the story: it's generally less buggy to declare arguments by name if you can. :)
Currently I have the following code:
init_state = tf.Variable(tf.zeros([batch_partition_length, state_size])) # -> [16, 1024].
final_state = tf.Variable(tf.zeros([batch_partition_length, state_size]))
And inside my inference method that is responsible producing the output, I have the following:
def inference(frames):
# Note that I write the final_state as a global valriable to avoid the shadowing issue, since it is referenced at the dynamic_rnn line.
global final_state
# .... Here we have some conv layers and so on...
# Now the RNN cell
with tf.variable_scope('local1') as scope:
# Move everything into depth so we can perform a single matrix multiply.
shape_d = pool3.get_shape()
shape = shape_d[1] * shape_d[2] * shape_d[3]
# tf_shape = tf.stack(shape)
tf_shape = 1024
print("shape:", shape, shape_d[1], shape_d[2], shape_d[3])
# So note that tf_shape = 1024, this means that we have 1024 features are fed into the network. And
# the batch size = 1024. Therefore, the aim is to divide the batch_size into num_steps so that
reshape = tf.reshape(pool3, [-1, tf_shape])
# Now we need to reshape/divide the batch_size into num_steps so that we would be feeding a sequence
rnn_inputs = tf.reshape(reshape, [batch_partition_length, step_size, tf_shape])
print('RNN inputs shape: ', rnn_inputs.get_shape()) # -> (16, 64, 1024).
cell = tf.contrib.rnn.BasicRNNCell(state_size)
# note that rnn_outputs are the outputs but not multiplied by W.
rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state)
# linear Wx + b
with tf.variable_scope('softmax_linear') as scope:
weight_softmax = \
tf.Variable(
tf.truncated_normal([state_size, n_classes], stddev=1 / state_size, dtype=tf.float32, name='weight_softmax'))
bias_softmax = tf.constant(0.0, tf.float32, [n_classes], name='bias_softmax')
softmax_linear = tf.reshape(
tf.matmul(tf.reshape(rnn_outputs, [-1, state_size]), weight_softmax) + bias_softmax,
[batch_size, n_classes])
print('Output shape:', softmax_linear.get_shape())
return softmax_linear
# Here we define the loss, accuracy and the optimzer.
# now run the graph:
with tf.Session() as sess:
_, accuracy_train, loss_train, summary = \
sess.run([optimizer, accuracy, cost_scalar, merged], feed_dict={x: image_batch,
y_valence: valences,
confidence_holder: confidences})
....
Problem: How I would be able to assign initial_state the value stored in final_state? That is, how to more update a Variable value given the other?
I have used the following:
tf.assign(init_state, final_state.eval())
under session after running the sess.run command. But, this is throwing an error:
You must feed a value for placeholder tensor 'inputs' with dtype float
Where tf.Variable: "input" is declared as follows:
x = tf.placeholder(tf.float32, [None, 112, 112, 3], name='inputs')
And the feeding is done after reading the images from the tfRecords through the following command:
example = tf.train.Example()
example.ParseFromString(string_record)
height = int(example.features.feature['height']
.int64_list
.value[0])
width = int(example.features.feature['width']
.int64_list
.value[0])
img_string = (example.features.feature['image_raw']
.bytes_list
.value[0])
img_1d = np.fromstring(img_string, dtype=np.uint8)
reconstructed_img = img_1d.reshape((height, width, -1)) # Where this is added to the image_batch list, which is fed into the placeholder.
And if tried the following:
img_1d = np.fromstring(img_string, dtype=np.float32)
This will produce the following error:
ValueError: cannot reshape array of size 9408 into shape (112,112,newaxis)
Any help is much appreciated!!
So here are the mistakes that I have done so far. After doing some revision I figured out the following:
I shouldn't create the final_state as a tf.Variable. Since tf.nn.dynamic_rnn return tensors as ndarray, then, I should not instantiate the final_state int the beginning. And I should not use the global final_state under the function definition.
In order to assign the initial state the final_state, I used:
tf.assign(intial_state, final_state)
And things work out.
Note: in tensorflow, an operation returns the data as numpy array in python and as tensorflow::Tensor in C and C++.
Have a look at https://www.tensorflow.org/versions/r0.10/get_started/basic_usage for more informaiton.
assuming the input to the network is a placeholder with variable batch size, i.e.:
x = tf.placeholder(..., shape=[None, ...])
is it possible to get the shape of x after it has been fed? tf.shape(x)[0] still returns None.
If x has a variable batch size, the only way to get the actual shape is to use the tf.shape() operator. This operator returns a symbolic value in a tf.Tensor, so it can be used as the input to other TensorFlow operations, but to get a concrete Python value for the shape, you need to pass it to Session.run().
x = tf.placeholder(..., shape=[None, ...])
batch_size = tf.shape(x)[0] # Returns a scalar `tf.Tensor`
print x.get_shape()[0] # ==> "?"
# You can use `batch_size` as an argument to other operators.
some_other_tensor = ...
some_other_tensor_reshaped = tf.reshape(some_other_tensor, [batch_size, 32, 32])
# To get the value, however, you need to call `Session.run()`.
sess = tf.Session()
x_val = np.random.rand(37, 100, 100)
batch_size_val = sess.run(batch_size, {x: x_val})
print x_val # ==> "37"
You can get the shape of the tensor x using x.get_shape().as_list(). For getting the first dimension (batch size) you can use x.get_shape().as_list()[0].
I have a variable batch size, so all of my inputs are of the form
tf.placeholder(tf.float32, shape=(None, ...)
to accept the variable batch sizes. However, how might you create a constant value with variable batch size? The issue is with this line:
log_probs = tf.constant(0.0, dtype=tf.float32, shape=[None, 1])
It is giving me an error:
TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'
I'm sure it is possible to initialize a constant tensor with variable batch size, how might I do so?
I've also tried the following:
tf.constant(0.0, dtype=tf.float32, shape=[-1, 1])
I get this error:
ValueError: Too many elements provided. Needed at most -1, but received 1
A tf.constant() has fixed size and value at graph construction time, so it probably isn't the right op for your application.
If you are trying to create a tensor with a dynamic size and the same (constant) value for every element, you can use tf.fill() and tf.shape() to create an appropriately-shaped tensor. For example, to create a tensor t that has the same shape as input and the value 0.5 everywhere:
input = tf.placeholder(tf.float32, shape=(None, ...))
# `tf.shape(input)` takes the dynamic shape of `input`.
t = tf.fill(tf.shape(input), 0.5)
As Yaroslav mentions in his comment, you may also be able to use (NumPy-style) broadcasting to avoid materializing a tensor with dynamic shape. For example, if input has shape (None, 32) and t has shape (1, 32) then computing tf.mul(input, t) will broadcast t on the first dimension to match the shape of input.
Suppose you want to do something using log_probs. For example, you want to do power operation on a tensor v and a constant log_probs. And you want the shape of log_probs to vary with the shape of v.
v = tf.placeholder(tf.float32, shape=(None, 1)
log_probs = tf.constant(0.0, dtype=tf.float32, shape=[None, 1])
result = tf.pow(v, log_probs)
However, you cannot construct the constant log_probs. While, firstly, you can construct tf.constant just with shape =[1] log_prob = tf.constant(0.0, dtype=tf.float32, shape=[None, 1]). Then use tf.map_fn() to do pow operation for each element of v.
v = tf.placeholder(tf.float32, shape=(None, 1)
log_prob = tf.constant(0.0, dtype=tf.float32, shape=[1])
result = tf.map_fn(lambda ele : tf.pow(ele, log_prob), v)