Why can't tensorflow determine the shape of this expression? - tensorflow

I have the following expression which is giving me problems. I have defined the batch_size as batch_size = tf.shape(input_tensor)[0] which dynamically determines the size of the batch based on the size of the input tensor to the model. I have used it elsewhere in the code without issue. What I am confused about is that when I run the following line of code it says the shape is (?, ?) I would expect it to be (?, 128) because it knows the second dimension.
print(tf.zeros((batch_size, 128)).get_shape())
I want to know the shape since I am trying to do the following and I am getting an error.
rnn_input = tf.reduce_sum(w * decoder_input, 1)
last_out = decoder_outputs[t - 1] if t else tf.zeros((batch_size, 128))
rnn_input = tf.concat(1, (rnn_input, last_out))
This code needs to set last_out to zero on the first time step.
Here is the error ValueError: Linear expects shape[1] of arguments: [[None, None], [None, 1024]]
I am doing something similar when I determine my initial state vector for the RNNs.
state = tf.zeros((batch_size, decoder_multi_rnn.state_size), tf.float32)
I also get (?, ?) when I try to print the size of state but it does not really throw any exceptions when I try to use it.

You are mixing static shapes and dynamic shapes. Static shape is what you get during tensor.get_shape(tensor) which is best-effort attempt to obtain shape, while dynamic shape comes from sess.run(tf.shape(tensor)) and it is always defined.
To be more precise, tf.shape(tensor) creates an op in the graph that will produce shape tensor on run call. If you do aop=tf.shape(tensor)[0], there's some magic through _SliceHelper that adds extra ops that will extract first element of the shape tensor on run call.
This means that myval=tf.zeros((aop, 128)) has to run aop to obtain the dimensions and this means that first dimension of myval is undefined until you issue the run call. IE, your run call could look like sess.run(myval, feed_dict={aop:2}, where feed_dict overrides aop with 2. Hence static shape inference reports ? for that dimension.

(EDIT: I rewrite an answer as what I wrote before was not up to the point)
The quick fix to your issue is to use set_shape() to update the static (inferred) shape of the Tensor:
input_tensor = tf.placeholder(tf.float32, [None, 32])
batch_size = tf.shape(input_tensor)[0]
res = tf.zeros((batch_size, 128))
print res.get_shape() # prints (?, ?) WHEREAS one could expect (?, 128)
res.set_shape([None, 128])
print res.get_shape() # prints (?, 128)
As for why TensorFlow looses the information about the second dimension being 128, I don't really know.
Maybe #Yaroslav will be able to answer.
EDIT:
The incorrect behavior was corrected following this issue.

Related

How to use metric with three inputs(GAP metric) in Keras while training?

This is GAP metric code from kaggle
def GAP(pred, conf, true):
x = pd.DataFrame({'pred': pred, 'conf': conf, 'true': true})
x.sort_values('conf', ascending=False, inplace=True, na_position='last')
x['correct'] = (x.true == x.pred).astype(int)
x['prec_k'] = x.correct.cumsum() / (np.arange(len(x)) + 1)
x['term'] = x.prec_k * x.correct
gap = x.term.sum() / x.true.count()
return gap
I want to use it while training, but it get conf argument - vector of probability or confidence scores for prediction. But metrics must get only two arguments. Does any possibility to use it like this:
model.compile(loss='my_loss',metrics=[GAP])
Yes.. There is a way you can do this with a small tweak. Note that frameworks like Keras support loss functions and metrics of the form fun(true, pred). The function definition should be in this form only.
Also, the second limitation is, the shapes of both true and pred must be same.
Tweaking the first limitation: Concatenate the two output tensors into one. Suppose you have x number of output classes, then shape of conf and pred will be (None, x). You can concatenate these two tensors into one producing final_output with shape (None, 2, x).
Doing this is only the first step. It won't work unless we tweak the second limitation.
Now let us tweak the second limitation: This limitation can be shortened to: "The dimensions of both these tensors must be same." Note that I am trying to reduce the limitation from shape to dimensions. This can be done by having dynamic shapes, for ex: shape(true) = (None, 1, x) and shape(pred) = (None, None, x) will not throw errors as None can take any value at runtime. In short, add a layer at the end of the model to combine outputs and that layer should have dynamic output shape.
But in your case, true will also have shape (None, x). You can just expand dimensions of this tensor at axis=1 to get (None, 1, x) and then the newly generated true can be provided as input to the model.
Note that as you are combining two tensors, the final_output will always have shape (None, 2, x) which isn't equal to (None, 1, x). But as we have configured the last layer to return dynamic shape i.e. (None, None, x), this will not be a problem at compile time. And Keras never checks for shape mismatch at runtime except an operation on tensor causes that error.
Now, that you have final_output with same shape as true, you just need to slice the final_output to get back the original two tensors pred and conf in your custom loss function and metrics.
The above was purely logical.. To see an example implementation, check out layers and loss function here.

Keras image captioning model not compiling because of concatenate layer when mask_zero=True in a previous layer

I am new to Keras and I am trying to implement a model for an image captioning project.
I am trying to reproduce the model from Image captioning pre-inject architecture (The picture is taken from this paper: Where to put the image in an image captioning generator) (but with a minor difference: generating a word at each time step instead of only generating a single word at the end), in which the inputs for the LSTM at the first time step are the embedded CNN features. The LSTM should support variable input length and in order to do this I padded all the sequences with zeros so that all of them have maxlen time steps.
The code for the model I have right now is the following:
def get_model(model_name, batch_size, maxlen, voc_size, embed_size,
cnn_feats_size, dropout_rate):
# create input layer for the cnn features
cnn_feats_input = Input(shape=(cnn_feats_size,))
# normalize CNN features
normalized_cnn_feats = BatchNormalization(axis=-1)(cnn_feats_input)
# embed CNN features to have same dimension with word embeddings
embedded_cnn_feats = Dense(embed_size)(normalized_cnn_feats)
# add time dimension so that this layer output shape is (None, 1, embed_size)
final_cnn_feats = RepeatVector(1)(embedded_cnn_feats)
# create input layer for the captions (each caption has max maxlen words)
caption_input = Input(shape=(maxlen,))
# embed the captions
embedded_caption = Embedding(input_dim=voc_size,
output_dim=embed_size,
input_length=maxlen)(caption_input)
# concatenate CNN features and the captions.
# Ouput shape should be (None, maxlen + 1, embed_size)
img_caption_concat = concatenate([final_cnn_feats, embedded_caption], axis=1)
# now feed the concatenation into a LSTM layer (many-to-many)
lstm_layer = LSTM(units=embed_size,
input_shape=(maxlen + 1, embed_size), # one additional time step for the image features
return_sequences=True,
dropout=dropout_rate)(img_caption_concat)
# create a fully connected layer to make the predictions
pred_layer = TimeDistributed(Dense(units=voc_size))(lstm_layer)
# build the model with CNN features and captions as input and
# predictions output
model = Model(inputs=[cnn_feats_input, caption_input],
outputs=pred_layer)
optimizer = Adam(lr=0.0001,
beta_1=0.9,
beta_2=0.999,
epsilon=1e-8)
model.compile(loss='categorical_crossentropy',optimizer=optimizer)
model.summary()
return model
The model (as it is above) compiles without any errors (see: model summary) and I managed to train it using my data. However, it doesn't take into account the fact that my sequences are zero-padded and the results won't be accurate because of this. When I try to change the Embedding layer in order to support masking (also making sure that I use voc_size + 1 instead of voc_size, as it's mentioned in the documentation) like this:
embedded_caption = Embedding(input_dim=voc_size + 1,
output_dim=embed_size,
input_length=maxlen, mask_zero=True)(caption_input)
I get the following error:
Traceback (most recent call last):
File "/export/home/.../py3_env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1567, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 200 and 1. Shapes are [200] and [1]. for 'concatenate_1/concat_1' (op: 'ConcatV2') with input shapes: [?,1,200], [?,25,1], [] and with computed input tensors: input[2] = <1>
I don't know why it says the shape of the second array is [?, 25, 1], as I am printing its shape before the concatenation and it's [?, 25, 200] (as it should be).
I don't understand why there'd be an issue with a model that compiles and works fine without that parameter, but I assume there's something I am missing.
I have also been thinking about using a Masking layer instead of mask_zero=True, but it should be before the Embedding and the documentation says that the Embedding layer should be the first layer in a model (after the input).
Is there anything I could change in order to fix this or is there a workaround to this ?
The non-equal shape error refers to the mask rather than the tensors/inputs. With concatenate supporting masking, it need to handle mask propagation. Your final_cnn_feats doesn't have mask (None), while your embedded_caption has a mask of shape (?, 25). You can find this out by doing:
print(embedded_caption._keras_history[0].compute_mask(caption_input))
Since final_cnn_feats has no mask, concatenate will give it a all non-zero mask for proper mask propagation. While this is correct, the shape of the mask, however, has the same shape as final_cnn_feats which is (?, 1, 200) rather than (?, 1), i.e. masking all features at all time step rather than just all time step. This is where the non-equal shape error comes from ((?, 1, 200) vs (?, 25)).
To fix it, you need to give final_cnn_feats a correct/matching mask. Now I'm not familiar with your project here. One option is to apply a Masking layer to final_cnn_feats, since it is designed to mask timestep(s).
final_cnn_feats = Masking()(RepeatVector(1)(embedded_cnn_feats))
This can be correct only when not all 200 features in final_cnn_feats are zero, i.e. there is always at least one non-zero value in final_cnn_feats. With that condition, Masking layer will give a (?, 1) mask and will not mask the single time step in final_cnn_feats.

Feeding a tensorflow placeholder from an array

I'm trying to train CatPole-v0 using Q learning. When trying to update the replay buffer with experience I am getting the following error:
ValueError: Cannot feed value of shape (128,) for Tensor 'Placeholder_1:0', which has shape '(?, 2)'
The related code snippet is:
def update_replay_buffer(replay_buffer, state, action, reward, next_state, done, action_dim):
# append to buffer
experience = (state, action, reward, next_state, done)
replay_buffer.append(experience)
# Ensure replay_buffer doesn't grow larger than REPLAY_SIZE
if len(replay_buffer) > REPLAY_SIZE:
replay_buffer.pop(0)
return None
The placeholder to be fed is
action_in = tf.placeholder("float", [None, action_dim])
Can someone clarify how action_dim should be used to resolve this error?
Let's start by action_in:
action_in = tf.placeholder("float", [None, action_dim])
This means that action_in can have shape like (None, action_dim), nothing other than that. And from the error:
ValueError: Cannot feed value of shape (128,) for Tensor 'Placeholder_1:0', which has shape '(?, 2)'
From error it seems your action_dim is 2.
its easy to see that you're placing an object of shape (128,) in place of tensor which expects shape like (?, 2) i.e (None, 2).
So you need to check your feed_dict that's where you're messing. Your placeholder action_in's dimensions should match with object you're putting in feed_dict.
Can someone clarify how action_dim should be used to resolve this
error?
Seems your environment's action has two components from value of action_dim, but you are providing only one component, this is inferred from your error((128,)). You need to fix this. Hope this helps.

Tensorflow cannot initialize tf.Variable for dynamic batch size

I tried creating a tf.Variable with a dynamic shape. The following outlines the problem.
Doing this works.
init_bias = tf.random_uniform(shape=[self.config.hidden_layer_size, tf.shape(self.question_inputs)[0]])
However, when i try to do this:
init_bias = tf.Variable(init_bias)
It throws the error ValueError: initial_value must have a shape specified: Tensor("random_uniform:0", shape=(?, ?), dtype=float32)
Just come context (question input is a placeholder which dynamic batch ):
self.question_inputs = tf.placeholder(tf.int32, shape=[None, self.config.qmax])
It seems like putting a dynamic value into random uniform gives shape=(?,?) which gives an error with tf.Variable.
Thanks and appreciate any help!
This should work:
init_bias = tf.Variable(init_bias,validate_shape=False)
If validate_shape is False, tensorflow allows the variable to be initialized with a value of unknown shape.
However, what you're doing seems a little strange to me. In tensorflow, Variables are generally used to store weights of a neural net, whose shape remains fixed irrespective of the batch size. Variable batch size is handled by passing a variable length tensor into the graph (and multiplying/adding it with a fixed shape bias Variable).

tensorflow shape of a tiled tensor

I have a variable a of dimension (1, 5) which I want to 'tile' as many times as the size of my mini-batch. For example, if the mini-batch size is 32 then I want to construct a tensor c of dimension (32, 5) where each row has values the same as the original (1, 5) variable a.
But I only know the mini-batch size at run time: it's the size of dimension 0 of a placeholder b: tf.shape(b)[0]
Here's my code to construct c:
a = tf.Variable(np.random.uniform(size=(1,5)))
b = tf.placeholder(shape=[None, 12], dtype=tf.float32)
batch_size = tf.shape(b)[0]
c = tf.tile(a, tf.pack([batch_size, 1]))
This runs fine. Howeverc.get_shape() returns (?, ?). I don't understand why this doesn't return (?, 5) instead.
This is causing an issue later in my code when I construct a matrix variable W with number of columns c.get_shape()[1] which I expect to return 5 rather than ?.
Any help would be appreciated. Thanks.
[EDIT: This was fixed in a commit to TensorFlow on August 10, 2016.]
This is a known limitation of TensorFlow's shape inference: when the multiples argument to tf.tile() is a computed value (such as the result of tf.pack() here), and its value is not trivially computable at graph construction time (in this case, because it depends on a tf.placeholder(), which has no value until it is fed), the current shape inference will throw its hands up and declare that the shape is unknown (but with the same rank as the input, a).
The current workaround is to use Tensor.set_shape(), which allows you as the programmer to provide additional shape information when you know more than the shape inference does. For example, you could do:
a = tf.Variable(np.random.uniform(size=(1, 5)))
b = tf.placeholder(shape=[None, 12], dtype=tf.float32)
batch_size = tf.shape(b)[0]
c = tf.tile(a, tf.pack([batch_size, 1]))
c.set_shape([None, a.get_shape()[1]]) # or `c.set_shape([None, 5])`
However, we recently added some features that make it possible to propagate partially computed values that may be used as shapes, and this can be adapted to aid the shape function for tf.tile(). I have created a GitHub issue to track this, and I have a fix being tested right now.