Shape change error in ctc_batch_cost function with TensorFlow 2.7.0 - tensorflow

I have some code that generates a CTC layer which no longer works in TensorFlow 2.7.0 but works in 2.6.1. The code in question which is causing the problem is:
class CTCLayer(layers.Layer):
def __init__(self, name=None):
super().__init__(name=name)
self.loss_fn = keras.backend.ctc_batch_cost
def call(self, labels, label_length, predictions): #input_length,
batch_len = tf.cast(tf.shape(labels)[0], dtype="int64")
input_length = tf.cast(tf.shape(predictions)[1], dtype="int64")
label_length = tf.cast(label_length, dtype="int64")#tf.cast(tf.shape(labels)[1], dtype="int64")
input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")
loss = self.loss_fn(y_true=labels, y_pred=predictions, input_length=input_length, label_length=label_length)#, logits_time_major=False)
self.add_loss(loss)
return predictions
and crashes when calling the ctc_batch_cost function during model building with the following error:
ValueError: Exception encountered when calling layer "CTC_LOSS" (type CTCLayer).
Traceback:
File "<ipython-input-10-0b2cf7d5ab7d>", line 16, in call *
loss = self.loss_fn(y_true=labels, y_pred=predictions, input_length=input_length, label_length=label_length)#, logits_time_major=False)
File "/usr/local/lib/python3.7/dist-packages/keras/backend.py", line 6388, in ctc_batch_cost
ctc_label_dense_to_sparse(y_true, label_length), tf.int32)
File "/usr/local/lib/python3.7/dist-packages/keras/backend.py", line 6340, in ctc_label_dense_to_sparse
range_less_than, label_lengths, initializer=init, parallel_iterations=1)
ValueError: Input tensor `CTC_LOSS/Cast_5:0` enters the loop with shape (1, 1), but has shape (1, None) after one iteration. To allow the shape to vary across iterations, use the `shape_invariants` argument of tf.while_loop to specify a less-specific shape.
Call arguments received:
• labels=tf.Tensor(shape=(None, 1), dtype=int32)
• label_length=tf.Tensor(shape=(None, 1), dtype=int32)
• predictions=tf.Tensor(shape=(None, 509, 30), dtype=float32)
I suspect the problem is easy to fix and has something to do with the fact that TensorFlow no longer performs upranking as described in the 2.7.0 release notes:
The methods Model.fit(), Model.predict(), and Model.evaluate() will no longer uprank input data of shape (batch_size,) to become (batch_size, 1). This enables Model subclasses to process scalar data in their train_step()/test_step()/predict_step() methods.
Note that this change may break certain subclassed models. You can revert back to the previous behavior by adding upranking yourself in the train_step()/test_step()/predict_step() methods, e.g. if x.shape.rank == 1: x = tf.expand_dims(x, axis=-1). Functional models as well as Sequential models built with an explicit input shape are not affected.
Any idea will be appreciated. Thanks!

Related

TensorFlow with custom gym environment: Layer "dense_6" expects 1 input(s), but it received 2 input tensors

I am trying to use TF to solve a custom gym environment, all within Google Colab.
The main script is the TF "DQN Tutorial" available here.
In place of env_name = "CartPole-v0" I am using env_name = "gym_examples/GridWorld-v0", where gym_examples/GridWorld-v0 is the sample custom environment described in the gym documentation here. (That example uses gym v0.25.0 but TF requires gym <= v0.23.0, so I also had to tweak the rendering code a bit to make it work in v0.23.0.)
The environment loads fine via env = suite_gym.load(env_name), and subsequent code cells run fine as well, until the following two cells:
fc_layer_params = (100, 50)
action_tensor_spec = tensor_spec.from_spec(env.action_spec())
num_actions = action_tensor_spec.maximum - action_tensor_spec.minimum + 1
# Define a helper function to create Dense layers configured with the right
# activation and kernel initializer.
def dense_layer(num_units):
return tf.keras.layers.Dense(
num_units,
activation=tf.keras.activations.relu,
kernel_initializer=tf.keras.initializers.VarianceScaling(
scale=2.0, mode='fan_in', distribution='truncated_normal'))
# QNetwork consists of a sequence of Dense layers followed by a dense layer
# with `num_actions` units to generate one q_value per available action as
# its output.
dense_layers = [dense_layer(num_units) for num_units in fc_layer_params]
q_values_layer = tf.keras.layers.Dense(
num_actions,
activation=None,
kernel_initializer=tf.keras.initializers.RandomUniform(
minval=-0.03, maxval=0.03),
bias_initializer=tf.keras.initializers.Constant(-0.2))
q_net = sequential.Sequential(dense_layers + [q_values_layer])
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
train_step_counter = tf.Variable(0)
agent = dqn_agent.DqnAgent(
train_env.time_step_spec(),
train_env.action_spec(),
q_network=q_net,
optimizer=optimizer,
td_errors_loss_fn=common.element_wise_squared_loss,
train_step_counter=train_step_counter)
agent.initialize()
After that cell, I get an error:
ValueError: Exception encountered when calling layer "sequential_2" (type Sequential).
Layer "dense_6" expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor: shape=(1, 2), dtype=int64, numpy=array([[2, 2]])>, <tf.Tensor: shape=(1, 2), dtype=int64, numpy=array([[3, 2]])>]
Call arguments received by layer "sequential_2" (type Sequential):
• inputs={'agent': 'tf.Tensor(shape=(1, 2), dtype=int64)', 'target': 'tf.Tensor(shape=(1, 2), dtype=int64)'}
• network_state=()
• kwargs={'step_type': 'tf.Tensor(shape=(1,), dtype=int32)', 'training': 'None'}
In call to configurable 'DqnAgent' (<class 'tf_agents.agents.dqn.dqn_agent.DqnAgent'>)
I'm too much of a TF novice to understand what's going on here. I suspect it's because the action state changed from 2 states (in CartPole) to 4 (in the custom GridWorld environment). But beyond that I cannot figure it out.
This can be solved by using an Embedding layer as your first layer. In this example (Embedding(16, 4)), 16 is the grid size (4x4), and 4 is the output dimension.
dense_layers = [dense_layer(num_units) for num_units in fc_layer_params]
For example, replacing the above line with the code below will eradicate the error.
dense_layers = [
# First layer
tf.keras.layers.Embedding(16, 4),
# Other layers
tf.keras.layers.Dense(100, activation=tf.keras.activations.relu)
]
Source and for further explanation:
https://martin-ueding.de/posts/reinforcement-learning-with-frozen-lake/

Keras(Tensorflow) LSTM error in spyder and jupyter

when I use google colab, there's no error in code
but when I use spyder or jupyter, the error occurs.
Model_10 = Sequential()
Model_10.add(LSTM(128, batch_input_shape = (1,10,5), stateful = True))
Model_10.add(Dense(5, activation = 'linear'))
Model_10.compile(loss = 'mse', optimizer = 'rmsprop')
Model_10.fit(x_train, y_train, epochs=1, batch_size=1, verbose=2, shuffle=False, callbacks=[history])
x_train_data.shape = (260,10,5)
y_train_data.shape = (260,1,5)
I'm using python3.7 and tensorflow 2.0
I don't know why error occurs in anaconda only.
ㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡ
error code
ValueError: A target array with shape (260, 1, 5) was passed for an output of shape (1, 5) while using as loss mean_squared_error. This loss expects targets to have the same shape as the output.
You should reshape your labels/targets:
y_train_data = y_train_data.reshape((260,5))
Since you're using batch_input_shape in the input layer and specifying batch size of 1, the model will take one example from your labels at each step which will have a shape of (1, 5) for the labels anyway.

Why is the batch size None in the method call of a Keras layer?

I am implementing a custom layer in Keras. If I print the shape of the input passed to the call method, I get None as the first element. Why is that? Shouldn't the first element be the batch size?
def call(self, x):
print(x.shape) # (None, ...)
When I call model.fit, I am passing the batch size
batch_size = 50
model.fit(x_train, y_train, ..., batch_size=batch_size)
So, when is the method call actually called? And what is the recommended way of getting the batch size in the method call?
None means it is a dynamic shape. It can take any value depending on the batch size you choose.
When you define a model by default it is defined to support any batch size you can choose. This is what the None means. In TensorFlow 1.* the input to your model is an instance of tf.placeholder().
If you don't use the keras.InputLayer() with specified batch size you get the first dimension None by default:
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
print(model.inputs[0].get_shape().as_list()) # [None, 2]
print(model.inputs[0].op.type == 'Placeholder') # True
When you do use keras.InputLayer() with specified batch size you can define the input placeholder with fixed batch size:
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer((2,), batch_size=50))
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
print(model.inputs[0].get_shape().as_list()) # [50, 2]
print(model.inputs[0].op.type == 'Placeholder') # True
When you specify the batch size to the model.fit() method these input placeholders have already been defined and you cannot modify their shape. The batch size for model.fit() is used only to split the data you provided to batches.
If you define your input layer with batch size 2 and then you pass different value of a batch size to the model.fit() method you will get ValueError:
import tensorflow as tf
import numpy as np
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer((2,), batch_size=2)) # <--batch_size==2
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
model.compile(optimizer=tf.keras.optimizers.Adam(),
loss='categorical_crossentropy')
x_train = np.random.normal(size=(10, 2))
y_train = np.array([[0, 1] for _ in range(10)])
model.fit(x_train, y_train, batch_size=3) # <--batch_size==3
This will raise:
ValueError: Thebatch_sizeargument value 3 is incompatible with the specified batch size of your Input Layer: 2
I faced the same issue and found that using tf.shape(your_variable) instead of your_variable.shape solved the problem. As tf.shape(your_variable) is dynamically evaluated later when the fit function is called.
reference
https://github.com/tensorflow/tensorflow/issues/36991#issuecomment-590448880
To get the value in integers consider output_shape variable.
Minimum working example
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer((2,), batch_size=50))
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
print(model.output_shape)
print(type(model.output_shape[0]), type(model.output_shape[1]))
Output:
(50, 2)
<class 'int'> <class 'int'>

Tensorflow compute_output_shape() Not Working For Custom Layer

I have created a custom layer (called GraphGather) in Keras, yet the output tensor prints as :
Tensor("graph_gather/Tanh:0", shape=(?, ?), dtype=float32)
For some reason the shape is being returned as (?,?), which is causing the next dense layer to raise the following error:
ValueError: The last dimension of the inputs to Dense should be defined. Found None.
The GraphGather layer code is as follows:
class GraphGather(tf.keras.layers.Layer):
def __init__(self, batch_size, num_mols_in_batch, activation_fn=None, **kwargs):
self.batch_size = batch_size
self.num_mols_in_batch = num_mols_in_batch
self.activation_fn = activation_fn
super(GraphGather, self).__init__(**kwargs)
def build(self, input_shape):
super(GraphGather, self).build(input_shape)
def call(self, x, **kwargs):
# some operations (most of def call omitted)
out_tensor = result_of_operations() # this line is pseudo code
if self.activation_fn is not None:
out_tensor = self.activation_fn(out_tensor)
out_tensor = out_tensor
return out_tensor
def compute_output_shape(self, input_shape):
return (self.num_mols_in_batch, 2 * input_shape[0][-1])}
I have also tried hardcoding compute_output_shape to be:
python
def compute_output_shape(self, input_shape):
return (64, 150)
```
Yet the output tensor when printed is still
Tensor("graph_gather/Tanh:0", shape=(?, ?), dtype=float32)
which causes the ValueError written above.
System information
Have written custom code
**OS Platform and Distribution*: Linux Ubuntu 16.04
TensorFlow version (use command below): 1.5.0
Python version: 3.5.5
I had the same problem. My workaround was to add the following lines to the call method:
input_shape = tf.shape(x)
and then:
return tf.reshape(out_tensor, self.compute_output_shape(input_shape))
I haven't run into any problems with it yet.
If Johnny's answer doesn't work, I found another way to get around this is to follow advice here https://github.com/tensorflow/tensorflow/issues/38296#issuecomment-623698709
which is to call the set_shape method on the output of your layer.
E.g.
l=GraphGather(...)
y=l(x)
y.set_shape( l.compute_output_shape(x.shape) )
This only works if you are using the functional API.

Exported Tensorflow Model not Preserving Placeholder Shape

I am using exporter from tensorflow.contrib.session_bundle to save out my model:
x = tf.placeholder(tf.float32, (None,) + (100, 200) + (1,))
....
saver = tf_saver.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
model_exporter.init(
sess.graph.as_graph_def(),
named_graph_signatures={
'inputs': exporter.generic_signature({'images': x}),
'outputs': exporter.generic_signature({'classes': y})})
and then I load the model back in (session_bundle from tensorflow.contrib.session_bundle):
sess, meta_graph_def = session_bundle.load_session_bundle_from_path(input)
However when I inspect the Placeholder tensor corresponding to the input x, I see no shape information:
> sess.graph.get_tensor_by_name(input_name)
<tf.Tensor 'Placeholder:0' shape=<unknown> dtype=float32>
Is this by design or is there some bug causing the shape to be lost?
Here is an answer from a colleague:
"The exporter.generic_signature call (when building the named_graph_signatures) populates the map of the generic_signature as defined here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/session_bundle/manifest.proto#L69
The value in the map is a TensorBinding, which by itself is simply the tensor-name. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/session_bundle/manifest.proto#L20
So it is expected that the shape will not be retained and the name should sufficiently identify the tensor."