I faced some issue on the model fit and new to Tensorflow.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 50, 5) 30
lstm (LSTM) (None, 64) 17920
dense (Dense) (None, 6) 390
=================================================================
Total params: 18,340
Trainable params: 18,340
Non-trainable params: 0
_________________________________________________________________
When it reached the model.fit, somehow it pop up these sorts of error messages. The purpose of this training is on the text analysis on the git log and categorised it whether it belongs to either new feature, fix and many more.
Traceback (most recent call last):
File "test.py", line 138, in <module>
model_training()
File "test.py", line 136, in model_training
model.fit( training_padding_sentence, training_general_label, epochs = 20)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:
Detected at node 'sequential/embedding/embedding_lookup' defined at (most recent call last):
File "test.py", line 138, in <module>
model_training()
File "test.py", line 136, in model_training
model.fit( training_padding_sentence, training_general_label, epochs = 20)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1384, in fit
tmp_logs = self.train_function(iterator)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1021, in train_function
return step_function(self, iterator)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1010, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1000, in run_step
outputs = model.train_step(data)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 859, in train_step
y_pred = self(x, training=True)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/base_layer.py", line 1096, in __call__
outputs = call_fn(inputs, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 92, in error_handler
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/sequential.py", line 374, in call
return super(Sequential, self).call(inputs, training=training, mask=mask)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/functional.py", line 451, in call
return self._run_internal_graph(
File "/usr/local/lib/python3.8/dist-packages/keras/engine/functional.py", line 589, in _run_internal_graph
outputs = node.layer(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/base_layer.py", line 1096, in __call__
outputs = call_fn(inputs, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 92, in error_handler
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/layers/embeddings.py", line 197, in call
out = tf.nn.embedding_lookup(self.embeddings, inputs)
Node: 'sequential/embedding/embedding_lookup'
indices[0,0] = 15 is not in [0, 6)
[[{{node sequential/embedding/embedding_lookup}}]] [Op:__inference_train_function_3232]
I am trying to use keras tuner to tune an LSTM neural network to detect if an article is a fake news or not, using a kaggle dataset.
However, I keep getting this error: RuntimeError: Too many failed attempts to build model
I have also tried to use the RandomSearch rather than the BayesianOptimization, but still getting the same type of error.
This is the code:
'''
def build_model(hp):
voc_size=5000
embedding_vector_features=40
model = Sequential([
Embedding(
voc_size,
embedding_vector_features,
input_length = sent_length
),
AlphaDropout(
rate = hp.Choice(
'dropout_1_rate',
values=[0.3, 0.5],
default=0.3
)
),
LSTM(
units = hp.Int(
'LSTM_1_units',
min_value=100,
max_value=300,
step=32,
default=128
),
activation = hp.Choice(
'LSTM_1_activation',
values=['relu', 'selu']
),
kernel_initializer='lecun_normal'
),
AlphaDropout(
rate = hp.Choice(
'dropout_2_rate',
values=[0.3, 0.5],
default=0.3
)
),
LSTM(
units = hp.Int(
'LSTM_2_units',
min_value=100,
max_value=300,
step=32,
default=128
),
activation = hp.Choice(
'LSTM_2_activation',
values=['relu', 'selu']
),
kernel_initializer='lecun_normal'
),
AlphaDropout(
rate = hp.Choice(
'dropout_3_rate',
values=[0.3, 0.5],
default=0.3
)
),
Dense(
units = 1,
activation = 'sigmoid',
kernel_initializer='lecun_normal'
)
])
model.compile(
optimizer = keras.optimizers.Nadam(
hp.Choice(
'learning_rate',
values=[1e-2, 1e-3]
)
),
loss = 'binary_crooentropy',
metric = ['accuracy']
)
return model
tuner_search = BayesianOptimization(build_model,
objective='val_accuracy',
max_trials=3,
seed=42,
directory='output',
project_name='Fake News Classifier'
)
'''
When I try to run this code I get the following error:
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
Invalid model 0/5
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py", line 104, in build
model = self.hypermodel.build(hp)
File "<ipython-input-18-fe84fe0afbca>", line 62, in build_model
kernel_initializer='lecun_normal'
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 144, in __init__
self.add(layer)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 223, in add
output_tensor = layer(self.outputs[0])
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/recurrent.py", line 660, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 952, in __call__
input_list)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1091, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 862, in _infer_output_signature
self._maybe_build(inputs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2685, in _maybe_build
self.input_spec, inputs, self.name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/input_spec.py", line 223, in assert_input_compatibility
str(tuple(shape)))
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
Invalid model 1/5
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py", line 104, in build
model = self.hypermodel.build(hp)
File "<ipython-input-18-fe84fe0afbca>", line 62, in build_model
kernel_initializer='lecun_normal'
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 144, in __init__
self.add(layer)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 223, in add
output_tensor = layer(self.outputs[0])
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/recurrent.py", line 660, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 952, in __call__
input_list)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1091, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 862, in _infer_output_signature
self._maybe_build(inputs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2685, in _maybe_build
self.input_spec, inputs, self.name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/input_spec.py", line 223, in assert_input_compatibility
str(tuple(shape)))
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
Invalid model 2/5
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py", line 104, in build
model = self.hypermodel.build(hp)
File "<ipython-input-18-fe84fe0afbca>", line 62, in build_model
kernel_initializer='lecun_normal'
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 144, in __init__
self.add(layer)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 223, in add
output_tensor = layer(self.outputs[0])
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/recurrent.py", line 660, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 952, in __call__
input_list)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1091, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 862, in _infer_output_signature
self._maybe_build(inputs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2685, in _maybe_build
self.input_spec, inputs, self.name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/input_spec.py", line 223, in assert_input_compatibility
str(tuple(shape)))
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
Invalid model 3/5
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py", line 104, in build
model = self.hypermodel.build(hp)
File "<ipython-input-18-fe84fe0afbca>", line 62, in build_model
kernel_initializer='lecun_normal'
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 144, in __init__
self.add(layer)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 223, in add
output_tensor = layer(self.outputs[0])
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/recurrent.py", line 660, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 952, in __call__
input_list)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1091, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 862, in _infer_output_signature
self._maybe_build(inputs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2685, in _maybe_build
self.input_spec, inputs, self.name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/input_spec.py", line 223, in assert_input_compatibility
str(tuple(shape)))
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
Invalid model 4/5
WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
WARNING:tensorflow:Layer lstm_1 will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py", line 104, in build
model = self.hypermodel.build(hp)
File "<ipython-input-18-fe84fe0afbca>", line 62, in build_model
kernel_initializer='lecun_normal'
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 144, in __init__
self.add(layer)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 223, in add
output_tensor = layer(self.outputs[0])
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/recurrent.py", line 660, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 952, in __call__
input_list)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1091, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 862, in _infer_output_signature
self._maybe_build(inputs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2685, in _maybe_build
self.input_spec, inputs, self.name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/input_spec.py", line 223, in assert_input_compatibility
str(tuple(shape)))
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
Invalid model 5/5
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py", line 104, in build
model = self.hypermodel.build(hp)
File "<ipython-input-18-fe84fe0afbca>", line 62, in build_model
kernel_initializer='lecun_normal'
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 144, in __init__
self.add(layer)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py", line 517, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/sequential.py", line 223, in add
output_tensor = layer(self.outputs[0])
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/recurrent.py", line 660, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 952, in __call__
input_list)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1091, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 862, in _infer_output_signature
self._maybe_build(inputs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2685, in _maybe_build
self.input_spec, inputs, self.name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/input_spec.py", line 223, in assert_input_compatibility
str(tuple(shape)))
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py in build(self, hp)
103 with maybe_distribute(self.distribution_strategy):
--> 104 model = self.hypermodel.build(hp)
105 except:
19 frames
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/kerastuner/engine/hypermodel.py in build(self, hp)
111 if i == self._max_fail_streak:
112 raise RuntimeError(
--> 113 'Too many failed attempts to build model.')
114 continue
115
RuntimeError: Too many failed attempts to build model.
How can I solve the issue?
Actual error is ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 128)
LSTM layer expects input shape inputs: A 3D tensor with shape [batch, timesteps, feature]
I could reproduce your issue
import tensorflow as tf
inputs = tf.random.normal([32, 8])
lstm = tf.keras.layers.LSTM(4)
output = lstm(inputs)
print(output.shape)
Output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-160c5e8d5d9a> in <module>()
2 inputs = tf.random.normal([32, 8])
3 lstm = tf.keras.layers.LSTM(4)
----> 4 output = lstm(inputs)
5 print(output.shape)
2 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
217 'expected ndim=' + str(spec.ndim) + ', found ndim=' +
218 str(ndim) + '. Full shape received: ' +
--> 219 str(tuple(shape)))
220 if spec.max_ndim is not None:
221 ndim = x.shape.rank
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (32, 8)
Working code snippet
import tensorflow as tf
inputs = tf.random.normal([32, 10, 8])
lstm = tf.keras.layers.LSTM(4)
output = lstm(inputs)
print(output.shape)
Output
(32, 4)
I follow the instruction of Fine-tuning BERT to build a model with my own dataset(It is kind of large, and greater than 20G), then take steps to re-cdoe my data and load them from tf_record files.
The training_dataset I create has the same signature as that in the instruction
training_dataset.element_spec
({'input_word_ids': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None),
'input_mask': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None),
'input_type_ids': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None)},
TensorSpec(shape=(32,), dtype=tf.int32, name=None))
where batch_size is 32, max_seq_length is 1024.
As the instruction suggestes,
The resulting tf.data.Datasets return (features, labels) pairs, as expected by keras.Model.fit
It semms that everything works as expected,(the instruction does not show how to use training_dataset though ) However, the following code
bert_classifier.fit(
x = training_dataset,
validation_data=test_dataset, # has the same signature just as training_dataset
batch_size=32,
epochs=epochs,
verbose=1,
)
encouters an error that seems weird to me,
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/captain/project/dataload/train.py", line 81, in <module>
verbose=1,
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
tmp_logs = self.train_function(iterator)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
result = self._call(*args, **kwds)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 871, in _call
self._initialize(args, kwds, add_initializers_to=initializers)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 726, in _initialize
*args, **kwds))
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2969, in _get_concrete_function_internal_garbage_collected
graph_function, _ = self._maybe_define_function(args, kwargs)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
out = weak_wrapped_fn().__wrapped__(*args, **kwds)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:805 train_function *
return step_function(self, iterator)
/home/captain/.local/lib/python3.7/site-packages/official/nlp/keras_nlp/layers/position_embedding.py:88 call *
return tf.broadcast_to(position_embeddings, input_shape)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py:845 broadcast_to **
"BroadcastTo", input=input, shape=shape, name=name)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
attrs=attr_protos, op_def=op_def)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py:592 _create_op_internal
compute_device)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:3536 _create_op_internal
op_def=op_def)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:2016 __init__
control_input_ops, op_def)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1856 _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 512 and 1024 for '{{node bert_classifier/bert_encoder_1/position_embedding/BroadcastTo}} =
BroadcastTo[T=DT_FLOAT, Tidx=DT_INT32](bert_classifier/bert_encoder_1/position_embedding/strided_slice_1, bert_classifier/bert_encoder_1/position_embedding/Shape)'
with input shapes: [512,768], [3] and with input tensors computed as partial shapes: input[1] = [32,1024,768].
There is nothing to do with 512, and I didn't use 512 thorough my code. So where is wrong with my code and how to fix that?
They created the bert_classifier based on bert_config_file loaded from bert_config.json
bert_classifier, bert_encoder = bert.bert_models.classifier_model(bert_config, num_labels=2)
bert_config.json
{
'attention_probs_dropout_prob': 0.1,
'hidden_act': 'gelu',
'hidden_dropout_prob': 0.1,
'hidden_size': 768,
'initializer_range': 0.02,
'intermediate_size': 3072,
'max_position_embeddings': 512,
'num_attention_heads': 12,
'num_hidden_layers': 12,
'type_vocab_size': 2,
'vocab_size': 30522
}
According to this config, hidden_size is 768 and max_position_embeddings is 512 so your input data used to feed to BERT model must be the same shape as described. It explains the reason why you are getting the shape-mismatched issue.
Therefore, to make it work, you have to change all lines for creating tensor inputs from 1024 to 512.