I fine-tuned a BERT model from Tensorflow hub to build a simple sentiment analyzer. The model trains and runs fine. On export, I simply used:
tf.saved_model.save(model, export_dir='models')
And this works just fine.. until I reboot.
On a reboot, the model no longer loads. I've tried using a Keras loader as well as the Tensorflow Server, and I get the same error.
I get the following error message:
Not found: /tmp/tfhub_modules/09bd4e665682e6f03bc72fbcff7a68bf879910e/assets/vocab.txt; No such file or directory
The model is trying to load assets from the tfhub modules cache, which is wiped by reboot. I know I could persist the cache, but I don't want to do that because I want to be able to generate models and then copy them over to a separate application without worrying about the cache.
The crux of it is that I don't think it's necessary to look in the cache for the assets at all. The model was saved with an assets folder wherein vocab.txt was generated, so in order to find the assets it just needs to look in its own assets folder (I think). However, it doesn't seem to be doing that.
Is there any way to change this behaviour?
Added the code for building and exporting the model (it's not a clever model, just prototyping my workflow):
bert_model_name = "bert_en_uncased_L-12_H-768_A-12"
EPOCHS = 1 # Initial
def build_bert_model(bert_model_name):
input_layer = tf.keras.layers.Input(shape=(), dtype=tf.string, name="inputs")
preprocessing_layer = hub.KerasLayer(
map_model_to_preprocess[bert_model_name], name="preprocessing"
encoder_inputs = preprocessing_layer(input_layer)
bert_model = hub.KerasLayer(
map_name_to_handle[bert_model_name], name="BERT_encoder"
outputs = bert_model(encoder_inputs)
net = outputs["pooled_output"]
net = tf.keras.layers.Dropout(0.1)(net)
net = tf.keras.layers.Dense(1, activation=None, name="classifier")(net)
return tf.keras.Model(input_layer, net)
def main():
train_ds, val_ds = load_sentiment140(batch_size=BATCH_SIZE, epochs=EPOCHS)
steps_per_epoch = tf.data.experimental.cardinality(train_ds).numpy()
init_lr = 3e-5
optimizer = tf.keras.optimizers.Adam(learning_rate=init_lr)
model = build_bert_model(bert_model_name)
model.compile(optimizer=optimizer, loss='mse', metrics='mse')
model.fit(train_ds, validation_data=val_ds, steps_per_epoch=steps_per_epoch)
tf.saved_model.save(model, export_dir='models')

This problem comes from a TensorFlow bug triggered by versions /1 and /2 of https://tfhub.dev/tensorflow/bert_en_uncased_preprocess. The updated models tensorflow/bert_*_preprocess/3 (released last Friday) avoid this bug. Please update to the newest version.
The Classify Text with BERT tutorial has been updated accordingly.
Thanks for bringing this up!


How to manually load pretrained model if I can't download it using TensorFlow

I am trying to download the VGG19 model via TensorFlow
base_model = VGG19(input_shape = [256,256,3],
include_top = False,
weights = 'imagenet')
However the download always gets stuck before it finishes downloading. I've tried with different models too like InceptionV3 and the same happens there.
Fortunately, the prompt makes the link available where the model can be downloaded manually
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5
19546112/80134624 [======>.......................] - ETA: 11s
After downloading the model from the given link I try to import the model using
base_model = load_model('vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5')
but I get this error
ValueError: No model found in config file.
How do I load in the downloaded .h5 model manually?
You're using load_model on weights, instead of a model. You need to have a defined model first, then load the weights.
weights = "path/to/weights"
model = VGG19 # the defined model
model.load_weights(weights) # the weights
Got the same problem when learning on tensorflow tutorial, too.
Transfer learning and fine-tuning: Create the base model from the pre-trained convnets
# Create the base model from the pre-trained model MobileNet V2
IMG_SIZE = (160, 160)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE, include_top=False, weights=None)
# load model weights manually
weights = 'mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_160_no_top.h5'
I tried download the model.h5, and load manually. It works.

Extremely slow when saving model on Colab TPU

my situation is that saving model is extremely slow under Colab TPU environment.
I first encountered this issue when using checkpoint callback, which causes the training stuck at the end of the 1st epoch.
Then, I tried taking out callback and just save the model using model.save_weights(), but nothing has changed. By using Colab terminal, I found that the saving speed is about ~100k for 5 minutes.
The version of Tensorflow = 2.3
My code of model fitting is here:
with tpu_strategy.scope(): # creating the model in the TPUStrategy scope means we will train the model on the TPU
Baseline = create_model()
checkpoint = keras.callbacks.ModelCheckpoint('baseline_{epoch:03d}.h5',
save_weights_only=True, save_freq="epoch")
hist = model.fit(get_train_ds().repeat(),
steps_per_epoch = 100,
epochs = 5,
verbose = 1,
callbacks = [checkpoint])
model.save_weights("epoch-test.h5", overwrite=True)
I found the issue happened because I explicitly switched to graph mode by writing
from tensorflow.python.framework.ops import disable_eager_execution
with tpu_strategy.scope():
Though I still don't understand the cause, remove disable_eager_execution solved the issue.

Isues with saving and loading tensorflow model which uses hugging face transformer model as its first layer

Hi I am having some serious problems saving and loading a tensorflow model which is combination of hugging face transformers + some custom layers to do classfication. I am using the latest Huggingface transformers tensorflow keras version. The idea is to extract features using distilbert and then run the features through CNN to do classification and extraction. I have got everything to work as far as getting the correct classifications.
The problem is in saving the model once trained and then loading the model again.
I am using tensorflow keras and tensorflow version 2.2
Following is the code to design the model, train it, evaluate it and then save and load it
bert_config = DistilBertConfig(dropout=0.2, attention_dropout=0.2, output_hidden_states=False)
bert_config.output_hidden_states = False
transformer_model = TFDistilBertModel.from_pretrained(DISTIL_BERT, config=bert_config)
input_ids_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='input_token', dtype='int32')
input_masks_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='masked_token', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
x = tf.keras.layers.Bidirectional(
tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1,
recurrent_dropout=0, recurrent_activation="sigmoid",
unroll=False, use_bias=True, activation="tanh"))(embedding_layer)
x = tf.keras.layers.GlobalMaxPool1D()(x)
outputs = []
# lots of code here to define the dense layers to generate the outputs
# .....
# .....
model = Model(inputs=[input_ids_in, input_masks_in], outputs=outputs)
for model_layer in model.layers[:3]:
logger.info(f"Setting layer {model_layer.name} to not trainable")
model_layer.trainable = False
rms_optimizer = RMSprop(learning_rate=0.001)
model.compile(loss=SigmoidFocalCrossEntropy(), optimizer=rms_optimizer)
# the code to fit the model (which works)
# then code to evaluate the model (which also works)
# finally saving the model. This too works.
tf.keras.models.save_model(model, save_url, overwrite=True, include_optimizer=True, save_format="tf")
However, when I try to load the saved model using the following
path, custom_objects={"Addons>SigmoidFocalCrossEntropy": SigmoidFocalCrossEntropy})
I get the following load error
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')" is not
Entire first structure:
Entire second structure:
{'input_ids': .}
I believe the issue is because TFDistilBertModel layer can be called using a dictionary input from DistilBertTokenizer.encode() and that happens to be the first layer. So the model compiler on load expects that to be the input signature to the call model. However, the inputs defined to the model are two tensors of shape (None, 128)
So how do I tell the load function or the save function to assume the correct signatures?
I solved the issue.
The issue was the object transformer_model in the above code is itself not a layer. So if we want to embed it inside another keras layer we should use the internal keras layer that is wrapped in the model
So changing the line
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in[0]
embedding_layer = transformer_model.distilbert([input_ids_in, input_masks_in])[0]
makes everything work. Hope this helps someone else. Took a long time to debug through tf.keras code to figure this one out although in hindsight it is obvious. :)
I suffered the same problem, casually, yesterday. My solution is very similar to yours, I supposed that the problem was due to how tensorflow keras processes custom models so, the idea was to use the layers of the custom model inside my model. This has the advantage of not calling explicitly the layer by its name (in my case, it is useful for easy building more generic models using different pretrained encoders):
sent_encoder = getattr(transformers, self.model_name).from_pretrained(self.shortcut_weights).layers[0]
I don't explored all the models of HuggingFace, but a few that I tested seem to be a custom model with only one custom layer.
Your solution also works like a charm, in fact, both solutions are the same if "distilbert" references to ".layers[0]".

Tensor flow 2.0 load save model without optimizer

I trained my model and save it like this:
network.save(os.path.join(args.logdir, "cifar_model.h5") ,
now, I would like to load it and continue training like this, but that doesn't work
model = tf.keras.models.load_model("...\\cifar_model.h5", compile ="False")
optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001, decay=1e-6),
model.tb_callback = tf.keras.callbacks.TensorBoard(args.logdir, update_freq=1000, profile_batch=1)
model.tb_callback.on_train_end = lambda *_: None
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
# cifar.train.data["images"], cifar.train.data["labels"],
datagen.flow(cifar.train.data["images"], cifar.train.data["labels"], batch_size=args.batch_size),
# batch_size=args.batch_size,
validation_data=(cifar.dev.data["images"], cifar.dev.data["labels"]),
It throws an error:
AttributeError: 'Network' object has no attribute 'compile'
This should work according to https://www.tensorflow.org/alpha/guide/keras/saving_and_serializing
Note that I'm saving without optimizer so I can avoid bug with loading optimizer.
I found out how to do this when I know the exact structure of layers.
Which I know, then I can recreate model and use weights from a loaded model like this:
load = tf.keras.models.load_model("...\\cifar_model.h5", compile ="False")
But I couldnt apply same for load.layers i think its possible if u dont have sequential layers

Saving tf.trainable_variables() using convert_variables_to_constants

I have a Keras model that I would like to convert to a Tensorflow protobuf (e.g. saved_model.pb).
This model comes from transfer learning on the vgg-19 network in which and the head was cut-off and trained with fully-connected+softmax layers while the rest of the vgg-19 network was frozen
I can load the model in Keras, and then use keras.backend.get_session() to run the model in tensorflow, generating the correct predictions:
frame = preprocess(cv2.imread("path/to/img.jpg")
keras_model = keras.models.load_model("path/to/keras/model.h5")
keras_prediction = keras_model.predict(frame)
with keras.backend.get_session() as sess:
tvars = tf.trainable_variables()
output = sess.graph.get_tensor_by_name('Softmax:0')
input_tensor = sess.graph.get_tensor_by_name('input_1:0')
tf_prediction = sess.run(output, {input_tensor: frame})
print(tf_prediction) # this matches keras_prediction exactly
If I don't include the line tvars = tf.trainable_variables(), then the tf_prediction variable is completely wrong and doesn't match the output from keras_prediction at all. In fact all the values in the output (single array with 4 probability values) are exactly the same (~0.25, all adding to 1). This made me suspect that weights for the head are just initialized to 0 if tf.trainable_variables() is not called first, which was confirmed after inspecting the model variables. In any case, calling tf.trainable_variables() causes the tensorflow prediction to be correct.
The problem is that when I try to save this model, the variables from tf.trainable_variables() don't actually get saved to the .pb file:
with keras.backend.get_session() as sess:
tvars = tf.trainable_variables()
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), ['Softmax'])
graph_io.write_graph(constant_graph, './', 'saved_model.pb', as_text=False)
What I am asking is, how can I save a Keras model as a Tensorflow protobuf with the tf.training_variables() intact?
Thanks so much!
So your approach of freezing the variables in the graph (converting to constants), should work, but isn't necessary and is trickier than the other approaches. (more on this below). If your want graph freezing for some reason (e.g. exporting to a mobile device), I'd need more details to help debug, as I'm not sure what implicit stuff Keras is doing behind the scenes with your graph. However, if you want to just save and load a graph later, I can explain how to do that, (though no guarantees that whatever Keras is doing won't screw it up..., happy to help debug that).
So there are actually two formats at play here. One is the GraphDef, which is used for Checkpointing, as it does not contain metadata about inputs and outputs. The other is a MetaGraphDef which contains metadata and a graph def, the metadata being useful for prediction and running a ModelServer (from tensorflow/serving).
In either case you need to do more than just call graph_io.write_graph because the variables are usually stored outside the graphdef.
There are wrapper libraries for both these use cases. tf.train.Saver is primarily used for saving and restoring checkpoints.
However, since you want prediction, I would suggest using a tf.saved_model.builder.SavedModelBuilder to build a SavedModel binary. I've provided some boiler plate for this below:
from tensorflow.python.saved_model.signature_constants import DEFAULT_SERVING_SIGNATURE_DEF_KEY as DEFAULT_SIG_DEF
builder = tf.saved_model.builder.SavedModelBuilder('./mymodel')
with keras.backend.get_session() as sess:
output = sess.graph.get_tensor_by_name('Softmax:0')
input_tensor = sess.graph.get_tensor_by_name('input_1:0')
sig_def = tf.saved_model.signature_def_utils.predict_signature_def(
{'input': input_tensor},
{'output': output}
sess, tf.saved_model.tag_constants.SERVING,
After running this code you should have a mymodel/saved_model.pb file as well as a directory mymodel/variables/ with protobufs corresponding to the variable values.
Then to load the model again, simply use tf.saved_model.loader:
# Does Keras give you the ability to start with a fresh graph?
# If not you'll need to do this in a separate program to avoid
# conflicts with the old default graph
with tf.Session(graph=tf.Graph()):
meta_graph_def = tf.saved_model.loader.load(
# From this point variables and graph structure are restored
sig_def = meta_graph_def.signature_def[DEFAULT_SIG_DEF]
print(sess.run(sig_def.outputs['output'], feed_dict={sig_def.inputs['input']: frame}))
Obviously there's a more efficient prediction available with this code through tensorflow/serving, or Cloud ML Engine, but this should work.
It's possible that Keras is doing something under the hood which will interfere with this process as well, and if so we'd like to hear about it (and I'd like to make sure that Keras users are able to freeze graphs as well, so if you want to send me a gist with your full code or something maybe I can find someone who knows Keras well to help me debug.)
EDIT: You can find an end to end example of this here: https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/census/keras/trainer/model.py#L85