Load checkpoint and finetuning using tf.estimator.Estimator - tensorflow

We're trying to translate old training code based into a more tf.estimator.Estimator compliant code.
In the initial code we fine tune an original model for a target dataset. Only some layers are loaded from the checkpoint before the training takes place using a combination of variables_to_restore and init_fn with the MonitoredTrainingSession.
How can one achieve this kind of weight loading with the tf.estimator.Estimator approach ?

you have two options, first one is simpler:
1- use tf.train.init_from_checkpoint in your model_fn
2- model_fn returns an EstimatorSpec. You can set scaffold viaEstimatorSpec.

import tensorflow as tf
def model_fn():
# your model defintion here
# ...
# specify your saved checkpoint path
checkpoint_path = "model.ckpt"
ws = tf.estimator.WarmStartSettings(ckpt_to_initialize_from=checkpoint_path)
est = tf.estimator.Estimator(model_fn=model_fn, warm_start_from=ws)

Related

How do I load the two stages of a saved Faster R-CNN separately in TF Object Detection 2.0?

I trained a Faster R-CNN from the TF Object Detection API and saved it using export_inference_graph.py. I have the following directory structure:
weights
|-checkpoint
|-frozen_inference_graph.pb
|-model.ckpt-data-00000-of-00001
|-model.ckpt.index
|-model.ckpt.meta
|-pipeline.config
|-saved_model
|--saved_model.pb
|--variables
I would like to load the first and second stages of the model separately. That is, I would like the following two models:
A model containing each variable in the scope FirstStageFeatureExtractor which accepts an image (or serialized tf.data.Example) as input, and outputs the feature map and RPN proposals.
A model containing each variable in the scopes SecondStageFeatureExtractor and SecondStageBoxPredictor which accepts a feature map and RPN proposals as input, and outputs the bounding box predictions and scores.
I basically want to be able to call _predict_first_stage and _predict_second_stage separately on my input data.
Currently, I only know how to load the entire model:
model = tf.saved_model.load("weights/saved_model")
model = model.signatures["serving_default"]
EDIT 6/7/2020:
For Model 1, I may be able to extract detection_features as in this question, but I'm still not sure about Model 2.
This was more difficult when Object Detection was only compatible with TF1, but is now pretty simple in TF2. There's a good example in this colab.
from object_detection.builders import model_builder
from object_detection.utils import config_util
# Set path names
model_name = 'centernet_hg104_512x512_kpts_coco17_tpu-32'
pipeline_config = os.path.join('models/research/object_detection/configs/tf2/',
model_name + '.config')
model_dir = 'models/research/object_detection/test_data/checkpoint/'
# Load pipeline config and build a detection model
configs = config_util.get_configs_from_pipeline_file(pipeline_config)
model_config = configs['model']
detection_model = model_builder.build(model_config=model_config,
is_training=False)
# Restore checkpoint
ckpt = tf.compat.v2.train.Checkpoint(
model=detection_model)
ckpt.restore(os.path.join(model_dir, 'ckpt-0')).expect_partial()
From here one can call detection_model.predict() and associated methods such as _predict_first_stage and _predict_second_stage.

Tensorflow Combining Two Models End to End

In tensorflow it is fairly easy to load trained models back into tensorflow through the use of checkpoints. However, this use case seems oriented towards users that want to either run evaluation or additional training on a checkpointed model.
What is the simplest way in tensorflow to load a pre-trained model and use it (without training) to produce results which will then be used in a new model?
Right now the methods that seem most promising are tf.get_tensor_by_name() and tf.stop_gradient() in order to get the input and output tensors for the trained model loaded from tf.train.import_meta_graph().
What is the best practices setup for this sort of thing?
The most straightforward solution would be to freeze the pre-trained model variables using this function:
def freeze_graph(model_dir, output_node_names):
"""Extract the sub graph defined by the output nodes and convert
all its variables into constant
Args:
model_dir: the root folder containing the checkpoint state file
output_node_names: a string, containing all the output node's names,
comma separated
"""
if not tf.gfile.Exists(model_dir):
raise AssertionError(
"Export directory doesn't exist")
if not output_node_names:
print("You need to supply the name of the output node")
return -1
# We retrieve our checkpoint fullpath
checkpoint = tf.train.get_checkpoint_state(model_dir)
input_checkpoint = checkpoint.model_checkpoint_path
# We precise the file fullname of our freezed graph
absolute_model_dir = "/".join(input_checkpoint.split('/')[:-1])
# We clear devices to allow TensorFlow to control on which device it will load operations
clear_devices = True
# We start a session using a temporary fresh Graph
with tf.Session(graph=tf.Graph()) as sess:
# We import the meta graph in the current default Graph
saver = tf.train.import_meta_graph(args.meta_graph_path, clear_devices=clear_devices)
# We restore the weights
saver.restore(sess, input_checkpoint)
# We use a built-in TF helper to export variables to constants
frozen_graph = tf.graph_util.convert_variables_to_constants(
sess, # The session is used to retrieve the weights
tf.get_default_graph().as_graph_def(), # The graph_def is used to retrieve the nodes
output_node_names.split(",") # The output node names are used to select the usefull nodes
)
return frozen_graph
Then you'd be able to build your new-model on top of the pre-trained model:
# Get the frozen graph
frozen_graph = freeze_graph(YOUR_MODEL_DIR, YOUR_OUTPUT_NODES)
# Set the frozen graph as a default graph
frozen_graph.as_default()
# Get the output tensor from the pre-trained model
pre_trained_model_result = frozen_graph.get_tensor_by_name(OUTPUT_TENSOR_NAME_OF_PRETRAINED_MODEL)
# Let's say you want to get the pre trained model result's square root
my_new_operation_results = tf.sqrt(pre_trained_model_result)

How can I convert a trained Tensorflow model to Keras?

I have a trained Tensorflow model and weights vector which have been exported to protobuf and weights files respectively.
How can I convert these to JSON or YAML and HDF5 files which can be used by Keras?
I have the code for the Tensorflow model, so it would also be acceptable to convert the tf.Session to a keras model and save that in code.
I think the callback in keras is also a solution.
The ckpt file can be saved by TF with:
saver = tf.train.Saver()
saver.save(sess, checkpoint_name)
and to load checkpoint in Keras, you need a callback class as follow:
class RestoreCkptCallback(keras.callbacks.Callback):
def __init__(self, pretrained_file):
self.pretrained_file = pretrained_file
self.sess = keras.backend.get_session()
self.saver = tf.train.Saver()
def on_train_begin(self, logs=None):
if self.pretrian_model_path:
self.saver.restore(self.sess, self.pretrian_model_path)
print('load weights: OK.')
Then in your keras script:
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
restore_ckpt_callback = RestoreCkptCallback(pretrian_model_path='./XXXX.ckpt')
model.fit(x_train, y_train, batch_size=128, epochs=20, callbacks=[restore_ckpt_callback])
That will be fine.
I think it is easy to implement and hope it helps.
Francois Chollet, the creator of keras, stated in 04/2017 "you cannot turn an arbitrary TensorFlow checkpoint into a Keras model. What you can do, however, is build an equivalent Keras model then load into this Keras model the weights"
, see https://github.com/keras-team/keras/issues/5273 . To my knowledge this hasn't changed.
A small example:
First, you can extract the weights of a tensorflow checkpoint like this
PATH_REL_META = r'checkpoint1.meta'
# start tensorflow session
with tf.Session() as sess:
# import graph
saver = tf.train.import_meta_graph(PATH_REL_META)
# load weights for graph
saver.restore(sess, PATH_REL_META[:-5])
# get all global variables (including model variables)
vars_global = tf.global_variables()
# get their name and value and put them into dictionary
sess.as_default()
model_vars = {}
for var in vars_global:
try:
model_vars[var.name] = var.eval()
except:
print("For var={}, an exception occurred".format(var.name))
It might also be of use to export the tensorflow model for use in tensorboard, see https://stackoverflow.com/a/43569991/2135504
Second, you build you keras model as usually and finalize it by "model.compile". Pay attention that you need to give you define each layer by name and add it to the model after that, e.g.
layer_1 = keras.layers.Conv2D(6, (7,7), activation='relu', input_shape=(48,48,1))
net.add(layer_1)
...
net.compile(...)
Third, you can set the weights with the tensorflow values, e.g.
layer_1.set_weights([model_vars['conv7x7x1_1/kernel:0'], model_vars['conv7x7x1_1/bias:0']])
Currently, there is no direct in-built support in Tensorflow or Keras to convert the frozen model or the checkpoint file to hdf5 format.
But since you have mentioned that you have the code of Tensorflow model, you will have to rewrite that model's code in Keras. Then, you will have to read the values of your variables from the checkpoint file and assign it to Keras model using layer.load_weights(weights) method.
More than this methodology, I would suggest to you to do the training directly in Keras as it claimed that Keras' optimizers are 5-10% times faster than Tensorflow's optimizers. Other way is to write your code in Tensorflow with tf.contrib.keras module and save the file directly in hdf5 format.
Unsure if this is what you are looking for, but I happened to just do the same with the newly released keras support in TF 1.2. You can find more on the API here: https://www.tensorflow.org/api_docs/python/tf/contrib/keras
To save you a little time, I also found that I had to include keras modules as shown below with the additional python.keras appended to what is shown in the API docs.
from tensorflow.contrib.keras.python.keras.models import Sequential
Hope that helps get you where you want to go. Essentially once integrated in, you then just handle your model/weight export as usual.

Tensorflow Estimator API: Summaries

I can't achieve to make summaries work with the Estimator API of Tensorflow.
The Estimator class is very useful for many reasons: I have already implemented my own classes which are really similar but I am trying to switch to this one.
Here is the code sample:
import tensorflow as tf
import tensorflow.contrib.layers as layers
import tensorflow.contrib.learn as learn
import numpy as np
# To reproduce the error: docker run --rm -w /algo -v $(pwd):/algo tensorflow/tensorflow bash -c "python sample.py"
def model_fn(x, y, mode):
logits = layers.fully_connected(x, 12, scope="dense-1")
logits = layers.fully_connected(logits, 56, scope="dense-2")
logits = layers.fully_connected(logits, 4, scope="dense-3")
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=y), name="xentropy")
return {"predictions":logits}, loss, tf.train.AdamOptimizer(0.001).minimize(loss)
def input_fun():
""" To be completed for a 4 classes classification problem """
feature = tf.constant(np.random.rand(100,10))
labels = tf.constant(np.random.random_integers(0,3, size=(100,)))
return feature, labels
estimator = learn.Estimator(model_fn=model_fn, )
trainingConfig = tf.contrib.learn.RunConfig(save_checkpoints_secs=60)
estimator = learn.Estimator(model_fn=model_fn, model_dir="./tmp", config=trainingConfig)
# Works
estimator.fit(input_fn=input_fun, steps=2)
# The following code does not work
# Can't initialize saver
# saver = tf.train.Saver(max_to_keep=10) # Error: No variables to save
# The following fails because I am missing a saver... :(
hooks=[
tf.train.LoggingTensorHook(["xentropy"], every_n_iter=100),
tf.train.CheckpointSaverHook("./tmp", save_steps=1000, checkpoint_basename='model.ckpt'),
tf.train.StepCounterHook(every_n_steps=100, output_dir="./tmp"),
tf.train.SummarySaverHook(save_steps=100, output_dir="./tmp"),
]
estimator.fit(input_fn=input_fun, steps=2, monitors=hooks)
As you can see, I can create an Estimator and use it but I can achieve to add hooks to the fitting process.
The logging hooks works just fine but the others require both tensors and a saver which I can't provide.
The tensors are defined in the model function, thus I can't pass them to the SummaryHook and the Saver can't be initialized because there is no tensor to save...
Is there a solution to my problem? (I am guessing yes but there is a lack of documentation of this part in the tensorflow documentation)
How can I initialized my saver? Or should I use other objects such as Scaffold?
How can I pass summaries to the SummaryHook since they are defined in my model function?
Thanks in advance.
PS: I have seen the DNNClassifier API but I want to use the estimator API for Convolutional Nets and others. I need to create summaries for any estimator.
The intended use case is that you let the Estimator save summaries for you. There are options in RunConfig for configuring summary writing. RunConfigs get passed when constructing the Estimator.
Just have tf.summary.scalar("loss", loss) in the model_fn, and run the code without summary_hook. The loss is recorded and shown in the tensorboard.
See also:
Tensorflow - Using tf.summary with 1.2 Estimator API

Unable to save the tf.contrib.learn wide and deep model in a tensorflow session and serve it on TensorFlow Serving

I am running the tf.contrib.learn wide and deep model in TensorFlow serving and to export the trained model I am using the piece of code
with tf.Session() as sess:
init_op = tf.initialize_all_variables()
saver = tf.train.Saver()
m.fit(input_fn=lambda: input_fn(df_train), steps=FLAGS.train_steps)
print('model successfully fit!!')
results = m.evaluate(input_fn=lambda: input_fn(df_test), steps=1)
for key in sorted(results):
print("%s: %s" % (key, results[key]))
model_exporter = exporter.Exporter(saver)
model_exporter.init(
sess.graph.as_graph_def(),
init_op=init_op,
named_graph_signatures={
'inputs': exporter.generic_signature({'input':df_train}),
'outputs': exporter.generic_signature({'output':df_train[impressionflag]})})
model_exporter.export(export_path, tf.constant(FLAGS.export_version), sess)
print ('Done exporting!')
But while using the command saver = tf.train.Saver() the error ValueError: No variable to save is displayed
enter image description here
How can I save the model, so that a servable is created which is required while loading the exported model in tensorflow standard server? Any help is appreciated.
The graphs and sessions are contained in Estimator and not exposed or leaked. Thus by using Estimator.export() we can export the model and create a servable which can be used to run on model_servers.
Estimator.export() is now deprecated, so you need to use an Estimator.export_savedmodel().
Here I wrote a simple tutorial Exporting and Serving a TensorFlow Wide & Deep Model.
TL;DR
To export an estimator there are four steps:
Define features for export as a list of all features used during estimator initialization.
Create a feature config using create_feature_spec_for_parsing.
Build a serving_input_fn suitable for use in serving using input_fn_utils.build_parsing_serving_input_fn.
Export the model using export_savedmodel().
To run a client script properly you need to do three following steps:
Create and place your script somewhere in the /serving/ folder, e.g. /serving/tensorflow_serving/example/
Create or modify corresponding BUILD file by adding a py_binary.
Build and run a model server, e.g. tensorflow_model_server.
Create, build and run a client that sends a tf.Example to our tensorflow_model_server for the inference.
For more details look at the tutorial itself.
Hope it helps.
Does your graph have any variables then? If not and all the operations work with constants instead, you can specify a flag in the Saver constructor:
saver = tf.train.Saver(allow_empty=True)