TensorFlow: load checkpoint, but only parts of it (convolutional layers) - tensorflow

Is it possible to only load specific layers (convolutional layers) out of one checkpoint file?
I've trained some CNNs fully-supervised and saved my progress (I'm doing object localization). To do auto-labelling I thought of building a weakly-supervised CNNs out of my current model...but since the weakly-supervised version has different fully-connected layers, I would like to select only the convolutional filters of my TensorFlow checkpoint file.
Of course I could manually save the weights of the corresponding layers, but due to the fact that they're already included in TensorFlow's checkpoint file I would like to extract them there, in order to have one single storing file.

TensorFlow 2.1 has many different public facilities for loading checkpoints (model.save, Checkpoint, saved_model, etc), but to the best of my knowledge, none of them has filtering API. So, let me suggest a snippet for hard cases which uses tooling from the TF2.1 internal development tests.
checkpoint_filename = '/path/to/our/weird/checkpoint.ckpt'
model = tf.keras.Model( ... ) # TF2.0 Model to initialize with the above checkpoint
variables_to_load = [ ... ] # List of model weight names to update.
from tensorflow.python.training.checkpoint_utils import load_checkpoint, list_variables
reader = load_checkpoint(checkpoint_filename)
for w in model.weights:
name=w.name.split(':')[0] # See (b/29227106)
if name in variables_to_load:
print(f"Updating {name}")
w.assign(reader.get_tensor(
# (Optional) Handle variable renaming
{'/var_name1/in/model':'/var_name1/in/checkpoint',
'/var_name2/in/model':'/var_name2/in/checkpoint',
# ... and so on
}.get(name,name)))
Note: model.weights and list_variables may help to inspect variables in Model and in the checkpoint
Note also, that this method will not restore model's optimizer state.

Related

Is it possible to load and train a model, if only thing we have is check point files?

Is it possible to load and train a model from check point files ?
We have information about input and output Tensor shape.
Check point files
Yes. You can use tensorflow-keras following this example.
https://www.tensorflow.org/guide/checkpoint
Directly from tensorflow documentation.
List checkpoints
!ls ./tf_ckpts
which produces
checkpoint ckpt-8.data-00000-of-00001 ckpt-9.index
ckpt-10.data-00000-of-00001 ckpt-8.index
ckpt-10.index ckpt-9.data-00000-of-00001
Recover from Checkpoint
Calling restore() on a tf.train.Checkpoint object queues the requested restorations, restoring variable values as soon as there's a matching path from the Checkpoint object. For example we can load just the bias from the model we defined above by reconstructing one path to it through the network and the layer.
to_restore = tf.Variable(tf.zeros([5])) # variables from your model.
print(to_restore.numpy()) # All zeros
fake_layer = tf.train.Checkpoint(bias=to_restore)
fake_net = tf.train.Checkpoint(l1=fake_layer)
new_root = tf.train.Checkpoint(net=fake_net)
status = new_root.restore(tf.train.latest_checkpoint('./tf_ckpts/'))
print(to_restore.numpy()) # We get the restored value now
To double check that it was restored you can type:
status.assert_existing_objects_matched()
and get the following output.
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f1d796da278>
Yes, it is possible if the checkpoint contains parameters of the model (parameters as W and b in W*x +b). I think you have that, in case of transfer learning, you can use this based on your files.
# Loads the weights
model.load_weights(checkpoint_path)
You should know the architecture of the model and create the model before using this. In some models, there is a specific way to load checkpoint.
Also, check this out: https://www.tensorflow.org/tutorials/keras/save_and_load

Tensorflow Object-Detection API - How does the Fine-Tuning of a model works?

This is a more general question about the Tensorflow Object-Detection API.
I am using this API, to be more concrete I fine-tune a model to my dataset. According to the description of the API, I use the model_main.py function to retrain a model from a given checkpoint/frozen graph.
However, it is not clear for me how the fine-tuning is working within the API. Does a re-initialization of the last layer happen automatically or do I have to implement something like ?
In the README files I did not find any hint concerning this topic. Maybe somebody could help me.
Training from stratch or training from a checkpoint, model_main.py is the main program, besides this program, all you need is a correct pipeline config file.
So for fine-tuning, it can be separated into two steps, restoring weights and updating weights. Both steps can be customly configured according to the train proto file, this proto corresponds to train_config in the pipeline config file.
train_config: {
batch_size: 24
optimizer { }
fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
fine_tune_checkpoint_type: "detection"
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options {}
}
Step 1, restoring weights.
In this step, you can config the variables to be restored by setting fine_tune_checkpoint_type, the options are detection and classification. By setting it to detection essentially you can restore almost all variables from the checkpoint, and by setting it to classification, only variables from the feature_extractor scope are restored, (all the layers in backbone networks, like VGG, Resnet, MobileNet, they are called feature extractors).
Previously this is controlled by from_detection_checkpoint and load_all_detection_checkpoint_vars, but these two fields are deprecated.
Also notice that after you configured the fine_tune_checkpoint_type, the actual restoring operation would check if the variable in the graph exists in the checkpoint, and if not, the variable would be initialized with routine initialization operation.
Give an example, suppose you want to fine-tune a ssd_mobilenet_v1_custom_data model and you downloaded the checkpoint ssd_mobilenet_v1_coco, when you set fine_tune_checkpoint_type: detection, then all variables in the graph that are also available in the checkpoint file will be restored, and the box predictor (last layer) weights will also be restored. But if you set fine_tune_checkpoint_type: classification, then only the weights for mobilenet layers are restored. But if you use a different model checkpoint, say faster_rcnn_resnet_xxx, then because variables in the graph are not available in the checkpoint, you will see the output log saying Variable XXX is not available in checkpoint warning, and they won't be restored.
Step 2, updating weights
Now you have all weights restored and you want to keep training (fine-tuning) on your own dataset, normally this should be enough.
But if you want to experiment with something and you want to freeze some layers during training, then you can customize the training by setting freeze_variables. Say you want to freeze all the weights of the mobilenet and only updating the weights for the box predictor, you can set freeze_variables: [feature_extractor] so that all variables that have feature_extractor in their names won't be updated. For detailed info, please see another answer that I wrote.
So to fine-tune a model on your custom dataset, you should prepare a custom config file. You can start with the sample config files and then modify some fields to suit your needs.

How to extract Tensorflow trained weights from graph.pbtxt to raw data

I have trained a custom neural network with the function:
tf.estimator.train_and_evaluate
After correct training, it contains the following files:
checkpoint
events.out.tfevents.1538489166.ti
model.ckpt-0.data-00000-of-00002
model.ckpt-0.index
model.ckpt-10.data-00000-of-00002
model.ckpt-10.index eval
graph.pbtxt
model.ckpt-0.data-00001-of-00002
model.ckpt-0.meta
model.ckpt-10.data-00001-of-00002
model.ckpt-10.meta
Now I need to export the weights and biases of every layer, into a raw data structure, e.g. an array, numpy.
I have read multiple pages on TensorFlow, and on other topics, but neither can find this question. The first thing I would assume to put the fils together into graph.pd with the freeze.py as suggested here:
Tensorflow: How to convert .meta, .data and .index model files into one graph.pb file
But then still the main question is unsolved.
If you wish to evaluate tensors alone, you can check out this question. But if you wish to e.g. deploy your network, you can take a look at TensorFlow serving, which is probably the most performant one right now. Or if you want to export this network to other frameworks and use them there, you can actually use ONNX for this purpose.
If saving weights and biases in a numpy array is your strict requirement, you can follow this example:
# In a TF shell, define all requirements and call the model function
y = model(x, is_training=False, reuse=tf.AUTO_REUSE) # For example
Once you call this function, you can see all the variables in the graph by running
tf.global_variables()
You need to restore all these variables from the latest checkpoint (say ckpt_dir) and then execute each of these variables to get the latest values.
checkpoint = tf.train.latest_checkpoint('./model_dir/')
fine_tune = tf.contrib.slim.assign_from_checkpoint_fn(checkpoint,
tf.global_variables(),
ignore_missing_vars=True)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
gv = sess.run(tf.global_variables())
Now gv will be a list of all the values of your variables (weights and biases); You can access any individual component via indexing - gv[5] etc. Or you can convert the entire thing into an array and save using numpy.
np.save('my_weights', np.array(gv))
This will save all your weights and biases in your current working directory as a numpy array - 'my_weights.npy'.
Hope this helps.

Tensorflow remove layers from pretrained model

Is there a way to load a pretrained model in Tensorflow and remove the top layers in the network? I am looking at Tensorflow release r1.10
The only documentation I could find is with tf.keras.Sequential.pop
https://www.tensorflow.org/versions/r1.10/api_docs/python/tf/keras/Sequential#pop
I want to manually prune a pretrained network by removing bunch of top convolution layers and add a custom fully convoluted layer.
EDIT:
The model is ssd_mobilenet_v1_coco downloaded from Tensorflow Model Zoo. I have access to both the frozen_inference_graph.pb model file and checkpoint file.
I donot have access to the python code which is used to construct the model.
Thanks.
From inspecting the code, SSDMobileNetV1FeatureExtractor.extract_features redirects research.slim.nets:
from nets import mobilenet_v1 # nets will have to be on your PYTHONPATH
with tf.variable_scope('MobilenetV1',
reuse=self._reuse_weights) as scope:
with slim.arg_scope(
mobilenet_v1.mobilenet_v1_arg_scope(
is_training=None, regularize_depthwise=True)):
with (slim.arg_scope(self._conv_hyperparams_fn())
if self._override_base_feature_extractor_hyperparams
else context_manager.IdentityContextManager()):
_, image_features = mobilenet_v1.mobilenet_v1_base(
ops.pad_to_multiple(preprocessed_inputs, self._pad_to_multiple),
final_endpoint='Conv2d_13_pointwise',
min_depth=self._min_depth,
depth_multiplier=self._depth_multiplier,
use_explicit_padding=self._use_explicit_padding,
scope=scope)
The mobilenet_v1_base function takes a final_endpoint argument. Rather than prune the constructed graph, just construct the graph up until the endpoint you want.

Saving tf.trainable_variables() using convert_variables_to_constants

I have a Keras model that I would like to convert to a Tensorflow protobuf (e.g. saved_model.pb).
This model comes from transfer learning on the vgg-19 network in which and the head was cut-off and trained with fully-connected+softmax layers while the rest of the vgg-19 network was frozen
I can load the model in Keras, and then use keras.backend.get_session() to run the model in tensorflow, generating the correct predictions:
frame = preprocess(cv2.imread("path/to/img.jpg")
keras_model = keras.models.load_model("path/to/keras/model.h5")
keras_prediction = keras_model.predict(frame)
print(keras_prediction)
with keras.backend.get_session() as sess:
tvars = tf.trainable_variables()
output = sess.graph.get_tensor_by_name('Softmax:0')
input_tensor = sess.graph.get_tensor_by_name('input_1:0')
tf_prediction = sess.run(output, {input_tensor: frame})
print(tf_prediction) # this matches keras_prediction exactly
If I don't include the line tvars = tf.trainable_variables(), then the tf_prediction variable is completely wrong and doesn't match the output from keras_prediction at all. In fact all the values in the output (single array with 4 probability values) are exactly the same (~0.25, all adding to 1). This made me suspect that weights for the head are just initialized to 0 if tf.trainable_variables() is not called first, which was confirmed after inspecting the model variables. In any case, calling tf.trainable_variables() causes the tensorflow prediction to be correct.
The problem is that when I try to save this model, the variables from tf.trainable_variables() don't actually get saved to the .pb file:
with keras.backend.get_session() as sess:
tvars = tf.trainable_variables()
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), ['Softmax'])
graph_io.write_graph(constant_graph, './', 'saved_model.pb', as_text=False)
What I am asking is, how can I save a Keras model as a Tensorflow protobuf with the tf.training_variables() intact?
Thanks so much!
So your approach of freezing the variables in the graph (converting to constants), should work, but isn't necessary and is trickier than the other approaches. (more on this below). If your want graph freezing for some reason (e.g. exporting to a mobile device), I'd need more details to help debug, as I'm not sure what implicit stuff Keras is doing behind the scenes with your graph. However, if you want to just save and load a graph later, I can explain how to do that, (though no guarantees that whatever Keras is doing won't screw it up..., happy to help debug that).
So there are actually two formats at play here. One is the GraphDef, which is used for Checkpointing, as it does not contain metadata about inputs and outputs. The other is a MetaGraphDef which contains metadata and a graph def, the metadata being useful for prediction and running a ModelServer (from tensorflow/serving).
In either case you need to do more than just call graph_io.write_graph because the variables are usually stored outside the graphdef.
There are wrapper libraries for both these use cases. tf.train.Saver is primarily used for saving and restoring checkpoints.
However, since you want prediction, I would suggest using a tf.saved_model.builder.SavedModelBuilder to build a SavedModel binary. I've provided some boiler plate for this below:
from tensorflow.python.saved_model.signature_constants import DEFAULT_SERVING_SIGNATURE_DEF_KEY as DEFAULT_SIG_DEF
builder = tf.saved_model.builder.SavedModelBuilder('./mymodel')
with keras.backend.get_session() as sess:
output = sess.graph.get_tensor_by_name('Softmax:0')
input_tensor = sess.graph.get_tensor_by_name('input_1:0')
sig_def = tf.saved_model.signature_def_utils.predict_signature_def(
{'input': input_tensor},
{'output': output}
)
builder.add_meta_graph_and_variables(
sess, tf.saved_model.tag_constants.SERVING,
signature_def_map={
DEFAULT_SIG_DEF: sig_def
}
)
builder.save()
After running this code you should have a mymodel/saved_model.pb file as well as a directory mymodel/variables/ with protobufs corresponding to the variable values.
Then to load the model again, simply use tf.saved_model.loader:
# Does Keras give you the ability to start with a fresh graph?
# If not you'll need to do this in a separate program to avoid
# conflicts with the old default graph
with tf.Session(graph=tf.Graph()):
meta_graph_def = tf.saved_model.loader.load(
sess,
tf.saved_model.tag_constants.SERVING,
'./mymodel'
)
# From this point variables and graph structure are restored
sig_def = meta_graph_def.signature_def[DEFAULT_SIG_DEF]
print(sess.run(sig_def.outputs['output'], feed_dict={sig_def.inputs['input']: frame}))
Obviously there's a more efficient prediction available with this code through tensorflow/serving, or Cloud ML Engine, but this should work.
It's possible that Keras is doing something under the hood which will interfere with this process as well, and if so we'd like to hear about it (and I'd like to make sure that Keras users are able to freeze graphs as well, so if you want to send me a gist with your full code or something maybe I can find someone who knows Keras well to help me debug.)
EDIT: You can find an end to end example of this here: https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/census/keras/trainer/model.py#L85