How to restore a tensorflow model that only has one file with extension ".model" - tensorflow

I want to use a pretrained tensorflow model provided by an unknown author. I do not know how he/she managed to save the tensorflow model (he/she used tensorflow version >= 1.2) to only one file with the extension '.model', as normally I get either three files '.meta', '.data', '.index' or one file with '.ckpt'.
How can I restore this pretrained model? How can I save a model to this format later?
Thanks.

I have also asked this question on a number of platforms with no assistance yet. So I decided to do some experimental work and this is what I found. This may be long but please bear with me.
To import a model in Tensor-flow we use
with tf.Session() as sess:
new_saver = tf.train.import_meta_graph('my_test_model-1000.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
The .meta file contains all the variables, operations, collections, etc, of the trained model. What tf.train.latest_checkpoint('./') does is to use the checkpoint file (which simply keeps a record of latest checkpoint files saved) to import the xxxx_model.data-00000-of-00001. This .data-00000-of-00001 contains all the weights, biases, gradients, etc, that must be loaded into the variables contained in my_test_model-1000.meta.
Summary [Semi-complete code]
with tf.Session() as sess:
new_saver = tf.train.import_meta_graph('my_test_model-1000.meta')
#new_saver.restore(sess, tf.train.latest_checkpoint('./'))
tensor_variable = tf.trainable_variables()
for tensor_var in tensor_variable:
#print(sess.run(tensor_var))
print(tensor_var)
This initial code will print out all the variables from .meta that are trainable. If you try to run print(sess.run(tensor_var)) you will get an error. This is because, the variables have not been initialized. How ever, if you un-comment new_saver.restore(sess, tf.train.latest_checkpoint('./')) and run print(sess.run(tensor_var)), you will get all the variables alongside values loaded into the variables.
Now to “.model”
My best guess is that xxxxxx.model works a much like xxxx_model.data-00000-of-00001 from tensorflow. It does not contain variables and so if you try to do
with tf.Session() as sess:
new_saver = tf.train.import_meta_graph('xxx.model')
you will get an error. Remember, the reason is that, this .model file does not contain any variables nor operation graph of any form. If you also try to do
with tf.Session() as sess:
new_saver = tf.train.Saver()
new_saver.restore(sess, "xxxx.model")
you will similarly get an error. This is because, there are no corresponding variables to load values into. Therefore, if you ever obtain a xxx.model file, you will have to go through the pain of replicating all the variables and operations before trying to run new_saver.restore(sess, "xxxx.model"). If you are able to replicate the architecture, this will run smoothly with no issues, hopefully.
I am sorry this was long, but considering that there is almost no answer on the internet, I had to make a lecture out of it. :)

Related

Saving tf.trainable_variables() using convert_variables_to_constants

I have a Keras model that I would like to convert to a Tensorflow protobuf (e.g. saved_model.pb).
This model comes from transfer learning on the vgg-19 network in which and the head was cut-off and trained with fully-connected+softmax layers while the rest of the vgg-19 network was frozen
I can load the model in Keras, and then use keras.backend.get_session() to run the model in tensorflow, generating the correct predictions:
frame = preprocess(cv2.imread("path/to/img.jpg")
keras_model = keras.models.load_model("path/to/keras/model.h5")
keras_prediction = keras_model.predict(frame)
print(keras_prediction)
with keras.backend.get_session() as sess:
tvars = tf.trainable_variables()
output = sess.graph.get_tensor_by_name('Softmax:0')
input_tensor = sess.graph.get_tensor_by_name('input_1:0')
tf_prediction = sess.run(output, {input_tensor: frame})
print(tf_prediction) # this matches keras_prediction exactly
If I don't include the line tvars = tf.trainable_variables(), then the tf_prediction variable is completely wrong and doesn't match the output from keras_prediction at all. In fact all the values in the output (single array with 4 probability values) are exactly the same (~0.25, all adding to 1). This made me suspect that weights for the head are just initialized to 0 if tf.trainable_variables() is not called first, which was confirmed after inspecting the model variables. In any case, calling tf.trainable_variables() causes the tensorflow prediction to be correct.
The problem is that when I try to save this model, the variables from tf.trainable_variables() don't actually get saved to the .pb file:
with keras.backend.get_session() as sess:
tvars = tf.trainable_variables()
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), ['Softmax'])
graph_io.write_graph(constant_graph, './', 'saved_model.pb', as_text=False)
What I am asking is, how can I save a Keras model as a Tensorflow protobuf with the tf.training_variables() intact?
Thanks so much!
So your approach of freezing the variables in the graph (converting to constants), should work, but isn't necessary and is trickier than the other approaches. (more on this below). If your want graph freezing for some reason (e.g. exporting to a mobile device), I'd need more details to help debug, as I'm not sure what implicit stuff Keras is doing behind the scenes with your graph. However, if you want to just save and load a graph later, I can explain how to do that, (though no guarantees that whatever Keras is doing won't screw it up..., happy to help debug that).
So there are actually two formats at play here. One is the GraphDef, which is used for Checkpointing, as it does not contain metadata about inputs and outputs. The other is a MetaGraphDef which contains metadata and a graph def, the metadata being useful for prediction and running a ModelServer (from tensorflow/serving).
In either case you need to do more than just call graph_io.write_graph because the variables are usually stored outside the graphdef.
There are wrapper libraries for both these use cases. tf.train.Saver is primarily used for saving and restoring checkpoints.
However, since you want prediction, I would suggest using a tf.saved_model.builder.SavedModelBuilder to build a SavedModel binary. I've provided some boiler plate for this below:
from tensorflow.python.saved_model.signature_constants import DEFAULT_SERVING_SIGNATURE_DEF_KEY as DEFAULT_SIG_DEF
builder = tf.saved_model.builder.SavedModelBuilder('./mymodel')
with keras.backend.get_session() as sess:
output = sess.graph.get_tensor_by_name('Softmax:0')
input_tensor = sess.graph.get_tensor_by_name('input_1:0')
sig_def = tf.saved_model.signature_def_utils.predict_signature_def(
{'input': input_tensor},
{'output': output}
)
builder.add_meta_graph_and_variables(
sess, tf.saved_model.tag_constants.SERVING,
signature_def_map={
DEFAULT_SIG_DEF: sig_def
}
)
builder.save()
After running this code you should have a mymodel/saved_model.pb file as well as a directory mymodel/variables/ with protobufs corresponding to the variable values.
Then to load the model again, simply use tf.saved_model.loader:
# Does Keras give you the ability to start with a fresh graph?
# If not you'll need to do this in a separate program to avoid
# conflicts with the old default graph
with tf.Session(graph=tf.Graph()):
meta_graph_def = tf.saved_model.loader.load(
sess,
tf.saved_model.tag_constants.SERVING,
'./mymodel'
)
# From this point variables and graph structure are restored
sig_def = meta_graph_def.signature_def[DEFAULT_SIG_DEF]
print(sess.run(sig_def.outputs['output'], feed_dict={sig_def.inputs['input']: frame}))
Obviously there's a more efficient prediction available with this code through tensorflow/serving, or Cloud ML Engine, but this should work.
It's possible that Keras is doing something under the hood which will interfere with this process as well, and if so we'd like to hear about it (and I'd like to make sure that Keras users are able to freeze graphs as well, so if you want to send me a gist with your full code or something maybe I can find someone who knows Keras well to help me debug.)
EDIT: You can find an end to end example of this here: https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/census/keras/trainer/model.py#L85

Tensorflow save/restore batch norm

I trained a model with batch norm in Tensorflow. I would like to save the model and restore it for further using. The batch norm is done by
def batch_norm(input, phase):
return tf.layers.batch_normalization(input, training=phase)
where the phase is True during training and False during testing.
It seems like simply calling
saver = tf.train.Saver()
saver.save(sess, savedir + "ckpt")
would not work well because when I restore the model it first says restored successfully. It also says Attempting to use uninitialized value batch_normalization_585/beta if I just run one node in the graph. Is this related to not saving the model properly or something else that I've missed?
I also had the "Attempting to use uninitialized value batch_normalization_585/beta" error. This comes from the fact that by declaring the saver with the empty brackets like this:
saver = tf.train.Saver()
The saver will save the variables contained in tf.trainable_variables() which do not contain the moving average of the batch normalization. To include this variables into the saved ckpt you need to do:
saver = tf.train.Saver(tf.global_variables())
Which saves ALL the variables, so it is very memory consuming. Or you must identify the variables that have moving avg or variance and save them by declaring them like:
saver = tf.train.Saver(tf.trainable_variables() + list_of_extra_variables)
Not sure if this needs to be explained, but just in case (and for other potential viewers).
Whenever you create an operation in TensorFlow, a new node is added to the graph. No two nodes in a graph can have the same name. You can define the name of any node you create, but if you don't give a name, TensorFlow will pick one for you in a deterministic way (that is, not randomly, but instead always with the same sequence). If you add two numbers, it will probably be Add, but if you do another addition, since no two nodes can have the same name, it may be something like Add_2. Once a node is created in a graph its name cannot be changed. Many functions create several subnodes in turn; for example, tf.layers.batch_normalization creates some internal variables beta and gamma.
Saving and restoring works in the following way:
You create a graph representing the model that you want. This graph contains the variables that will be saved by the saver.
You initialize, train or do whatever you want with that graph, and the variables in the model get assigned some values.
You call save on the saver to, well, save the values of the variables to a file.
Now you recreate the model in a different graph (it can be a different Python session altogether or just another graph coexisting with the first one). The model must be created in exactly the same way the first one was.
You call restore on the saver to retrieve the values of the variables.
In order for this to work, the names of the variables in the first and the second graph must be exactly the same.
In your example, TensorFlow is complaining about the variable batch_normalization_585/beta. It seems that you have called tf.layers.batch_normalization nearly 600 times in the same graph, so you have that many beta variables hanging around. I doubt that you actually need that many, so I guess you are just experimenting with the API and ended up with that many copies.
Here's a draft of something that should work:
import tensorflow as tf
def make_model():
input = tf.placeholder(...)
phase = tf.placeholder(...)
input_norm = tf.layers.batch_normalization(input, training=phase))
# Do some operations with input_norm
output = ...
saver = tf.train.Saver()
return input, output, phase, saver
# We work with one graph first
g1 = tf.Graph()
with g1.as_default():
input, output, phase, saver = make_model()
with tf.Session() as sess:
# Do your training or whatever...
saver.save(sess, savedir + "ckpt")
# We work with a second different graph now
g2 = tf.Graph()
with g2.as_default():
input, output, phase, saver = make_model()
with tf.Session() as sess:
saver.restore(sess, savedir + "ckpt")
# Continue using your model...
Again, the typical case is not to have two graphs side by side, but rather have one graph and then recreate it in another Python session later, but in the end both things are the same. The important part is that the model is created in the same way (and therefore with the same node names) in both cases.

Save and load Tensorflow model

I want to save a Tensorflow (0.12.0) model, including graph and variable values, then later load and execute it. I have the read the docs and other posts on this but cannot get the basics to work. I am using the technique from this page in the Tensorflow docs. Code:
Save a simple model:
myVar = tf.Variable(7.1)
tf.add_to_collection('modelVariables', myVar) # why?
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
print sess.run(myVar)
saver0 = tf.train.Saver()
saver0.save(sess, './myModel.ckpt')
saver0.export_meta_graph('./myModel.meta')
Later, load and execute the model:
with tf.Session() as sess:
saver1 = tf.train.import_meta_graph('./myModel.meta')
saver1.restore(sess, './myModel.meta')
print sess.run(myVar)
Question 1: The saving code seems to work but the loading code produces this error:
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open ./myModel.meta: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
How to fix this?.
Question 2: I included this line to follow the pattern in the TF docs...
tf.add_to_collection('modelVariables', myVar)
... but why is that line necessary? Doesn't expert_meta_graphexport the entire graph by default? If not then does one need to add every variable in the graph to the collection before saving? Or do we just add to the collection those variables that will be accessed after the restore?
---------------------- Update January 12 2017 -----------------------------
Partial success based on Kashyap's suggestion below but a mystery still exists. The code below works but only if I include the lines containing tf.add_to_collection and tf.get_collection. Without those lines, 'load' mode throws an error in the last line:
NameError: name 'myVar' is not defined. My understanding was that by default Saver.save saves and restores all variables in the graph, so why is it necessary to specify the name of variables that will be used in the collection? I assume this has to do with mapping Tensorflow's variable names to Python names, but what are the rules of the game here? For which variables does this need to be done?
mode = 'load' # or 'save'
if mode == 'save':
myVar = tf.Variable(7.1)
init_op = tf.global_variables_initializer()
saver0 = tf.train.Saver()
tf.add_to_collection('myVar', myVar) ### WHY NECESSARY?
with tf.Session() as sess:
sess.run(init_op)
print sess.run(myVar)
saver0.save(sess, './myModel')
if mode == 'load':
with tf.Session() as sess:
saver1 = tf.train.import_meta_graph('./myModel.meta')
saver1.restore(sess, tf.train.latest_checkpoint('./'))
myVar = tf.get_collection('myVar')[0] ### WHY NECESSARY?
print sess.run(myVar)
Question1
This question has been already answered thoroughly here. You don't have to explicitly call export_meta_graph. Call the save method. This will generate the .meta file also (since save method will call the export_meta_graph method internally.)
For example
saver0.save(sess, './myModel.ckpt')
will produce myModel.ckpt file and also the myModel.ckpt.meta file.
Then you can restore the model using
with tf.Session() as sess:
saver1 = tf.train.import_meta_graph('./myModel.ckpt.meta')
saver1.restore(sess, './myModel')
print sess.run(myVar)
Question2
Collections are used to store custom information like learning rate,the regularisation factor that you have used and other information and these will be stored when you export the graph. Tensorflow itself defines some collections like "TRAINABLE_VARIABLES" which are used to get all the trainable variables of the model you built. You can chose to export all the collections in your graph or you can specify which collections to export in the export_meta_graph function.
Yes tensorflow will export all the variables that you define. But if you need any other information that needs to be exported to the graph then they can be added to the collection.
I've been trying to figure out the same thing and was able to successfully do it by using Supervisor. It automatically loads all variables and your graph etc. Here is the documentation - https://www.tensorflow.org/programmers_guide/supervisor. Below is my code -
sv = tf.train.Supervisor(logdir="/checkpoint', save_model_secs=60)
with sv.managed_session() as sess:
if not sv.should_stop():
#Do run/eval/train ops on sess as needed. Above works for both saving and loading
As you see, this is much simpler than using the Saver object and dealing with individual variables etc as long as the graph stays the same (my understanding is that Saver comes handy when we want to reuse a pre-trained model for a different graph).

Tensorflow freeze_graph script failing on model defined with Keras

I'm attempting to export a model built and trained with Keras to a protobuffer that I can load in a C++ script (as in this example). I've generated a .pb file containing the model definition and a .ckpt file containing the checkpoint data. However, when I try to merge them into a single file using the freeze_graph script I get the error:
ValueError: Fetch argument 'save/restore_all' of 'save/restore_all' cannot be interpreted as a Tensor. ("The name 'save/restore_all' refers to an Operation not in the graph.")
I'm saving the model like this:
with tf.Session() as sess:
model = nndetector.architecture.models.vgg19((3, 50, 50))
model.load_weights('/srv/nn/weights/scratch-vgg19.h5')
init_op = tf.initialize_all_variables()
sess.run(init_op)
graph_def = sess.graph.as_graph_def()
tf.train.write_graph(graph_def=graph_def, logdir='.', name='model.pb', as_text=False)
saver = tf.train.Saver()
saver.save(sess, 'model.ckpt')
nndetector.architecture.models.vgg19((3, 50, 50)) is simply a vgg19-like model defined in Keras.
I'm calling the freeze_graph script like this:
bazel-bin/tensorflow/python/tools/freeze_graph --input_graph=[path-to-model.pb] --input_checkpoint=[path-to-model.ckpt] --output_graph=[output-path] --output_node_names=sigmoid --input_binary=True
If I run the freeze_graph_test script everything works fine.
Does anyone know what I'm doing wrong?
Thanks.
Best regards
Philip
EDIT
I've tried printing tf.train.Saver().as_saver_def().restore_op_name which returns save/restore_all.
Additionally, I've tried a simple pure tensorflow example and still get the same error:
a = tf.Variable(tf.constant(1), name='a')
b = tf.Variable(tf.constant(2), name='b')
add = tf.add(a, b, 'sum')
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
tf.train.write_graph(graph_def=sess.graph.as_graph_def(), logdir='.', name='simple_as_binary.pb', as_text=False)
tf.train.Saver().save(sess, 'simple.ckpt')
And I'm actually also unable to restore the graph in python. Using the following code throws ValueError: No variables to save if I execute it separately from saving the graph (that is, if I both save and restore the model in the same script, everything works fine).
with gfile.FastGFile('simple_as_binary.pb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Session() as sess:
tf.import_graph_def(graph_def)
saver = tf.train.Saver()
saver.restore(sess, 'simple.ckpt')
I'm not sure if the two problems are related, or if I'm simply not restoring the model correctly in python.
The problem is the order of these two lines in your original program:
tf.train.write_graph(graph_def=sess.graph.as_graph_def(), logdir='.', name='simple_as_binary.pb', as_text=False)
tf.train.Saver().save(sess, 'simple.ckpt')
Calling tf.train.Saver() adds a set of nodes to the graph, including one called "save/restore_all". However, this program calls it after writing out the graph, so the file you pass to freeze_graph.py doesn't contain those nodes, which are necessary for doing the rewriting.
Reversing the two lines should make the script work as intended:
tf.train.Saver().save(sess, 'simple.ckpt')
tf.train.write_graph(graph_def=sess.graph.as_graph_def(), logdir='.', name='simple_as_binary.pb', as_text=False)
So, I got it working. Sort of.
By using tensorflow.python.client.graph_util.convert_variables_to_constants directly instead of first saving GraphDef and a checkpoint to disk and then using the freeze_graph tool/script, I have been able to save a GraphDef containing both graph definition and variables converted to constants.
EDIT
mrry updated his answer, which solved my issue of freeze_graph not working, but I'll leave this answer as well, in case anyone else could find it useful.

Tensorflow: Saving and restoring the model parameters

I am a beginner in TensorFlow, currently training a CNN.
I am using Saver in order to save the parameters used by the model, but I am having concerns whether this would itself store all the Variables used by the model, and is sufficient to restore the values to re-run the program for performing classification/testing on the trained network.
Let us look at the famous example MNIST given by TensorFlow.
In the example, we have bunch of Convolutional blocks, all of which have weight, and bias variables that gets initialised when the program is run.
W_conv1 = init_weight([5,5,1,32])
b_conv1 = init_bias([32])
After having processed several layers, we create a session, and initialise all the variables added to the graph.
sess = tf.Session()
sess.run(tf.initialize_all_variables())
saver = tf.train.Saver()
Here, is it possible to comment the saver.save code, and replace it by saver.restore(sess,file_path) after the training, in order to restore the weight, bias, etc., parameters back to the graph? Is this how it should be ?
for i in range(1000):
...
if i%500 == 0:
saver.save(sess,"model%d.cpkt"%(i))
I am currently training on large dataset, so terminating, and restarting the training is a waste of time, and resources so I request someone to please clarify before the I start the training.
If you want to save the final result only once, you can do this:
with tf.Session() as sess:
for i in range(1000):
...
path = saver.save(sess, "model.ckpt") # out of the loop
print "Saved:", path
In other programs, you can load the model using the path returned from saver.save for prediction or something. You can see some examples at https://github.com/sugyan/tensorflow-mnist.
Based on the explanation in here and Sung Kim solution I wrote a very simple model exactly for this problem. Basically in this way you need to create an object from the same class and restore its variables from the saver. You can find an example of this solution here.