generating pb file in tensorflow - tensorflow

I'm trying to generate a pb file using the method given in this tutorial,
http://cv-tricks.com/how-to/freeze-tensorflow-models/
import tensorflow as tf
saver = tf.train.import_meta_graph('/Users/pr/tensorflow/dogs-cats-model.meta', clear_devices=True)
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
sess = tf.Session()
saver.restore(sess, "./dogs-cats-model")
When I try to run this code I get this error -
DataLossError (see above for traceback): Unable to open table file ./dogs-cats-model: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator?
WHen I googled this error most of them recommend to generate the meta file using version 2 format? Is that the right approach?
Tensorflow version used -
1.3.0

Apparently, you are using both '/Users/pr/tensorflow/dogs-cats-model.meta' and './dogs-cats-model.meta'. Are you sure they point to the same file?
The following code works well on my machine:
import tensorflow as tf
saver = tf.train.import_meta_graph('./dogs-cats-model.meta', clear_devices=True)
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
sess = tf.Session()
saver.restore(sess, "./dogs-cats-model")

Related

Can't import frozen graph with BatchNorm layer

I have trained a Keras model based on this repo.
After the training I save the model as checkpoint files like this:
sess=tf.keras.backend.get_session()
saver = tf.train.Saver()
saver.save(sess, current_run_path + '/checkpoint_files/model_{}.ckpt'.format(date))
Then I restore the graph from the checkpoint files and freeze it using the standard tf freeze_graph script. When I want to restore the frozen graph I get the following error:
Input 0 of node Conv_BN_1/cond/ReadVariableOp/Switch was passed float from Conv_BN_1/gamma:0 incompatible with expected resource
How can I fix this issue?
Edit: My problem is related to this question. Unfortunately, I can't use the workaround.
Edit 2:
I have opened an issue on github and created a gist to reproduce the error.
https://github.com/keras-team/keras/issues/11032
Just resolved the same issue. I connected this few answers: 1, 2, 3 and realized that issue originated from batchnorm layer working state: training or learning. So, in order to resolve that issue you just need to place one line before loading your model:
keras.backend.set_learning_phase(0)
Complete example, to export model
import tensorflow as tf
from tensorflow.python.framework import graph_io
from tensorflow.keras.applications.inception_v3 import InceptionV3
def freeze_graph(graph, session, output):
with graph.as_default():
graphdef_inf = tf.graph_util.remove_training_nodes(graph.as_graph_def())
graphdef_frozen = tf.graph_util.convert_variables_to_constants(session, graphdef_inf, output)
graph_io.write_graph(graphdef_frozen, ".", "frozen_model.pb", as_text=False)
tf.keras.backend.set_learning_phase(0) # this line most important
base_model = InceptionV3()
session = tf.keras.backend.get_session()
INPUT_NODE = base_model.inputs[0].op.name
OUTPUT_NODE = base_model.outputs[0].op.name
freeze_graph(session.graph, session, [out.op.name for out in base_model.outputs])
to load *.pb model:
from PIL import Image
import numpy as np
import tensorflow as tf
# https://i.imgur.com/tvOB18o.jpg
im = Image.open("/home/chichivica/Pictures/eagle.jpg").resize((299, 299), Image.BICUBIC)
im = np.array(im) / 255.0
im = im[None, ...]
graph_def = tf.GraphDef()
with tf.gfile.GFile("frozen_model.pb", "rb") as f:
graph_def.ParseFromString(f.read())
graph = tf.Graph()
with graph.as_default():
net_inp, net_out = tf.import_graph_def(
graph_def, return_elements=["input_1", "predictions/Softmax"]
)
with tf.Session(graph=graph) as sess:
out = sess.run(net_out.outputs[0], feed_dict={net_inp.outputs[0]: im})
print(np.argmax(out))
This is bug with Tensorflow 1.1x and as another answer stated, it is because of the internal batch norm learning vs inference state. In TF 1.14.0 you actually get a cryptic error when trying to freeze a batch norm layer.
Using set_learning_phase(0) will put the batch norm layer (and probably others like dropout) into inference mode and thus the batch norm layer will not work during training, leading to reduced accuracy.
My solution is this:
Create the model using a function (do not use K.set_learning_phase(0)):
def create_model():
inputs = Input(...)
...
return model
model = create_model()
Train model
Save weights:
model.save_weights("weights.h5")
Clear session (important so layer names are the same) and set learning phase to 0:
K.clear_session()
K.set_learning_phase(0)
Recreate model and load weights:
model = create_model()
model.load_weights("weights.h5")
Freeze as before
Thanks for pointing the main issue! I found that keras.backend.set_learning_phase(0) to be not working sometimes, at least in my case.
Another approach might be: for l in keras_model.layers: l.trainable = False

Tensorflow: Unable to open table file error

With tensorflow 1.2.0, I am trying to restore a saved model but I receive the error:
DataLossError (see above for traceback): Unable to open table file checkpoints/saved_2/saved_2_model_1.meta: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2_185 = RestoreV2[dtypes=[DT_INT32], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_185/tensor_names, save/RestoreV2_185/shape_and_slices)]]
I am using the same tensorflow version for saving and restoring.
For saving:
saver = tf.train.Saver()
ckpt_dir = os.path.join(params['CHK_PATH'], folder)
if not os.path.exists(ckpt_dir):
os.makedirs(ckpt_dir)
ckpt_file = os.path.join(ckpt_dir, '{}'.format(name))
path = saver.save(sess, ckpt_file)
For restoring:
saver.restore(sess, ckpt_file)
I tried: model_saver = tf.train.Saver(write_version = saver_pb2.SaverDef.V1)
But the same problem remains.
saver.restore(sess,tf.train.latest_checkpoint(ckpt_dir))
works

How to restore a model by filename in Tensorflow r12?

I have run the distributed mnist example:
https://github.com/tensorflow/tensorflow/blob/r0.12/tensorflow/tools/dist_test/python/mnist_replica.py
Though I have set the
saver = tf.train.Saver(max_to_keep=0)
In previous release, like r11, I was able to run over each check point model and evaluate the precision of the model. This gave me a plot of the progress of the precision versus global steps (or iterations).
Prior to r12, tensorflow checkpoint models were saved in two files, model.ckpt-1234 and model-ckpt-1234.meta. One could restore a model by passing the model.ckpt-1234 filename like so saver.restore(sess,'model.ckpt-1234').
However, I've noticed that in r12, there are now three output files model.ckpt-1234.data-00000-of-000001, model.ckpt-1234.index, and model.ckpt-1234.meta.
I see that the the restore documentation says that a path such as /train/path/model.ckpt should be given to restore instead of a filename. Is there any way to load one checkpoint file at a time to evaluate it? I have tried passing the model.ckpt-1234.data-00000-of-000001, model.ckpt-1234.index, and model.ckpt-1234.meta files, but get errors like below:
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open logdir/2016-12-08-13-54/model.ckpt-0.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
NotFoundError (see above for traceback): Tensor name "hid_b" not found in checkpoint files logdir/2016-12-08-13-54/model.ckpt-0.index
[[Node: save/RestoreV2_1 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open logdir/2016-12-08-13-54/model.ckpt-0.meta: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
I'm running on OSX Sierra with tensorflow r12 installed via pip.
Any guidance would be helpful.
Thank you.
I also used Tensorlfow r0.12 and I didn't think there is any issue for saving and restoring model. The following is a simple code that you can have a try:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="v1")
v2 = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="v2")
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print("Model saved in file: %s" % save_path)
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Do some work with the model
although in r0.12, the checkpoint is stored in multiple files, you can restore it by using the common prefix, which is 'model.ckpt' in your case.
The R12 has changed the checkpoint format. You should save the model in the old format.
import tensorflow as tf
from tensorflow.core.protobuf import saver_pb2
...
saver = tf.train.Saver(write_version = saver_pb2.SaverDef.V1)
saver.save(sess, './model.ckpt', global_step = step)
According to the TensorFlow v0.12.0 RC0’s release note:
New checkpoint format becomes the default in tf.train.Saver. Old V1
checkpoints continue to be readable; controlled by the write_version
argument, tf.train.Saver now by default writes out in the new V2
format. It significantly reduces the peak memory required and latency
incurred during restore.
see details in my blog.
You can restore the model like this:
saver = tf.train.import_meta_graph('./src/models/20170512-110547/model-20170512-110547.meta')
saver.restore(sess,'./src/models/20170512-110547/model-20170512-110547.ckpt-250000'))
Where the path '/src/models/20170512-110547/' contains three files:
model-20170512-110547.meta
model-20170512-110547.ckpt-250000.index
model-20170512-110547.ckpt-250000.data-00000-of-00001
And if in one directory there are more than one checkpoints,eg: there are checkpoint files in the path
./20170807-231648/:
checkpoint
model-20170807-231648-0.data-00000-of-00001
model-20170807-231648-0.index
model-20170807-231648-0.meta
model-20170807-231648-100000.data-00000-of-00001
model-20170807-231648-100000.index
model-20170807-231648-100000.meta
you can see that there are two checkpoints, so you can use this:
saver = tf.train.import_meta_graph('/home/tools/Tools/raoqiang/facenet/models/facenet/20170807-231648/model-20170807-231648-0.meta')
saver.restore(sess,tf.train.latest_checkpoint('/home/tools/Tools/raoqiang/facenet/models/facenet/20170807-231648/'))
OK, I can answer my own question. What I found was that my python script was adding an extra '/' to my path so I was executing:
saver.restore(sess,'/path/to/train//model.ckpt-1234')
somehow that was causing a problem with tensorflow.
When I removed it, calling:
saver.restore(sess,'/path/to/trian/model.ckpt-1234')
it worked as expected.
use only model.ckpt-1234
at least it works for me
I'm new to TF and met the same issue. After reading Yuan Ma's comments, I copied the '.index' to the same 'train\ckpt' folder together with '.data-00000-of-00001' file. Then it worked!
So, the .index file is sufficient when restoring the models.
I used TF on Win7, r12.

How to save a trained tensorflow model for later use for application?

I am a bit of a beginner with tensorflow so please excuse if this is a stupid question and the answer is obvious.
I have created a Tensorflow graph where starting with placeholders for X and y I have optimized some tensors which represent my model. Part of the graph is something where a vector of predictions can be calculated, e.g. for linear regression something like
y_model = tf.add(tf.mul(X,w),d)
y_vals = sess.run(y_model,feed_dict={....})
After training has been completed I have acceptable values for w and d and now I want to save my model for later. Then, in a different python session I want to restore the model so that I can again run
## Starting brand new python session
import tensorflow as tf
## somehow restor the graph and the values here: how????
## so that I can run this:
y_vals = sess.run(y_model,feed_dict={....})
for some different data and get back the y-values.
I want this to work in a way where the graph for calculating the y-values from the placeholders is also stored and restored - as long as the placeholders get fed the correct data, this should work transparently without the user (the one who applies the model) needing to know what the graph looks like).
As far as I understand tf.train.Saver().save(..) only saves the variables but I also want to save the graph. I think that tf.train.export_meta_graph could be relevant here but I do not understand how to use it correctly, the documentation is a bit cryptic to me and the examples do not even use export_meta_graph anywhere.
From the docs, try this:
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
..
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print("Model saved in file: %s" % save_path)
You can specify the path.
And if you want to restore the model, try:
with tf.Session() as sess:
saver = tf.train.import_meta_graph('/tmp/model.ckpt.meta')
saver.restore(sess, "/tmp/model.ckpt")
Saving Graph in Tensorflow:
import tensorflow as tf
# Create some placeholder variables
x_pl = tf.placeholder(..., name="x")
y_pl = tf.placeholder(..., name="y")
# Add some operation to the Graph
add_op = tf.add(x, y)
with tf.Session() as sess:
# Add variable initializer
init = tf.global_variables_initializer()
# Add ops to save variables to checkpoints
# Unless var_list is specified Saver will save ALL named variables
# in Graph
# Optionally set maximum of 3 latest models to be saved
saver = tf.train.Saver(max_to_keep=3)
# Run variable initializer
sess.run(init)
for i in range(no_steps):
# Feed placeholders with some data and run operation
sess.run(add_op, feed_dict={x_pl: i+1, y_pl: i+5})
saver.save(sess, "path/to/checkpoint/model.ckpt", global_step=i)
This will save the following files:
1) Meta Graph
.meta file:
MetaGraphDef protocol buffer representation of MetaGraph which saves the complete Tf Graph structure i.e. the GraphDef that describes the dataflow and all metadata associated with it e.g. all variables, operations, collections, etc.
importing the graph structure will recreate the Graph and all its variables, then the corresponding values for these variables can be restored from the checkpoint file
if you don't want to restore the Graph however you can reconstruct all of the information in the MetaGraphDef by re-executing the Python code that builds the model n.b. you must recreate the EXACT SAME variables first before restoring their values from the checkpoint
since Meta Graph file is not always needed, you can switch off writing the file in saver.save using write_meta_graph=False
2) Checkpoint files
.data file:
binary file containing VALUES of all saved variables outlined in tf.train.Saver() (default is all variables)
.index file:
immutable table describing all tensors and their metadata checkpoint file:
keeps a record of latest checkpoint files saved
Restoring Graph in Tensorflow:
import tensorflow as tf
latest_checkpoint = tf.train.latest_checkpoint("path/to/checkpoint")
# Load latest checkpoint Graph via import_meta_graph:
# - construct protocol buffer from file content
# - add all nodes to current graph and recreate collections
# - return Saver
saver = tf.train.import_meta_graph(latest_checkpoint + '.meta')
# Start session
with tf.Session() as sess:
# Restore previously trained variables from disk
print("Restoring Model: {}".format("path/to/checkpoint"))
saver.restore(sess, latest_checkpoint)
# Retrieve protobuf graph definition
graph = tf.get_default_graph()
print("Restored Operations from MetaGraph:")
for op in graph.get_operations():
print(op.name)
# Access restored placeholder variables
x_pl = graph.get_tensor_by_name("x_pl:0")
y_pl = graph.get_tensor_by_name("y_pl:0")
# Access restored operation to re run
accuracy_op = graph.get_tensor_by_name("accuracy_op:0")
This is just a quick example with the basics, for a working implementation see here.
In order to save the graph, you need to freeze the graph.
Here is the python script for freezing the graph : https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py
Here is a code snippet for freezing graph:
from tensorflow.python.tools import freeze_graph
freeze_graph.freeze_graph(input_graph_path, input_saver_def_path,
input_binary, checkpoint_path, output_node
restore_op_name, filename_tensor_name,
output_frozen_graph_name, True, "")
where output node corresponds to output tensor variable.
output = tf.nn.softmax(outer_layer_name,name="output")

Tensorflow freeze_graph script failing on model defined with Keras

I'm attempting to export a model built and trained with Keras to a protobuffer that I can load in a C++ script (as in this example). I've generated a .pb file containing the model definition and a .ckpt file containing the checkpoint data. However, when I try to merge them into a single file using the freeze_graph script I get the error:
ValueError: Fetch argument 'save/restore_all' of 'save/restore_all' cannot be interpreted as a Tensor. ("The name 'save/restore_all' refers to an Operation not in the graph.")
I'm saving the model like this:
with tf.Session() as sess:
model = nndetector.architecture.models.vgg19((3, 50, 50))
model.load_weights('/srv/nn/weights/scratch-vgg19.h5')
init_op = tf.initialize_all_variables()
sess.run(init_op)
graph_def = sess.graph.as_graph_def()
tf.train.write_graph(graph_def=graph_def, logdir='.', name='model.pb', as_text=False)
saver = tf.train.Saver()
saver.save(sess, 'model.ckpt')
nndetector.architecture.models.vgg19((3, 50, 50)) is simply a vgg19-like model defined in Keras.
I'm calling the freeze_graph script like this:
bazel-bin/tensorflow/python/tools/freeze_graph --input_graph=[path-to-model.pb] --input_checkpoint=[path-to-model.ckpt] --output_graph=[output-path] --output_node_names=sigmoid --input_binary=True
If I run the freeze_graph_test script everything works fine.
Does anyone know what I'm doing wrong?
Thanks.
Best regards
Philip
EDIT
I've tried printing tf.train.Saver().as_saver_def().restore_op_name which returns save/restore_all.
Additionally, I've tried a simple pure tensorflow example and still get the same error:
a = tf.Variable(tf.constant(1), name='a')
b = tf.Variable(tf.constant(2), name='b')
add = tf.add(a, b, 'sum')
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
tf.train.write_graph(graph_def=sess.graph.as_graph_def(), logdir='.', name='simple_as_binary.pb', as_text=False)
tf.train.Saver().save(sess, 'simple.ckpt')
And I'm actually also unable to restore the graph in python. Using the following code throws ValueError: No variables to save if I execute it separately from saving the graph (that is, if I both save and restore the model in the same script, everything works fine).
with gfile.FastGFile('simple_as_binary.pb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Session() as sess:
tf.import_graph_def(graph_def)
saver = tf.train.Saver()
saver.restore(sess, 'simple.ckpt')
I'm not sure if the two problems are related, or if I'm simply not restoring the model correctly in python.
The problem is the order of these two lines in your original program:
tf.train.write_graph(graph_def=sess.graph.as_graph_def(), logdir='.', name='simple_as_binary.pb', as_text=False)
tf.train.Saver().save(sess, 'simple.ckpt')
Calling tf.train.Saver() adds a set of nodes to the graph, including one called "save/restore_all". However, this program calls it after writing out the graph, so the file you pass to freeze_graph.py doesn't contain those nodes, which are necessary for doing the rewriting.
Reversing the two lines should make the script work as intended:
tf.train.Saver().save(sess, 'simple.ckpt')
tf.train.write_graph(graph_def=sess.graph.as_graph_def(), logdir='.', name='simple_as_binary.pb', as_text=False)
So, I got it working. Sort of.
By using tensorflow.python.client.graph_util.convert_variables_to_constants directly instead of first saving GraphDef and a checkpoint to disk and then using the freeze_graph tool/script, I have been able to save a GraphDef containing both graph definition and variables converted to constants.
EDIT
mrry updated his answer, which solved my issue of freeze_graph not working, but I'll leave this answer as well, in case anyone else could find it useful.