Error while loading a pretrained resnet model - tensorflow

I am trying to load the pre-trained ResNet model in the below link
https://drive.google.com/open?id=1xkVK92XLZOgYlpaRpG_-WP0Elzg4ewpw
But it gives RuntimeError: The Session graph is empty. Add operations to the graph before calling run().
What could be the possible issue?
import tensorflow as tf
import tensorflow.contrib.slim as slim
# Let's load a previously saved meta graph in the default graph
# This function returns a Saver
saver = tf.train.import_meta_graph('model.ckpt-0.meta')
# We can now access the default graph where all our metadata has been loaded
graph = tf.get_default_graph()
with tf.Session(graph=tf.Graph()) as sess:
saver.restore(sess, 'model.ckpt-0.data-00000-of-00001')
print('Worked')

you must have a model(Rough house),and load parameter(Bed, furniture).now you need a Rough house(operations,such as:tf.Variable(),tf.add(),tf.nn.softmax_cross_entropy_with_logits()).

with tf.Session() as sess:
# tf.saved_model.loader.load(sess, [tag_constants.TRAINING], export_dir)
saver = tf.train.import_meta_graph('C://Users//hardi//tutorial//resnet//model.ckpt.meta')
# new_saver = saver.restore(sess, tf.train.latest_checkpoint('C://Users//hardi//tutorial//resnet//'))
saver.restore(sess, 'model.ckpt')
graph = tf.get_default_graph()
print('success')
The error was to bring the saver instance in the loop and use 'model.ckpt' instead of 'model.ckpt-0.data-00000-of-00001' as V2 checkpoint
found solution here https://github.com/tensorflow/models/issues/2676

Related

tf.train.Saver not loading in same process

I am observing a strange behavior where Saver can't restore if the checkpoint was saved earlier in the same Python process. It loads fine if done from a different process. Here's some simple code that will show the problem.
import tensorflow.compat.v1 as tf
def train():
W = tf.Variable(tf.zeros([1, 1]))
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, "./model.ckpt")
def predict():
W = tf.Variable(tf.zeros([1, 1]))
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.restore(sess, "./model.ckpt")
train()
predict()
Here we save and restore immediately after that in the same process. Restoration fails with errors like:
Key Variable_1 not found in checkpoint
But if I run just the predict() code again from a new Python process it works just fine.
#train()
predict()
Am I doing something wrong here?
After predict, if you run:
print([v for v in tf.trainable_variables()])
you will see that two different variables are being created. That's why TF is not able to restore the value of the second one.
In order to link both variables into a single one, you can either:
Pass a dictionary to the argument var_list of tf.train.Saver. For example:
saver = tf.train.Saver({'W': W})
Use auto-reusing when creating the variable. For example:
with tf.variable_scope('', reuse=tf.AUTO_REUSE):
W = tf.get_variable(initializer=lambda: tf.zeros([1, 1]),
name='W')

How to retrain more than the final layer of inception

I have used the retrain.py script provided at the tensorflow github to finetune a pretrained inceptionV3 model on my own dataset. I have the model saved to disk and now I want to use it as the starting point for another round of training in which I retrain all of the convolutional layers. Below is the code I am attempting to use to create the new graph. I thought that once I have the graph loaded into the default graph I could access the variables in it with tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES) and set them all to trainable. However, there doesn't appear to be anything in any of the tf.GraphKeys variables (GLOBAL_VARIABLES, TRAINABLE_VARIABLES, MODEL_VARIABLES etc.). So when I try to create the optimizer I get an error "ValueError: No variables to optimize." What am I doing wrong?
def create_graph(model_path, class_count):
"""Creates a graph from saved GraphDef file and returns a saver."""
with tf.Graph().as_default() as graph:
# Creates graph from saved graph_def.pb.
with tf.gfile.FastGFile(model_path, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
#import logits from saved graph
logits_tensor = graph.get_tensor_by_name("final_training_ops/biases/final_biases:0")
ground_truth_input = tf.placeholder(tf.float32,
[None, class_count],
name='GroundTruthInput')
print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
#connect logits to the training ops
with tf.name_scope('cross_entropy'):
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=ground_truth_input, logits=logits_tensor)
with tf.name_scope('total'):
cross_entropy_mean = tf.reduce_mean(cross_entropy)
tf.summary.scalar('cross_entropy', cross_entropy_mean)
with tf.name_scope('train'):
optimizer = tf.train.GradientDescentOptimizer(0.001)
train_step = optimizer.minimize(cross_entropy_mean)
return graph, ground_truth_input, cross_entropy_mean, train_step

Tensorflow session and graph context

My question is about context and the TensorFlow default sessions and graph.
The problem:
Tensorflow is unable to feed a placeholder in the following scenario:
Function Test defines a graph.
Function Test_Once defines a session.
When Function Test calls Test_Once -> Feeding fails.
When I change the code so function Test declares the graph + the session -> all is working.
Here is the code:
def test_once(g, saver, summary_writer, logits, images, summary_op):
"""Run a session once for a givven test image.
Args:
saver: Saver.
summary_writer: Summary writer.
logits:
summary_op: Summary op.
"""
with tf.Session(graph=g) as sess:
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
# Restores from checkpoint
saver.restore(sess, ckpt.model_checkpoint_path)
# extract global_step from it.
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
else:
print('No checkpoint file found')
return
images.astype(np.float32)
predictions = sess.run(logits, feed_dict={'InputPlaceHolder/TestInput:0':images})
summary = tf.Summary()
summary.ParseFromString(sess.run(summary_op))
summary_writer.add_summary(summary, global_step)
return (predictions)
def test():
"""Test LCPR with a test image"""
with tf.Graph().as_default() as g:
# Get image for testing
images, labels = lcpr.test_input()
# Build a Graph that computes the logits predictions from the
# inference model.
with tf.name_scope('InputPlaceHolder'):
test_image_placeholder = tf.placeholder(tf.float32, (None,None,None,3), 'TestInput')
# Display the training images in the visualizer.
# The 'max_outputs' default is 3. Not stated. (Max number of batch elements to generate images for.)
#tf.summary.image('input_images', test_image_placeholder)
with tf.name_scope('Inference'):
logits = lcpr.inference(test_image_placeholder)
# Restore the moving average version of the learned variables for eval.
variable_averages = tf.train.ExponentialMovingAverage(
lcpr.MOVING_AVERAGE_DECAY)
variables_to_restore = variable_averages.variables_to_restore()
saver = tf.train.Saver(variables_to_restore)
# Build the summary operation based on the TF collection of Summaries.
writer = tf.summary.FileWriter("/tmp/lcpr/test")
writer.add_graph(g)
summary_op = tf.summary.merge_all()
summary_writer = tf.summary.FileWriter(FLAGS.test_dir, g)
#Sadly, this will not work:
predictions = test_once(g, saver, summary_writer, logits, images, summary_op)
'''Alternative working option :
with tf.Session() as sess:
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
# Restores from checkpoint
saver.restore(sess, ckpt.model_checkpoint_path)
# Assuming model_checkpoint_path looks something like:
# /my-favorite-path/cifar10_train/model.ckpt-0,
# extract global_step from it.
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
else:
print('No checkpoint file found')
return
x = sess.run(logits, feed_dict={'InputPlaceHolder/TestInput:0':images})
print(x)
'''
The above code yeilds an error that the placeholder is not fed:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'InputPlaceHolder/TestInput' with dtype float
And it's not that TensorFlow does not recognize the placeholder. If I change the name from 'InputPlaceHolder/TestInput:0' to 'InputPlaceHolder/TestInput:1' I receive a message calming that 'InputPlaceHolder/TestInput' exists but has only 1 output. This makes sense, and I guess the session runs on my default graph.
Things only work for me if I stay within the same def:
If I change the code by running the commented part (starting ' with tf.Session() as sess:) directly from within the first function all works.
I wonder what am I missing?
My guess that is context related, maybe not assigning the session to the graph?
Solved. Stupid mistake
test_once calls sess.run twice. On the second time, indeed no placeholder is fed.... : summary.ParseFromString(sess.run(summary_op))

Why I need to use tf.global_variables_initializer() to initialize variables when I am using some pre trained model paramenters?

def train_model(model, batch_gen, num_train_steps, weights_fld):
saver = tf.train.Saver() # defaults to saving all variables - in this case embed_matrix, nce_weight, nce_bias
initial_step = 0
with tf.Session() as sess:
**sess.run(tf.global_variables_initializer())**
ckpt = tf.train.get_checkpoint_state(os.path.dirname('checkpoints/checkpoint'))
# if that checkpoint exists, restore from checkpoint
***if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)***
In the above codes it's very clear how ow graph try to import pretrained parameters if there are any.(highlighted section)
So if have already trained parameter set (for example weight set of a neural net) why we still have to initialize variables with tf.global_variables_initializer()?
You do not have to use we tf.global_variables_initializer() if you use saver.restore(sess, file) before running any of the tensorflow graph.
Rewrite your code like so:
with tf.Session() as sess:
ckpt = tf.train.get_checkpoint_state(os.path.dirname('checkpoints/checkpoint'))
# if that checkpoint exists, restore from checkpoint
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
else :
sess.run(tf.global_variables_initializer())
You can see a fully working example of another example here

Restore checkpoint in Tensorflow - tensor name not found

Trying to run the Inceptionv3 Tensorflow model with the architecture and the checkpoint provided by Google here.
My issue is that my script crashes on saver.restore(sess, "./inception_v3.ckpt") with the following error:
tensorflow.python.framework.errors.NotFoundError: Tensor name "InceptionV3/Mixed_5b/Branch_1/Conv2d_0b_5x5/biases" not found in checkpoint files ./inception_v3.ckpt
Here is my code:
import tensorflow as tf
import inception_v3
with tf.Session() as sess:
image = tf.read_file('./file.jpg')
# code to decode, crop, convert jpeg
eval_inputs = tf.pack([image])
logits, _ = inception_v3.inception_v3(eval_inputs, num_classes=1001, is_training=False)
sess.run(tf.initialize_all_variables())
saver = tf.train.Saver()
saver.restore(sess, "./inception_v3.ckpt")
I get the same errors with the other checkpoint/model combinations so this must be an issue with my code. Not sure what I am doing wrong though.
Thank you
Indeed the checkpoint file does not contain this tensor. Can you file a bug on github?
You need to call inception_v3() within the arg_scope() returned by inception_v3_arg_scope() like this:
import tensorflow as tf
import tensorflow.contrib.slim as slim
from nets.inception_v3 import inception_v3, inception_v3_arg_scope
height = 299
width = 299
channels = 3
# Create graph
X = tf.placeholder(tf.float32, shape=[None, height, width, channels])
with slim.arg_scope(inception_v3_arg_scope()):
logits, end_points = inception_v3(X, num_classes=1001,
is_training=False)
predictions = end_points["Predictions"]
saver = tf.train.Saver()
X_test = ... # your images, shape [batch_size, 299, 299, 3]
# Execute graph
with tf.Session() as sess:
saver.restore(sess, "./inception_v3.ckpt")
predictions_val = predictions.eval(feed_dict={X: X_test})
predicted_classes = np.argmax(predictions_val, axis=1)
I recommend clearly separating the construction phase and the execution phase. Just tested on a random photo on the web, and it worked fine. :)