Tensorflow object_detection: unable to find input and output tensors - tensorflow

I've successfully trained and saved a faster RCNN model for tensorflow using their object detection API. I'm now trying to run some inferences on the code, taking bits of code from this tutorial.
However, after I successfully restore the metagraph and the checkpoint, the system can't find the input and output nodes, I get the following error:
KeyError: "The name 'image_tensor:0' refers to a Tensor which does not
exist. The operation, 'image_tensor', does not exist in the graph."
The checkpoint and metagraph were created by the train.py script, on my own data, following the instructions given here.
This is my code:
OUTPUT_DIR = "my_path/models/SSD_v1/train"
CKPT_DIR = OUTPUT_DIR
LATEST_CKPT_FILENAME = "checkpoint"
LAST_CKPT_FILE = os.path.join(CKPT_DIR, LATEST_CKPT_FILENAME)
MODEL_FILENAME_PATH = os.path.join(OUTPUT_DIR, "model.ckpt.meta")
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
def test_model(images_list, path_to_ckpt=None,
meta_graph=None):
if path_to_ckpt is None:
path_to_ckpt = tf.train.latest_checkpoint(CKPT_DIR, LATEST_CKPT_FILENAME)
if meta_graph is None:
meta_graph = MODEL_FILENAME_PATH
print("test_model launched")
tf.reset_default_graph()
detection_graph = tf.Graph()
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
# Restore graph
saver = tf.train.import_meta_graph(meta_graph, clear_devices=True)
print('metagraph restored')
saver.restore(sess, path_to_ckpt)
print('graph restored')
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') # This is where the error happens
# Each box represents a part of the image where a particular object was detected.
detected_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
detected_scores = detection_graph.get_tensor_by_name('detection_scores:0')
detected_classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = graph.get_tensor_by_name('num_detections:0')
print("Output tensors: ")
print(detected_boxes)
print(detected_scores)
print(detected_classes)
print('')
for i, image in enumerate(images_list):
detected_boxes, detected_scores, detected_classes, num_detect = sess.run([detected_boxes, detected_scores, detected_classes, num_detections],
feed_dict={image_tensor: image})
print(i, num_detect, detected_boxes, detected_scores, detected_classes)
def main():
directory_path = "../data/samples/"
image_files = [f for f in os.listdir(directory_path) if os.path.isfile(os.path.join(directory_path, f))]
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_list = [ np.expand_dims(load_image_into_numpy_array(Image.open(os.path.join(directory_path, f))), axis=0) for f in image_files]
test_model(images_list=image_list)
if __name__=="__main__":
main()
Full error stacktrace:
Traceback (most recent call last): File "/home/guillaumedelaboulaye/PR8210PANO/faster-rcnn/pano_faster_rcnn/src/run_faster_rcnn_inference.py", line 99, in <module>
main() File "/home/guillaumedelaboulaye/PR8210PANO/faster-rcnn/pano_faster_rcnn/src/run_faster_rcnn_inference.py", line 95, in main
test_model(images_list=image_list) File "/home/guillaumedelaboulaye/PR8210PANO/faster-rcnn/pano_faster_rcnn/src/run_faster_rcnn_inference.py", line 48, in test_model
image_tensor = graph.get_tensor_by_name('image_tensor:0') File "/home/guillaumedelaboulaye/PR8210PANO/faster-rcnn/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2733, in get_tensor_by_name
return self.as_graph_element(name, allow_tensor=True, allow_operation=False) File "/home/guillaumedelaboulaye/PR8210PANO/faster-rcnn/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2584, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation) File "/home/guillaumedelaboulaye/PR8210PANO/faster-rcnn/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2626, in _as_graph_element_locked
"graph." % (repr(name), repr(op_name))) KeyError: "The name 'image_tensor:0' refers to a Tensor which does not exist. The operation, 'image_tensor', does not exist in the graph."

In the train graph, the input/output nodes are not given those names. What you will need to do is to "export" your trained model via the export_inference_graph.py tool. I believe it currently exports it to a frozen graph or a SavedModel, but in future releases, it will export to ordinary checkpoint as well.

If you want sample code for finding the node names of the graph, referring to the object_detection_tutorial.ipynb, after the "Load a (frozen) Tensorflow model into memory." block:
for node in od_graph_def.node:
print node.name
That should list all the node names that you can then enter in the subsequent blocks.

Related

Keras model compiles well outside SageMaker. But as soon as i try to train it in SageMaker with the Tensorflow instance i get an error

Here is the error: ValueError: Output tensors to a Model must be the output of a TensorFlow Layer (thus holding past layer metadata)
I try to train and deploy a multi-input Keras model with AWS Sagemaker, but there seem to be some showstopper issues with the needed libraries that expect single input for Keras models.
I have 3 categorical inputs variables and one numeric variable. The target variable is also of type categorical.I have no test or validation data. I am only interested in the training without errors.
I merged the arrays after data preparation as follows and then stored them in s3
input_train = np.column_stack((input_cat1, input_cat2, input_num, input_cat3))
training_input_path = sage_maker_session.upload_data('data/training.npz', key_prefix=prefix + training_folder)
print(training_input_path)
s3://sagemaker-eu-central-1-xxxxxxxxxxxxx/user_tracking/training/training.npz
In the train.py script (entry_point), I again fetched the file from s3. And I compiled the Train.py file again without problems, as if I were outside SageMaker.
%%writefile train.py
### import library ###
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=60)
parser.add_argument('--batch-size', type=int, default=50)
parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR'])
#parser.add_argument('--model-dir', type=str)
parser.add_argument('--training', type=str, default=os.environ['SM_CHANNEL_TRAINING'])
#parser.add_argument('--training', type=str, default='data')
args, _ = parser.parse_known_args()
epochs = args.epochs
batch_size = args.batch_size
model_dir = args.model_dir
training_dir = args.training
input_train =np.load(os.path.join(training_dir, 'training.npz'))['train_input']
target =np.load(os.path.join(training_dir, 'training.npz'))['train_output']
input_cat1 = input_train[:,0].astype(np.int32)
input_cat2 = input_train[:,1].astype(np.int32)
input_cat3 = input_train[:,3:].astype(np.int32)
input_num = input_train[:,2].astype(np.float32)
n_steps = 2 # number of timesteps in each sample
num_unique_os = 5 #len(le_betriebsystem.classes_)+1
num_unique_browser = 10 #len(le_browser.classes_)+1
num_unique_actions = 210 #len(le_actionen.classes_)+1
#numeric Input
numerical_input = tf.keras.Input(shape=(1,), name='numeric_input')
#categorical Input
os_input = tf.keras.Input(shape=(1,), name='os_input')
browser_input = tf.keras.Input(shape=(1,), name='browser_input')
action_input= tf.keras.Input(shape=(max_seq_len,), name='action_input')
emb_os = tf.keras.layers.Embedding(num_unique_os, 32)(os_input)
emb_browser = tf.keras.layers.Embedding(num_unique_browser, 32)(browser_input)
emb_actions = tf.keras.layers.Embedding(num_unique_actions, 64)(action_input)
actions_repr = tf.keras.layers.LSTM(300, return_sequences=True)(emb_actions)
actions_repr = tf.keras.layers.LSTM(200)(emb_actions)
emb_os = tf.squeeze(emb_os, axis=1)
emb_browser = tf.squeeze(emb_browser, axis=1)
activity_repr = tf.keras.layers.Concatenate()([emb_os, emb_browser, actions_repr,
numerical_input])
x = tf.keras.layers.RepeatVector(n_steps)(activity_repr)
x = tf.keras.layers.LSTM(288, return_sequences=True)(x)
next_n_actions = tf.keras.layers.Dense(num_unique_actions-1, activation='softmax')(x)
model = tf.keras.Model(inputs=[numerical_input, os_input, browser_input, action_input], outputs =
next_n_actions)
model.summary()
model.compile('adam', 'categorical_crossentropy', metrics=['accuracy'])
history = model.fit({'numeric_input': input_num,
'os_input': input_cat1,
'browser_input': input_cat2,
'action_input': input_cat3}, target, batch_size=50, epochs=130)
tf.saved_model.simple_save(
tf.keras.backend.get_session(),
os.path.join(model_dir, '1'),
inputs={'inputs': model.input},
outputs={t.name: t for t in model.outputs})
I received this:
Model Sommary
Meric Tendency
when trying to do the whole thing again with the Tensorflow instance, the following error occurred:
Traceback (most recent call last): File "train.py", line 105, in model = tf.keras.Model(inputs=[numerical_input, os_input, browser_input, action_input], outputs = next_n_actions) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 121, in init super(Model, self).init(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py", line 80, in init self._init_graph_network(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper method(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py", line 224, in _init_graph_network '(thus holding past layer metadata). Found: ' + str(x)) ValueError: Output tensors to a Model must be the output of a TensorFlow Layer (thus holding past layer metadata). Found: Tensor("dense/truediv:0", shape=(?, 2, 209), dtype=float32) 2021-03-08 21:52:04,761 sagemaker-containers ERROR ExecuteUserScriptError: Command "/usr/bin/python train.py --batch-size 50 --epochs 150--model_dir s3://sagemaker-eu-central-1-xxxxxxxxxxxxxxxxx/sagemaker-tensorflow-scriptmode
I used the Tensorflow versions '2.0.4' and '1.15.4' respectly with the kernels conda_tensorflow_p36 and conda_tensorflow2_p36
For more of Code: https://gitlab.com/patricksardin08/data-science/-/tree/master/
Please i need your Helps. I'm here around the clock if anyone wants me to explain the question in more detail.

Keras load_weights() not loading checkpoints

I have been following the RNN tutorial of Tensorflow
https://www.tensorflow.org/tutorials/text/text_generation
The model.load_weights() is not working, and is throwing the error
Traceback (most recent call last):
File "C:/Users/swati.srivastava/PycharmProjects/TensorFlow Practice/main.py", line 1232, in <module>
model.load_weights(tf.train.load_checkpoint("./training_checkpoints/ckpt_" + str(checkpoint_num)))
File "C:\Users\swati.srivastava\PycharmProjects\TensorFlow Practice\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2260, in load_weights
filepath, save_format = _detect_save_format(filepath)
File "C:\Users\swati.srivastava\PycharmProjects\TensorFlow Practice\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2868, in _detect_save_format
if saving_utils.is_hdf5_filepath(filepath):
File "C:\Users\swati.srivastava\PycharmProjects\TensorFlow Practice\venv\lib\site-packages\tensorflow\python\keras\saving\saving_utils.py", line 327, in is_hdf5_filepath
return (filepath.endswith('.h5') or filepath.endswith('.keras') or
AttributeError: 'tensorflow.python.util._pywrap_checkpoint_reader.C' object has no attribute 'endswith'
Process finished with exit code 1
My code is
BATCH_SIZE = 64
VOCAB_SIZE = len(vocab)
EMBEDDING_DIM = 256
RNN_UNITS = 1024
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
tf.keras.layers.LSTM(rnn_units,
return_sequences=True,
stateful=True,
recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_prefix,
save_weights_only=True
)
model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, batch_size=1)
checkpoint_num = 2
model.load_weights(tf.train.load_checkpoint("./training_checkpoints/ckpt_" + str(checkpoint_num)))
model.build(tf.TensorShape([1, None]))
My project directory looks like
which means that the training checkpoints are created, and exist. None of the checkpoint files are empty.
The only solution I could find was at https://github.com/tensorflow/tensorflow/issues/38745, where it says to do save_weights_only=True, which I have already done.
I think it is some sort of version conflict, but am not sure.
Edit: Added the checkpoint_callback snippet. training_checkpoints directory is created as can be seen in the project directory image
The issue is that load_weights api expects an HDF5 format file, but as per the your code, you do not provide it.
I am assuming that you are using ModelCheckpoint API for checkpoint creation like below:
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
verbose=1)
Then you will just provide the checkpoint path which is in your case './training_checkpoints/'
model.load_weights(checkpoint_path)
You can look at the documentation for more details.

Tensorflow 2.1.0 - An op outside of the function building code is being passed a "Graph" tensor

I am trying to implement a recent paper. Part of this implementation involves moving from tf 1.14 to tf 2.1.0. The code was working with tf 1.14 but is no longer working.
NOTE: If I disable eager execution tf.compat.v1.disable_eager_execution() then the code works as expected.
Is this the solution? I've made plenty of models before in TF 2.x and never had to disable eager execution to achieve normal functionality.
I have distilled the problem to a very short gist that shows what's happening.
Links & Code First Followed By Detailed Error Message
Link to Gist -- https://gist.github.com/darien-schettler/fd5b25626e9eb5b1330cce670bf9cc17
Code
# version 2.1.0
import tensorflow as tf
# version 1.18.1
import numpy as np
# ######## DEFINE CUSTOM FUNCTION FOR TF LAMBDA LAYER ######## #
def resize_like(input_tensor, ref_tensor):
""" Resize an image tensor to the same size/shape as a reference image tensor
Args:
input_tensor : (image tensor) Input image tensor that will be resized
ref_tensor : (image tensor) Reference image tensor that we want to resize the input tensor to.
Returns:
reshaped tensor
"""
reshaped_tensor = tf.image.resize(images=input_tensor,
size=tf.shape(ref_tensor)[1:3],
method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
preserve_aspect_ratio=False,
antialias=False,
name=None)
return reshaped_tensor
# ############################################################# #
# ############ DEFINE MODEL USING TF.KERAS FN API ############ #
# INPUTS
model_input_1 = tf.keras.layers.Input(shape=(160,160,3))
model_input_2 = tf.keras.layers.Input(shape=(160,160,3))
# OUTPUTS
model_output_1 = tf.keras.layers.Conv2D(filters=64,
kernel_size=(1, 1),
use_bias=False,
kernel_initializer='he_normal',
name='conv_name_base')(model_input_1)
model_output_2 = tf.keras.layers.Lambda(function=resize_like,
arguments={'ref_tensor': model_output_1})(model_input_2)
# MODEL
model = tf.keras.models.Model(inputs=[model_input_1, model_input_2],
outputs=model_output_2,
name="test_model")
# ############################################################# #
# ######### TRY TO UTILIZE PREDICT WITH DUMMY INPUT ########## #
dummy_input = [np.ones((1,160,160,3)), np.zeros((1,160,160,3))]
model.predict(x=dummy_input) # >>>>ERROR OCCURS HERE<<<<
# ############################################################# #
Full Error
>>> model.predict(x=dummy_input) # >>>>ERROR OCCURS HERE<<<<
Traceback (most recent call last):
File "/Users/<username>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 61, in quick_execute
num_outputs)
TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
#tf.function
def has_init_scope():
my_constant = tf.constant(1.)
with tf.init_scope():
added = my_constant * 2
The graph tensor has name: conv_name_base_1/Identity:0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 1013, in predict
use_multiprocessing=use_multiprocessing)
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 498, in predict
workers=workers, use_multiprocessing=use_multiprocessing, **kwargs)
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 475, in _model_iteration
total_epochs=1)
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch
batch_outs = execution_function(iterator)
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function
distributed_function(input_fn))
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in __call__
result = self._call(*args, **kwds)
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 638, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1611, in _filtered_call
self.captured_inputs)
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
ctx=ctx)
File "/Users/<user-name>/.virtualenvs/<venv-name>/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 75, in quick_execute
"tensors, but found {}".format(keras_symbolic_tensors))
tensorflow.python.eager.core._SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'conv_name_base_1/Identity:0' shape=(None, 160, 160, 64) dtype=float32>]
One potential solution I thought of would be to replace the Lambda layer with a custom layer... this seems to fix the issue as well. Not sure what the best practices are surrounding this though. Code below.
# version 2.1.0
import tensorflow as tf
# version 1.18.1
import numpy as np
# ######## DEFINE CUSTOM LAYER DIRECTLY BY SUBCLASSING ######## #
class ResizeLike(tf.keras.layers.Layer):
""" tf.keras layer to resize a tensor to the reference tensor shape.
Attributes:
keras.layers.Layer: Base layer class.
This is the class from which all layers inherit.
- A layer is a class implementing common neural networks
operations, such as convolution, batch norm, etc.
- These operations require managing weights,
losses, updates, and inter-layer connectivity.
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
def call(self, inputs, **kwargs):
"""TODO: docstring
Args:
inputs (TODO): TODO
**kwargs:
TODO
Returns:
TODO
"""
input_tensor, ref_tensor = inputs
return self.resize_like(input_tensor, ref_tensor)
def resize_like(self, input_tensor, ref_tensor):
""" Resize an image tensor to the same size/shape as a reference image tensor
Args:
input_tensor: (image tensor) Input image tensor that will be resized
ref_tensor: (image tensor) Reference image tensor that we want to resize the input tensor to.
Returns:
reshaped tensor
"""
reshaped_tensor = tf.image.resize(images=input_tensor,
size=tf.shape(ref_tensor)[1:3],
method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
preserve_aspect_ratio=False,
antialias=False)
return reshaped_tensor
# ############################################################# #
# ############ DEFINE MODEL USING TF.KERAS FN API ############ #
# INPUTS
model_input_1 = tf.keras.layers.Input(shape=(160,160,3))
model_input_2 = tf.keras.layers.Input(shape=(160,160,3))
# OUTPUTS
model_output_1 = tf.keras.layers.Conv2D(filters=64,
kernel_size=(1, 1),
use_bias=False,
kernel_initializer='he_normal',
name='conv_name_base')(model_input_1)
model_output_2 = ResizeLike(name="resize_layer")([model_input_2, model_output_1])
# MODEL
model = tf.keras.models.Model(inputs=[model_input_1, model_input_2],
outputs=model_output_2,
name="test_model")
# ############################################################# #
# ######### TRY TO UTILIZE PREDICT WITH DUMMY INPUT ########## #
dummy_input = [np.ones((1,160,160,3)), np.zeros((1,160,160,3))]
model.predict(x=dummy_input) # >>>>ERROR OCCURS HERE<<<<
# ############################################################# #
Thoughts??
Thanks in advance!!
Let me know if you would like me to provide anything else.
You can try the following steps:
Change resize_like as follows:
def resize_like(inputs):
input_tensor, ref_tensor = inputs
reshaped_tensor = tf.image.resize(images=input_tensor,
size=tf.shape(ref_tensor)[1:3],
method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
preserve_aspect_ratio=False,
antialias=False,
name=None)
return reshaped_tensor
Then, in the Lambda layer:
model_output_2 = tf.keras.layers.Lambda(function=resize_like)([model_input_2, model_output_1])

Getting error 'str' object has no attribute 'dtype' when exporting textsum model for TensorFlow Serving

I am currently trying to get a TF textsum model exported using the PREDICT SIGNATURE. I have _Decode returning a result from a passed in test article string and then I pass that to buildTensorInfo. This is in-fact a string being returned.
Now when I run the textsum_export.py logic to export the model, it gets to the point where is is building out the TensorInfo object however errors out with the below trace. I know that the PREDICT signature is usually used with images. Is this the problem? Can I not use this for the Textsum model because I am working with strings?
Error is:
Traceback (most recent call last):
File "export_textsum.py", line 129, in Export
tensor_info_outputs = tf.saved_model.utils.build_tensor_info(res)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/saved_model/utils_impl.py", line 37, in build_tensor_info
dtype_enum = dtypes.as_dtype(tensor.dtype).as_datatype_enum
AttributeError: 'str' object has no attribute 'dtype'
The TF session where the model is exported is below:
with tf.Session(config = config) as sess:
# Restore variables from training checkpoints.
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
print('Successfully loaded model from %s at step=%s.' %
(ckpt.model_checkpoint_path, global_step))
res = decoder._Decode(saver, sess)
print("Decoder value {}".format(type(res)))
else:
print('No checkpoint file found at %s' % FLAGS.checkpoint_dir)
return
# Export model
export_path = os.path.join(FLAGS.export_dir,str(FLAGS.export_version))
print('Exporting trained model to %s' % export_path)
#-------------------------------------------
tensor_info_inputs = tf.saved_model.utils.build_tensor_info(serialized_tf_example)
tensor_info_outputs = tf.saved_model.utils.build_tensor_info(res)
prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={ tf.saved_model.signature_constants.PREDICT_INPUTS: tensor_info_inputs},
outputs={tf.saved_model.signature_constants.PREDICT_OUTPUTS:tensor_info_outputs},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
))
#----------------------------------
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder = saved_model_builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
sess=sess,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
'predict':prediction_signature,
},
legacy_init_op=legacy_init_op)
builder.save()
print('Successfully exported model to %s' % export_path)
PREDICT signature work with tensors, if res is 'str' type python variable, then res_tensor will be of dtype tf.string
res_tensor = tf.convert_to_tensor(res)

Creating a Slim classifier using pretrained ResNet V2 model

I am trying to create an image classifier that utilizes the pre-trained ResNet V2 model provided in the slim documentation.
Here is the code so far:
import tensorflow as tf
slim = tf.contrib.slim
from PIL import Image
from inception_resnet_v2 import *
import numpy as np
checkpoint_file = 'inception_resnet_v2_2016_08_30.ckpt'
sample_images = ['carrot.jpg']
input_tensor = tf.placeholder(tf.float32, shape=(None,299,299,3), name='input_image')
scaled_input_tensor = tf.scalar_mul((1.0/255), input_tensor)
scaled_input_tensor = tf.subtract(scaled_input_tensor, 0.5)
scaled_input_tensor = tf.multiply(scaled_input_tensor, 2.0)
variables_to_restore = slim.get_model_variables()
print(variables_to_restore)
init_fn = slim.assign_from_checkpoint_fn(
checkpoint_file,
slim.get_model_variables('InceptionResnetV2'))
sess = tf.Session()
init_fn(sess)
arg_scope = inception_resnet_v2_arg_scope()
with slim.arg_scope(arg_scope):
logits, end_points = inception_resnet_v2(scaled_input_tensor, is_training=False)
for image in sample_images:
im = Image.open(image).resize((299,299))
im = np.array(im)
im = im.reshape(-1,299,299,3)
predict_values, logit_values = sess.run([end_points['Predictions'], logits], feed_dict={input_tensor: im})
print (np.max(predict_values), np.max(logit_values))
print (np.argmax(predict_values), np.argmax(logit_values))
The problem is I keep getting this error:
Traceback (most recent call last):
File "./classify.py", line 21, in <module>
slim.get_model_variables('InceptionResnetV2'))
File "/home/ubuntu/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 584, in assign_from_checkpoint_fn
saver = tf_saver.Saver(var_list, reshape=reshape_variables)
File "/home/ubuntu/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1040, in __init__
self.build()
File "/home/ubuntu/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1061, in build
raise ValueError("No variables to save")
ValueError: No variables to save
So it seems TF/Slim is unable to find any variables and this is made clear when I call:
variables_to_restore = slim.get_model_variables()
print(variables_to_restore)
As it outputs an empty array.
How can I go about using the pre-trained model?
This happens because you haven't constructed the model in your graph yet to have any variables starting with the name "InceptionResnetV2" to be captured and restored by the saver.
I believe you should put the model construction before using slim.get_variables_to_restore().
For instance:
with slim.arg_scope(arg_scope):
logits, end_points = inception_resnet_v2(scaled_input_tensor, is_training=False)
variables_to_restore = slim.get_model_variables()
This way, the Tensor variables will be constructed and you should see variables_to_restore is no longer empty.
You need to manually add the model variables.
Try this
with slim.arg_scope(arg_scope):
logits, end_points = inception_resnet_v2(scaled_input_tensor, is_training=False)
# Add model variables
for var in tf.global_variables(scope='inception_resnet_v2'):
slim.add_model_variable(var)