I want to compile the TensorFlow Graph to Movidius Graph. I have used Model Zoo's ssd_mobilenet_v1_coco model to train it on my own dataset.
Then I ran
python object_detection/export_inference_graph.py \
--input_type=image_tensor \
--pipeline_config_path=/home/redtwo/nsir/ssd_mobilenet_v1_coco.config \
--trained_checkpoint_prefix=/home/redtwo/nsir/train/model.ckpt-3362 \
--output_directory=/home/redtwo/nsir/output
which generates me frozen_interference_graph.pb & saved_model/saved_model.pb
Now to convert this saved model into Movidius graph. There are commands given
Export GraphDef file
python3 ../tensorflow/tensorflow/python/tools/freeze_graph.py \
--input_graph=inception_v3.pb \
--input_binary=true \
--input_checkpoint=inception_v3.ckpt \
--output_graph=inception_v3_frozen.pb \
--output_node_name=InceptionV3/Predictions/Reshape_1
Freeze model for inference
python3 ../tensorflow/tensorflow/python/tools/freeze_graph.py \
--input_graph=inception_v3.pb \
--input_binary=true \
--input_checkpoint=inception_v3.ckpt \
--output_graph=inception_v3_frozen.pb \
--output_node_name=InceptionV3/Predictions/Reshape_1
which can finally be feed to NCS Intel Movidius SDK
mvNCCompile -s 12 inception_v3_frozen.pb -in=input -on=InceptionV3/Predictions/Reshape_1
All of this is given at Intel Movidius Website here: https://movidius.github.io/ncsdk/tf_modelzoo.html
My model was already trained i.e. output/frozen_inference_graph. Why do I again freeze it using /slim/export_inference_graph.py or it's the output/saved_model/saved_model.py that will go as input to slim/export_inference_graph.py??
All I want is output_node_name=Inceptionv3/Predictions/Reshape_1. How to get this output_name_name directory structure & anything inside it? I don't know what all it contains
what output node should I use for model zoo's ssd_mobilenet_v1_coco model(trained on my own custom dataset)
python freeze_graph.py \
--input_graph=/path/to/graph.pbtxt \
--input_checkpoint=/path/to/model.ckpt-22480 \
--input_binary=false \
--output_graph=/path/to/frozen_graph.pb \
--output_node_names="the nodes that you want to output e.g. InceptionV3/Predictions/Reshape_1 for Inception V3 "
Things I understand & don't understand:
input_checkpoint: ✓ [check points that were created during training]
output_graph: ✓ [path to output frozen graph]
out_node_names: X
I don't understand out_node_names parameter & what should inside this considering its ssd_mobilnet not inception_v3
System information
What is the top-level directory of the model you are using:
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): TensorFlow installed with pip
TensorFlow version (use command below): 1.13.1
Bazel version (if compiling from source):
CUDA/cuDNN version: V10.1.168/7.*
GPU model and memory: 2080Ti 11Gb
Exact command to reproduce:
The graph in saved_model/saved_model.pb is the graph definition(graph architecture) of the pretrained inception_v3 model without the weights loaded to the graph. The frozen_interference_graph.pb is the graph frozen with the checkpoints you have provided and taking the default output nodes of the inception_v3 model.
To get output node names summarise_graph tool can be used
You can use the below commands to use summarise_graph tool if bazel is installed
bazel build tensorflow/tools/graph_transforms:summarize_graph
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
--in_graph=/tmp/inception_v3_inf_graph.pb
In case if bazel is not installed Output nodes can be obtained using the tensorboard or any other graph visualising tools like Netron.
The additional freeze_graph.py can be used to freeze the graph specifying the output nodes(ie in a case where additional output nodes are added to the inceptionV3). The frozen_interference_graph.pb is also an equaly good fit for infrencing.
I'm trying to use object_detection model with tf 1.4. I realize that there is no training directory that used to contain some ckpt files as in the previous tf versions, so any suggestions where is the new destination or should I use the ones from old versions?
I'm using the below command
python3 export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path samples/configs/faster_rcnn_resnet101_coco.config \
--trained_checkpoint_prefix model.ckpt \
--output_directory inference
You can download the full sets including the .ckpt files from this link
I was following this tutorial to use tensorflow serving using my object detection model. I am using tensorflow object detection for generating the model. I have created a frozen model using this exporter (the generated frozen model works using python script).
The frozen graph directory has following contents ( nothing on variables directory)
variables/
saved_model.pb
Now when I try to serve the model using the following command,
tensorflow_model_server --port=9000 --model_name=ssd --model_base_path=/serving/ssd_frozen/
It always shows me
...
tensorflow_serving/model_servers/server_core.cc:421] (Re-)adding
model: ssd 2017-08-07 10:22:43.892834: W
tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:262]
No versions of servable ssd found under base path /serving/ssd_frozen/
2017-08-07 10:22:44.892901: W
tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:262]
No versions of servable ssd found under base path /serving/ssd_frozen/
...
I had same problem, the reason is because object detection api does not assign version of your model when exporting your detection model. However, tensorflow serving requires you to assign a version number of your detection model, so that you could choose different versions of your models to serve. In your case, you should put your detection model(.pb file and variables folder) under folder:
/serving/ssd_frozen/1/. In this way, you will assign your model to version 1, and tensorflow serving will automatically load this version since you only have one version. By default tensorflow serving will automatically serve the latest version(ie, the largest number of versions).
Note, after you created 1/ folder, the model_base_path is still need to be set to --model_base_path=/serving/ssd_frozen/.
For new version of tf serving, as you know, it no longer supports the model format used to be exported by SessionBundle but now SavedModelBuilder.
I suppose it's better to restore a session from your older model format and then export it by SavedModelBuilder. You can indicate the version of your model with it.
def export_saved_model(version, path, sess=None):
tf.app.flags.DEFINE_integer('version', version, 'version number of the model.')
tf.app.flags.DEFINE_string('work_dir', path, 'your older model directory.')
tf.app.flags.DEFINE_string('model_dir', '/tmp/model_name', 'saved model directory')
FLAGS = tf.app.flags.FLAGS
# you can give the session and export your model immediately after training
if not sess:
saver = tf.train.import_meta_graph(os.path.join(path, 'xxx.ckpt.meta'))
saver.restore(sess, tf.train.latest_checkpoint(path))
export_path = os.path.join(
tf.compat.as_bytes(FLAGS.model_dir),
tf.compat.as_bytes(str(FLAGS.version)))
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
# define the signature def map here
# ...
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
'predict_xxx':
prediction_signature
},
legacy_init_op=legacy_init_op
)
builder.save()
print('Export SavedModel!')
you could find main part of the code above in tf serving example.
Finally it will generate the SavedModel in a format that can be served.
Create a version folder under like - serving/model_name/0000123/saved_model.pb
Answer's above already explained why it is important to keep a version number inside the model folder. Follow below link , here they have different sets of built models , you can take it as a reference.
https://github.com/tensorflow/serving/tree/master/tensorflow_serving/servables/tensorflow/testdata
I was doing this on my personal computer running Ubuntu, not Docker. Note I am in a directory called "serving". This is where I saved my folder "mobile_weight". I had to create a new folder, "0000123" inside "mobile_weight". My path looks like serving->mobile_weight->0000123->(variables folder and saved_model.pb)
The command from the tensorflow serving tutorial should look like (Change model_name and your directory):
nohup tensorflow_model_server \
--rest_api_port=8501 \
--model_name=model_weight \
--model_base_path=/home/murage/Desktop/serving/mobile_weight >server.log 2>&1
So my entire terminal screen looks like:
murage#murage-HP-Spectre-x360-Convertible:~/Desktop/serving$ nohup tensorflow_model_server --rest_api_port=8501 --model_name=model_weight --model_base_path=/home/murage/Desktop/serving/mobile_weight >server.log 2>&1
That error message can also result due to issues with the --volume argument.
Ensure your --volume mount is actually correct and points to the model's dir, as this is a general 'model not found' error, but it just seems more complex.
If on windows just use cmd, otherwise its easy to accidentally use linux file path and linux separators in cygwin or gitbash. Even with the correct file structure you can get OP's error if you don't use the windows absolute path.
#using cygwin
$ echo $TESTDATA
/home/username/directory/serving/tensorflow_serving/servables/tensorflow/testdata
$ docker run -t --rm -p 8501:8501 -v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two" -e MODEL_NAME=half_plus_two tensorflow/serving
2021-01-22 20:12:28.995834: W tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:267] No versions of servable half_plus_two found under base path /models/half_plus_two. Did you forget to name your leaf directory as a number (eg. '/1/')?
Then calling the same command with the same unchanged file structure but with the full windows path using windows file separators, and it works:
#using cygwin
$ export TESTDATA="$(cygpath -w "/home/username/directory/serving/tensorflow_serving/servables/tensorflow/testdata")"
$ echo $TESTDATA
C:\Users\username\directory\serving\tensorflow_serving\servables\tensorflow\testdata
$ docker run -t --rm -p 8501:8501 -v "$TESTDATA\\saved_model_half_plus_two_cpu:/models/half_plus_two" -e MODEL_NAME=half_plus_two tensorflow/serving
2021-01-22 21:10:49.527049: I tensorflow_serving/core/basic_manager.cc:740] Successfully reserved resources to load servable {name: half_plus_two version: 1}
I'm trying train a model using Tensorflow on the Google Cloud ml-engine. It seems that tensorflow can't get to the libcupti files on the cloud compute machine due to the LD_LIBRARY_PATH not pointing to the correct directory, as implied by the log entry below:
lineno: 126
message: "Couldn't open CUDA library libcupti.so.8.0.
LD_LIBRARY_PATH: /usr/local/cuda/lib64"
levelname: "INFO"
pathname: "tensorflow/stream_executor/dso_loader.cc"
created: 1491143889.84344
As far as I know, the libcupti files are all in /usr/local/cuda/extras/CUPTI/lib64, so I would need to append this to the LD_LIBRARY_PATH variable, but how would I do that when submitting a job via a gcloud ml-engine jobs submit training $JOB_NAME command? Or maybe there's an easier solution?
I tried to use GPU with tensorflow on google cloud and it works for me. In my code I didn't do any GPU specific setting (nor set anything with LD_LIBRARY_PATH)
I think you can try with just a simple and standard tensorflow code and with you submit the job you attach a config then the job should automatically use GPU to do the calculation for you.
Try add a file such as cloudml-gpu.yaml in your module with the following content:
trainingInput:
scaleTier: CUSTOM
# standard_gpu provides 1 GPU. Change to complex_model_m_gpu for 4
GPUs
masterType: standard_gpu
runtimeVersion: "1.0"
Then add a option called --config=trainer/cloudml-gpu.yaml (suppose your training module is in a folder called trainer). For example:
export BUCKET_NAME=tf-learn-simple-sentiment
export JOB_NAME="example_5_train_$(date +%Y%m%d_%H%M%S)"
export JOB_DIR=gs://$BUCKET_NAME/$JOB_NAME
export REGION=europe-west1
gcloud ml-engine jobs submit training $JOB_NAME \
--job-dir gs://$BUCKET_NAME/$JOB_NAME \
--runtime-version 1.0 \
--module-name trainer.example5-keras \
--package-path ./trainer \
--region $REGION \
--config=trainer/cloudml-gpu.yaml \
-- \
--train-file gs://tf-learn-simple-sentiment/sentiment_set.pickle
I have downloaded the Inception_v3 model of Tensorflow using following command:
curl http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz -o /tmp/inceptionv3.tgz
tar xzf /tmp/inceptionv3.tgz -C /tmp/
Now I have a classify_image_graph_def.pb file, which I believe is the model.
My question is, how to evaluate this model against the ImageNet 2012 data? Is there any scripts, or do I have to write some python code?
The classify_image.py script in the TensorFlow GitHub repository uses that pre-trained Inception model to perform classification on arbitrary JPEG images. You could adapt this script to evaluate it against the ImageNet 2012 data.