Serving Keras Models With Tensorflow Serving

Serving Keras Models With Tensorflow Serving - tensorflow

Right now we are successfully able to serve models using Tensorflow Serving. We have used following method to export the model and host it with Tensorflow Serving.
------------
For exporting
------------------
from tensorflow.contrib.session_bundle import exporter
K.set_learning_phase(0)
export_path = ... # where to save the exported graph
export_version = ... # version number (integer)
saver = tf.train.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
signature = exporter.classification_signature(input_tensor=model.input,
scores_tensor=model.output)
model_exporter.init(sess.graph.as_graph_def(),
default_graph_signature=signature)
model_exporter.export(export_path, tf.constant(export_version), sess)
--------------------------------------
For hosting
-----------------------------------------------
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=default --model_base_path=/serving/models
However our issue is - we want keras to be integrated with Tensorflow serving. We would like to serve the model through Tensorflow serving using Keras.
The reason we would like to have that is because - in our architecture we follow couple of different ways to train our model like deeplearning4j + Keras ,
Tensorflow + Keras, but for serving we would like to use only one servable engine that's Tensorflow Serving. We don't see any straight forward way to achieve that. Any comments ?
Thank you.

Very recently TensorFlow changed the way it exports the model, so the majority of the tutorials available on web are outdated. I honestly don't know how deeplearning4j works, but I use Keras quite often. I managed to create a simple example that I already posted on this issue in TensorFlow Serving Github.
I'm not sure whether this will help you, but I'd like to share how I did and maybe it will give you some insights. My first trial prior to creating my custom model was to use a trained model available on Keras such as VGG19. I did this as follows.
Model creation
import keras.backend as K
from keras.applications import VGG19
from keras.models import Model
# very important to do this as a first thing
K.set_learning_phase(0)
model = VGG19(include_top=True, weights='imagenet')
# The creation of a new model might be optional depending on the goal
config = model.get_config()
weights = model.get_weights()
new_model = Model.from_config(config)
new_model.set_weights(weights)
Exporting the model
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import utils
from tensorflow.python.saved_model import tag_constants, signature_constants
from tensorflow.python.saved_model.signature_def_utils_impl import build_signature_def, predict_signature_def
from tensorflow.contrib.session_bundle import exporter
export_path = 'folder_to_export'
builder = saved_model_builder.SavedModelBuilder(export_path)
signature = predict_signature_def(inputs={'images': new_model.input},
outputs={'scores': new_model.output})
with K.get_session() as sess:
builder.add_meta_graph_and_variables(sess=sess,
tags=[tag_constants.SERVING],
signature_def_map={'predict': signature})
builder.save()
Some side notes
It can vary depending on Keras, TensorFlow, and TensorFlow Serving
version. I used the latest ones.
Beware of the names of the signatures, since they should be used in the client as well.
When creating the client, all preprocessing steps that are needed for the
model (preprocess_input() for example) must be executed. I didn't try
to add such step in the graph itself as Inception client example.
With respect to serving different models within the same server, I think that something similar to the creation of a model_config_file might help you. To do so, you can create a config file similar to this:
model_config_list: {
config: {
name: "my_model_1",
base_path: "/tmp/model_1",
model_platform: "tensorflow"
},
config: {
name: "my_model_2",
base_path: "/tmp/model_2",
model_platform: "tensorflow"
}
}
Finally, you can run the client like this:
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --config_file=model_config.conf

try this script i wrote, you can convert keras models into tensorflow frozen graphs, ( i saw that some models give rise to strange behaviours when you export them without freezing the variables).
import sys
from keras.models import load_model
import tensorflow as tf
from keras import backend as K
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
K.set_learning_phase(0)
K.set_image_data_format('channels_last')
INPUT_MODEL = sys.argv[1]
NUMBER_OF_OUTPUTS = 1
OUTPUT_NODE_PREFIX = 'output_node'
OUTPUT_FOLDER= 'frozen'
OUTPUT_GRAPH = 'frozen_model.pb'
OUTPUT_SERVABLE_FOLDER = sys.argv[2]
INPUT_TENSOR = sys.argv[3]
try:
model = load_model(INPUT_MODEL)
except ValueError as err:
print('Please check the input saved model file')
raise err
output = [None]*NUMBER_OF_OUTPUTS
output_node_names = [None]*NUMBER_OF_OUTPUTS
for i in range(NUMBER_OF_OUTPUTS):
output_node_names[i] = OUTPUT_NODE_PREFIX+str(i)
output[i] = tf.identity(model.outputs[i], name=output_node_names[i])
print('Output Tensor names: ', output_node_names)
sess = K.get_session()
try:
frozen_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), output_node_names)
graph_io.write_graph(frozen_graph, OUTPUT_FOLDER, OUTPUT_GRAPH, as_text=False)
print(f'Frozen graph ready for inference/serving at {OUTPUT_FOLDER}/{OUTPUT_GRAPH}')
except:
print('Error Occured')
builder = tf.saved_model.builder.SavedModelBuilder(OUTPUT_SERVABLE_FOLDER)
with tf.gfile.GFile(f'{OUTPUT_FOLDER}/{OUTPUT_GRAPH}', "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sigs = {}
OUTPUT_TENSOR = output_node_names
with tf.Session(graph=tf.Graph()) as sess:
tf.import_graph_def(graph_def, name="")
g = tf.get_default_graph()
inp = g.get_tensor_by_name(INPUT_TENSOR)
out = g.get_tensor_by_name(OUTPUT_TENSOR[0] + ':0')
sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
tf.saved_model.signature_def_utils.predict_signature_def(
{"input": inp}, {"outout": out})
builder.add_meta_graph_and_variables(sess,
[tag_constants.SERVING],
signature_def_map=sigs)
try:
builder.save()
print(f'Model ready for deployment at {OUTPUT_SERVABLE_FOLDER}/saved_model.pb')
print('Prediction signature : ')
print(sigs['serving_default'])
except:
print('Error Occured, please checked frozen graph')

I have recently added this blogpost that explain how to save a Keras model and serve it with Tensorflow Serving.
TL;DR:
Saving an Inception3 pretrained model:
### Load a pretrained inception_v3
inception_model = keras.applications.inception_v3.InceptionV3(weights='imagenet')
# Define a destination path for the model
MODEL_EXPORT_DIR = '/tmp/inception_v3'
MODEL_VERSION = 1
MODEL_EXPORT_PATH = os.path.join(MODEL_EXPORT_DIR, str(MODEL_VERSION))
# We'll need to create an input mapping, and name each of the input tensors.
# In the inception_v3 Keras model, there is only a single input and we'll name it 'image'
input_names = ['image']
name_to_input = {name: t_input for name, t_input in zip(input_names, inception_model.inputs)}
# Save the model to the MODEL_EXPORT_PATH
# Note using 'name_to_input' mapping, the names defined here will also be used for querying the service later
tf.saved_model.simple_save(
keras.backend.get_session(),
MODEL_EXPORT_PATH,
inputs=name_to_input,
outputs={t.name: t for t in inception_model.outputs})
And then starting a TF serving Docker:
Copy the saved model to the hosts' specified directory. (source=/tmp/inception_v3 in this example)
Run the docker:
docker run -d -p 8501:8501 --name keras_inception_v3 --mount type=bind,source=/tmp/inception_v3,target=/models/inception_v3 -e MODEL_NAME=inception_v3 -t tensorflow/serving
Verify that there's network access to the Tensorflow service. In order to get the local docker ip (172.*.*.*) for testing run:
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' keras_inception_v3

Related

How to save a SetFit trainer locally after training

I am working on an HPC with no internet access on worker nodes and the only option to save a SetFit trainer after training, is to push it to HuggingFace hub. How do I go about saving it locally to disk?
https://github.com/huggingface/setfit

setfit has this class method
model._save_pretrained(save_directory)
and to load it
saved_model = SetFitModel._from_pretrained(save_directory)

I think you can do this with either pickle or joblib
import pickle
import joblib
pickle.dump(trainer, open('model.pkl', 'wb'))
joblib.dump(trainer, 'model.joblib')
And load in the future with:
job_model = joblib.load('model.joblib')
pkl_model = pickle.load(open('model.pkl', 'rb'))

As an alternative to pushing your Trainer to the Hub as described in SetFit for Text Classification, you can save your trainer locally and use it for prediction.
There is a predict method in the source code. You can use that same method to make predictions from your SetFit object
Save your model locally:
import joblib
# trainer is you SetFit object: setfit.trainer.SetFitTrainer
joblib.dump(trainer, 'my-awesome-setfit-model.joblib')
Load your model and make a classification or inference from your model:
# Load the trainer
trainer = joblib.load('my-awesome-setfit-model.joblib')
# Use the model and predict
trainer.model.predict(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])

You can use the sklearn wrapper:
Train the model
from setfit.modeling import SKLearnWrapper
from sentence_transformers import SentenceTransformer
from sklearn.linear_model import LogisticRegression
model = SentenceTransformer("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")
clf = SKLearnWrapper(model, LogisticRegression())
sentences = ["good", "bad", "very good"]
labels = [1, 0, 1]
clf.fit(sentences, labels)
pred1 = clf.predict(["gooood"])
Save the model
path = "model1"
clf.save(path)
Load the model
clf = SKLearnWrapper(None, None)
clf.load(path)
Test
pred2 = clf.predict(["gooood"])
assert pred1 == pred2

How to load model back from cpkt, .meta, .index and .pb files for Mobilenet v3?

I have downloaded checkpoints along with model for Mobilenet v3. After extraction of rar file, I get two folders and two other files. Directory looks like following
Main Folder
ema (folder)
checkpoint
model-x.data-00000-of-00001
model-x.index
model-x.meta
pristine (folder)
model.ckpt-y.data-00000-of-00001
model.ckpt-y.index
model.ckpt-y.meta
.pb
.tflite
I have tried many codes among which few are below.
import tensorflow as tf
from tensorflow.python.platform import gfile
model_path = "./weights/v3-large-minimalistic_224_1.0_uint8/model.ckpt-3868848"
detection_graph = tf.Graph()
with tf.Session(graph=detection_graph) as sess:
# Load the graph with the trained states
loader = tf.train.import_meta_graph(model_path+'.meta')
loader.restore(sess, model_path)
The above code results in following error
Node {{node batch_processing/distort_image/switch_case/indexed_case}} of type Case has '_lower_using_switch_merge' attr set but it does not support lowering.
I tried following code:
import tensorflow as tf
import sys
sys.path.insert(0, 'models/research/slim')
from nets.mobilenet import mobilenet_v3
tf.reset_default_graph()
file_input = tf.placeholder(tf.string, ())
image = tf.image.decode_jpeg(tf.read_file('test.jpg'))
images = tf.expand_dims(image, 0)
images = tf.cast(images, tf.float32) / 128. - 1
images.set_shape((None, None, None, 3))
images = tf.image.resize_images(images, (224, 224))
model = mobilenet_v3.wrapped_partial(mobilenet_v3.mobilenet,
new_defaults={'scope': 'MobilenetEdgeTPU'},
conv_defs=mobilenet_v3.V3_LARGE_MINIMALISTIC,
depth_multiplier=1.0)
with tf.contrib.slim.arg_scope(mobilenet_v3.training_scope(is_training=False)):
logits, endpoints = model(images)
ema = tf.train.ExponentialMovingAverage(0.999)
vars = ema.variables_to_restore()
print(vars)
with tf.Session() as sess:
tf.train.Saver(vars).restore(sess, './weights/v3-large-minimalistic_224_1.0_uint8/saved_model.pb')
tf.train.Saver().save(sess, './weights/v3-large-minimalistic_224_1.0_uint8/pristine/model.ckpt')
The above code generates following error:
Unable to open table file ./weights/v3-large-minimalistic_224_1.0_uint8/saved_model.pb: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[node save/RestoreV2 (defined at <ipython-input-11-1531bbfd84bb>:29) ]]
How can I load Mobilenet v3 model along with the checkpoints and use it for my data?

try this
with tf.contrib.slim.arg_scope(mobilenet_v3.training_scope(is_training=False)):
logits, endpoints = mobilenet_v3.large_minimalistic(images)
instead of
model = mobilenet_v3.wrapped_partial(mobilenet_v3.mobilenet,
new_defaults={'scope': 'MobilenetEdgeTPU'},
conv_defs=mobilenet_v3.V3_LARGE_MINIMALISTIC,
depth_multiplier=1.0)
with tf.contrib.slim.arg_scope(mobilenet_v3.training_scope(is_training=False)):
logits, endpoints = model(images)

Tensorflow Serving of Saved Model ssd_mobilenet_v1_coco

I have looked on several posts on stackoverflow and have been at it for a few days now, but alas, I'm not able to properly serve an object detection model through tensorflow serving.
I have visited to the following links:
How to properly serve an object detection model from Tensorflow Object Detection API?
and
https://github.com/tensorflow/tensorflow/issues/11863
Here's what I have done.
I have downloaded the ssd_mobilenet_v1_coco_11_06_2017.tar.gz, which contains the following files:
frozen_inference_graph.pb
graph.pbtxt
model.ckpt.data-00000-of-00001
model.ckpt.index
model.ckpt.meta
Using the following script, I was able successfully convert the frozen_inference_graph.pb to a SavedModel (under directory ssd_mobilenet_v1_coco_11_06_2017/saved)
import tensorflow as tf
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
import ipdb
# Specify version 1
export_dir = './saved/1'
graph_pb = 'frozen_inference_graph.pb'
builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
with tf.gfile.GFile(graph_pb, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sigs = {}
with tf.Session(graph=tf.Graph()) as sess:
# name="" is important to ensure we don't get spurious prefixing
tf.import_graph_def(graph_def, name="")
g = tf.get_default_graph()
ipdb.set_trace()
inp = g.get_tensor_by_name("image_tensor:0")
outputs = {}
outputs["detection_boxes"] = g.get_tensor_by_name('detection_boxes:0')
outputs["detection_scores"] = g.get_tensor_by_name('detection_scores:0')
outputs["detection_classes"] = g.get_tensor_by_name('detection_classes:0')
outputs["num_detections"] = g.get_tensor_by_name('num_detections:0')
output_tensor = tf.concat([tf.expand_dims(t, 0) for t in outputs], 0)
# or use tf.gather??
# out = g.get_tensor_by_name("generator/Tanh:0")
sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
tf.saved_model.signature_def_utils.predict_signature_def(
{"in": inp}, {"out": output_tensor} )
sigs["predict_images"] = \
tf.saved_model.signature_def_utils.predict_signature_def(
{"in": inp}, {"out": output_tensor} )
builder.add_meta_graph_and_variables(sess,
[tag_constants.SERVING],
signature_def_map=sigs)
builder.save()
I get the following error:
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
--port=9000 --model_base_path=/serving/ssd_mobilenet_v1_coco_11_06_2017/saved
2017-09-17 22:33:21.325087: W tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:268] No versions of servable default found under base path /serving/ssd_mobilenet_v1_coco_11_06_2017/saved/1
I understand I will need a client to connect to the server to do the prediction. However, I'm not even able to serve the model properly.

You need to change the export signature somewhat from what the original post did. This script does the necessary changes for you:
$OBJECT_DETECTION_CONFIG=object_detection/samples/configs/ssd_mobilenet_v1_pets.config
$ python object_detection/export_inference_graph.py \ --input_type encoded_image_string_tensor \ --pipeline_config_path ${OBJECT_DETECTION_CONFIG} \ --trained_checkpoint_prefix ${YOUR_LOCAL_CHK_DIR}/model.ckpt-${CHECKPOINT_NUMBER} \ --output_directory ${YOUR_LOCAL_EXPORT_DIR}
For more details on what the program is doing, see:
https://cloud.google.com/blog/big-data/2017/09/performing-prediction-with-tensorflow-object-detection-models-on-google-cloud-machine-learning-engine

Convert a graph proto (pb/pbtxt) to a SavedModel for use in TensorFlow Serving or Cloud ML Engine

I've been following the TensorFlow for Poets 2 codelab on a model I've trained, and have created a frozen, quantized graph with embedded weights. It's captured in a single file - say my_quant_graph.pb.
Since I can use that graph for inference with the TensorFlow Android inference library just fine, I thought I could do the same with Cloud ML Engine, but it seems it only works on a SavedModel model.
How can I simply convert a frozen/quantized graph in a single pb file to use on ML engine?

It turns out that a SavedModel provides some extra info around a saved graph. Assuming a frozen graph doesn't need assets, then it needs only a serving signature specified.
Here's the python code I ran to convert my graph to a format that Cloud ML engine accepted. Note I only have a single pair of input/output tensors.
import tensorflow as tf
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
export_dir = './saved'
graph_pb = 'my_quant_graph.pb'
builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
with tf.gfile.GFile(graph_pb, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sigs = {}
with tf.Session(graph=tf.Graph()) as sess:
# name="" is important to ensure we don't get spurious prefixing
tf.import_graph_def(graph_def, name="")
g = tf.get_default_graph()
inp = g.get_tensor_by_name("real_A_and_B_images:0")
out = g.get_tensor_by_name("generator/Tanh:0")
sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
tf.saved_model.signature_def_utils.predict_signature_def(
{"in": inp}, {"out": out})
builder.add_meta_graph_and_variables(sess,
[tag_constants.SERVING],
signature_def_map=sigs)
builder.save()

There is sample with multiple outputs nodes:
# Convert PtotoBuf model to saved_model, format for TF Serving
# https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction
import shutil
import tensorflow.compat.v1 as tf
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
export_dir = './1' # TF Serving supports run different versions of same model. So we put current model to '1' folder.
graph_pb = 'frozen_inference_graph.pb'
# Clear out folder
shutil.rmtree(export_dir, ignore_errors=True)
builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
with tf.io.gfile.GFile(graph_pb, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sigs = {}
with tf.Session(graph=tf.Graph()) as sess:
# Prepare input and outputs of model
tf.import_graph_def(graph_def, name="")
g = tf.get_default_graph()
image_tensor = g.get_tensor_by_name("image_tensor:0")
num_detections = g.get_tensor_by_name("num_detections:0")
detection_scores = g.get_tensor_by_name("detection_scores:0")
detection_boxes = g.get_tensor_by_name("detection_boxes:0")
detection_classes = g.get_tensor_by_name("detection_classes:0")
sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
tf.saved_model.signature_def_utils.predict_signature_def(
{"input_image": image_tensor},
{ "num_detections": num_detections,
"detection_scores": detection_scores,
"detection_boxes": detection_boxes,
"detection_classes": detection_classes})
builder.add_meta_graph_and_variables(sess,
[tag_constants.SERVING],
signature_def_map=sigs)
builder.save()

Benchmark Keras model using TensforFlow Benchmark

I'm trying to benchmark the performance in the inference phase of my Keras model build with the TensorFlow backend. I was thinking that the the Tensorflow Benchmark tool was the proper way to go.
I've managed to build and run the example on Desktop with the tensorflow_inception_graph.pb and everything seems to work fine.
What I can't seem to figure out is how to save the Keras model as a proper .pbmodel. I'm able to get the TensorFlow Graph from the Keras model as follows:
import keras.backend as K
K.set_learning_phase(0)
trained_model = function_that_returns_compiled_model()
sess = K.get_session()
sess.graph # This works
# Get the input tensor name for TF Benchmark
trained_model.input
> <tf.Tensor 'input_1:0' shape=(?, 360, 480, 3) dtype=float32>
# Get the output tensor name for TF Benchmark
trained_model.output
> <tf.Tensor 'reshape_2/Reshape:0' shape=(?, 360, 480, 12) dtype=float32>
I've now been trying to save the model in a couple of different ways.
import tensorflow as tf
from tensorflow.contrib.session_bundle import exporter
model = trained_model
export_path = "path/to/folder" # where to save the exported graph
export_version = 1 # version number (integer)
saver = tf.train.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
signature = exporter.classification_signature(input_tensor=model.input, scores_tensor=model.output)
model_exporter.init(sess.graph.as_graph_def(), default_graph_signature=signature)
model_exporter.export(export_path, tf.constant(export_version), sess)
Which produces a folder with some files I don't know what to do with.
I would now run the Benchmark tool with something like this
bazel-bin/tensorflow/tools/benchmark/benchmark_model \
--graph=tensorflow/tools/benchmark/what_file.pb \
--input_layer="input_1:0" \
--input_layer_shape="1,360,480,3" \
--input_layer_type="float" \
--output_layer="reshape_2/Reshape:0"
But no matter which file I'm trying to use as the what_file.pb I'm getting a Error during inference: Invalid argument: Session was not created with a graph before Run()!

So I got this to work. Just needed to convert all variables in the tensorflow graph to constants and then save graph definition.
Here's a small example:
import tensorflow as tf
from keras import backend as K
from tensorflow.python.framework import graph_util
K.set_learning_phase(0)
model = function_that_returns_your_keras_model()
sess = K.get_session()
output_node_name = "my_output_node" # Name of your output node
with sess as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
graph_def = sess.graph.as_graph_def()
output_graph_def = graph_util.convert_variables_to_constants(
sess,
sess.graph.as_graph_def(),
output_node_name.split(","))
tf.train.write_graph(output_graph_def,
logdir="my_dir",
name="my_model.pb",
as_text=False)
Now just call the TensorFlow Benchmark tool with my_model.pb as the graph.

You're saving the parameters of this model and not the graph definition; to save that use tf.get_default_graph().as_graph_def().SerializeToString() and then save that to a file.
That said I don't think the benchmark tool will work since it has no way to initialize the variables your model depends on.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Serving Keras Models With Tensorflow Serving - tensorflow

Related

How to save a SetFit trainer locally after training

How to load model back from cpkt, .meta, .index and .pb files for Mobilenet v3?

Tensorflow Serving of Saved Model ssd_mobilenet_v1_coco

Convert a graph proto (pb/pbtxt) to a SavedModel for use in TensorFlow Serving or Cloud ML Engine

Benchmark Keras model using TensforFlow Benchmark

Categories

Resources