Deploying Keras Models via Google Cloud ML - tensorflow

I am looking to use Google Cloud ML to host my Keras models so that I can call the API and make some predictions. I am running into some issues from the Keras side of things.
So far I have been able to build a model using TensorFlow and deploy it on CloudML. In order for this to work I had to make some changes to my basic TF code. The changes are documented here: https://cloud.google.com/ml/docs/how-tos/preparing-models#code_changes
I have also been able to train a similar model using Keras. I can even save the model in the same export and export.meta format as I would get with TF.
from keras import backend as K
saver = tf.train.Saver()
session = K.get_session()
saver.save(session, 'export')
The part I am missing is how do I add the placeholders for input and output into the graph I build on Keras?

After training your model on Google Cloud ML Engine (check out this awesome tutorial ), I named the input and output of my graph with
signature = predict_signature_def(inputs={'NAME_YOUR_INPUT': new_Model.input},
outputs={'NAME_YOUR_OUTPUT': new_Model.output})
You can see the full exporting example for an already trained keras model 'model.h5' below.
import keras.backend as K
import tensorflow as tf
from keras.models import load_model, Sequential
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import tag_constants, signature_constants
from tensorflow.python.saved_model.signature_def_utils_impl import predict_signature_def
# reset session
K.clear_session()
sess = tf.Session()
K.set_session(sess)
# disable loading of learning nodes
K.set_learning_phase(0)
# load model
model = load_model('model.h5')
config = model.get_config()
weights = model.get_weights()
new_Model = Sequential.from_config(config)
new_Model.set_weights(weights)
# export saved model
export_path = 'YOUR_EXPORT_PATH' + '/export'
builder = saved_model_builder.SavedModelBuilder(export_path)
signature = predict_signature_def(inputs={'NAME_YOUR_INPUT': new_Model.input},
outputs={'NAME_YOUR_OUTPUT': new_Model.output})
with K.get_session() as sess:
builder.add_meta_graph_and_variables(sess=sess,
tags=[tag_constants.SERVING],
signature_def_map={
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature})
builder.save()
You can also see my full implementation.
edit: And if my answer solved your problem, just leave me an uptick here :)

I found out that in order to use keras on google cloud one has to install it with a setup.py script and put it on the same place folder where you run the gcloud command:
├── setup.py
└── trainer
├── __init__.py
├── cloudml-gpu.yaml
├── example5-keras.py
And in the setup.py you put content such as:
from setuptools import setup, find_packages
setup(name='example5',
version='0.1',
packages=find_packages(),
description='example to run keras on gcloud ml-engine',
author='Fuyang Liu',
author_email='fuyang.liu#example.com',
license='MIT',
install_requires=[
'keras',
'h5py'
],
zip_safe=False)
Then you can start your job running on gcloud such as:
export BUCKET_NAME=tf-learn-simple-sentiment
export JOB_NAME="example_5_train_$(date +%Y%m%d_%H%M%S)"
export JOB_DIR=gs://$BUCKET_NAME/$JOB_NAME
export REGION=europe-west1
gcloud ml-engine jobs submit training $JOB_NAME \
--job-dir gs://$BUCKET_NAME/$JOB_NAME \
--runtime-version 1.0 \
--module-name trainer.example5-keras \
--package-path ./trainer \
--region $REGION \
--config=trainer/cloudml-gpu.yaml \
-- \
--train-file gs://tf-learn-simple-sentiment/sentiment_set.pickle
To use GPU then add a file such as cloudml-gpu.yaml in your module with the following content:
trainingInput:
scaleTier: CUSTOM
# standard_gpu provides 1 GPU. Change to complex_model_m_gpu for 4
GPUs
masterType: standard_gpu
runtimeVersion: "1.0"

I don't know much about Keras. I consulted with some experts, and the following should work:
from keras import backend as k
# Build the model first
model = ...
# Declare the inputs and outputs for CloudML
inputs = dict(zip((layer.name for layer in model.input_layers),
(t.name for t in model.inputs)))
tf.add_to_collection('inputs', json.dumps(inputs))
outputs = dict(zip((layer.name for layer in model.output_layers),
(t.name for t in model.outputs)))
tf.add_to_collection('outputs', json.dumps(outputs))
# Fit/train the model
model.fit(...)
# Export the model
saver = tf.train.Saver()
session = K.get_session()
saver.save(session, 'export')
Some important points:
You have to call tf.add_to_collection after you create the model
but before you ever call K.get_session(), fit etc.,
You should be sure set the name of input and output layers when
you add them to the graph because you'll need to refer to them
when you send prediction requests.

Here's another answer that may help. Assuming you already have a keras model you should be able to append this to the end of your script and get an ML Engine compatible version of the model (protocol buffer). Note that you need to upload the saved_model.pb file and the sibling directory with variables to ML Engine for it to work. Note also that the .pb file must be named saved_model.pb or saved_model.pbtxt.
Assuming your model is name model
from tensorflow import saved_model
model_builder = saved_model.builder.SavedModelBuilder("exported_model")
inputs = {
'input': saved_model.utils.build_tensor_info(model.input)
}
outputs = {
'earnings': saved_model.utils.build_tensor_info(model.output)
}
signature_def = saved_model.signature_def_utils.build_signature_def(
inputs=inputs,
outputs=outputs,
method_name=saved_model.signature_constants.PREDICT_METHOD_NAME
)
model_builder.add_meta_graph_and_variables(
K.get_session(),
tags=[saved_model.tag_constants.SERVING],
signature_def_map={saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
})
model_builder.save()
will export the model to directory /exported_model.

Related

Convert a TensorFlow model in a format that can be served

I am following Tensorflow serving documentation to convert my trained model into a format that can be served in Docker container. As I'm new to Tensorflow, I am struggling to convert this trained model into a form that will be suitable for serving.
The model is already trained and I have the checkpoint file and .meta file. So, I need to get the .pb file and variables folder from the above two files. Can anyone please suggest me an approach on how to get this done for serving the models?
.
|-- tensorflow model
| -- 1
| |-- saved_model.pb
| -- variables
| |-- variables.data-00000-of-00001
| -- variables.index
There is multiple ways of doing this, and other methods could be required for more complex models.
I am currently using the method described here, which works great for tf.keras.models.Model and tf.keras.Sequential models (not sure for tensorflow subclassing?).
Below is a minimal working example, including creating a model using python (it seems like you have already completed this by your folder structure and can ignore the first step)
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
import tensorflow.keras.backend as K
inputs = Input(shape=(2,))
x = Dense(128, activation='relu')(inputs)
x = Dense(32, activation='relu')(x)
outputs = Dense(1)(x)
model = Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam', loss='mse')
# loading existing weights, model architectural must be the same as the existing model
#model.load_weights(".//MODEL_WEIGHT_PATH//WEIGHT.h5")
export_path = 'SAVE_PATH//tensorflow_model//1'
with K.get_session() as sess:
tf.saved_model.simple_save(
sess,
export_path,
inputs={'inputs': model.input}, # for single input
#inputs={t.name[:-5]: t for t in model.input}, # for multiple inputs
outputs={'outputs': model.output})
I suggest you use folder name "tensorflow_model" instead of "tensorflow model", to avoid possible problems with spaces.
Then we can build the docker image in terminal by (for windows, use ^ instead of \ for line brake, and use //C/ instead of C:\ in path):
docker run -p 8501:8501 --name tfserving_test \
--mount type=bind,source="SAVE_PATH/tensorflow_model",target=/models/tensorflow_model \
-e MODEL_NAME=tensorflow_model -t tensorflow/serving
Now the container should be up and running, and we can test the serving with python
import requests
import json
#import numpy as np
payload = {
"instances": [{'inputs': [1.,1.]}]
}
r = requests.post('http://localhost:8501/v1/models/tensorflow_model:predict', json=payload)
print(json.loads(r.content))
# {'predictions': [[0.121025]]}
The container is working with our model, giving the prediction 0.121025 for the input [1., 1.]
I hope this helps:
import tensorflow as tf
from tensorflow.contrib.keras import backend as K
from tensorflow.python.client import device_lib
K.set_learning_phase(0)
model = tf.keras.models.load_model('my_model.h5')
export_path = './'
with K.get_session() as sess:
tf.saved_model.simple_save(
sess,
export_path,
inputs={'input_image': model.input},
outputs={t.name: t for t in model.outputs}
)
print('Converted to SavedModel!!!')
From your question, do you mean you no more have access to Model and you have only Check Point files and .meta files?
If that is the case, you can refer the below links which has the code for converting those files into '.pb' file.
Tensorflow: How to convert .meta, .data and .index model files into one graph.pb file
https://github.com/petewarden/tensorflow_makefile/blob/master/tensorflow/python/tools/freeze_graph.py
If you have access to the Trained Model, then I guess you are saving it currently using tf.train.Saver. Instead of that, you can Save the Model and Export it using any of the three (commonly used) functions mentioned below:
tf.saved_model.simple_save => In this case, only Predict API is supported during Serving. Example of this is mentioned by KrisR89 in his answer.
tf.saved_model.builder.SavedModelBuilder => In this case, you can define the SignatureDefs, i.e., the APIs which you want to access during Serving.
You can find example on how to use it in the below link,
https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/mnist_saved_model.py
Third way is shown below:
classifier = tf.estimator.DNNClassifier(config=training_config, feature_columns=feature_columns,hidden_units=[256, 32], optimizer=tf.train.AdamOptimizer(1e-4),n_classes=NUM_CLASSES,dropout=0.1, model_dir=FLAGS.model_dir)
classifier.export_savedmodel(FLAGS.saved_dir,
serving_input_receiver_fn=serving_input_receiver_fn)
The Example on how to save model using Estimators can be found in the below link. This supports Predict and Classification APIs.
https://github.com/yu-iskw/tensorflow-serving-example/blob/master/python/train/mnist_premodeled_estimator.py
Let me know if this information helps or if you need any further help.

How to make the tensorflow hub embeddings servable using tensorflow serving?

I am trying use an embeddings module from tensorflow hub as servable. I am new to tensorflow. Currently, I am using Universal Sentence Encoder embeddings as a lookup to convert sentences to embeddings and then using those embeddings to find a similarity to another sentence.
My current code to convert sentences into embeddings is:
with tf.Session() as session:
session.run([tf.global_variables_initializer(), tf.tables_initializer()])
sen_embeddings = session.run(self.embed(prepared_text))
Prepared_text is a list of sentences. How do I take this model and make it a servable?
Right now you probably need to do this by hand. Here is my solution, similar to previous answer but more general - show how to use any other module without guessing input parameters, as well as extended with verification and usage:
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.saved_model import simple_save
export_dir = "/tmp/tfserving/universal_encoder/00000001"
with tf.Session(graph=tf.Graph()) as sess:
module = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/2")
input_params = module.get_input_info_dict()
# take a look at what tensor does the model accepts - 'text' is input tensor name
text_input = tf.placeholder(name='text', dtype=input_params['text'].dtype,
shape=input_params['text'].get_shape())
sess.run([tf.global_variables_initializer(), tf.tables_initializer()])
embeddings = module(text_input)
simple_save(sess,
export_dir,
inputs={'text': text_input},
outputs={'embeddings': embeddings},
legacy_init_op=tf.tables_initializer())
Thanks to module.get_input_info_dict() you know what tensor names you need to pass to the model - you use this name as a key for inputs={} in simple_save method.
Remember that to serve the model it needs to be in directory path ending with version, that's why '00000001' is the last path in which saved_model.pb resides.
After exporting your module, quickest way to see if your model is exported properly for serving is to use saved_model_cli API:
saved_model_cli run --dir /tmp/tfserving/universal_encoder/00000001 --tag_set serve --signature_def serving_default --input_exprs 'text=["what this is"]'
To serve the model from docker:
docker pull tensorflow/serving
docker run -p 8501:8501 -v /tmp/tfserving/universal_encoder:/models/universal_encoder -e MODEL_NAME=universal_encoder -t tensorflow/serving
Currently, the hub modules cannot be consumed by Tensorflow Serving directly. You will have to load the module into an empty graph and then export it using the SavedModelBuilder. For example:
import tensorflow as tf
import tensorflow_hub as hub
with tf.Graph().as_default():
module = hub.Module("http://tfhub.dev/google/universal-sentence-encoder/2")
text = tf.placeholder(tf.string, [None])
embedding = module(text)
init_op = tf.group([tf.global_variables_initializer(), tf.tables_initializer()])
with tf.Session() as session:
session.run(init_op)
tf.saved_model.simple_save(
session,
"/tmp/serving_saved_model",
inputs = {"text": text},
outputs = {"embedding": embedding},
legacy_init_op = tf.tables_initializer()
)
This will export your model (to the folder /tmp/serving_saved_model) in the desired format for serving. After this, you can follow the instructions given in the documentation here: https://www.tensorflow.org/serving/serving_basic
Note that the other answers are for TensorFlow 1. Most TF Hub models for TensorFlow 2 will already be compatible with TF Serving. For example, to deploy the USE-Large model:
Download the model, either via the tensorflow_hub library or just https://tfhub.dev/google/universal-sentence-encoder-large/5
Put the content into folders representing the model name and version, e.g. models/use-large/5
Run the TF Serving application, e.g. via Docker:
docker run -t --rm -p 8501:8501 \
-v "$PATH_TO_YOUR_WORKSPACE/models:/models" \
-e MODEL_NAME="use-large" \
tensorflow/serving
The model will be available at localhost:8501/v1/models/use-large:
curl -d '{"instances": ["Hey!"]}' \
-X POST http://localhost:8501/v1/models/use-large:predict

Upload SavedModel to ML engine

I'm trying to upload my saved model to ML engine so I can consume my model online, however I am getting the below error:
I am using tensorflow version 1.5 locally to train my model, based on the Tensorflow for poets tutorial (https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/).
I am then converting my model using the below 'save_model.py' script:
import tensorflow as tf
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import builder as saved_model_builder
input_graph = 'retrained_graph.pb'
saved_model_dir = 'my_model'
with tf.Graph().as_default() as graph:
# Read in the export graph
with tf.gfile.FastGFile(input_graph, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
# Define SavedModel Signature (inputs and outputs)
in_image = graph.get_tensor_by_name('DecodeJpeg/contents:0')
inputs = {'image_bytes': tf.saved_model.utils.build_tensor_info(in_image)}
out_classes = graph.get_tensor_by_name('final_result:0')
outputs = {'prediction': tf.saved_model.utils.build_tensor_info(out_classes)}
signature = tf.saved_model.signature_def_utils.build_signature_def(
inputs=inputs,
outputs=outputs,
method_name='tensorflow/serving/predict'
)
with tf.Session(graph=graph) as sess:
# Save out the SavedModel.
b = saved_model_builder.SavedModelBuilder(saved_model_dir)
b.add_meta_graph_and_variables(sess,
[tf.saved_model.tag_constants.SERVING],
signature_def_map={'serving_default': signature})
b.save()
The error message saying please use runtime 1.2 or above is talking about tensorflow? Or is my save_model.py doing something incorrectly?
You will need to use gcloud to deploy your model. The console does not let you manually specify the runtime version (i.e. it assumes TensorFlow 1.0). Further note that 1.5 is not yet available but will be very soon. That said, your model might work with 1.4, so it's worth a try.
The command to run is:
gcloud ml-engine versions create --model mymodel --origin=gs://mybucket --runtime-version 1.4
And in the near future you can use --runtime-version 1.5.
For more info, see the reference docs, particular the gcloud examples.

Use keras with tensorflow serving

I was wondering how to use my model trained with keras on a production server.
I heard about tensorflow serving, but I can't figure out how to use it with my keras model.
I found this link : https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html
But I don't know how to initialize the sess variable, since my model is already trained and everything.
Is there any way to do this?
You can initialise your session variable as
from keras import backend as K
sess = K.get_session()
and go about exporting the model as in the tutorial (Note that import for exporter has changed)
from tensorflow.contrib.session_bundle import exporter
K.set_learning_phase(0)
export_path = ... # where to save the exported graph
export_version = ... # version number (integer)
saver = tf.train.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
signature = exporter.classification_signature(input_tensor=model.input,
scores_tensor=model.output)
model_exporter.init(sess.graph.as_graph_def(),
default_graph_signature=signature)
model_exporter.export(export_path, tf.constant(export_version), sess)
Good alternative to TensorFlow Serving can be TensorCraft - a simple HTTP server that stores models (I'm the author of this tool). Currently it supports only TensorFlow Saved Model format.
Before using the model, you need to export it using TensorFlow API, pack it to TAR and push to the server.
More details you can find in the project documentation.

How to serve retrained Inception model using Tensorflow Serving?

So I have trained inception model to recognize flowers according to this guide. https://www.tensorflow.org/versions/r0.8/how_tos/image_retraining/index.html
bazel build tensorflow/examples/image_retraining:retrain
bazel-bin/tensorflow/examples/image_retraining/retrain --image_dir ~/flower_photos
To classify the image via command line, I can do this:
bazel build tensorflow/examples/label_image:label_image && \
bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--output_layer=final_result \
--image=$HOME/flower_photos/daisy/21652746_cc379e0eea_m.jpg
But how do I serve this graph via Tensorflow serving?
The guide about setting up Tensorflow serving (https://tensorflow.github.io/serving/serving_basic) does not tell how to incorporate the graph (output_graph.pb). The server expects the different format of file:
$>ls /tmp/mnist_model/00000001
checkpoint export-00000-of-00001 export.meta
To serve the graph after you have trained it, you would need to export it using this api: https://www.tensorflow.org/versions/r0.8/api_docs/python/train.html#export_meta_graph
That api generates the metagraph def that is needed by the serving code ( this will generate that .meta file you are asking about)
Also, you need to restore a checkpoint using Saver.save() which is the Saver class https://www.tensorflow.org/versions/r0.8/api_docs/python/train.html#Saver
Once you have done this, you will both the metagraph def and the checkpoint files that are needed to restore the graph.
You have to export the model. I have a PR that exports the model during retraining. The gist of it is below:
import tensorflow as tf
def export_model(sess, architecture, saved_model_dir):
if architecture == 'inception_v3':
input_tensor = 'DecodeJpeg/contents:0'
elif architecture.startswith('mobilenet_'):
input_tensor = 'input:0'
else:
raise ValueError('Unknown architecture', architecture)
in_image = sess.graph.get_tensor_by_name(input_tensor)
inputs = {'image': tf.saved_model.utils.build_tensor_info(in_image)}
out_classes = sess.graph.get_tensor_by_name('final_result:0')
outputs = {'prediction': tf.saved_model.utils.build_tensor_info(out_classes)}
signature = tf.saved_model.signature_def_utils.build_signature_def(
inputs=inputs,
outputs=outputs,
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
)
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
# Save out the SavedModel.
builder = tf.saved_model.builder.SavedModelBuilder(saved_model_dir)
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
},
legacy_init_op=legacy_init_op)
builder.save()
Above will create a variables directory and saved_model.pb file. If you put it under a parent directory representing the version number (e.g. 1/) then you can call tensorflow serving via:
tensorflow_model_server --port=9000 --model_name=inception --model_base_path=/path/to/saved_models/
Check out this gist how to load your .pb output graph in a Session:
https://github.com/eldor4do/Tensorflow-Examples/blob/master/retraining-example.py